1.集群规划
hadoop102 | master、worker |
hadoop103 | worker |
hadoop104 | worker |
2.环境准备工作
1)三台节点均需部署JDK(1.8+),并配置相关环境变量。
2)需部署数据库,支持MySQL(5.7+)或者PostgreSQL(8.2.15+)。
3)需部署Zookeeper(3.4.6+)。
4)三台节点均需安装进程管理工具包psmisc。如下三台节点都需要安装
[root@hadoop102 ~]$ sudo yum install -y psmisc[root@hadoop103 ~]$ sudo yum install -y psmisc[root@hadoop104 ~]$ sudo yum install -y psmisc
3.初始化数据库
DolphinScheduler 元数据存储在关系型数据库中,故需创建相应的数据库和用户。
注意:未避免因设置密码过于简单而报错,这里需要降低密码的强度级别
mysql> set global validate_password_length=4;mysql> set global validate_password_policy=0; 可以纯数据或纯字母未进行上述设置,可能会报错:ERROR 1819 (HY000): Your password does not satisfy the current policy requirements
具体操作分为三步:1、创建dolphinscheduler数据库,2、创建dolphinscheduler用户,3、授予dolphinscheduler用户dolphinscheduler库的全部权限
1、创建dolphinscheduler数据库mysql> CREATE DATABASE dolphinscheduler DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;2、创建dolphinscheduler用户mysql> CREATE USER 'dolphinscheduler'@'%' IDENTIFIED BY 'dolphinscheduler';%意思是可以通过dp在任何机器上都能访问mysql数据库3、授予dolphinscheduler用户,对于dolphinscheduler库的全部权限mysql> GRANT ALL PRIVILEGES ON dolphinscheduler.* TO 'dolphinscheduler'@'%'; --不能改成小写4、刷新mysql> flush privileges;
4.进行部署
1、解压文件
[root@hadoop102 ~]# cd /opt/software[root@hadoop102 ~]# tar -zxvf apache-dolphinscheduler-3.0.0-bin.tar.gz[root@hadoop102 software]# lldrwxrwxr-x 10 root root 4096 9月 19 09:40 apache-dolphinscheduler-3.0.0-bin-r-------- 1 root root 148680098 9月 12 21:51 apache-dolphinscheduler-3.0.0-bin.tar.gz
2、进入apache-dolphinscheduler-3.0.0-bin文件夹进行配置
总共需要修改如下三种配置文件:
/opt/software/apache-dolphinscheduler-3.0.0-bin/bin/env/isntall_env.sh
/opt/software/apache-dolphinscheduler-3.0.0-bin/bin/env/dolphinscheduler_env.sh
如下四个文件的common.properties配置,四个文件配置均一致,worker-server下的common.properties不用配置,因为启动的时候,worker-server会读取tools下的common.properties配置
/opt/software/apache-dolphinscheduler-3.0.0-bin/tools/conf/common.properties
/opt/software/apache-dolphinscheduler-3.0.0-bin/api-server/conf/common.properties
/opt/software/apache-dolphinscheduler-3.0.0-bin/alert-server/conf/common.properties
/opt/software/apache-dolphinscheduler-3.0.0-bin/master-server/conf/common.properties
开始配置:
配置1:
[root@hadoop102 env]# vim /opt/software/apache-dolphinscheduler-3.0.0-bin/bin/env/isntall_env.sh
修改如下配置:
#1、现在只在hadoop102配置,调度器如何知道你的集群,通过ips告诉,安装的时候会把文件分发到这里配置的集群上hadoop103,hadoop104ips="hadoop102,hadoop103,hadoop104"#2、端口不用动sshPort="22"#3、master是哪台机器,必须是上面ips中的其中一台或多台masters="hadoop102" #多台:"hadoop102,hadoop103"#4、dolphinscheduler的默认工作组,登入界面后的默认工作组就是指这里的,集群必须是上面ips中workers="hadoop102:default,hadoop103:default,hadoop104:default"#5、指定alertServer和apiServers的主机alertServer="hadoop102"apiServers="hadoop102"#6、dolphinscheduler调度器的安装路径,安装在本用户用权限的下的路径。注意,集群启动后,以后要修改配置,就要在下面的路径下修改,在安装包路径下修改配置是无效的!!!installPath="/opt/module/apache-dolphinscheduler-quality"#7、部署用户,是dolphinscheduler调度器的启动用户,需要具有sudo的权限,并且配置免密deployUser="hdfs"
配置2:
vim /opt/software/apache-dolphinscheduler-3.0.0-bin/bin/env/dolphinscheduler_env.sh
#1、 配置javahomeexport JAVA_HOME=/usr/local/jdk1.8.0_231 #2、配置mysql数据库,就是刚才创建的export DATABASE="mysql"export SPRING_PROFILES_ACTIVE=${DATABASE}export SPRING_DATASOURCE_URL="jdbc:mysql://172.24.140.181:3306/dolphinscheduler?useUnicode=true&characterEncoding=UTF-8"export SPRING_DATASOURCE_USERNAME="dolphinscheduler"export SPRING_DATASOURCE_PASSWORD="dolphinscheduler" #3、zk设置 export REGISTRY_TYPE=${REGISTRY_TYPE:-zookeeper}export REGISTRY_ZOOKEEPER_CONNECT_STRING="172.24.140.181:2181,172.24.140.182:2181,172.24.140.183:2181" 4、hadoop、hive,环境变量配置,注意:后面要用python、datax时,也要在这里配置export HADOOP_HOME=/opt/cloudera/parcels/CDH-6.3.2-1.cdh6.3.2.p0.1605554/lib/hadoopexport HADOOP_CONF_DIR=/opt/cloudera/parcels/CDH-6.3.2-1.cdh6.3.2.p0.1605554/lib/hadoop/etc/hadoopexport SPARK_HOME=/opt/cloudera/parcels/CDH-6.3.2-1.cdh6.3.2.p0.1605554/lib/sparkexport HIVE_HOME=/opt/cloudera/parcels/CDH-6.3.2-1.cdh6.3.2.p0.1605554/lib/hive
配置3:
vim /opt/software/apache-dolphinscheduler-3.0.0-bin/tools/conf/common.properties
# 1、临时文件路径data.basedir.path=/tmp/dolphinschedulerquality# 2、资源存储: HDFS, S3, NONE。资源指的脚本、jar包等文件资源,上传到哪种数据库中,后面调用的时候会去这里找resource.storage.type=HDFS#3、在hdfs的根路径,资源会传到hdfs下面路径上resource.upload.path=/dolphinscheduler#4、没有开启kerberos就不用管,开启,就如下配置# 4.1whether to startup kerberoshadoop.security.authentication.startup.state=true# 4.2java.security.krb5.conf pathjava.security.krb5.conf.path=/opt/krb5.conf# 4.3login user from keytab usernamelogin.user.keytab.username=hdfs-mycluster@ESZ.COM# 4.4login user from keytab pathlogin.user.keytab.path=/opt/hdfs.headless.keytab# 4.5kerberos expire time, the unit is hourkerberos.expire.time=2# 6 操作hdfs的用户,需要是hdfs的超级用户,谁启动namenode,谁就是hdfs的超级用户,CDH上是hdfs用户,不知道如何查看hdfs用户,百度好了再填hdfs.root.user=hdfs#7 如果namenode HA 被开启, 需要复制 core-site.xml and hdfs-site.xml 到 conf 路径下,并且要改成集群的名称,如:hdfs://mycluster:8020fs.defaultFS=hdfs://sd-140-181:8020# 8 ds如何知道yarn是否运行完毕,就是通过下面的接口去访问yarn,默认不用改变resource.manager.httpaddress.port=8088#9 yarn的ip配置,根据是否启用ha,分为两种情况# if resourcemanager HA is enabled, please set the HA IPs; if resourcemanager is single, keep this value emptyyarn.resourcemanager.ha.rm.ids=# resourcemanager HA is enabled or not use resourcemanager, please keep the default value; If resourcemanager is single, you only need to replace ds1 to actual resourcemanager hostnameyarn.application.status.address=http://sd-140-181:%s/ws/v1/cluster/apps/%s# job history status url when application number threshold is reached(default 10000, maybe it was set to 1000)yarn.job.history.status.address=http://sd-140-181:19888/ws/v1/history/mapreduce/jobs/%s
配置3配置完后,也要对如下文件,配置相同参数
vim /opt/software/apache-dolphinscheduler-3.0.0-bin/api-server/conf/common.properties vim /opt/software/apache-dolphinscheduler-3.0.0-bin/alert-server/conf/common.properties vim /opt/software/apache-dolphinscheduler-3.0.0-bin/master-server/conf/common.properties
放置jar包
官网说明一定要8.0.16版本及以上,我先在是8.0.30版本,不能直接从5.x版本的mysql拿,没有的话,就去下载!!!
$cp /usr/share/java/mysql-connector-java.jar /opt/software/apache-dolphinscheduler-3.0.0-bin/api-server/libs/$cp /usr/share/java/mysql-connector-java.jar /opt/software/apache-dolphinscheduler-3.0.0-bin/alert-server/libs$cp /usr/share/java/mysql-connector-java.jar /opt/software/apache-dolphinscheduler-3.0.0-bin/master-server/libs$cp /usr/share/java/mysql-connector-java.jar /opt/software/apache-dolphinscheduler-3.0.0-bin/worker-server/libs$cp /usr/share/java/mysql-connector-java.jar /opt/software/apache-dolphinscheduler-3.0.0-bin/tools/libs
初始化、启动dolphinschedule调度器
初始化数据库
$cd /opt/software/apache-dolphinscheduler-3.0.0-bin/tools/bin$sh upgrade-schema.sh
安装、并启动
$ cd /opt/software/apache-dolphinscheduler-3.0.0-bin/bin/$ sh install.sh
浏览器访问地址 http://test01:12345/dolphinscheduler/ui/login 即可登录系统UI。
默认的用户名和密码是 admin/dolphinscheduler123
启动后,时区问题:
通过修改在bin/env/dolphinscheduler_env.sh 加上export SPRING_JACKSON_TIME_ZONE=Asia/Shanghai重启服务就可以了
或者启动后,在界面的右上角能改时间
有什么不懂可以留言,会经常查看的
来源地址:https://blog.csdn.net/sinat_35197112/article/details/128256442