I. aforementioned
Azkaban is a workflow scheduling tool. Because of the dependency between tasks, traditional crontab tasks cannot meet the requirements.
Therefore, you need to create a workflow engine. Compared with ooize, Azkaban provides task submission as a client. Ooize randomly distributes tasks to our cluster environment.
Considering the common architecture, we chose Azkaban for workflow engine. However, there will always be various problems in building tutorials on the Internet, which is really annoying. After several days of hard work, I finally succeeded in setting up Azkaban. The detailed steps are described as follows.
Here is a suggestion. Azkaban should never use the latest construction because you will have unexpected pitfalls ..
Ii. Steps
1. download the software package 3.25.0 of Azkaban at the following address:
Https://github.com/azkaban/azkaban/releases/tag/3.25.0
2. Download the package and upload it to the corresponding server directory. The address directory of this article is/mnt/data8/IRE/
3. Extract and rename the Directory
Tar-XF azkaban-3.25.0.tar.gz
Music azkaban-3.25.0 Azkaban
4. Compile the source code
./Gradlew build Compilation
./Gradlew build-X test ignore Test(When a command fails to be executed and the command is executed again, it is not needed after the previous execution is successful, because sometimes it is stuck in the compilation test and cannot be performed)
./Gradlew installdist Packaging
5. Create a directory for running Azkaban after compilation
Create a directory named Azkaban-IRE in the root directory/mnt/data8/IRE /.
Mkdir Azkaban-ire
6. Run the build tar package CPazkaban-ire
Directory
CP Azkaban/Azkaban-web-server/build/distributions/azkaban-web-server-0.1.0-SNAPSHOT.tar.gz Azkaban-IRE/
CP Azkaban/Azkaban-Exec-server/build/distributions/azkaban-exec-server-0.1.0-SNAPSHOT.tar.gz Azkaban-IRE/
CP Azkaban/Azkaban-DB/build/distributions/azkaban-db-0.1.0-SNAPSHOT.tar.gz Azkaban-IRE/
Decompress the corresponding jar package and rename it (strongly recommended) because there will be a bunch of absolute paths introduced later, the name is too long and too troublesome. The effect is as follows:
Tar-XF azkaban-db-0.1.0-SNAPSHOT.tar.gz
Tar-XF azkaban-exec-server-0.1.0-SNAPSHOT.tar.gz
Tar-XF azkaban-web-server-0.1.0-SNAPSHOT.tar.gz
Music azkaban-db-0.1.0-SNAPSHOT Azkaban-DB
Music azkaban-exec-server-0.1.0-SNAPSHOT Azkaban-Exec-Server
Music azkaban-web-server-0.1.0-SNAPSHOT Azkaban-web-Server
7. Import Azkaban-SQL
Select a database or create a database named Azkaban. The command is as follows:
Create Database Azkaban;
Create user 'azkaba' @ '%' identified by 'azkaba ';
Grant all on Azkaban. * To 'azkaba' @ 'localhost' identified by 'azkaba ';
Flush privileges;
Use Azkaban;
Source Azkaban-IRE/azkaban-db-0.1.0-SNAPSHOT/create-all-sql-0.1.0-SNAPSHOT. SQL
We recommend that you manually delete the tables created in other versions before execution, as shown below:
drop table active_executing_flows;drop table active_sla;drop table execution_flows;drop table execution_jobs;drop table execution_logs;drop table executor_events;drop table executors;drop table project_events;drop table project_files;drop table project_flows;drop table project_permissions;drop table project_properties;drop table properties_index ;drop table projects;drop table properties;drop table triggers;drop table project_flow_files; drop table project_versions;drop table qrtz_blob_triggers,qrtz_calendars,qrtz_cron_triggers,qrtz_fired_triggers,qrtz_job_details,qrtz_locks,qrtz_paused_trigger_grps,qrtz_scheduler_state,qrtz_simple_triggers,qrtz_simprop_triggers,qrtz_triggers;
The last line in the create-all-sql-0.1.0-SNAPSHOT. SQL can be removed because this field already exists.
8. Configurationazkaban-web-server
8.1 create the conf directory in Azkaban-web-server. The structure is as follows:
Mkdir Conf
├── conf│ ├── azkaban-users.xml│ ├── azkaban.properties│ ├── global.properties│ └── log4j.properties
azkaban-users.xml
<azkaban-users> <user username="azkaban" password="azkaban" roles="admin" groups="azkaban" /> <user username="metrics" password="metrics" roles="metrics"/> <user username="admin" password="admin" roles="admin,metrics" /> <role name="admin" permissions="ADMIN" /> <role name="metrics" permissions="METRICS"/></azkaban-users>
azkaban.properties
# Azkaban personalization settings # set the project name Azkaban. name = test # set the project subtitle Azkaban. label = my local azkabanazkaban. color = # ff3601azkaban. default. servlet. path =/index # the absolute path must be written here. Otherwise, the corresponding web address cannot be found on the page. resource. dir =/mnt/data8/IRE/Azkaban-web-server/web/# You must set it to Shanghai. Otherwise, default is executed based on the U.S. time. timezone. id = Asia/Shanghai # Azkaban usermanager classuser. manager. class = Azkaban. user. xmlusermanager # the absolute path user must be written here. manager. XML. file =/mnt/data8/IRE/Azkaban-web-server/CONF/azkaban-users.xml # loader for projects # Be sure to write absolute paths here or the corresponding address executor will not be found on the page. global. properties =/mnt/data8/IRE/Azkaban-web-server/CONF/global. propertiesazkaban. project. dir = projectsdatabase. type = mysqlmysql. port = 3306mysql. host = 127.0.0.1mysql.database = azkabanmysql. user = rootmysql. password = rootmysql. numconnections = 100 # velocity Dev modevelocity. dev. mode = false # Azkaban jetty server properties. jetty. maxthreads = 25jetty. SSL. port = 8443jetty. use. SSL = falsejetty. port = 8081jetty. keystore = keystorejetty. password = passwordjetty. keypassword = keypasswordjetty. truststore = keystorejetty. trustpassword = passwordjetty. excludeciphersuites = ssl_rsa_with_des_cbc_sha, clerk, clerk # Azkaban executor settingsexecutor. port = 12321 # mail settings # mail. sender = Email account # mail. host = email server # mail. user = Email account # mail. password = Email Password mail. sender = Mail. host = job. failure. email = job. success. email = lockdown. create. projects = falsecache. directory = cache # JMX statsjetty. connector. stats = trueexecutor. connector. stats = true
Global. properties (leave it empty)
log4j.properties
log4j.rootLogger=INFO,Clog4j.appender.C=org.apache.log4j.ConsoleAppenderlog4j.appender.C.Target=System.errlog4j.appender.C.layout=org.apache.log4j.PatternLayoutlog4j.appender.C.layout.ConversionPattern=%d{yyyy-MM-dd HH:mm:ss} %-5p %c{1}:%L - %m%n
8.2 generate a keystore file.
Run the following command in the conf directory:
Keytool-keystore-alias Azkaban-genkey-keyalg RSA
The specific execution result is as follows:
Enter keystore password:
Re-enter new password:
What is your first and last name?
[UNKNOWN]: azkaban.test.com
What is the name of your organizational unit?
[UNKNOWN]: Azkaban
What is the name of your organization?
[UNKNOWN]: Test
What is the name of your city or locality?
[UNKNOWN]: Beijing
What is the name of your state or province?
[UNKNOWN]: Beijing
What is the two-letter country code for this unit?
[UNKNOWN]: CN
Is Cn = azkaban.test.com, ou = Azkaban, O = test, L = Beijing, St = Beijing, c = cn correct?
[No]: Yes
Enter key password for <Azkaban>
(Return if same as keystore password)
8.3 create a logs directory in Azkaban-web-server to record the running log structure of Azkaban as follows:
Currentpid is generated at startup.
9. Configure Azkaban-Exec-Server
9.1 If no conf directory is configured in the conf directory of Azkaban-Exec-server, create the corresponding file
├── conf│ ├── azkaban.properties
│ ├── log4j.properties
azkaban.properties
# Azkaban
Default. timezone. ID = Asia/Shanghai # Azkaban jobtypes plugins
Azkaban. jobtype. plugin. dir = plugins/jobtypes # absolute path of loader for projects executor. global. properties =/mnt/data8/IRE/Azkaban-web-server/CONF/global. propertiesazkaban. project. dir = projectsdatabase. type = mysqlmysql. port = 3306mysql. host = 127.0.0.1mysql.database = azkabanmysql. user = rootmysql. password = rootmysql. numconnections = 100 # Azkaban executor settingsexecutor. maxthreads = 50executor. port = 12321executor. flow. threads = 30 # JMX statsjetty. connector. stats = trueexecutor. connector. stats = true # uncomment to enable inmemory stats for Azkaban # executor. metric. reports = true # executor. metric. milisecinterval. default = 60000
Log4j. Properties
log4j.rootLogger=INFO,Clog4j.appender.C=org.apache.log4j.ConsoleAppenderlog4j.appender.C.Target=System.errlog4j.appender.C.layout=org.apache.log4j.PatternLayoutlog4j.appender.C.layout.ConversionPattern=%d{yyyy-MM-dd HH:mm:ss} %-5p %c{1}:%L - %m%n
9.2 create the plugins/jobtypes folder on the Azkaban-Exec-server.
Mkdir-P plugins/jobtypes
Generate commonprivate. properties file in/jobtypes to write
execute.as.user=false
9.3 create a logs directory in Azkaban-Exec-Server
9.4 The final directory level of Azkaban-Exec-server is as follows:
It is the format after the corresponding folder is created, and the executors projects temp directory will be created when the task is executed.
10. Start Azkaban-web-Server
Be sure to start at the bin layer not in the bin directory as follows:
CD Azkaban-web-server/
Bin/start-web.sh
11. Start Azkaban-Exec-Server
CD Azkaban-Exec-server/
Bin/start-exec.sh
12. The verification results are as follows:
13. Start the host at the front end: 8081.
[Azkaban construction] --- the detailed construction rules for Azkaban 3.25.0 are extremely practical.