Perform spark tasks with crontab timing

Source: Internet
Author: User
Tags perl interpreter install perl perl script

today's main contents are: Using the Timer crontab Linux under Linux How to write Perl scripts in Java programs to invoke the Linux command instance: 0:30 every day to perform spark task 1. Use timer crontab under Linux

1, installation

Yum-y Install Vixie-cron
yum-y Install Crontabs

2. Start and Stop command

Service Crond start//     Start Services
crond stop      //close service
crond restart   //Restart service
Crond Reload    //Reload Configuration
Service Crond status    /view crontab service status

3. View all Timer tasks

Crontab-l

This timer task is to execute the test.sh script every minute with SH

4. Add Timer task

Crontab-e

5, crontab time expression

Basic format:

* * * * *
command is divided into  weeks   

6. Common examples

  Express frequency)

//15 and 45 per hour 15,45 * * * * * (, express side-by-side 
    )

///At 8-11 hours a day, 15 per hour, 45 points each to perform a
    15,45 8-11 * * * * * * COMM and (-representation range)

//Every 3rd and 15 minutes from 8 o'clock in the morning to 11 in Monday executes
    3,15 8-11 * * 1 command

//two-day 8 o'clock in the morning to 11-minute 3rd and 15 minutes of execution
    3,15 8-11 */2 * Command
2. Write Perl scripts under Linux

1. Install Perl First

Yum-y install gcc gcc-c++ make  automake autoconf libtool perl

2. Write one of the simplest Perl scripts

VI test.pl

The contents are as follows:

#!/usr/bin/perl use

strict;
print "hellonworld!\n";

The first "#" indicates that this line is a comment
A second "!" Indicates that this line is not a normal annotation, but rather a declaration line of the interpreter path
The following "/usr/bin/perl" is the path to the Perl interpreter's installation, and may be: "/usr/local/bin/perl," if that doesn't work, change this.
Use strict is strictly checked grammar

3. Add executable permissions to the script

chmod 764 test.pl

The basic permissions of Linux files are 9, respectively, owner/group/other three kinds of identity have read/write/execute authority, and the score control of each authority is r:4,w:2,x:1;
The permissions for each identity need to be cumulative, such as when the permission is [-rwxrwx-], indicating:
Owner:rwx=4+2+1=7
Group:rwx=4+2+1=7
Other:-=0+0+0=0
That is, the file's permission number is 770.

4, and then execute the Perl file can

./test.pl

Since we have declared the interpreter path, we do not need to use the Perl test.pl, but directly./You can execute 3. Invoke Linux commands in Java programs

The main use of two class process and runtime, code examples are as follows:

    Runtime RT = Runtime.getruntime ();
    string[] cmd = {"/bin/sh", "-C", "CD ~"};
    Process proc = rt.exec (cmd);
    Proc.waitfor ();
    Proc.destroy ();

If the-C option exists, the command reads 4 from the string . Instance: Perform spark tasks 0:30 every day

1, first write the Perl script to perform the spark task: getappinfo.pl

#!/usr/bin/perl use

strict;

# Get the date of the day my
($sec, $min, $hour, $mday, $mon, $year, $wday, $yday, $ISDST) = localtime (time-3600 *);
# $year is counted from 1900, so $year need to add 1900;
$year + = 1900;
# $mon is counted from 0, so $mon need to add 1;
$mon + + 1;


Print "$year-$mon-$mday-$hour-$min-$sec, Wday: $wday, Yday: $yday, isdst: $ISDST \ n";

Sub Exec_spark
{my
    $dst _date = sprintf ("%d%02d%02d", $year, $mon, $mday);
    My $spark _generateapp = "Nohup/data/install/spark-2.0.0-bin-hadoop2.7/bin/spark-submit  --master spark://hxf : 7077  --executor-memory 30G--executor-cores  --conf spark.default.parallelism=300--class Com.analysis.main.GenAppInfo  /home/hadoop/jar/analysis.jar $dst _date >/home/hadoop/logs/genappinfo.log & ";
    Print "$spark _generateapp\n";

    return system ($spark _generateapp);

if (!exec_spark ())
{
    print "done\n";
    Exit (0);
}

2, add Timer task: daily 0:30 Execution getappinfo.pl

Crontab-e

Add the following:

0 * * */data/tools/getappinfo.pl

3. The spark program in the script is as follows:

Package Com.analysis.main Import org.apache.spark.SparkConf Import Org.apache.spark.sql.SparkSession Object Testcrontab {//args-> 20170101 def main (args:array[string]) {if (args.length = 0) {SYSTEM.ERR.PR
    INTLN ("parameter exception") System.exit (1)} val year = args (0). substring (0, 4) Val month = args (0). substring (4, 6)  Val day = args (0). substring (6, 8)//Set serializer to Kryoserializer or configure in configuration file System.setproperty ("Spark.serializer", "Org.apache.spark.serializer.KryoSerializer")//Set application name, new spark environment val sparkconf = new sparkconf (). Setappname ("Ge Nerateappinfo_ "+ args (0)) Val spark = sparksession. Builder (). config (sparkconf). Enablehivesupport 
    (). Getorcreate () println ("Start" + "Generateappinfo_" + args (0)) Import spark.sql SQL ("Use arrival") Val sqlstr = "Select Opttime, Firstimei, Secondimei, Thirdimei, Applist, year, month, day from Base_arrival where yea R= "+ year +" and month= "+ month + "and day=" + Day SQL (SQLSTR). Show ()//Run genappinfonew val rt = Runtime.getruntime () val cmd = Array ("/b" In/sh ","-C ","/data/tools/getappinfo_new.pl ") try {val proc = rt.exec (cmd) proc.waitfor () proc.des Troy () println ("Perform extraction appinfo_new task")} catch {case e:exception => println ("Perform Fetch appinfo_new task failed:" + e.g Etmessage ())}}}

This program first queries the data from the hive and displays it, then calls the Linux shell to execute another perl script getappinfo_new.pl, we can write other actions in this script

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.