Dr.elephant User Guide-"dr.elephant series article-2"

Source: Internet
Author: User

This article describes how to use Dr.elephant for task analysis.

UI Home

Dr.elephant start, the first page is as follows:


Cluster Statistics Information

The gray section of the home page contains the latest cluster information. This section lists the number of tasks that have been analyzed in the last 24 hours, the number of tasks that can be optimized, and the number of tasks that need to be optimized.

Latest Task Statistics

This section lists the tasks for the most recent period of time analysis.

Search Page

First click "Search" to enter the search page, on this page, we can search for the task by different search items:

    • Task ID: Enter the ID of the task to search for a specific task or task flow. Return to the Task details page.
    • Task Flow Execution Id/url: You can search for all tasks triggered by this task flow by using the execution ID or URL of the task flow (for example, Azkaban task flow).
    • User name: The name of the user who submitted the task
    • Task type: Search for all tasks of a specific type
    • To be optimized: After each task is dr.elephant diagnosed, a detailed diagnostic report is generated, including the level of the task to be optimized. We can search for tasks based on the level we want to optimize. For example, if we enter "severe (critical)" in the input box for the level of optimization, the search results will contain at least one of the tasks diagnosed as "severe" by a heuristic algorithm. In this example, the search can also specify the heuristic algorithm at the same time, then the search results will only contain all the tasks that are diagnosed as "severe" by the heuristic algorithm.
    • Task End Date: We can also use the task end time as the search criteria. In the "from" and "to" input boxes, you can set the start time and end time, respectively. This time period is a left-to-right-open interval ([From, to)] that contains the from point of time, but does not contain a to at this point in time.

All of these search conditions can be combined. For example, we can specify "user name" as "user1", while specifying "to optimize Level" to "critical (Critical)", click Search, will return all User1 submitted tasks to optimize the level of "critical" task.

Task Details

Click on a task in the UI to go to the Task Details page.

Task Information
    1. Task Tracker Link (jobtracker): This link points to the task's tracking page. On this page, you can see details of the task, log, map, and task information for reduce.
    2. Task execution link (Job execution): This link points to the execution page of the task in the scheduler. For example, on the Azkaban scheduler, point to the execution link for this task.
    3. Task definition: This link points to the task's definition page in the scheduler. For example, on the Azkaban scheduler, point to the Properties page for this task.
    4. Task flow execution link (flow execution): This link points to the execution page of the entire task flow. For example, on the Azkaban scheduler, point to the execution page for this task flow.
    5. Task flow definition (flow definition): The same as the previous task definition (job definition), which points to the definition page of the task flow.
    6. Task History: This link points to the Task history page. The following is a detailed description of the Task History page.
    7. Task Flow history: This link points to the Task Flow history page. The following is a detailed description of the Task Flow history page.
Heuristic Algorithm Diagnostic report

When a task is dr.elephant analyzed, dr.elephant will run all the heuristic algorithms to analyze the task. Each heuristic calculates a level to be optimized for the task, which may be "none", "Medium (moderate)", "Critical (severe)", or "critical" (critical). On the detailed analysis page of each task, it will show its level of optimization and other analysis results. If the task's optimization level is not "none", it indicates that the diagnostic results of some heuristics suggest that the task needs to be optimized, and that a corresponding link (help page) is provided to illustrate the optimization recommendations proposed by the heuristic algorithm. The task's developers can use this link to help optimize their tasks.

Task Comparison

On the first page of the Dr.elephant UI, click Compare to go to the task comparison page. On the Task comparison page, we can compare the task flow execution of any two times at the task level. When we compare the execution of two task streams, the same task is compared and displayed at the top. Other different tasks are shown below in order of the task flow.


History Task Page

On the History Tasks page, a comparison of all recent executions of each particular task is shown.


Search Box

We can search for specific tasks by entering the ID or URL of the task in the search box on the History task page. By clicking Search, you will be presented with a historical implementation of the task. In the previously mentioned Task details page, there is also a link to jump to the task's historical execution page. The line chart shown on this page represents a score for each performance that the task performs in history.

perform a performance scoring chart

The execution performance score chart is a line chart. The x-axis represents the time, and the y-axis represents the score. When we hover the mouse over a point in a line chart, we see a popup with a pop-up frame. The box lists the top 3 phases of the task that caused performance problems during this execution. The performance score is calculated from a simple formula, and the lower the score indicates that the performance of the task performs better.

a flat-form display of heuristic algorithm for task execution analysis

Below the performance scoring line chart, you can see the flat display of the task for each recent execution. The first column is the time of each execution, and each time you click, you can jump to the execution details page of the task in the scheduler. Each of the following columns represents a phase of task execution. At each stage of the task execution in the chart, a number of colored dots are included. Different colors represent the level of optimization that is generated by the heuristic algorithm analysis. When we hover over a dot of any color, a box pops up to show some of the optimization recommendations of the heuristic algorithm for the task.

History Task Flow page

On the History task flow page, a comparison of all recent executions of each particular task flow is shown.


Search Box

We can search for specific tasks by entering the task flow ID or URL in the search box on the history task flow page. By clicking Search, you will be presented with a historical execution of the task flow.

perform a performance scoring chart

The execution performance score chart is a line chart. The x-axis represents the time, and the y-axis represents the score. When we hover the mouse over a point in a line chart, we see a popup with a pop-up frame. The bullet box lists the top 3 phases of the task flow that caused performance problems in this execution. The performance score is calculated from a simple formula, and the lower the score indicates that the performance of the task flow performs better.

a flat-form display of heuristic algorithm for task execution analysis

Below the performance scoring line chart, you can see the flat display of the task flow in the near-term of each execution. The first column is the time of each execution, and each time you click, you can jump to the task flow's execution Details page in the scheduler. Each of the following columns represents a task in the task flow execution. At each task stage in the chart, a number of dots of color are included. When we hover the mouse over a dot of any color, a popup pops up to show all the heuristic algorithms and the results of the algorithm's analysis of the task's desired level of optimization.

Help

Click "Help" on the Dr.elephant UI home page to jump to the assistance screen. You can also jump to a help page by clicking the "explain" link in the Task Details page in the UI (this link appears when the heuristic diagnoses the result to moderate, severe, or critical). On the help page, you can see all the heuristic algorithms introduced, and the optimization recommendations given by these heuristics. Click on a specific heuristic algorithm, you can see the heuristic algorithm to obtain detailed optimization recommendations. The above image shows the optimization suggestions given by the Mapper memory heuristic algorithm.

level to optimize

The level to be optimized represents the performance of the task, which indicates the urgency of the task to optimize the performance. We can configure a number of thresholds for each heuristic by using parameters, and the heuristic algorithm gives the diagnostic analysis of each task to a level to be optimized. There are 5 levels to optimize, followed by the urgency to be optimized in descending order: CRITICAL > SEVERE > Moderate > Low > NONE


Severity

Color

Description

CRITICAL

This task urgently needs to be optimized

SEVERE

This task has a lot of space to optimize.

Moderate

This task has further space to optimize.

Low

This task has a little space to optimize.

NONE

This mission is safe, no space to optimize.




Dr.elephant User Guide-"dr.elephant series article-2"

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.