Explain syntax
Hive provides the explain command to display the query execution plan. Syntax:
Explain [extended] Query
The explain statement uses extended to provide additional information about the operation in the execution plan. This is a typical physical information, such as a file name.
Hive queries are converted into sequences (this is a directed acyclic graph. These stages may be mapper/reducer stages, or MetaStore or file system operations, such as moving and renaming stages. The explain output consists of three parts:
Query abstract syntax tree
Dependencies between different stages of the Execution Plan
Description of each scenario
The scenario description shows the operation sequence related to metadata. Metadata includes the filter expression of filteroperator, the query expression of selectoperator, or the file output name of filesinkoperator.
Example
Consider the following explain query:
Explainfrom SRC insert overwrite table dest_g1 select SRC. Key,Sum(Substr (SRC. value,4) Group by Src. Key;
The statement output contains the following parts:
Abstract syntax tree
Abstract syntax tree: (tok_query (tok_from (tok_tabref SRC) (tok_insert (tok_destination (tok_tab dest_g1) (tok_select (tok_selexpr (tok_colref SRC key) (tok_selexpr (tok_functionSum(Tok_function substr (tok_colref SRC value)4) (Tok_groupby (tok_colref SRC key ))))
Dependency Graph
Stage dependencies: Stage-1Is a root stage-2Depends on stages: stage-1Stage-0Depends on stages: stage-2
This shows that stage-1 is the root stage, stage-2 is executed after stage-1 is complete, and stage-0 is executed after stage-2 is complete.
Plan for each stage
Stage plans: Stage - 1 Map reduce alias ->Map operator tree: SRC reduce output operator Key Expressions: Expr : Key type: String Sort Order: + Map - Reduce partition columns: Expr : Rand () type: Double Tag: - 1 Value expressions: Expr : Substr (value, 4 ) Type: String Reduce operator tree: group by operator aggregations: Expr : Sum (Udftodouble (value. 0 ) Keys: Expr : Key. 0 Type: String Mode: partial1 file output operator compressed: False Table: Input Format: org. Apache. hadoop. mapred. sequencefileinputformat output format: org. Apache. hadoop. mapred. sequencefileoutputformat name: binary_table stage: Stage - 2 Map reduce alias -> Map operator tree: /Tmp/hive-zshao/ 67494501 / 106593589.10001 Reduce output operator Key Expressions: Expr : 0 Type: String Sort Order: + Map - Reduce partition columns: Expr : 0 Type: String Tag: -1 Value expressions: Expr : 1 Type: Double Reduce operator tree: group by operator aggregations: Expr : Sum (Value. 0 ) Keys: Expr : Key. 0 Type: String Mode: final select operator expressions: Expr : 0 Type: String Expr : 1 Type: Double Select operator expressions: Expr : Udftointeger (0 ) Type: Int Expr : 1 Type: Double File output operator compressed: False Table: Input Format: Org. apache. hadoop. mapred. textinputformat output format: Org. apache. hadoop. hive. QL. io. ignorekeytextoutputformat serde: Org. apache. hadoop. hive. serde2.dynamic _ type. dynamicserde name: dest_g1 stage: Stage -0 Move operator tables: replace: True Table: Input Format: Org. apache. hadoop. mapred. textinputformat output format: Org. apache. hadoop. hive. QL. io. ignorekeytextoutputformat serde: Org. apache. hadoop. hive. serde2.dynamic _ type. dynamicserde name: dest_g1
In this example, there are two map/reduce stages (stage-1 and stage-2), and a file system related stage (stage-0 ). stage-0 simply moves the result from the temporary directory to the directory related to the table dest_g1.
The MAP/reduce stage contains two parts:
The MAP/reduce scenario contains two parts: the alias mapped from the table to the map operator tree -- This ing notifies the Mapper operator tree to call, processing the results of rows in a specific table or the previous map/reduce stage. In the above example, stage-1, the row of the original table is processed by the operator tree of the reduce output operator. Similarly, in stage-2, the row of the result of stage-1 is processed by other operator trees of the reduce output operator. Each reduce output operator partitions data to Reducers Based on metadata standards.
Reduce operator tree-this operator tree processes all rows of reducers of MAP/reduce tasks. In the example of stage-1, the reducer operator tree performs partial aggregation, while the reducers operator tree of stage-2 performs partial aggregation of stage-1 for final aggregation calculation.
Translation from https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Explain