Analysis of hive.mapred.mode parameters of hive

Source: Internet
Author: User

Hive configuration has a parameter hive.mapred.mode, divided into nonstrict,strict, the default is Nonstrict

If set to strict, three of cases of statements in the compile link do filter:

1. Cartesian product join. In this case, the reduce join key is not specified, so only one reducer is enabled, resulting in a performance bottleneck when the volume of data is large

Use only 1 reducer in case of Cartesian product  
if (reducekeys.size () = = 0) {  
  numreds = 1;  
      
  Cartesian product isn't supported in strict mode  
  if (Conf.getvar (HiveConf.ConfVars.HIVEMAPREDMODE). Equalsignorecase (  
      "strict")) {  
    throw new Semanticexception (ERRORMSG.NO_CARTESIAN_PRODUCT.GETMSG ());  
  }  
}

2. The order is not followed by limit. The order by forces the reduce number to be set to 1, without limit, and all the data is sink to the reduce end for full sorting.

if (Sortexprs = = null) {  
  SORTEXPRS = Qb.getparseinfo (). Getorderbyforclause (dest);  
  if (Sortexprs!= null) {  
    assert numreducers = = 1;  
    In strict mode, the presence of order by, limit must is specified  
    Integer limit = Qb.getparseinfo (). Getdestlimit (dest);  
    if (Conf.getvar (HiveConf.ConfVars.HIVEMAPREDMODE). Equalsignorecase (  
        "strict")  
        && limit = null) {  
      throw new Semanticexception (Generateerrormessage (Sortexprs,  
            ERRORMSG.NO_LIMIT_WITH_ORDERBY.GETMSG ()));  
  }  
}

3. The table read is partitioned table, but partition predicate is not specified.

Note: If it is a multilevel partition table, just show any one and release it.

If the "strict" mode is on, we have to provide partition Pruner for  
Each table.
if ("strict". Equalsignorecase (Hiveconf.getvar) (Conf,
HiveConf.ConfVars.HIVEMAPREDMODE))) {
if (!hascolumnexpr (prunerexpr)) {
throw New Semanticexception (errormsg.no_partition_predicate
. getmsg ("for alias \" "+ alias +" \ Table \ "")
+ tab.gettablename () + "\"));
}
}

These three kinds of cases in the case of large amount of data will result in the generation of inefficient Mr Job, affecting execution time and efficiency, but directly throw exception and feel too forcefully.

You can open strict mode, such as Hiveweb, and operating tools in AD-HOC queries on some non online production environments.

More Wonderful content: http://www.bianceng.cnhttp://www.bianceng.cn/database/extra/

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.