Hive clidriver hack

Source: Internet
Author: User
Tags log4j

For Clidriver, refer to Hive Source code Analysis: CLI Entry class

This portal was born for the shell of Hive, and when I wanted to submit a hive task in my application, I found I could not use it directly (before MR Runjar).

As above the Hive source code Analysis said, Clidriver did a lot of work, then I can only hack a bit.

After copying the source code of Clidriver, the work to be done has
Hack log4j to redefine the output stream with its own configuration to get the results of executing HQL

Hack log4j is easy to do, there is such a code, reinitialization of the log4j

    Boolean loginitfailed = false;
    String Loginitdetailmessage;
    try {
      loginitdetailmessage = logutils.inithivelog4j ();
    } catch (Loginitializationexception e) {
      Loginitfailed = true;
      Loginitdetailmessage = E.getmessage ();
    }
This causes us to be outside no matter how churn cannot configure the log system, comment out this code is OK.

Redefining the output stream needs to look here

    clisessionstate ss = new Clisessionstate (new hiveconf (Sessionstate.class));
    ss.in = system.in;
    try {
      ss.out = new PrintStream (System.out, True, "UTF-8");
      Ss.info = new PrintStream (System.err, True, "UTF-8");
      Ss.err = new Cachingprintstream (System.err, True, "UTF-8");
    } catch (Unsupportedencodingexception e) {
      return 3;
    }

Redefined under
Ss.out = new Yourhiveprintstream ();
After this is done, you cannot perform a hive that requires MR, the problem is that you have to resolve UGI conflicts, or you will encounter various exceptions that do not have permissions, such as
[25-11:04:58,499] [ERROR] [main] [hive.ql.Driver] Authorization failed:no privilege ' Select ' found for inputs {DATABASE:D b, TABLE:TB}. Use show grant to get more details.

Here is a big pit, you want to find the reason for the exception of permissions, it is impossible to understand this logic, permissions this thing is defined in the HDFs side, login to HDFs to see the permissions configuration is normal Ah, and directly using the Hive command line can be normal execution ok

Finally, staring at the log from the beginning, found a warning

[25-11:04:53,106] [WARN] [main] [hadoop.security.UserGroupInformation] No groups available for user gdpi
[ 25-11:04:53,108] [WARN] [main] [hadoop.security.UserGroupInformation] No groups available for user gdpi
Also continuously warned two times, can only view the source to find the reason, first find the source of this warning message
  Public synchronized string[] Getgroupnames () {
    ensureinitialized ();
    try {
      list<string> result = Groups.getgroups (Getshortusername ());
      Return Result.toarray (New String[result.size ()));
    catch (IOException IE) {
      log.warn ("No groups available for user" + Getshortusername ());
      return new string[0];
    }
  }
was found in Usergroupinformation, and then a layer was found in Hive.

Clidriver:

    Sessionstate.start (ss);

    Execute CLI driver work
    int ret = 0;
    try {
      ret = executedriver (ss, Conf, Oproc);
    } catch (Exception e) {
      ss.close ();
      throw e;
    }
See what Sessionstate.start (ss) has done
    try {
      startss.authenticator = Hiveutils.getauthenticator (
          startss.getconf (), HiveConf.ConfVars.HIVE_ Authenticator_manager);
      Startss.authorizer = Hiveutils.getauthorizeprovidermanager (
          startss.getconf (), HiveConf.ConfVars.HIVE_ Authorization_manager,
          startss.authenticator);
      startss.createtablegrants = Createtableautomaticgrant.create (startss
          . getconf ());
    } catch (hiveexception E) {
      throw new RuntimeException (e);
    }
Hiveutils.getauthenticator () Gets the class name of the configured Authorization Manager, and then instantiates
if (CLS! = null) {
        ret = reflectionutils.newinstance (CLS, conf);
      }

Instantiation is instantiated, but it actually calls the
Setconf (result, Conf);

  public static void Setconf (Object theobject, Configuration conf) {
    if (conf! = null) {
      if (theobject instanceof Co nfigurable) {
        ((configurable) theobject). setconf (conf);
      }
      Setjobconf (theobject, conf);
    }
  }

The Authorization Manager default value is Org.apache.hadoop.hive.ql.security.HadoopDefaultAuthenticator, and its setconf is implemented in this way

  @Override public
  void setconf (Configuration conf) {
    this.conf = conf;
    Usergroupinformation UGI = null;
    try {
      ugi = Shimloader.gethadoopshims (). getugiforconf (conf);
    } catch (Exception e) {
      throw new RuntimeException (e);
    }

    if (UGI = = null) {
      throw new RuntimeException (
          "Can not initialize Hadoopdefaultauthenticator.")
    ;

    This.username = Shimloader.gethadoopshims (). Getshortusername (UGI);
    if (ugi.getgroupnames () = null) {
      this.groupnames = arrays.aslist (Ugi.getgroupnames ());
    }
  }

Well, it's been called twice.
Ugi.getgroupnames ()

The reason is not getting the desired user group because there is no such user in my environment (see the previous article "Those login for Hadoop usergroupinformation"). And look at UGI. Ways to get user groups
  Public synchronized string[] Getgroupnames () {
    ensureinitialized ();
    try {
      list<string> result = Groups.getgroups (Getshortusername ());
      Return Result.toarray (New String[result.size ()));
    catch (IOException IE) {
      log.warn ("No groups available for user" + Getshortusername ());
      return new string[0];
    }
  }
This is dependent on org.apache.hadoop.security.groups#getgroups ()
  Public list<string> getgroups (String user) throws IOException {//No need to lookup for groups of static user
    s list<string> staticmapping = staticusertogroupsmap.get (user);
    if (staticmapping! = null) {return staticmapping;
    }//Return cached value if available cachedgroups groups = usertogroupsmap.get (user);
    Long Startms = Time.monotonicnow ();  If cache has a value and it hasn ' t expired if (groups! = null && (Groups.gettimestamp () + cachetimeout >
      STARTMS) {if (log.isdebugenabled ()) {Log.debug ("Returning cached groups for" + User + "'");
    } return Groups.getgroups ();
    }//Create and cache user ' s groups list<string> grouplist = impl.getgroups (user);
    Long Endms = Time.monotonicnow ();
    Long deltams = ENDMS-STARTMS;
    UserGroupInformation.metrics.addGetGroups (Deltams); if (Deltams > Warningdeltams) {log.warn ("potential performance Problem:getgrouPS (user= "+ user +") "+" took "+ Deltams +" milliseconds. ");
    Groups = new Cachedgroups (grouplist, ENDMS);
    if (Groups.getgroups (). IsEmpty ()) {throw new IOException ("No groups found for user" + user);
    } usertogroupsmap.put (user, groups);
    if (log.isdebugenabled ()) {Log.debug ("Returning fetched groups for" + User + "'");
  } return Groups.getgroups (); }
The key is
    Create and Cache user ' s groups
    list<string> grouplist = impl.getgroups (user);
This impl is initialized when the Groups is instantiated.
  Public Groups (Configuration conf) {
    Impl = 
      reflectionutils.newinstance (
          conf.getclass ( Commonconfigurationkeys.hadoop_security_group_mapping, 
                        Shellbasedunixgroupsmapping.class, 
                        Groupmappingserviceprovider.class), 
          conf);
    
    Cachetimeout = 
      Conf.getlong (commonconfigurationkeys.hadoop_security_groups_cache_secs, 
          Commonconfigurationkeys.hadoop_security_groups_cache_secs_default) * +;
    Warningdeltams =
      Conf.getlong (Commonconfigurationkeys.hadoop_security_groups_cache_warn_after_ms,
        Commonconfigurationkeys.hadoop_security_groups_cache_warn_after_ms_default);
    parsestaticmapping (conf);

    if (log.isdebugenabled ())
      log.debug ("Group mapping impl=" + Impl.getclass (). GetName () + 
          "; cachetimeout=" + Cachetimeout + "; warningdeltams= "+
          warningdeltams);
  }

Therefore, there is a tool that provides the user Group mapping service, there is a default value, the code is org.apache.hadoop.security.ShellBasedUnixGroupsMapping by default, and in Core-site.xml the default value is Org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback. However, this user group is dependent on the current operating system and must be hack. I have achieved a
public class Myusergroupsmapping implements Groupmappingserviceprovider {
    @Override public
    list<string > getgroups (String user) throws IOException {
        return lists.newarraylist (user);
    }

    @Override public
    void Cachegroupsrefresh () throws IOException {//Does ' nothing ' in this provider of the
        user to groups Mapping
    }

    @Override public
    void Cachegroupsadd (list<string> groups) throws IOException {
        // Does provider of user to groups mapping
    }}

The Core-site.xml is then modified. Finally the successful submission of Hive required to perform the MR job, but why are failed  to see the log on the cluster, the original is classnotfound, my custom implementation class  myusergroupsmapping is not on the cluster then, only Rescue, first modify the configuration specified with my  groupmappingserviceprovider, until the local Hive ready to submit MR, before restoring the original
        Set all properties specified via command line hiveconf conf = ss.getconf ();
        /*hack start*///set hadoop.security.group.mapping to return the group of user, cause the user does not exist
        String Hadoopsecuritygroupmappingclass = Conf.get (commonconfigurationkeys.hadoop_security_group_mapping); Conf.setclass (commonconfigurationkeys.hadoop_security_group_mapping, Meepousergroupsmapping.class,
        Groupmappingserviceprovider.class);
        Console.printinfo ("Set hadoop_security_group_mapping ...");
        resetgroupsmapping (conf); /*hack end*/for (Map.entry<object, object> item:ss.cmdProperties.entrySet ()) {Conf.set (Str
            ing) Item.getkey (), (String) Item.getvalue ());
        Ss.getoverriddenconfigurations (). put (string) Item.getkey (), (String) item.getvalue ());
        }//Read prompt configuration and substitute variables.
        prompt = Conf.getvar (HiveConf.ConfVars.CLIPROMPT);prompt = new variablesubstitution (). Substitute (conf, prompt);

        Prompt2 = spacesforstring (prompt);
        Sessionstate.start (ss); /*hack start*///prevent Submit Mr to RM with my mapping class value, restore the old value if (stringutil S.isempty (Hadoopsecuritygroupmappingclass)) {Conf.unset (commonconfigurationkeys.hadoop_security_group_mapping
        );
        } else {Conf.set (commonconfigurationkeys.hadoop_security_group_mapping, Hadoopsecuritygroupmappingclass);
        } console.printinfo ("Reset hadoop_security_group_mapping ...");
        resetgroupsmapping (conf);
        /*hack end*///execute CLI driver work int ret = 0;
        try {ret = executedriver (ss, Conf, Oproc);
            } catch (Exception e) {ss.close ();
        Throw e; } ss.close ();

There's a trap here, and it's not enough to change the Conf.
    private void Resetgroupsmapping (Configuration conf) {
        Console.printinfo (commonconfigurationkeys.hadoop_ Security_group_mapping + ":" +
                Conf.get (commonconfigurationkeys.hadoop_security_group_mapping));
        Groups.getusertogroupsmappingservicewithloadedconfiguration (conf);
        Usergroupinformation.setconfiguration (conf);
    }
Only then will the cache be emptied and re-
Returning fetched groups
Done.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.