For Clidriver, refer to Hive Source code Analysis: CLI Entry class
This portal was born for the shell of Hive, and when I wanted to submit a hive task in my application, I found I could not use it directly (before MR Runjar).
As above the Hive source code Analysis said, Clidriver did a lot of work, then I can only hack a bit.
After copying the source code of Clidriver, the work to be done has
Hack log4j to redefine the output stream with its own configuration to get the results of executing HQL
Hack log4j is easy to do, there is such a code, reinitialization of the log4j
Boolean loginitfailed = false;
String Loginitdetailmessage;
try {
loginitdetailmessage = logutils.inithivelog4j ();
} catch (Loginitializationexception e) {
Loginitfailed = true;
Loginitdetailmessage = E.getmessage ();
}
This causes us to be outside no matter how churn cannot configure the log system, comment out this code is OK.
Redefining the output stream needs to look here
clisessionstate ss = new Clisessionstate (new hiveconf (Sessionstate.class));
ss.in = system.in;
try {
ss.out = new PrintStream (System.out, True, "UTF-8");
Ss.info = new PrintStream (System.err, True, "UTF-8");
Ss.err = new Cachingprintstream (System.err, True, "UTF-8");
} catch (Unsupportedencodingexception e) {
return 3;
}
Redefined under
Ss.out = new Yourhiveprintstream ();
After this is done, you cannot perform a hive that requires MR, the problem is that you have to resolve UGI conflicts, or you will encounter various exceptions that do not have permissions, such as
[25-11:04:58,499] [ERROR] [main] [hive.ql.Driver] Authorization failed:no privilege ' Select ' found for inputs {DATABASE:D b, TABLE:TB}. Use show grant to get more details.
Here is a big pit, you want to find the reason for the exception of permissions, it is impossible to understand this logic, permissions this thing is defined in the HDFs side, login to HDFs to see the permissions configuration is normal Ah, and directly using the Hive command line can be normal execution ok
Finally, staring at the log from the beginning, found a warning
[25-11:04:53,106] [WARN] [main] [hadoop.security.UserGroupInformation] No groups available for user gdpi
[ 25-11:04:53,108] [WARN] [main] [hadoop.security.UserGroupInformation] No groups available for user gdpi
Also continuously warned two times, can only view the source to find the reason, first find the source of this warning message
Public synchronized string[] Getgroupnames () {
ensureinitialized ();
try {
list<string> result = Groups.getgroups (Getshortusername ());
Return Result.toarray (New String[result.size ()));
catch (IOException IE) {
log.warn ("No groups available for user" + Getshortusername ());
return new string[0];
}
}
was found in Usergroupinformation, and then a layer was found in Hive.
Clidriver:
Sessionstate.start (ss);
Execute CLI driver work
int ret = 0;
try {
ret = executedriver (ss, Conf, Oproc);
} catch (Exception e) {
ss.close ();
throw e;
}
See what Sessionstate.start (ss) has done
try {
startss.authenticator = Hiveutils.getauthenticator (
startss.getconf (), HiveConf.ConfVars.HIVE_ Authenticator_manager);
Startss.authorizer = Hiveutils.getauthorizeprovidermanager (
startss.getconf (), HiveConf.ConfVars.HIVE_ Authorization_manager,
startss.authenticator);
startss.createtablegrants = Createtableautomaticgrant.create (startss
. getconf ());
} catch (hiveexception E) {
throw new RuntimeException (e);
}
Hiveutils.getauthenticator () Gets the class name of the configured Authorization Manager, and then instantiates
if (CLS! = null) {
ret = reflectionutils.newinstance (CLS, conf);
}
Instantiation is instantiated, but it actually calls the
Setconf (result, Conf);
public static void Setconf (Object theobject, Configuration conf) {
if (conf! = null) {
if (theobject instanceof Co nfigurable) {
((configurable) theobject). setconf (conf);
}
Setjobconf (theobject, conf);
}
}
The Authorization Manager default value is Org.apache.hadoop.hive.ql.security.HadoopDefaultAuthenticator, and its setconf is implemented in this way
@Override public
void setconf (Configuration conf) {
this.conf = conf;
Usergroupinformation UGI = null;
try {
ugi = Shimloader.gethadoopshims (). getugiforconf (conf);
} catch (Exception e) {
throw new RuntimeException (e);
}
if (UGI = = null) {
throw new RuntimeException (
"Can not initialize Hadoopdefaultauthenticator.")
;
This.username = Shimloader.gethadoopshims (). Getshortusername (UGI);
if (ugi.getgroupnames () = null) {
this.groupnames = arrays.aslist (Ugi.getgroupnames ());
}
}
Well, it's been called twice.
Ugi.getgroupnames ()
The reason is not getting the desired user group because there is no such user in my environment (see the previous article "Those login for Hadoop usergroupinformation"). And look at UGI. Ways to get user groups
Public synchronized string[] Getgroupnames () {
ensureinitialized ();
try {
list<string> result = Groups.getgroups (Getshortusername ());
Return Result.toarray (New String[result.size ()));
catch (IOException IE) {
log.warn ("No groups available for user" + Getshortusername ());
return new string[0];
}
}
This is dependent on org.apache.hadoop.security.groups#getgroups ()
Public list<string> getgroups (String user) throws IOException {//No need to lookup for groups of static user
s list<string> staticmapping = staticusertogroupsmap.get (user);
if (staticmapping! = null) {return staticmapping;
}//Return cached value if available cachedgroups groups = usertogroupsmap.get (user);
Long Startms = Time.monotonicnow (); If cache has a value and it hasn ' t expired if (groups! = null && (Groups.gettimestamp () + cachetimeout >
STARTMS) {if (log.isdebugenabled ()) {Log.debug ("Returning cached groups for" + User + "'");
} return Groups.getgroups ();
}//Create and cache user ' s groups list<string> grouplist = impl.getgroups (user);
Long Endms = Time.monotonicnow ();
Long deltams = ENDMS-STARTMS;
UserGroupInformation.metrics.addGetGroups (Deltams); if (Deltams > Warningdeltams) {log.warn ("potential performance Problem:getgrouPS (user= "+ user +") "+" took "+ Deltams +" milliseconds. ");
Groups = new Cachedgroups (grouplist, ENDMS);
if (Groups.getgroups (). IsEmpty ()) {throw new IOException ("No groups found for user" + user);
} usertogroupsmap.put (user, groups);
if (log.isdebugenabled ()) {Log.debug ("Returning fetched groups for" + User + "'");
} return Groups.getgroups (); }
The key is
Create and Cache user ' s groups
list<string> grouplist = impl.getgroups (user);
This impl is initialized when the Groups is instantiated.
Public Groups (Configuration conf) {
Impl =
reflectionutils.newinstance (
conf.getclass ( Commonconfigurationkeys.hadoop_security_group_mapping,
Shellbasedunixgroupsmapping.class,
Groupmappingserviceprovider.class),
conf);
Cachetimeout =
Conf.getlong (commonconfigurationkeys.hadoop_security_groups_cache_secs,
Commonconfigurationkeys.hadoop_security_groups_cache_secs_default) * +;
Warningdeltams =
Conf.getlong (Commonconfigurationkeys.hadoop_security_groups_cache_warn_after_ms,
Commonconfigurationkeys.hadoop_security_groups_cache_warn_after_ms_default);
parsestaticmapping (conf);
if (log.isdebugenabled ())
log.debug ("Group mapping impl=" + Impl.getclass (). GetName () +
"; cachetimeout=" + Cachetimeout + "; warningdeltams= "+
warningdeltams);
}
Therefore, there is a tool that provides the user Group mapping service, there is a default value, the code is org.apache.hadoop.security.ShellBasedUnixGroupsMapping by default, and in Core-site.xml the default value is Org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback. However, this user group is dependent on the current operating system and must be hack. I have achieved a
public class Myusergroupsmapping implements Groupmappingserviceprovider {
@Override public
list<string > getgroups (String user) throws IOException {
return lists.newarraylist (user);
}
@Override public
void Cachegroupsrefresh () throws IOException {//Does ' nothing ' in this provider of the
user to groups Mapping
}
@Override public
void Cachegroupsadd (list<string> groups) throws IOException {
// Does provider of user to groups mapping
}}
The Core-site.xml is then modified. Finally the successful submission of Hive required to perform the MR job, but why are failed to see the log on the cluster, the original is classnotfound, my custom implementation class myusergroupsmapping is not on the cluster then, only Rescue, first modify the configuration specified with my groupmappingserviceprovider, until the local Hive ready to submit MR, before restoring the original
Set all properties specified via command line hiveconf conf = ss.getconf ();
/*hack start*///set hadoop.security.group.mapping to return the group of user, cause the user does not exist
String Hadoopsecuritygroupmappingclass = Conf.get (commonconfigurationkeys.hadoop_security_group_mapping); Conf.setclass (commonconfigurationkeys.hadoop_security_group_mapping, Meepousergroupsmapping.class,
Groupmappingserviceprovider.class);
Console.printinfo ("Set hadoop_security_group_mapping ...");
resetgroupsmapping (conf); /*hack end*/for (Map.entry<object, object> item:ss.cmdProperties.entrySet ()) {Conf.set (Str
ing) Item.getkey (), (String) Item.getvalue ());
Ss.getoverriddenconfigurations (). put (string) Item.getkey (), (String) item.getvalue ());
}//Read prompt configuration and substitute variables.
prompt = Conf.getvar (HiveConf.ConfVars.CLIPROMPT);prompt = new variablesubstitution (). Substitute (conf, prompt);
Prompt2 = spacesforstring (prompt);
Sessionstate.start (ss); /*hack start*///prevent Submit Mr to RM with my mapping class value, restore the old value if (stringutil S.isempty (Hadoopsecuritygroupmappingclass)) {Conf.unset (commonconfigurationkeys.hadoop_security_group_mapping
);
} else {Conf.set (commonconfigurationkeys.hadoop_security_group_mapping, Hadoopsecuritygroupmappingclass);
} console.printinfo ("Reset hadoop_security_group_mapping ...");
resetgroupsmapping (conf);
/*hack end*///execute CLI driver work int ret = 0;
try {ret = executedriver (ss, Conf, Oproc);
} catch (Exception e) {ss.close ();
Throw e; } ss.close ();
There's a trap here, and it's not enough to change the Conf.
private void Resetgroupsmapping (Configuration conf) {
Console.printinfo (commonconfigurationkeys.hadoop_ Security_group_mapping + ":" +
Conf.get (commonconfigurationkeys.hadoop_security_group_mapping));
Groups.getusertogroupsmappingservicewithloadedconfiguration (conf);
Usergroupinformation.setconfiguration (conf);
}
Only then will the cache be emptied and re-
Returning fetched groups
Done.