Detailed analysis on permission authentication for hive task submission

Source: Internet
Author: User
Recently I have been studying Hue and encountered a problem. When I write an HQL in HiveEditor and submit it, a permission error will be reported, such as Authorizationfailed: NoprivilegeSelectfoundforinputs {database: xxx, table: xxx, columnName: xxx }. useshowgranttogetmoredetails. the login user of Hue is hadoop.

Recently I have been studying Hue and encountered a problem. When I write an HQL in Hive Editor and submit it, a permission error will be reported, such as Authorizationfailed: Noprivilege 'select' foundforinputs {database: xxx, table: xxx, columnName: xxx }. useshowgranttogetmoredetails. the login user of Hue is hadoop.

I recently encountered a problem when I was studying Hue. I wrote an HQL file in Hive Editor and reported a permission error after submission.

Authorization failed:No privilege 'Select' found for inputs {database:xxx, table:xxx, columnName:xxx}. Use show grant to get more details.

The login user of Hue is hadoop. It is okay to use cli for query. However, if you use Hue to connect to HiveServer2, the corresponding table cannot be queried to eliminate Hue interference, if you use Beeline to connect to HiveServer2, a permission error is also reported. The stack information is shown in figure



650) this. width = 650; "src =" http://www.68idc.cn/help/uploads/allimg/151214/100610M07-0.jpg "title =" 3.png" alt = "wKiom1O1KT_jKAZdAAyGT2bPY6U498.jpg"/>


According to the stack information, the following source code is summarized (only important code is listed). The permission verification process for Hive to submit SQL statements is as follows:

Driver. compile (String command, boolean resetTaskIds) {if (HiveConf. getBoolVar (conf, HiveConf. confVars. HIVE_AUTHORIZATION_ENABLED) {try {perfLogger. perfLogBegin (LOG, PerfLogger. DO_AUTHORIZATION); // perform permission verification doAuthorization (sem) ;}} Driver. doAuthorization (BaseSemanticAnalyzer sem) {// determine the op operation type as QUERY if (op. equals (HiveOperation. CREATETABLE_AS_SELECT) | op. equals (HiveOperation. QUERY) {if (cols! = Null & cols. size ()> 0) {// perform more specific ss verification. getAuthorizer (). authorize (tbl, null, cols, op. getInputRequiredPrivileges (), null) ;}} BitSetCheckedAuthorizationProvider. authorize (Table table, Partition part, List
 
  
Columns, Privilege [] inputRequiredPriv, Privilege [] outputRequiredPriv) {// verify the authorizeUserDBAndTable (Table, metadata, outputRequiredPriv, inputCheck, and outputCheck) // verify the permission of the column in the Table for (String col: columns) {PrincipalPrivilegeSet partColumnPrivileges = hive_db. get_privilege_set (HiveObjectType. COLUMN, table. getDbName (), table. getTableName (), partValues, col, this. getAuthenticator (). getUserName (), this. getAuthenticator (). getGroupNames (); authorizePrivileges (partColumnPrivileges, inputRequiredPriv, inputCheck2, outputRequiredPriv, outputCheck2 );}}
 

The Hive permission verification first calls authorizeUserDBAndTable to verify whether the user has access permissions to the accessed DB and Table. It corresponds to the DB_PRIVS and TBL_PRIVS tables of MetaStore. During verification, thrift interacts with the HiveMetaStore process to obtain information about the corresponding table in the MetaStore library. If a user has access permissions for resources of a higher granularity, the user will return directly and will not continue to perform more fine-grained verification. That is to say, if the user has relevant permissions for the database, the access permissions to Table and Column are not verified.

After viewing the DB_PRIVS table, the hadoop user has the Select permission for the database to be accessed, so there is no problem in accessing the database in the traditional CLI mode. The above code is also expected, because in fact, permission verification in CLI mode and HiveServer mode is a set of code. Decide to run remote debug to find this. getAuthenticator (). the value of getUserName () is hive, that is, the user who started HiveServer2, rather than hadoop, who submitted SQL, and found the code for setting the relevant attributes of authenticator.

SessionState. start (SessionState startSs) {// instantiate the default hadoopdefaauthauthenticator. In the method, hadoopdefautils calls hadoopdefaauthauthenticator when loading classes using ReflectionUtils reflection. setConf method startSs. authenticator = HiveUtils. getAuthenticator (startSs. getConf (), HiveConf. confVars. HIVE_AUTHENTICATOR_MANAGER);} HadoopDefaultAuthenticator. setConf (Configuration conf) {ugi = ShimLoader. getHadoopShims (). getUGIForConf (conf);} HadoopShimsSecure. get UGIForConf (Configuration conf) throws IOException {return UserGroupInformation. getCurrentUser ();} UserGroupInformation. getCurrentUser () throws IOException {AccessControlContext context = AccessController. getContext (); Subject subject = Subject. getSubject (context); // when HiveServer is started, the subject is empty and getLoginUser if (subject = null | subject. getPrincipals (User. class ). isEmpty () {return getLoginUse R () ;}else {return new UserGroupInformation (subject) ;}} UserGroupInformation. getLoginUser () {if (loginUser = null) {try {Subject subject = new Subject (); LoginContext login; if (isSecurityEnabled () {login = newLoginContext (HadoopConfiguration. USER_KERBEROS_CONFIG_NAME, subject, new HadoopConfiguration ();} else {login = newLoginContext (HadoopConfiguration. SIMPLE_CONFIG_NAME, subject, ne W HadoopConfiguration ();} login. login (); loginUser = new UserGroupInformation (subject); loginUser. setLogin (login); loginUser. setAuthenticationMethod (isSecurityEnabled ()? AuthenticationMethod. KERBEROS: AuthenticationMethod. SIMPLE); loginUser = new UserGroupInformation (login. getSubject (); String fileLocation = System. getenv (HADOOP_TOKEN_FILE_LOCATION); if (fileLocation! = Null) {Credentials cred = Credentials. readTokenStorageFile (new File (fileLocation), conf); loginUser. addCredentials (cred);} loginUser. spawnAutoRenewalThreadForUserCreds ();} catch (LoginException le) {LOG. debug ("failure to login", le); throw new IOException ("failure to login", le);} if (LOG. isDebugEnabled () {LOG. debug ("UGI loginUser:" + loginUser) ;}} return loginUser ;}

When HiveServer is started, it calls getLoginUser () for the first time. If loginUser is empty, LoginContext is created and its login method is called. The login method will eventually call the commit () method of HadoopLoginModule. The general logic of the commit () method is as follows:

1. If kerberos is used, it is the kerberos login user.

2. If the kerberos user is empty and security is not enabled, the value of HADOOP_USER_NAME is obtained from the system environment variable.

3. If HADOOP_USER_NAME is not set in the environment variable, use the System user, that is, the user who starts the HiveServer2 process.

Later users are the users who started HiveServer2, so the UserName attribute value of authenticator is hive. Therefore, when you use hive to query the MetaStore permission table, you cannot find the relevant information and the authorization fails. Unless you grant permissions to hive users. The solution is either to grant related permissions to hive users, but in this way, permission verification is meaningless. It is better to implement hive. security. authenticator. manager to implement permission Verification Based on the user who submits the SQL statement.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.