Hadoop. Job. ugi no longer takes effect after clouder cdh3b3 starts

Source: Internet
Author: User
Tags kinit

Hadoop. Job. ugi no longer takes effect after clouder cdh3b3 starts!

After several days, I finally found the cause. In the past, the company used the original hadoop-0.20.2, using Java to set hadoop. Job. ugi for the correct hadoop users and groups can be normal access to HDFS and can be created and deleted.

After being updated to cdh3b4, I found a lot of information for no reason. Finally, please refer:

TheHadoop. Job. ugiConfiguration no longer has any effect. Instead, please useUsergroupinformation. DOAsAPI
To impersonate other users on a non-secured cluster. (As of cdh3b3)

The hadoop. Job. ugi configuration is no longer effective. Instead, useUsergroupinformation. DOAs
The cluster is not considered safe.


Incompatible changes:


  • The tasktracker Configuration ParameterMapreduce. tasktracker. Local. cache. numberdirectoriesHas been renamedMapreduce. tasktracker. cache. Local. numberdirectories. (As of cdh3u0)
  • The job-level configuration parametersMapred. Max. Maps. Per. Node,Mapred. Max. Reduces. Per. Node,Mapred. Running. Map. Limit, AndMapred. Running. Reduce. LimitDeployments have been removed. (As of cdh3b4)
  • Cdh3 no longer contains packages for Debian Lenny, UBUNTU hard, jaunty, or karmic. Checkout these
    Upgrade instructions if you are using an Ubuntu release past its end of life. if you are using a release for which cloudera's Debian or RPM packages are not available, you can always use the tarbils from the CDH
    Download Page. (As of cdh3b4)
  • TheHadoop. Job. ugiConfiguration no longer has any effect. Instead, please useUsergroupinformation. DOAsAPI to impersonate other users on a non-secured cluster. (As of cdh3b3)
  • The unixusergroupinformation class has been removed. Please see the new methods inUsergroupinformationClass. (As of cdh3b3)
  • The resolution of groups for a user is now already med on the server side. for a user's group membership to take effect, it must be visible on the namenode and jobtracker machines. (As of cdh3b3)
  • TheMapred. tasktracker. procfsbasedprocesstree. sleeptime-before-sigkillConfiguration has been renamedMapred. tasktracker. Tasks. sleeptime-before-sigkill. (As of cdh3b3)
  • The HDFS and mapreduce daemons no longer run as a single sharedHadoopUser. Instead, the HDFS Daemons runHDFSAnd the mapreduce Daemons runMapred. See changes
    In user accounts and groups in cdh3. (as of cdh3b3)
  • Due to a change in the internal compression APIs, cdh3 is incompatible with versions ofHadoop-lzoOpen Source Project prior to 0.4.9. (As of cdh3b3)
  • Cdh3 changes the wire format for hadoop's RPC mechanism. Thus, you must upgrade any existing client software at the same time as the cluster is upgraded. (All versions)
  • Zero values forDFS. Socket. TimeoutAndDFS. datanode. Socket. Write. TimeoutConfiguration parameters are now respected. previusly zero values for these parameters resulted in a 5 second timeout. (As of cdh3u1)
  • When hadoop's Kerberos integration is enabled, it is now required that eitherKinitBe on the path for user accounts running the hadoop client, or thatHadoop. Kerberos. kinit. CommandConfiguration option be manually set to the absolute
    PathKinit. (As of cdh3u1)
Hive
  • The upgrade of hive from cdh2 to cdh3 requires several manual steps. please be sure to follow the upgrade guide closely. See upgrading
    Hive and hue in cdh3.
Address: https://ccp.cloudera.com/display/cdhdoc/incompatible#changes: how to use usergroupinformation. DOAs?
Join oozie to access HDFS, but only Joe can access HDFS normally. This is oozie.
......
UserGroupInformation ugi =                      UserGroupInformation.createProxyUser(user, UserGroupInformation.getLoginUser());             ugi.doAs(new PrivilegedExceptionAction<Void>() {               public Void run() throws Exception {                 //Submit a job                 JobClient jc = new JobClient(conf);                 jc.submitJob(conf);                 //OR access hdfs                 FileSystem fs = FileSystem.get(conf);                 fs.mkdir(someFilePath);                }             }

Configure namenode and jobtracker as follows:

             <property>               <name>hadoop.proxyuser.oozie.groups</name>               <value>group1,group2</value>               <description>Allow the superuser oozie to impersonate any members of the group group1 and group2</description>             </property>             <property>               <name>hadoop.proxyuser.oozie.hosts</name>               <value>host1,host2</value>               <description>The superuser can connect only from host1 and host2 to impersonate a user</description>             </property>

If no configuration is available, it will not succeed.


Caveats

The superuser must have Kerberos credentials to be able to impersonate another user. it cannot use delegation tokens for this feature. it wocould be wrong if superuser adds its own delegation token to the proxy user ugi, as it will allow the proxy user to connect
To the service with the privileges of the superuser.

However, if the superuser does want to give a delegation token to Joe, it must first impersonate Joe and get a delegation token for Joe, in the same way as the code example above, and add it to the ugi of Joe. in this way the delegation token will have
Owner as Joe.

Secure impersonation using usergroupinformation. For more information about DOAs, see http://hadoop.apache.org/common/docs/stable/Secure_Impersonation.html.
As mentioned above, Java code can access hadoop for normal operations. Kerberos authentication must be implemented and configured using usergroupinformation. DOAs
Method.
If this is not done, the application must be operated properly under the hadoop user ?!

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.