Use Hadoop ACL to control access permissions

Source: Internet
Author: User
Tags hdfs dfs

Use Hadoop ACL to control access permissions

Use Hadoop ACL to control access permissions

I. HDFS Access Control
Hdfs-site.xml settings startup acl

<Property>
<Name> dfs. permissions. enabled </name>
<Value> true </value>
</Property>


<Property>
<Name> dfs. namenode. acls. enabled </name>
<Value> true </value>
</Property>


Core-site.xml sets user group default permissions.


<Property>
<Name> fs. permissions. umask-mode </name>
<Value> 002 </value>
</Property>


The requirements and solutions are as follows:

1. Apart from the data warehouse owner, normal users cannot create databases or tables in the default database.
The default permissions of/user/hive/warehouse are changed to 755. If the owner is hadoop (or the data warehouse owner), no one can create a database or create a table in the default database.

2. After the data warehouse owner creates a database, it can be assigned to the project team, where the project team can create tables.
Change/user/hive/warehouse/database. db owner to project team.

3. After the data warehouse owner creates a database, he does not assign the table creation permission to the project team. Instead, he creates a table for it and only allows the project team to insert partitions.
The data warehouse owner keeps the permissions of/user/hive/warehouse/database. db. The project team cannot create tables. After the data warehouse owner creates a table for the project team, the table directory is assigned to the project team.

4. Some tables can only be read and written by the project team.
/User/hive/warehouse/database. db/table name directory changed to 770.

5. Some tables can only be read and written by special users in the project team.
Change the owner of the/user/hive/warehouse/database. db/table name directory to this user and change the permission to 700.

6. For tables in the project team, special users in other groups are required to insert data.
Use the following command to map the dntest. the database table testp1 has the write permission for hdfs dfs-setfacl-R-m user: mapengxu: rwx/user/hive/warehouse/cdntest. db/testp1

7. The table of the project team requires special users in other groups to have the permission to read data.
Hdfs dfs-setfacl-R-m user: mapengxu: r-x/user/hive/warehouse/cdntest. db/testp1

8. For tables in the project team, all users in other groups must have the permission to read data.
Hdfs dfs-setfacl-R-m group: data_sum: r-x/user/hive/warehouse/cdntest. db/testp1

9. Create a default database. All users of this database have the permission to create tables, but only save for 30 days.
The permission of/user/hive/warehouse/database. db is changed to 777, and the scheduled task is set to scan this directory and hive database. If a table has been created for more than 30 days, delete the table and its directory.

10. This measure is combined with basic SQL access control.


Task Scheduling
Manage queues by user group, unified permissions in the portal and jenkins, allocate resources by group, to facilitate statistics by project team every day, the number of cluster resources occupied by each week. mapred-site.xml configuration is as follows:
<Property>
<Name> mapred. acls. enabled </name>
<Value> true </value>
</Property>
<Property>
<Name> mapred. fairschedproperty. poolnameproperty </name>
<Value> group. name </value>
</Property>
Fair-scheduler.xml configuration is as follows:

<? Xml version = "1.0"?>
<Allocations>


<Pool name = "cdn">
<MaxResources> 1000 vcores </maxResources>
<MaxRunningJobs> 10 </maxRunningJobs>
<Weight> 1.0 </weight>
<SchedulingPolicy> fair </schedulingPolicy>
</Pool>
<Pool name = "data_sum">
<MaxResources> 1000 vcores </maxResources>
<MaxRunningJobs> 10 </maxRunningJobs>
<Weight> 1.0 </weight>
<SchedulingPolicy> fair </schedulingPolicy>
</Pool>


<Usermaxcompute default> 2 </usermaxcompute default>


<QueuePlacementPolicy>
<Rule name = "primaryGroup" create = "false"/>
<Rule name = "secondaryGroupExistingQueue" create = "false"/>
<Rule name = "user" create = "false"/>
<Rule name = "reject"/>

Tutorial on standalone/pseudo-distributed installation and configuration of Hadoop2.4.1 under Ubuntu14.04

Install and configure Hadoop2.2.0 on CentOS

Build a Hadoop environment on Ubuntu 13.04

Cluster configuration for Ubuntu 12.10 + Hadoop 1.2.1

Build a Hadoop environment on Ubuntu (standalone mode + pseudo Distribution Mode)

Configuration of Hadoop environment in Ubuntu

Detailed tutorial on creating a Hadoop environment for standalone Edition

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.