How-to:enable User authentication and Authorization in Apache HBase

Source: Internet
Author: User
Tags zookeeper ticket

With the default Apache HBase configuration, "Everyone is allowed" to "read from" and "write to" tables available in the SYS Tem. For many enterprise setups, this kind of the policy is unacceptable.

Administrators can set up firewalls that decide which machines is allowed to communicate with HBase.  However, machines that can pass the firewall is still allowed to read from and write to all tables. This kind of mechanism are effective but insufficient because HBase still cannot differentiate between multiple users Use the same client machines, and there are still no granularity with regard to HBase table, column family, or column Quali Fier access.

In this post, we'll discuss how Kerberos was used with Hadoop and HBase to provide User authentication, and how HBase implements User Authorization to grant users permissions for particular the actions on a specified set of data.

Secure Hbase:authentication & Authorization

A Secure HBase aims to protect against sniffers, unauthenticated/unauthorized users and network-based attacks. It does not protect against authorized users who accidentally delete all the data.

Hbase can be configured to provide  User authentication , which ensures the only authorized Users can communicate with HBase. The authorization system is implemented at The rpc level, and was based on the simple authentication and Security Layer (SASL), which supports (among other authentication mechanisms) Kerberos. SASL allows authentication, encryption negotiation and/or message integrity verification on a per connection basis ("HBAs E.rpc.protection "Configuration Property".

The next step after enabling user authentication are to give an admin the ability to define a series of user Autho Rization rules that allow or deny particular actions. The Authorization system, also known as the Access Controller coprocessor or Access Control List (ACL), is available from HBase 0.92 (CDH4) onward and gives the ability to define authorization policy (read/write/create/admin), with table/family /qualifier granularity, for a specified user.

Kerberos

Kerberos is a networked authentication protocol. It is designed to provide strong authentication for Client/server applications by using Secret-key cryptography. The Kerberos protocol uses strong cryptography (AES, 3DES, ...) so that a client can prove it identity to a server (and VIC e versa) across an insecure network connection. After a client and server has used Kerberos to prove their identities, they can also encrypt all of their communications To assure privacy and data integrity as they go on their business.

Ticket Exchange Protocol

At a high level, the to access a service using the Kerberos, each client must follow three steps:

    • Kerberos Authentication:the client authenticates itself to the Kerberos authentication Server and receive a Ticket granti Ng Ticket (TGT).
    • Kerberos authorization:the Client request a service ticket from the ticket granting Server, which issues a ticket and a s Ession key if the client TGT sent with the request is valid.
    • Service Request:the client uses the service ticket to authenticate itself to the server which is providing the service the Client is using (e.g. HDFS, HBase, ...)

HBase, HDFS, ZooKeeper SASL

Since HBase depends on HDFS and ZooKeeper, secure HBase relies on a secure HDFS and a secure ZooKeeper. This means is the HBase servers need to create a secure service session, as described above, to communicate with HDFS an D ZooKeeper.

All the files written by HBase is stored in HDFS.  As in Unix filesystems, the access control provided by HDFS are based on users, groups and permissions. All of the files created by HBase has "hbase" as user, but this access control was based on the username provided by the Syst EM, and everyone that can access the machine was potentially able to "sudo" as the user "HBase". Secure HDFS adds the authentication steps that guarantee, the "hbase" User is trusted.

ZooKeeper has a Access Control List (ACL) on each znode this allows Read/write Access to the users based on user Informat Ion in a similar manner to HDFS.

HBase ACL

Now this our users are authenticated via Kerberos, we are sure that the username so we received is one of our trusted US ERs. Sometimes this was not enough granularity–we want to control, a specified user is able to read or write a table. To does, HBase provides an Authorization mechanism, which allows restricted access for specified users.

To enable this feature, you must enable the Access Controller coprocessor, by adding it to hbase-site.xml under the master and region server coprocessor classes. (See How to Setup, the HBase security configuration here.)

A coprocessor is code this runs inside each HBase region Server and/or Master. It is able to intercept most operations (put, GET, delete, ...), and run arbitrary code before and/or after the operation is Executed.

Using This ability to execute some code before each operation, the Access Controller coprocessor can check the user rights and decide if the user can or cannot execute the operation.

Rights Management and _ACL_ table

The HBase Shell has a couple of commands this allows an admin to manage the user rights:

    • grant [table] [family] [qualifier]
    • revoke [table] [family] [qualifier]

As you see, a admin has the ability to restrict user access based on the table schema:

    • Give User-w only Read rights to Table-x/family-y ( grant ‘User-W‘, ‘R‘, ‘Table-X‘, ‘Family-Y‘ )
    • Give user-w the full read/write rights to Qualifier-z ( grant ‘User-W‘, ‘RW‘, ‘Table-X‘, ‘Family-Y‘, ‘Qualifier-Z‘ )

An admin also have the ability to grant global rights, which operate at the cluster level, such as creating tables, Balanci NG regions, shutting down the cluster and so on:

    • Give User-w the ability to create tables ( grant ‘User-W‘, ‘C‘ )
    • Give User-w the ability to manage the cluster ( grant ‘User-W‘, ‘A‘ )

All the permissions is stored in a table created by the Access Controller coprocessor, called _acl_. The primary key of this table is the table name, the Specify in the grant command. The _acl_ table has just one column family and each qualifier describes the granularity of rights for a particular table/u  Ser. The value contains the actual rights granted.

As can see, the HBase shell commands was tightly related to how the data is stored. The grant command adds or updates one row, and the revoke command removes one row from the _acl_ table.

Access Controller under the hood

As mentioned previously, the Access Controller coprocessor uses the ability to intercept each user request, and check if T He user has the rights to execute the operations.

For each operation, the Access Controller needs to query the _acl_ table to see if the user have the rights to execute the Operation.

However, this operation can has a negative impact on performance. The solution to fix this problem are using the _acl_ table for persistence and ZooKeeper to speed up the rights lookup. Each region server loads the _acl_ table in memory and get notified of changes by the Zkpermissionwatcher. In this to, every region server have the updated value every time and each permission check are performed by using an in-me Mory Map.

Roadmap

While Kerberos was a stable, well-tested and proven authentication system, the HBase ACL feature is still very basic and it s semantics is still evolving. HBASE-6096 is the umbrella JIRA as reference for all the improvements to ship in a V2 of the ACL feature.

Another open topic on authorization and access control is implementing a Per-keyvalue security system (HBASE-6222 That would give the ability to has different values on the same cell associated with a security tag. That would allow to showing a particular piece of information based on the user ' s permissions.

Conclusion

HBase Security adds extra features this allow you to protect your data against sniffers or other network attacks (by U Sing Kerberos to authenticate users and encrypt communications between services), and allow your to define User Authorizati On policies, restrict operations, and limit data visibility for particular users.

Original address: http://blog.cloudera.com/blog/2012/09/understanding-user-authentication-and-authorization-in-apache-hbase/

How-to:enable User authentication and Authorization in Apache HBase

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.