Monitor and audit access rights for IBM InfoSphere biginsights and Cloudera Hadoop

Source: Internet
Author: User
Tags hadoop ecosystem

Http://www.ithov.com/server/124456.shtml

You will also learn a quick start monitoring implementation that applies only to IBM InfoSphere BigInsights.

Big Data riots are focused on infrastructure that supports limit capacity, speed and diversity, and real-time analytics capabilities supported by the infrastructure. While big data environments such as Hadoop are relatively new, the truth is that the key to data security in a big Data environment is pre-addressed. Where there is data, there may be privacy breaches, unauthorized access or inappropriate access by privileged users.

The same compliance requirements should be enforced across the big Data environment and more traditional big data management architectures, with no reason to weaken data security because the technology is immature and improving. As a matter of fact, organizations will face enormous storage (where data is stored) risks and threats as more data is absorbed by the big Data environment.

If you are responsible for data security in your organization, you may need to answer the following questions, such as:

• Who is running a specific big data request? What map-reduce jobs are they running? Are they trying to download all sensitive data? Or is this a normal marketing query for customer insight?
• Is it possible that a large number of file permission exceptions are caused by hackers trying to access sensitive data through an algorithm?
• Are these jobs part of the list of programs that grant access to the data? Or have you already developed some new applications that you didn't realize before?
What you need is the ability to integrate big data applications and analytics into an existing data security infrastructure, rather than relying on your own scripts and monitoring programs, to develop scripts and monitoring programs that can be laborious, error-prone, and often abused.

This article will show you how to extend the IBM InfoSphere Guardium V9 (a comprehensive data activity monitoring and compliance solution) to include access monitoring and reporting for the Hadoop ecosystem.

Although this article contains a high-level overview of InfoSphere Guardium, it does not describe how to install and configure InfoSphere Guardium Collector. This article describes how to configure InfoSphere Guardium to monitor supported Hadoop activities and send them to InfoSphere Guardium Collector for security analyst preparation reports. You'll see several examples of out-of-the-box reports to help you get started quickly.

InfoSphere Guardium Introduction

The IBM InfoSphere Guardium solution continuously monitors database transactions with lightweight software detectors, as shown in 1.

Figure 1. InfoSphere Guardium Data Activity Monitoring


These probes (known as S-TAP, for software) monitor all database transactions (including privileged user transactions) at the operating system kernel level without relying on the database audit log to ensure separation of responsibilities. S-tap also does not need to make any changes to the database or its applications.

The probe forwards the transaction to a strengthened collector (a device) on the network, where it is compared to the previously defined policy to detect the offending operation. The system responds to a variety of policy-based actions, including generating alerts.

InfoSphere Guardium supports a wide variety of deployments to support very large and geographically dispersed infrastructures. Because this article is just a brief introduction to InfoSphere Guardium, you can see the Resources section for more links to InfoSphere Guardium features. Note that not all features are available for all data resources.

Benefits of using InfoSphere Guardium for Hadoop monitoring

Using InfoSphere Guardium can greatly simplify your audit readiness process by providing targeted, actionable information. Imagine that if your current Hadoop Audit Readiness plan is based on compressed log data, then hopefully you will never need it, and you may not be able to meet many audit requirements in terms of timeliness alone. Forensic analysis is undoubtedly time-consuming and requires a waste of resources to develop your own scripts, which you prefer to use to create business advantages over Hadoop.

With InfoSphere Guardium, a lot of heavy tasks can be handed over to you. Define the security policy to specify what data needs to be saved and how to respond to policy violations. Data events are written directly to the InfoSphere Guardium collector, and privileged users do not even have the opportunity to access and hide their traces. Out-of-the-box reporting allows you to immediately start running Hadoop monitoring quickly, and these reports can easily be customized to meet your audit needs.

InfoSphere Guardium S-tap was originally designed to improve performance at a small cost, after all, S-tap is also used to monitor the product database environment. With Hadoop, you're less likely to see more than 3% overhead, which is minimal for most hadoop workloads.

Finally, InfoSphere Guardium provides monitoring capabilities throughout the Hadoop stack, from the user interface up to the storage, as shown in 2.

Figure 2. The importance of data activity monitoring throughout the Hadoop stack


Why is this the most important? Although many of the activities in Hadoop are broken down into MapReduce and HDFS, at this level you may not know what a higher-level user in the stack really wants to do or even know who the user is. This is similar to showing some disk segment I/O operations, rather than an audit trail of a database. Therefore, it is possible to understand the activity only by providing different levels of monitoring to be able to audit activities that enter directly through the lower points in the stack.

Hadoop Activity Monitoring

The events that can be monitored include:

• Session and user information.
HDFs Operations – commands (cat, tail, chmod, chown, expunge, and so on).
MapReduce Jobs-Jobs, actions, permissions.
• Exceptions, such as authorization failures.
hive/hbase queries-Change, count, create, delete, get, place, list, and so on.
The following example describes how to display some simple Hadoop commands in the InfoSphere Guardium report.

Other pages see below URL

The original text from "Wind Letter Net", reproduced please keep the original link: http://www.ithov.com/server/124456.shtml

Monitor and audit access rights for IBM InfoSphere biginsights and Cloudera Hadoop

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.