Today, data is growing exponentially. Some experts have recently suggested that the rate of growth is equivalent to placing a grain of rice on the first grid of a chess board, the second of which is the square grain of its number, and the third to place its number at three square meters. When placed in the 64th lattice, the last grain of rice will be equal to 1000 times times the annual output of the whole world.
Today's industries, including healthcare, finance, retailing and government agencies, face the problem of how best to use the vast amounts of data they collect. The large data service providers provide a variety of applications to facilitate the analysis of large amounts of data, the extraction of great value insights, and thus able to cross departments, cross business functions to help the development of all walks of life. This convergence has led to a rapid increase in the volume of large data use in all sectors of the economy.
However, it is clear that as large data applications connect to the database through the network environment, the most sensitive information of the enterprise is not secured by the same level of priority. To ensure that hackers are protected from the risk of data theft, companies should take the advantage of moving to the full use of large data, as well as the need for relevant security measures to protect the integrity of their data assets.
Process automation
To get valuable insights from large data analysis, large datasets are divided into smaller, higher-fiber analysis components, processed separately by a Hadoop cluster, and then regroup to produce valuable information. The process is almost completely automated, so a large number of machines are needed to communicate with the machine (Machine-to-machine) over the network.
Several levels of authorization occur in the infrastructure of Hadoop, including:
Accessing the Hadoop cluster
Inter-cluster communication
Cluster Access data source
These authorizations are often based on SSH (secure Shell) keys, which are ideal for using Hadoop because of their security level to support automated machine-to-machine communications. Many popular cloud-based based Hadoop services also use SSH as a way to access the Hadoop cluster. Ensuring that identity in a large data environment is granted is a high priority, but it is also challenging. For those who wish to make use of large data analysis, the following questions should be considered:
1. Who should be responsible for establishing a large data analysis authority?
2. Are these authorizations properly managed?
3. Who can access these specific authorizations?
4. What happens if the original authorized creator leaves the job?
5. Does the security rule "need to know" directly affect the access authorization level?
These problems are not just about big data. In fact, these issues are becoming more important as data center automation business processes increase. Automated Machine-to-machine Transactions account for 80% of all communications in the data center, while most administrators focus on 20% of the traffic associated with the employee account. Some industries that rely heavily on data, such as finance and cloud-based services, typically have a 4:1 per cent ratio of machine-based identification to human-computer interaction. So why is this big identification set overlooked? It is clear that the sense of urgency in dealing with Machine-to-machine identity management has increased as the volume of large data has risen.
Business crisis
Ignoring the risk of machine-to-machine authentication is terrifying, and these licensing mismanagement can lead to serious data leaks. Although the security management of end-user identity has been significantly improved, the machine based authentication is seriously ignored, causing the IT environment to produce a far-reaching attack vector.
Part of the reason for this neglect may be the challenge of implementing changes to the system in operation. Bringing Central authentication and access management to thousands of machine based authentication is undoubtedly a complex task. Given the rising risk, new tools and processes are needed to circumvent and counter this risk.
Currently, IT administrators use manual tracking of authentication keys to protect machine-to-machine transactions. Outdated methods, such as spreadsheets or home-grown scripts, are a popular choice for monitoring, allocating, and inventory checking keys. As with all manual management, human error can cause its own errors, making many deployment keys frustrated. This approach usually lacks regular scans, so administrators who don't have system management knowledge can plug in from the back door.
Compliance means the relevant aspects of business
Compliance standards across multiple industries require central control over these authentication keys. If the enterprise does not do so, it can face huge fines and irregularities to be identified, reputational damage risk. For example, the recently tightened PCI standard requires that any enterprise that accepts a payment card must have strict control over who has access to sensitive financial information. That would affect the banking, catering, retail and healthcare industries; As a result, many vertical industries are rapidly changing their safety status to minimise the risks associated with non-compliance.
Steps to enhance Machine-to-machine network security
To address these risks, we recommend that your organization take the following steps to achieve best practices:
Discovery: Data Center Administrators, security teams, or authentication and access managers have limited visibility into where identity information is stored, what identity information is allowed, and which business processes support access. Therefore, the important first step in the repair process is quite passive, non-invasive discovery.
FIX: As visualization and associated controls are established, some of the necessary authentication violations can be updated without disrupting ongoing business processes. For example, a machine's authentication can support a positive process, but it can have a higher level of privilege than required. With centralized administration, the level of permission assigned to that identity can be repaired.
Monitoring: the need for continuous monitoring of the network environment to determine which identities are frequently used and which are associated with inactive users or processes. The upside is that, in many enterprises, unused (and therefore no longer needed) identities tend to dominate. Once these unused identities are identified and removed, the scope of the processing is significantly reduced.
Management: Machine identity additions, changes and deletions should be implemented through central control. This approach enables policy-based management to effectively grasp how the relevant identity information is used, to ensure that it no longer adds an unauthorized identity, and to provide evidence that conforms to the validation.
Of course, Rome was not built in a day, and it could not be achieved completely at once. The process of using related tools and processes on a daily basis also facilitates these best practices and provides active risk management and compliance management.
Looking to the future
When it comes to analyzing access to value from the information, large data opens up unlimited possibilities, but in the endless stream of new technologies, companies need a way to protect the security of data information and maintain the same pace of development as current risk threats. Automated Machine-to-machine Identity management can bring significant business benefits, such as saving time and cost. In addition to immediate interests, but also conducive to long-term compliance and enhance corporate credibility. Organizations that want to leverage all the big data advantages must also ensure the highest level of security, and be sure to follow the best practice scenario described above.