I. risks are classified into internal and external
First, internal:
During the deployment of CDH Big Data clusters, users named after services are automatically created,
Username (login_name): Password location (passwd): User ID (UID): User Group ID (GID): annotation description (users): Home directory ): log on to Shell)
CAT/etc/shadow
The format of the second column in the shadow file. It is the encrypted password. This column is "!! ", That is ":!! : ", Indicating that the user has never set a password.
In this way, malicious users may pretend to intrude into hadoop clusters as real users or servers, submit jobs maliciously, modify the jobtracker status, and tamper with data on HDFS, disguise as namenode or tasktracker to accept tasks.
Solution:
Added the Kerberos authentication mechanism. So that the nodes in the cluster are what they claim and are trustworthy. Kerberos allows you to place the authentication key on a reliable node before deploying the cluster. When the cluster is running, the nodes in the cluster are authenticated using the key. Only Authenticated nodes can be used normally. The node to be impersonated cannot communicate with nodes in the cluster because it does not obtain the key information in advance. It prevents malicious use or tampering with hadoop clusters and ensures reliable and secure hadoop clusters.
CDH big data cluster Security Risk Summary