Deploying Hadoop distributed clusters using Docker
Looking for a long time on the internet did not find the use of Docker to build a Hadoop distributed cluster of documents, no way, can only write one.
One: Environment preparation:
1: First, you have a CENTOS7 operating system that can be installed in the virtual machine.
2: The version of Docker,docker installed in Centos7 is 1.8.2
The installation steps are as follows:
<1> installation of the developed version of Dockeryum Install-y Docker-1.8.2-10.el7.centos
<2> may error when installing, need to delete this dependency
Rpm-e lvm2-7:2.02.105-14.el7.x86_64
1
1
Start Docker
Service Docker start
1
1
To verify the installation results:
<3> Execute Docker info after startup and see two lines of warning below
Need to shut down the firewall and reboot the system
Systemctl Stop firewalld
systemctl disable FIREWALLD
Note: Restart the system after executing the above command
reboot-h (restart the system)
1
2
3
4
1
2
3
4
<4> Running containers may error
Need to close SELinux
Workaround:
1:setenforce 0 (Effective immediately without restarting the operating system)
2: Modify the selinux=disabled in the/etc/selinux/config file, and then restart the system to take effect the
recommended two steps are executed, This ensures that SELinux is also off after the system restarts
1
2
3
1
2
3
3: You need to build a basic image of Hadoop first and build it using the Dockerfile file method.
First, build an SSH-capable image to facilitate later use. (This will have an impact on the security of the container)
Note: The root user's password in this image is root
Mkdir Centos-ssh-root
Cd Centos-ssh-root
Vi Dockerfile
# Select an existing OS image as the basis from the author of the CentOS # Image Maintainer Crxy # Install the openssh-server and sudo packages and set the sshd usepam parameter to no RUN yum I Nstall-y openssh-server sudo run sed-i ' s/usepam yes/usepam no/g '/etc/ssh/sshd_config #安装openssh-clients RUN Yum I Nstall-y openssh-clients # Add test user root, password root, and add this user to Sudoers RUN echo "Root:root" | Chpasswd run echo "root all= (All) All" >>/etc/sudoers # The following two sentences are special, must have on the CENTOS6, otherwise the created container sshd cannot log on to RUN
Ssh-keygen-t dsa-f/etc/ssh/ssh_host_dsa_key RUN ssh-keygen-t rsa-f/etc/ssh/ssh_host_rsa_key # start sshd service and expose 22 ports
RUN mkdir/var/run/sshd EXPOSE + CMD ["/usr/sbin/sshd", "-D"] 1 2 3 4
5 6 7 8 9 10 11 12 13 14
15 16 17 18 19 20 21 22 23
1 2 3) 4 5 6 7 8 9 10 11 12 13 14 15 16
17 18 19 20 21 22 23
Build command:
Docker build-t= "Crxy/centos-ssh-root".
Query the image that was just built successfully
4: Build a mirror with JDK based on this image
Note: The JDK is using the 1.7 version of the
Mkdir CENTOS-SSH-ROOT-JDK
Cd CENTOS-SSH-ROOT-JDK
Cp.. /jdk-7u75-linux-x64.tar.gz.
Vi Dockerfile
From Crxy/centos-ssh-root
ADD jdk-7u75-linux-x64.tar.gz/usr/local/
RUN mv/usr/local/jdk1.7.0_75/usr/local /jdk1.7
env java_home/usr/local/jdk1.7
env PATH $JAVA _home/bin: $PATH
1
2
3
4
5
1
2
3
4
5
Build command:
Docker build-t= "Crxy/centos-ssh-root-jdk".
Query builds a successful image
5: Build a mirror with Hadoop based on this JDK image
Note: Hadoop is using the 2.4.1 version.
Mkdir Centos-ssh-root-jdk-hadoop
Cd Centos-ssh-root-jdk-hadoop
Cp.. /hadoop-2.4.1.tar.gz.
Vi Dockerfile
From Crxy/centos-ssh-root-jdk
ADD hadoop-2.4.1.tar.gz/usr/local
RUN mv/usr/local/hadoop-2.4.1/usr/local/ Hadoop
env hadoop_home/usr/local/hadoop
env PATH $HADOOP _home/bin: $PATH
1
2
3
4
5
1
2
3)
4
5
Build command:
Docker build-t= "Crxy/centos-ssh-root-jdk-hadoop".
Query builds a successful image
Two: Build Hadoop distributed cluster
1: Cluster planning
Ready to build a cluster with three nodes, one master two from
Master node: hadoop0 ip:192.168.2.10
From node 1:hadoop1 ip:192.168.2.11
From node 2:hadoop2 ip:192.168.2.12
But because the IP changes after the Docker container restarts, we need to set up a fixed IP for Docker. To set a fixed IP for a docker container using pipework
2: Start three containers, respectively, as a hadoop0 hadoop1 hadoop2
Execute the following command on the host, set the host name and container name for the container, and open ports 50070 and 8088 in Hadoop0
Docker run--name hadoop0--hostname hadoop0-d-p-p 50070:50070-p 8088:8088 Crxy/centos