Docker Container Executor (DCE) is an important feature of Hadoop 2.6.0: Big Data's giant Hadoop can finally take advantage of the power of the beloved Docker in the current virtualization/Cloud world.
There are a lot of articles about Docker concept, so this article is not going to go into it, just refer directly to the description in the Hadoop community: "Docker (https://www.docker.io/) combines an easy-to-use interface to Linux containers with easy-to-construct image files for those containers. In short, the Docker launches very light weight virtual machines. "While Hadoop uses Docker capabilities primarily with its new components Docker Container Executor ( DCE). The YARN container can be executed in a Docker container using the Dce,yarn NodeManager.
According to the Hadoop community article http://hadoop.apache.org/docs/r2.6.0/hadoop-yarn/hadoop-yarn-site/DockerContainerExecutor.html, We can try this feature. Before you try, you need to install the Docker components beforehand and download the Docker image sequenceiq/hadoop-docker:2.4.1.
Also, change the Yarn-site.xml configuration:
<property> <name>yarn.nodemanager.docker-container-executor.exec-name</name> <value >/usr/bin/docker</value> <description> Name or path to the Docker client. This is a required parameter. If This is empty, the user must pass a image name as part of the job invocation (see below). </description></property><property> <name> Yarn.nodemanager.container-executor.class</name> <value> org.apache.hadoop.yarn.server.nodemanager.dockercontainerexecutor</value> <description> This is the container executor setting, that ensures, Alljobs is started with the dockercontainerexecutor. </description></property>
Finally, you can submit a Hadoop MapReduce job Teragen to Resourcemanger by command: "Hadoop jarshare/hadoop/mapreduce/ Hadoop-mapreduce-examples-2.4.1.jar teragen-dmapreduce.map.env= " yarn.nodemanager.docker-container-executor.image-name=sequenceiq/hadoop-docker:2.4.1 "- dyarn.app.mapreduce.am.env= "Yarn.nodemanager.docker-container-executor.image-name=sequenceiq/hadoop-docker : 2.4.1 "10000/tmp/teragen".
MapReduce Job Execution:
During the MR job run, we can observe that 3 Docker container are also being executed--MR job in them:
Of course, the execution results can be seen from the Resourcemanger Web console after successful execution:
Key features of Hadoop 2.6.0 Docker Container Executor (DCE)