Author Madequi published on September 3, 2014 |
Recently, Gigaom senior editor Derrick Harris published an article titled "The Hadoop Task is about to be able to run easily and safely in the Docker container." At the beginning of the article, Hadoop start-up Altiscale will turn Docker into a suitable environment for running Hadoop tasks, and is nearing completion. Now, one of the biggest unresolved issues is network security.
Raymie Stata is a former Yahoo CTO and a founding member of Artiscale, the Hadoop-service start-up. He and another engineer, Dinesh Subhraveti, briefed Harris on their work.
According to Stata, they are working closely with the Docker community to promote Docker integration with yarn. This, he argues, is important for any enterprise that needs to deal with a multi-tenant hadoop environment. Docker can not only provide a fast, standard way to deploy applications to yarn, but also to implement isolation between applications. This is important both in terms of security and performance. The following is a schematic diagram of the Docker container running the application:
However, Subhraveti points out that before implementing Docker integration with yarn, a major improvement is to introduce user namespaces into Docker, ensuring that applications with root permissions do not compromise the host, make it unsafe, or degrade the performance of other containers. The work may not be complete until the end of the year, when Hadoop users should be able to run Docker containers on yarn and not worry about security issues.
Recently, there have been more discussions about Docker. Readers who are interested in Docker can read another article published by Gigaom earlier, the ideal time for Docker to seize open source as the darling of the clouds.