Author: inner Calm
Original link: http://www.zjbonline.com/2016/03/05/Jenkins-Docker Build continuous integration test environment
This article will focus on how we can optimize the configuration of resources by introducing Docker to improve the performance and stability of the entire environment in the context of continuous integration and testing of Jenkins management.
About Jenkins
Jenkins is a widely used framework for continuous integration, automated testing, and continuous deployment, and even some project teams use it as a tool for process management. Depending on the size of the task, Jenkins typically has two typical ways to deploy.
Single node (Master) deployment
This deployment works for most projects with a lighter build task and a smaller number of individual nodes that are sufficient to meet the needs of daily development.
Multi-node (master-slave) deployment
Typically large scale, frequent code submissions (meaning building is frequent), and projects with large automated test pressures take this deployment structure. Under such a deployment structure, master usually only acts as a manager, is responsible for task scheduling, slave node management, task status collection, and the specific build task is assigned to the slave node. There is no limit to the number of slave nodes that a master node can theoretically manage, but the performance and stability are usually decreased with the increase in number, and the effect is different depending on the performance of master hardware.
About Docker
Docker is a virtualization platform for program developers and system administrators to develop, deploy, and run applications. Docker allows you to combine applications as quickly as you can with containers, and to shield code levels as much as possible, like shipping standard containers. Docker will shorten the time from code testing to product deployment as much as possible. In short, Docker provides a technology that allows developers to easily package the environment in which the application code is already running into a mirror, and then upload the mirror to the mirrored warehouse. A test or product environment simply downloads the mirror and then launches it to complete deployment (like opening a container). For more detailed information about Docker please refer to the official website documentation.
Current difficulties encountered by Jenkins
With the popularity of agile development, automated testing becomes a must for every project. A project that has been developed over many years has a staggering number of automated tests. To ensure that each deployment is correct, you need to return to all of your automated test cases each time. Depending on the project, some need to run a round of regression tests weekly, while others need to be round every day. So we're going to group all the test cases and run on multiple testers to reduce the time required for one round of testing. This requires that we have enough hardware resources to meet this demand. The following figure shows a typical extension structure that manages automated tests through Jenkins. A master host manages multiple testers, and master assigns the test tasks to the test machine.
Current Jenkins (MASTER-SLAVE) Structure current Jenkins (MASTER-SLAVE) structure
There are some drawbacks to this structure:
Automated testing typically simulates a user's behavior by capturing a screen control to achieve the purpose of the test, which means that only one set of test cases can be run on a single tester, otherwise the use case interferes with each other. This is a waste of resources because the configuration of a tester can often support multiple sets of test cases.
Maintenance personnel to ensure that the test machine is kept online, when the number of test machines, such as 100 or even 1000 units, this maintenance pressure is relatively large.
A regression test is usually run at night (so that the code submitted the previous day is found to be correct when the development team comes to work the next morning), but the hundreds of testers are working during the day and night, which is another dimension of waste.
It is important that the test machine will cause the test environment to be polluted for a variety of reasons, resulting in the test not to be carried out smoothly, and at this time, in addition to the maintenance Staff manual processing, there is no particularly good plan.
As we mentioned before, when the number of slave reaches a certain level, the node as master will have poor performance and stability.
Each slave node needs to download the latest distribution from the Central library before running the test, decompression, and set the operating environment. This process usually takes up a lot of time, and imagine that hundreds of slave start downloading the latest libraries, which is a huge impact on the Central library.
How to introduce Docker to solve these problems
A natural idea is to replace all the testers into Docker containers (container), while the work of managing these containers is given to more professional tools, such as Google's Kubernetes or Docker's official swarm. All built environments are packaged into Docker mirrored files, and the automated test is a mirror, a compilation unit test is a mirror, and so on. The improved topology is shown in the following illustration:
The only technical problem in this scenario is that the automated test requires a desktop system, and usually the Docker container is run without a graphical interface. The solution is also very straightforward, and we provide the desktop system (VNC service) in the container. Depending on the Linux distribution, we can choose between TightVNC or Tigervnc. Regardless of which VNC service you choose, they provide a virtual desktop that can be remotely connected to the desktop through VNC clients.
The 6 deficiencies mentioned above have been solved under this structure.
Each tester can run multiple sets of automated test cases at the same time, that is, running the same number of regression tests, which can save at least half of the test machines.
Through kubernetes or swarm, we can automate the management of the container, thereby reducing the maintenance staff's work.
Docker containers can dynamically increase or decrease as needed, and because all the environments required for the build are encapsulated in Docker mirrors, these test machines can be conveniently used to build other tasks without the need for additional configuration. Just download the mirror of the other build task and start it.
Each test starts with a new container, so the environment is completely clean. When there is a problem with a container, the container is destroyed and a new container completes the same test work.
Under this structure Jenkins only need to manage one slave node, while kubernetes and swarm can manage thousands of containers.
In the same Docker environment (the same test machine) only need to download a mirror of the latest automated test can be a number of containers, and in this structure, the number of testers has been greatly reduced, so that the Central library (private mirror warehouse) has significantly reduced the impact.
More thinking.
A project developed for 5 years, accumulated tens of thousands of regression testing, the need for dozens of test machines uninterrupted operation of 8, 9 hours to complete; Imagine, this project will continue to run for 5 years or even 10 years. The number of our test machines will be, our test feedback cycle is still 8, 9 hours. According to our observation, after years of maintenance, many functional modules have been very perfect, basically little code changes. So the test cases for these functions need to run every time. The answer I think is negative, but the question is how to determine whether the current code changes will affect these functional modules. According to the present design, I think no one dare hundred percent affirmation. Whether we need to make some adjustments to the existing code, according to the idea of micro-service, the distribution of our split, each time only need to release updated modules. As a result, our recursive automated test cases also need only those modules that contain changes. At that time, it was possible that we could run a regression test every time the code was submitted.