1 Preface
The rapid development of modern information technology, software and system code size has become more and more large, and the number of components, rely on complex, each new version of the release is like a ride without a seat of the green car long-distance night, exhausted. Software delivery is a complex project that involves every detail of software development, and any problem in any one of these loops can lead to software not being delivered in time or the quality of delivery is worrying.
From the perspective of the enterprise, how to use more scientific tools, more scientific process to improve product quality, improve customer satisfaction, is just needed. From the staff point of view, life is worth pursuing a lot of things, can not waste precious time on some mechanical, repetitive things above.
Lenovo Enterprise Network from the beginning of 2007 for enterprise customers to provide professional cloud storage services, 10 years to serve 250000 + enterprises. Software update iteration is commonplace, Lenovo Enterprise network disk is composed of hundreds of servers, is a very complex Internet application, only in the service side has dozens of modules work together, plus a variety of clients, need to use a different compilation publishing environment, sometimes need a separate module release, Sometimes multiple modules are required to be released together, making each upgrade very complex. Once experienced a major version of the upgrade iteration, the operations and research and development team did not sleep for more than more than 40 hours, not only affect the user's services, but also make the team exhausted. Similar experiences have led us to think about how to solve this problem through technological innovation, to liberate our engineers from simple labor, and to be able to cope with larger clusters in the future.
Shortening the online time and improving the accuracy of the line is our original intention to build this system.
2 problem
Let's start by borrowing a picture (from the official ThoughtWorks document) to review a complete flow of software releases:
Throughout the process, code management, integration and testing, released on-line is the 3 main link. All of our problems are focused on these 3 links.
1. Code Management
Code management Confusion is a research and development team's frequently asked questions, the development process, the Code branch design is unreasonable, the branch is too much or too little, the branch relies on the confusion, the privilege control is missing, completely by the rule of man, no code review.
2. Integration and Testing
From the research and development environment to the test environment, there is no unified deployment environment, the research and development team directly to the test published (Wild version), because the compilation environment, personnel level differences will lead to a variety of inexplicable (sometimes very low) problems, greatly affecting the efficiency and accuracy of the test.
3. On-line delivery
When the code is finally deployed to the production environment, it requires operations personnel and developers frequently manual operation, time and effort, but also error-prone, the entire process is not repeatable and no records, the rollback operation is complex, and sometimes can not be rolled back, once on-line error, the impact on our users is very bad.
3 Practice
Over the years, we continue to summarize in the development process, think a lot of ways, in the service of customers at the same time accumulated a large number of production environment operations and maintenance experience, developed a number of tools and processes to solve the upgrade and product on-line issues. , we share some of our approach to building continuous delivery systems, based on the production practices of Lenovo's enterprise network disks.
As shown, we mainly discuss these aspects:
3.1 Code Management
Code is the source of the software delivery process, so reasonable planning and management is particularly important.
3.1.1 Code Warehouse
In the early days, all of our developers ' code was stored in an SVN repository, with branches and tags scattered across the sub-directories of each module. SVN is a good tool, but too flexible, we must strictly abide by the discipline, but more often rely on the self-consciousness, but people will always have slack. Once someone breaks the rules, it is a miserable process for the later.
So our first step is to move SVN to Git. The modules are split into separate libraries, each of which is licensed separately and unified by the branching model. Warehouse software with the Gerrit, it was originally a code audit tool, with a strong rights management system, Git repository is just the accompanying features.
In fact, when migrating from SVN to git, there are a lot of engineers who have questions about why they migrated to Git. It's not bad for SVN, it's not about chasing the tech bandwagon, it's about automating the work behind it (including code-checking tools), and of course, Git's powerful branching and distribution is a big reason.
3.1.2 Branch Design
Branch we refer to a more common Git branching model (reference link) and make some adjustments to our own needs, such as:
1, design two main branch, Dev and Master,dev is the development branch, master is the external stable branch, the continuous delivery system will be built from the master branch to pull the replacement code;
2, auxiliary branches only use feature branch and Hotfix branch, feature branch in principle is not built as far as possible, only for the development cycle of a long-term development of new features, fast track approach feature are directly submitted to dev.
3.1.3 Audit
Code is the source of product quality, code quality is not good, and other more auxiliary means are useless. Code auditing is a critical part of ensuring code quality. Code audits should be implemented as long as the number of team members is greater than one.
There are two modes of code auditing:
• Pre-integration review (pre review)
As the name implies, the code is merged into the target branch before the code review, there are problems, change, and then continue to audit, audit through the integration into the target branch, this kind of Audit representative tool software has: Github,gerrit, where Github is a branch for audit, Gerrit Audited as a unit of submission.
L Integrated post-audit (Post review)
Merge the code first, then audit, there is a problem can only be repaired with the new submission, this kind of Audit representative tool software (in fact, these two software also support pre review): Reviewboard,phabricator. This approach tends to cause the target branch to be unstable, so it is generally not recommended.
We are using the first method of pre-integration audit, the tool software used by the Gerrit, in order to commit as a unit, after the forced review and then merged into the target branch (of course, the process is automatic).
Well, say not much, there is a picture of the truth, is our code submission workflow:
The yellow part of the figure is part of the code review, and each submission needs to be audited by someone else (code Review +2) and a continuous Integration system verified (Verify + 1) to merge to the target branch.
Code Review page:
3.2 Build a deployment
Here I simply divided the build deployment into continuous integration and deployment pipeline, in fact, the two pieces of many places overlap, where the continuous integration only discusses the building verification and automatic integration, the deployment pipeline includes from the build to the deployment to different environments of the entire process.
3.2.1 Continuous Integration
Continuous integration is a big issue and a core practice of agile development. During the continuous delivery process, continuous integration will form a pipeline from development to deployment, at the core of the entire delivery process. The focus is on quick feedback and quickly identifying and correcting problems before integrating the code.
We isolate unit tests, compile validations, static scans, and coverage detections (this step is controlled in 5 minutes, which is one of the reasons why the library was split in the first place), triggers the build immediately after the developer submits the code, feeds the results back to the developers in 5 minutes, and then quickly fixes the error Until validation passes.
We use the tool software is Jenkins, the most popular continuous integration software, through the plug-in support Gerrit, the function is very powerful.
In the actual implementation process, each module is required to provide a clean environment to perform the compilation, Unit testing, and other steps of the script or method, the build environment can be Vagrant or Docker from the configuration, we use Docker technology internally to isolate the various build environments.
Pipeline
3.2.2 Deployment Pipeline
As the name implies, this step is to deploy packaged software to a different operating environment, and to automatically handle the configuration of each environment (such as domain name, database information, login information, etc.), this step relies heavily on the implementation of the previous steps, warehouse planning, branch planning, continuous integration of pipeline construction and so on.
A typical deployment pipeline
There are several principles to follow when building a deployment pipeline:
1, the process can be repeated;
2, one-time construction of multi-deployment;
3, modular deployment;
4, change management;
5, Audit function;
6, fast rollback.
In the selection of deployment tools, we examined two: ThoughtWorks go and Jenkins (plug-in Delivery Pipeline).
Go system comes with pipeline, but the flexibility is not as good as Jenkins;jenkins's one advantage is that our continuous integration is implemented in Jenkins, many scripts can be reused, and even many tasks can be reused directly, the disadvantage is that the pipeline between the tasks of data sharing is cumbersome, requires additional plug-ins (such as Copy Artifact), so the implementation is not very natural.
In the actual implementation process, can fully realize the automation (unattended release) is an ideal state, but the practice is always subject to a variety of factors, so when necessary, must also bow to reality. We finally implemented a process that combines a one-click deployment with a key environment (such as a production environment) that is manually triggered (this is the step of playing the small arrow in the illustration below), see:
In the implementation process, the management of configuration files is also a very important issue. Configuration files are divided into two main categories:
1, the configuration file and the running program can not be separated, such as the Java EE application, configuration files and compiled results packaged into a war file, our processing method is to store sensitive information (such as database information) in other Git libraries, when built for different environments, built by Jenkins Automatically record the version of the Code and the version of the configuration file;
2, the configuration file and the running program can be separated, similar to Nginx, we package the program into RPM or Deb, the configuration file is stored on the puppet master server, each deployment will trigger the automatic distribution of puppet.
In the continuous delivery process, we can clearly know the current each link, each node in what version of the state, which for a clear understanding, fast rollback is very useful. See, a project part of the module different Environment version information (please ignore the details of the ugly page, red means that a module is being released, not the final online):
---------------topics added---------------
@IT Mint Leaf: 4 epilogue at present, Lenovo Enterprise Network Service has fully adopted a process of on-line delivery system, from the research and development environment to the test environment to the production environment, all is the assembly line operation, to ensure that the code and version of the consistency between the modules, the representative of the integration, release only need us to tap the mouse, Then you can drink tea patiently waiting to receive the successful release of the message. Continuous delivery is a long-term need to constantly improve the process, the company's strategy in the change, product requirements in the change, people in the change, the process is also changing, we have to do is just the beginning, but also need to continue to explore, run-in, to create a more complete delivery system. This is something that any software development team needs to focus on, build specifications, develop processes, use scientific tools to practice specifications and processes, and deliver products on-time, on-demand, from a small workshop-based delivery model. (1 hours ago)
http://www.oschina.net/question/2448759_2186294
Lenovo Enterprise Network disk: SaaS service clustering Continuous delivery practices