Stages of devops

Last Update:2018-12-07 Source: Internet

Author: User

Tags statsd qcon

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

This article is based on my "devops is a stage, not a specific state" speech at devopsdays in Sweden. If you are interested, you can watch my speech online, but you do not need to watch it before reading this article.

Related Vendor Content

JBoss Researcher Zhang Jianfeng confirmed to participate in qcon Beijing to share how to use JBoss as7 to build an enterprise private cloud

Sohu mailbox, the largest Python application in China, shares experience on qcon Beijing enterprise development topics

Over the past few yearsArticleThe term devops is known in speeches and conversations. Devops claims that it can build faster feedback and reduce product iteration costs while improving the overall system stability. The goal of devops is impressive, but as a new concept, it cannot prove to be able to achieve this expected goal, so related activities can easily be ignored or undone. With the development of devops, many companies have benefited from devops and a large number of devops-based organizations have emerged. Now is a good time for you to investigate and practice devops.

For external customers, it is easy to think that adopting devops is just a simple change, more like opening a light bulb. From this perspective, implementing such a change is a daunting task that may not be implemented. Like traditional engineering, when you try to build complex things without breaking them down, the result is usually a failure. Fortunately, devops can be broken down into a series of stages. The content and time limit of changes in each stage can be controlled by your organization.

For ease of understanding, here we use a timeline chart to describe: the leftmost of the horizontal axis represents the traditional O & M mode, and the rightmost represents the devops mode. In this way, no one will propose "has your company implemented devops ?" For example, "Can you talk about the depth of devops in your company ?"

It should be noted that the points and cases in this article are based on specific organizational structures and are not universally applicable. These assumptions are based on my personal experience-I have worked with O & M and development teams in a number of companies (O & M teams are responsible for maintaining the development environment) and have worked on several projects. If these assumptions are different from the current situation of your organization, the viewpoint may not apply. Of course, the advantage is that these ideas can be applied to similar teams, which combine our practical implementation experience in multiple work environments.

Devops Scope

To better understand the stages that we have crossed in implementing devops, it is necessary to discuss in detail what is represented on the leftmost and rightmost sides of the timeline.

The leftmost part indicates the culture and practice of traditional O & M.

An extreme situation of traditional O & M can be described as "black box O & M ". In this culture, O & M and development are separated, and they generally do not cooperate with each other. Even if they cooperate, they are extremely reluctant. It is characterized by the opposite goals of development and O & M. The task of the development team is to add new features to the product, constantly upgrade the product, and set performance accordingly. The goal of the O & M team is stability first. If there is not enough communication flow, the two teams will have a conflict. When developers are eager to develop new features quickly, the O & M personnel are not in the mood to deploy new features. Any type of changes to the stable system will lead to system risks, so O & M personnel will try their best to avoid changes.

For exampleCodeThere is a bug that leads to infinite loops under specific boundary conditions, and QA or testing personnel did not find this problem. If the O & M personnel deploys this change, the CPU usage of some servers will soar to 100%, resulting in unstable services. If the O & M personnel do not implement changes, there will be no problems, at least there will be no new problems. This is the concept of the leftmost traditional O & M.

The rightmost shows the full implementation of devops. Here, Development and O & M are a role. In this case, development is O & M, and O & M is development. The team's common goal is to add new features and ensure a certain degree of reliability.

When you understand what is expressed on both sides of the timeline-with special emphasis, both sides are extreme-from one extreme to another, it looks incredible. However, this incredible situation occurs because you think this transformation is done in one go. If you divide the timeline into different manageable sub-stages, it will be easier to implement devops with clearer benefits and predictable results.

Culture and technological changes in devops

Devops requires cultural and technological changes within the organization. From the perspective of team culture, the traditional thinking of O & M and development needs to be changed in order to communicate openly and honestly and achieve the unity of goals. From a technical perspective, developers need to understand how the O & M team works and deepen their understanding of the system architecture. O & M personnel need to understand the development process and have a deep understanding of the Code content.

After breaking devops into various stages, I found that it is easier to introduce the concept of devops through the exchange of culture and technology. This idea will be applied in subsequent work. The reason for doing so is that change is difficult and almost impossible. Through Alternate Changes, each change will be more accepted. Therefore, we do not need to make changes, but through a series of cultural and technical changes, and ultimately achieve devops. In this way, the team will not feel that the environment is suddenly unable to adapt. Changes are more natural, and organizations are more likely to accept them.

Indicator Monitoring is everywhere

The first step towards devops is to enable metric monitoring on the architecture and application layer within the Organization. Or, I prefer to call it monitoring everywhere. There are many speeches to discuss this topic, but in the end it is a key question: what did my code do?

Developers are more happy to answer this question by showing you the code. Unfortunately, the Code only reflects what the code should do, rather than what it actually does. The code is like a cooking recipe: it records the steps needed to make a delicious food, but it does not control whether the food is actually delicious. In our daily life, we all tried cooking according to recipes, but the results were unsatisfactory. Similarly, the Code describes the process of achieving the expected goal, but the actual impact of the code on the system cannot be predicted by the Code itself. In the following example, the developer changes the cache expiration time from 3600 seconds to 1800 seconds. Of course, these changes are obvious, but the impact on the entire system is unknown.

The O & M personnel can answer this question by logging on to this machine and getting information such as memory utilization and CPU utilization from the running system to determine the impact of cache expiration time modification on the entire system. This is the correct method! This method reflects the real impact of code modification on the system. O & M personnel analyze the specific impact through more data. The data provides answers to many important questions, such as "what is the impact of this change on the system ?" Or "Why is service y slow after service X is deployed ?" In the past, when answering these questions, developers only considered how the code (in theory) runs. However, the actual data is more convincing than the theoretical basis. Data seen by O & M personnel: Running data of the production system.

Here we need to remember the position in the devops process. We just took a small step towards devops from the far left and are still in the traditional O & M environment. Therefore, the current permissions for developers to access the production environment do not play any role. Most developers do not adapt to this new environment, so they naturally return to the original environment, which is human nature. When trying to change within the Organization, no one will support the change without considering the personnel acceptance, and eventually have to return to the original method.

You can simply use a development-friendly way to present the data: charts. Drawing Technology has been developing for many years, but graphite and statsd have become popular in the past few years. Monitoring System metrics in graphite and providing APIs related to developers can accomplish two tasks simultaneously: O & M personnel can display system metrics and developers can display application metrics. In addition to viewing statistics on events such as application logon and logout, developers can now view CPU, memory utilization, and other data.

Developers can add a line of code to calculate related monitoring metrics:

Based on the monitoring data, the following images can be drawn using graphite:

For traditional O & M teams, setting up a monitoring system to display these metric data is a piece of cake, and calling the statsd and graphite interfaces is very simple. For the development team, only a few lines of code are required. With these technological changes, developers can now fully understand the impact of code on the system and have a general understanding of the O & M work. At this stage, the development and O & M cooperation has started. Although it is only a small aspect, you can say that you have taken a step towards devops. We have chosen an appropriate starting point, so we will continue to develop on this basis, and this kind of cooperation will become more extensive in the future.

Documented infrastructure

After understanding the performance and status of the production system, developers will naturally become interested in the relevant underlying systems. For many developers, a large-scale production system is like a black box: A request is input and a response is returned, but it is completely unclear which systems have passed through.

To solve this problem, the infrastructure should be documented. In the early stage, you can use the basic high-level flowchart to plot the request processing process and reflect the situation of various software processing requests in various stages. As the document process goes deeper, the document should record the specific functions of each module in the system architecture and the advantages of this module over other solutions. In addition to specific software, documents should also record the online processes of new servers, potential faults and solutions, UNIX system tools, and so on. The content recorded in the document is to make it easier for developers to understand the production system architecture from a higher level.

With these documents, developers can learn more about the system architecture at any time. Through the monitoring system we previously deployed, developers can learn about the running status of the system in a simpler way, so they are more likely to be interested in the infrastructure. After monitoring metrics and recording documents are set, the O & M black box problem will be solved gradually. Although there is not much collaboration between the two teams, the gap between them is quickly disappearing.

Make the development environment an image of the production environment

Up to now, Development and O & M are primarily communicated through monitoring data and documents. Based on these cognition, developers prefer to perform some tests in the actual environment to understand the internal mechanism of the system. These operations in the production environment are not only unrealistic, but also affect the system stability. A good solution is to provide some sandboxes for developers to test.

To meet this requirement, a tool such as vagrant can be used to package and distribute the development environment as a virtualbox virtual machine. These virtual machines are created through standard configuration management tools such as Chef, puppet, or the most basic shell script. O & M personnel can use these tools to quickly configure the same development environment as the production environment. Developers hope to work in such an environment because it is basically the same as the production environment. In addition, developers no longer have to worry about the need to manually configure the development environment, because the O & M personnel have taken care of this work through the vagrant tool.

This development environment based on the production environment configuration is equivalent to the sandbox of the developer's real system. If any problem occurs, delete the Virtual Machine and create another one. Sandbox is simple on the surface, but the specific configuration process allows developers to understand the server allocation process, the O & M implementation change process, and the actual system architecture.

Devops office hours

Developers now have a sandbox that can be tested on the "Real System". They can also use documents to gain a deep understanding of the system and obtain monitoring data of the production system. Despite this, the O & M work is still prohibitive for developers. Fortunately, everyone is friendly, and now is the time for the two teams to really start working together. Cooperation can be originated from forums, help stations, or even directly communicate with someone.

With regard to this practice, I found that cooperation during office hours is the best solution. Work is a fixed schedule for O & M or development engineers. This period can be used to answer any type of questions. These questions can be simply described as "how can I search for files on a machine ?" Or is it complicated to "can you explain why we need to configure the haproxy parameter like this ?" The advantage of office hours is that no matter what questions you ask, no one else will comment on you. Engineers can raise any related questions during their office hours without any concerns.

At this stage, an important milestone has been reached: communication! The development and O & M teams already understand each other. They communicate with each other, cooperate with each other, and interact with each other.

Avoids the risk of O & M by the development team

Before proceeding to the next step, I need to point out that in the current stage, the devops culture in your organization is much better than that in most organizations. At this stage, we have introduced the devops culture in an orderly and low-risk manner. Next, we will reach the rightmost of the devops timeline described above. Compared with the previous stages, this stage will be more radical and has not yet been fully defined. However, some organizations have been completely transformed into devops, and they have gradually realized the benefits of these changes.

Developers now have the support of tools to start and take responsibility for real O & M work. Like the previous stage, this stage can also be divided into smaller stages to avoid risks and make it easier for people to accept them.

The first step is to use the standard open-source mode for changes: Pull request and code review. When a developer wants to add something new, he or she can directly make a change and release a pull. Developers can use virtual machines configured in vagrant to test these changes. The pull request provides the O & M team with an opportunity to review and test the completeness of these changes. If there is a problem, report it to the developer so that the problem can be avoided. Finally, when a pull request is merged, developers can be confident and proud of implementing a change, and O & M personnel can rest assured that the change Review is the responsibility of O & M.

Step 2: This step is still experimental at the time of writing: continuous integration of O & M. The most basic thing is to verify that the scripts used by O & M personnel to create a sandbox environment (possibly managed by vagrant) are correct through a continuous Integration Server similar to Jenkins. You can also perform basic smoke tests. For example, the infrastructure can successfully generate an HTTP request.

When some or all of the above work is completed, developers can safely make changes. O & M personnel are still responsible for the Change Review, so they can rest assured. From now on, O & M and development personnel can truly work together and take responsibility together. Although there are still some differences between the development and O & M teams, these differences will soon disappear.

Developers: Crazy!

Now, we are going to enter the rightmost end of the devops timeline: the development team is responsible for all O & M work. After several stages, technological and cultural changes have made devops possible. In actual situations, these stages are continuous, that is, they keep two independent teams working together more and more. The O & M team can be reduced, while the development team can be expanded. Developers perform O & M under the guidance of a few O & M personnel. In response to system interruptions, developers should be on standby, while O & M personnel should be back to the second level.

I would like to reiterate that the premise for entering this stage is that the foundation of the previous stages is very solid. Metric data is collected to understand the overall impact of the developed code on the system. Documented work allows developers to learn more about the production environment architecture, so as to better understand the impact of different changes on the system. Using virtual machines and workflow systems built with automatic configuration scripts not only saves O & M personnel time, but also provides developers with a sandbox environment for research. Offices and forums are reliable environments for developers and O & M personnel to understand and learn from each other. You can freely ask questions you want to know. Automated architecture testing and code review provide a level of security for development and O & M, reducing the risk of O & M personnel changes. The final result is that communication between teams is more free, trust each other more, and blur the boundaries between teams.

Devops

Devops has many benefits. First, the trust and cooperation within the Organization will become wider. The delivery frequency of new features is higher, and the O & M team does not need to say "no"-because more people are involved in O & M, and developers are also responsible for changes. Devops can also improve system stability. Believe it or not-because more people are paying attention to the impact of changes on the system. Because of the fast delivery of new features, changes that require downtime for large-scale upgrades will be reduced. Relatively, the scale of changes will be smaller and more controllable, and there may be no downtime at all.

On this timeline, Where are you located? Where do you want to reach? As long as you are not on the far left, it means that your organization has begun to move towards devops. Splitting devops into these sub-stages aims to reduce risks and improve your self-confidence. If you believe that these changes are not carried out as expected, you can first return to the previous stage, try again later.

View Original: http://www.infoq.com/cn/articles/wide-range-devops

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More