Weighting The Difference Between DevOps and SRE

Source: Internet
Author: User
Keywords sre devops difference between sre and devops devops sre
DevOps and SRE seem to be two sides of the same coin. They all aim to bridge the gap between the development team and the operation and maintenance team, and they want to improve the efficiency of software deployment and the reliability of software operations.

In most companies, we can see that the responsibilities and capabilities of the development team and the operation and maintenance team overlap. So what is the difference between DevOps and SRE, and what does each mean? Let's see.

Development, operation and maintenance and reliability
Before DevOps was implemented, the development and operation teams were two independent teams, each with its own goals. The differences and lack of communication between these teams often affect the product, which ultimately affects the user experience and company effectiveness.

To better communicate and build better products, DevOps has become one of the most critical positions in every company.

The definition of DevOps is "a culture and practice of software engineering, aimed at unifying development and operation and maintenance". This term was originally coined by Andrew Shafer and Patrick Debois in 2008. Although it took several years to become a general concept, today, almost every enterprise is using DevOps.

The concept of Site Reliability Engineer (SRE) has been around since 2003 and is even older than DevOps. It was created by Ben Treynor, who founded Google. According to Treynor, SRE is "the software development engineer begins to take on the tasks of the operation and maintenance personnel"

Like DevOps, SRE will also integrate the development team and the operation and maintenance team to help them become familiar with the work and tasks of another team while making the entire application lifecycle visible.

Both DevOps and SRE advocate automation and monitoring, and their goal is to reduce the time from development to deployment and production, while not affecting the quality of the code or product.

Google pointed out that SRE and DevOps are not very different from each other: "In terms of software development and operation and maintenance, they are not competitive, but are close friends who aim to break organizational barriers and make better software deliver faster. "

Difference between DevOps and SRE
As mentioned earlier, the concept of DevOps is to combine development and operation and maintenance, define the behavior of the system, and understand what needs to be done to bridge the "gap" between the development team and the operation and maintenance team. DevOps theory is about what needs to be done to bring the development team and the operations team together.

According to Google, this is the main difference between DevOps and SRE. DevOps only cares about what needs to be done, but SRE talks about how it can be done. SRE is to expand the theoretical part into an effective workflow through the use of correct methods and tools. This also involves sharing responsibilities between everyone and making everyone have the same goals and vision.
To further illustrate the difference between the two, Google has released a series of videos and posts that introduce the differences between DevOps and SRE. In an article written by two Google employees (Seth Vargo and Liz Fong-Jones), they explained the SRE

1. Reduce organizational silos

Often in large organizations with complex organizational structures, many teams work independently. Each team pushes the product in different directions and does not communicate with other members of the company. Therefore, they cannot understand the overall situation of the product as a whole. This may cause problems in deployment.

The job of DevOps is to reduce silos and ensure that the final goals of different teams are the same. Organize the team through a common vision.

SRE is no longer talking about how many islands there are in the company, but about how to involve everyone. This is done by using the same tools and technologies throughout the company, which in return helps share ownership among everyone.

2. Accept failure
Although the concept of DevOps is to prevent failures before they occur, unfortunately, we cannot avoid failures. DevOps treats failures as inevitable.

In SRE, formulate a formula to count failures. In other words, SRE hopes there are not too many errors or failures.

This formula is measured with two key identifiers: Service Level Indicators (SLIs) and Service Level Objectives (SLOs).

SLIs measure the failure of each request by calculating the request delay, the throughput of the request per second, and the number of failures. SLOs source indicates the success of SLI within a certain period of time.

3. Implement gradual change
More and more companies want to be able to release frequently, constantly update and iterate their products, and keep team members concerned about new technologies and related technologies.

The same is true of DevOps, but it must be done in a gradual and manageable way. Both DevOps and SRE want rapid development, and SRE emphasizes reducing the cost of failures while doing so.

4. Tooling and automation
As mentioned earlier, automation is one of the main focuses of DevOps and SRE. Both DevOps and SRE encourage as many tools and automation as possible to reduce the error rate for developers and operations by eliminating human operations.

5. Measure everything
Automated workflows require continuous monitoring. Both DevOps and SRE teams need to ensure that they are moving in the right direction and do this by measuring everything.

The main difference here is that SREs revolve around the concept of "operations are a software problem" so that they define some usability measurement methods.

SRE also ensures that everyone in the company knows how to measure reliability and what to do in the event of a failure.

What does reliability mean?
Above, we discussed the division of responsibilities, accepting failure and measuring everything. Now, we need a way to ensure that everything is indeed working and reliable. In other words, there should be a unified method to measure the reliability of each level.

SRE is measured by SLIs and SLOs. The DevOps team will measure the failure rate and the success rate over a period of time, and both are usually carried out using different tools and methods. Reliability is not only related to infrastructure, but also to application quality, performance, and security.

The problem may occur in different aspects of the application, and when a failure occurs, we need to have reliable data to understand the cause of the problem. If we break down the data, including:

Stack information
Variable status
JVM state: thread, environment variable
Related log statements (including DEBUG and TRACE in production)
Event analysis (frequency, failure rate, deployment, application)
Since this data is vital information, we must ensure that it is reliable and operational

To sum up
SRE has a clear definition and puts forward a series of direct expectations. However, DevOps is more like a "free spirit", and its definition and views vary from organization to organization.

However, the DevOps and SRE teams are not very different. Both help to integrate developers and operations teams, while taking on similar responsibilities and focusing on achieving automation and reliability.

The most important thing is that everything is related to data. You need data information to measure success and failure, and how to achieve continuous reliability throughout the application.
Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.