Google code base has more than 2 billion lines of code, how they are managed

Source: Internet
Author: User
Tags mercurial version control system

Google code base has more than 2 billion lines of code, how they manage it?

it Blue Panther published in 2015/10/22

How big is Google? To answer this question, we can look at the income, look at the stock price, look at the number of customers, see influence. But that's not enough. When it comes to scale, Google is definitely a huge computer software empire. To prove it, we can also look at Google's code size.

In Monday, Google employee Rachel Potvin in a Silicon Valley engineering conference to mention the issue of code volume (see Video, please science online). According to her estimate, there are about 2 billion lines of total code for the Google Internet Service software (including search services, mailboxes, maps) you normally use. By contrast, Microsoft's Windows operating system, the world's most complex PC operating system, has been developing and evolving since the 1980 's, and its code size is just 50 million lines.

650) this.width=650; "id=" pic "src=" http://www.dexcoder.com/images/201539/SfKIc2plmJivyXgp.jpg "style=" Border:none ; vertical-align:middle; "/>

650) this.width=650; "src=" http://www.dexcoder.com/images/201539/bb4KonnusaDaKjde.jpg "style=" border:none; Vertical-align:middle; "/>

650) this.width=650; "src=" http://www.dexcoder.com/images/201539/tdq7cbkBkMfTCM4j.jpg "style=" border:none; Vertical-align:middle; "/>

650) this.width=650; "src=" http://www.dexcoder.com/images/201539/qeQl6jn7EeWuGCf4.jpg "style=" border:none; Vertical-align:middle; "/>

650) this.width=650; "src=" http://www.dexcoder.com/images/201539/O6fssafOoUKGo65j.jpg "style=" border:none; Vertical-align:middle; "/>

650) this.width=650; "src=" http://www.dexcoder.com/images/201539/mlqY0oiXcXoubp4g.jpg "style=" border:none; Vertical-align:middle,/> (from the Bay Area daily)

So, all Google's code is rebuilt once, which is equivalent to building a Windows system 40 of times. In fact, it makes sense to compare Windows systems (the author's point is that it might be questioned that Windows is just an operating system software, but Google search, Gmail,google maps are several software, The amount of code in a software is compared with the amount of code in several software, which seems unreasonable, and the author will explain the problem later.

Like the basic code of Windows, Google's 2 billion lines of code are used to drive the entire Google service, they are a whole! These 2 billion lines of code support Google search, Google Maps, Google Docs, Google Plus, Google Calendar, Gmail, Youtube, and a variety of other Google Internet services. And, these 2 billion lines of code are stored in a single code repository for Google's 25,000 engineers to work with. Google sees its code as a "huge operating system", "although I can't prove it," Potvin says, "but I guess it's the largest single code repository in the world." ”

Google's situation is a special case. But this example shows us how complex our software is in the internet age, and how we can adapt it to this complexity by means of reform tools. Google's oversized code warehouse is open to internal employees only. However, we also have a similar tool,--github, which is an open source repository platform that is open to all engineers around the world, and anyone can share the vast amount of code. Times have changed, and today's engineers are able to work together in a very large code base. This is the perfect way for modern internet services to sustain rapid evolution.

"Having 25,000 developers share the code base, like Google, a big company, means that developers are rich and diverse and have different skills." "However, for small businesses, you can have the same advantages (as Google) with GitHub and open source," says Sam Lambert, system director at GitHub. Because there is a good old saying: ' Rising tide '.

"The challenge with compiling 2 billion lines of code is that it's not a joke to build and run 2 billion lines of code at once. "It must be a technical challenge – a great feat." The 2 billion figure is undoubtedly shocking. ”

The genius of GitHub is that it allows programmers to share and collaborate less at the cost. But GitHub (unlike Google) doesn't store everything as a single software project. It's the way to store millions of small items. Google is one step ahead of the peers of countless small projects. Considering that this involves so many engineers, so many projects, peers sounds a little crazy. But according to Potvin, Google did it.

listen to the Piper.

In short, Google has made a "version control system" of its own, which is used to judge all its code. The system, called Piper, runs across Google's already-built, massive online infrastructure and manages all of its online services. According to Potvin, the system is distributed across 10 different Google data centers.

It's not just about the fact that Google's engineers have access to 2 billion lines of code, and the point is that every Google engineer has the freedom to use and assemble countless items in the warehouse. "You create a new project," Potvin to Wired Magazine, "and then there's an incredibly rich library of code resources for you to use." Basically everything you need is readily available. "What's even more beautiful is that when the engineer modifies the code, it's ready to deploy and then reflect all of Google's services." Just update one place and you're done updating everything.

Of course, there are limits to this system. Potvin says some highly sensitive code-like Google's Web rank search algorithm (PageRank search algorithm)-is placed in a separate code repository that only some of the authorised employees can see. Because these algorithms are not running on the internet, and so they are different from other code, Google also has the other two devices related to the operating system source code, Android and Chrome, have a separate version control management. But for the vast majority of the code, Google has saved them as a whole, and engineers can use it to build modules, propose innovations, and implement solutions.

Robot Factors

Lamber points out that building and running a system like this requires more than just knowing the basics, but also the huge computational power of the system. Piper covers 85T data (8500G), and Google's 25,000 engineers complete 45,000 commits per day. This strength is not a joke. Linux Open source operating system has a total of 40,000 files, 15 million lines of code, and Google engineers will modify 25,000 files per week, 15 million lines of code.

At the same time, Piper also need to take into account the task of reducing the burden of the program personnel, so that programmers can be buried in the process of modifying their own programs, and not stepping on other people's feet. Programmers need to be able to remove code libraries from code that is not applicable or problematic. This task is very difficult, so it is not possible to handle the work entirely by hand. Now that Google has switched the previously used Perforce version control system to Piper, Piper uses automated bots to handle most user submissions.

Of course this is not to let the robot write code, but to let the robot automatically generate when a large number of data and configuration files to assist users to run the software. "In order to keep your code strong, you need to do a lot of specific work," Potvin said, "and then our approach is to let the robot help you to share the work, not just let people do it." ”

Piper that everyone can use.

So can other vendors benefit from the Piper system? Sure, and they're doing it. The total code volume for the Facebook app has also reached 200 million lines, and Facebook sees the whole code as a whole project. Other companies are doing the same, just a little bit smaller on the scale. This can be the case for companies that are about the same size as Google or Facebook. But now Google and Facebook are exploring new ways to get everyone to benefit from it.

The two big it giants are now working on how to make an ultra-large open source version control system available to everyone. The study was based on an existing mercurial system. "We are trying to extend the mercurial system to the size of the Google code warehouse. ", Potvin pointed out. Google is now working with programming guru Bryan O ' Sullivan and other Facebook programmers to make a breakthrough.

This may seem a bit extreme. After all, most companies do not operate on the code at Google or Facebook level. But in the near future, they will become like that.

Article source: "It Blue Panther"

Google code base has more than 2 billion lines of code, how they are managed

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.