Coding CTO Sun Yucong: People, Technology and processes

Source: Internet
Author: User
Tags version control system

I introduce myself first, I was joined in 07 Google, Moutain View headquarters as Google SRE, early this year to join Coding.

At Google I participated in two Project, the first one is Youtube, including Video transcoding, streaming, Google is very large, each month will have 1PB level of storage, store, transcoding, we also do Golba L CDN, Maximum peak time reached ten TB, we are in the global 100,000 nodes, each machine is 24 cores run full state. Then I left the Youtube team to join Google Cloud Platform team. We do the main job is to manage Google's global machine, about 1 million or so. What I did before I left Google was Omega Project, a cluster management system that managed the task scheduling and collaboration of Google's entire cloud platform. Many people may say "but the egg", because this is not the existence of the domestic site. Laugh

Coming back from Google as CTO at Coding is a big change in my life. Recently I have seen a good question, "from big companies to small companies when the CTO is how the experience", I excerpt a good answer: "The name of the CTO, recruitment, training, encourage the program ape, pull network cable, check room, installation system This is what the CTO will do; discuss programs, push programs, set plans, determine progress, Delaying progress, appeasing the ape, cursing the boss, and reassuring the boss, is also the CTO's job. "Not including Coding, and my work also includes Coding, very sad." Laugh

So I categorized the problem:

What is the CTO:
First, he is an encouragement in the company, (I this image as an encouragement teacher is not really suitable)
Second, may be the network management, this I did do, connect network cable, rack server,
Third, may be the doormat, the programmer is very dissatisfied with me, the boss is also very dissatisfied with me.

So let's take these three characters as a counterpart:

First, is the management of research and development personnel, including how to recruit suitable, how to make these people better collaboration;
Second, the development of technology and research and development environment management, you have these people how to make everyone Better Together, more efficient development;
Third, is the development process management, how to make the company this machine to more flexible, smooth.

So here I propose three elements, people, technology and processes, which is an indispensable key element in a company's research and development system.

Let's start by saying that a startup company needs a full stack of engineers. What does this full-stack engineer look like? Generally speaking he is like a warrior, stepping on the white horse, armed with a gold gun, what problems can be solved, a person singled out hundreds of people. But sometimes we don't get full stack engineers, we recruit some full-time engineers, and this is a word we invented recently. This guy is what he does, write the website and write IOS and finally write Android. Such people are also difficult to find, and the more difficult is to bring these people to the reasonable arrangement together, reasonably organized to collaborate.

To explain the evolution of Coding in research and development, let me first talk about the Coding service architecture, which is our service architecture for last May (first release), very simple only a core program, after more than a year of evolution, now become this way. I deliberately write very small words, because do not want to let everyone see clearly, this is in fact complex is not necessarily right. How does a simple architecture evolve into complex steps? Here I would like to introduce a Conway law, which is

"The organization of any design system, the resulting design is equivalent to the structure of communication within and between the organization."

Recall Coding before the way of communication, the boss said "we want to do a new function", we began to function, how to change the front-end, to change who to do, the back end to change, change the words who come, service layer, DB layer, test, deployment, everyone do things are different, everyone only do one thing, Everyone does a screw on the assembly line. What does this cause? Front-end programmers in the back-end programmer, when to connect the interface to start again, the backend programmer in the database, when to add this field I started again, testing and deployment is more waiting. Each time a project progress meeting, the Boss asked the function to do what extent, the front end said that the backend did not write a good interface, the back end said the database has not been done, often this state.

Then Coding today adopted an organization way, we call him full stack, the meaning of the whole stack and the general meaning of the full stack is not very same, this whole stack we refer to the product, the full stack of functions. How is it implemented? In fact, for any more qualified programmer, the language is not his bottleneck, and should not be his restrictive conditions. What we want more is this all-stack engineer in our company. When we do any function, I will tell this person that you are responsible for this function from start to finish, you need to change the front end, you need to change the backend to change back end. It may be difficult to do this at first, the backend engineer is not very familiar with the front end, the front-end engineer is not very familiar with the backend, then we will use some other methods to overcome it. This process forces the knowledge sharing within the organization.

Many big company engineers may only care about this small piece, for example, front-end engineers only wrote this front-end things, the back end I do not care, because I dare not move, people do not let me move. In small companies, everyone has to understand the company's use of various technologies, we work together to reduce the complexity of the system. Full stack is our company's strategic direction, we hope that everyone has the sense of ownership, in the need to do the time he can from the front to the end, but want to put all the full stack to do research and development environment and development tools to adjust, next we talk about how we are technical means to assist.

Technical means in a research and development environment has three elements:
The first one is how the code is managed and how it operates;
The second is to run, the front end of the code to run a complex requirements, how to adjust.
The 3rd is how you deploy and OPS after you've written it. The development environment and production environment are often out of the way.

For the code this aspect is good, I first to introduce to you in the Google when the code farmer is a kind of experience. Google's most powerful is that his research and development system after more than 10 years of accumulation and temper, is now a very efficient system. So how does this "non-existent" company manage its own code?

1th, the entire company has only one warehouse, one version control system. This thing is simple to say, in fact, it is very difficult to achieve, many people think this thing seems feasible, but he thought that we are using SVN? Obviously not, that SVN no use, Git also useless, then how to do only one warehouse? The problem with only one warehouse is that all the program code is in one directory, if you have a very large hard disk, you put the whole company down, he is a directory of many subdirectories, maybe dozens of levels of subdirectories, probably the entire company code has hundreds of G. How to synchronize, manage, operate, this is Google's secret recipe.

What good is this?
First, you can see anyone's code, you can see how the service is implemented, why there is an exception, what error;
Second, it brings a more efficient way to reuse, such as I wrote a small program, my Gmail program with a section of bigtable code, then I directly in the code to refer to the past can, anyway, everyone in a directory, the compilation environment is the same, with this repo, there is a set of compiler system, This compilation system can do a key to compile any program, I run a Gmail is also a line of programs, a command is compiled program build plus Gmail's path can be compiled, compiled go,python,c++ are all a command, do not care about how the bottom layer is implemented, run up. After that, it's easy to deploy and develop, which means that if I write a business code that requires a service, I can easily start the service and run it again with minor changes to the service.

There is more to say: We all understand the truth, but how to achieve it?

How does the Coding organize our code? Because we do not have the Google centralized code services, in order to adjust our code structure, so that everyone better reuse, reference each other's code, and because we provide git services, we also use the Git repository, but we each project is a separate warehouse, For example, we put the front-end code in a warehouse, back-end code in a warehouse, each service is placed in different warehouses.

This approach creates a problem, how to synchronize?

We use Google open-source things called Android repo, we do Android development may have used, what does this repo mean? Is he defined a Workspace, this Workspace has a fixed format/structure, and then you unified with repo this tool to sync, this repo should go to which commit, the repo should go to which commit, the company all use the same kind of Wo Rkspace's way of doing this ensures that everyone sees the same code.

The benefits of doing this code structure definition are:
First, we have code reading functions, such as our Coding code reading can be based on this Workspace, you can see the code of each project, the state of reference to each other;
The second is quality analysis, you can do quality analysis for this Workspace. This Workspace has a repo.sh this is a command, the key is that there is a default.xml is the configuration file for this command, then everyone open Workspace run repo sync, it will automatically update each component to the latest version. We also despise XML but there is no way, this program is so written, in fact, it is very simple, the main definition of a lot of project, this project may be the path of the code warehouse map to the local path, it has a more advanced function is that it can sync several Project at the same time, is when you knock repo.sh sync-g=4 is to open 4 threads to synchronize. In practice this is very easy, with this thing opened the situation, for everyone next development environment to do the groundwork.

Tools have, what is the development environment we really want?

My feeling from Google is that we want a unified, code-based, replicable, reproducible development environment.

First, whenever a new colleague to the company, he took his own computer or with the company's computer, he used the tool is not the same, the environment is not the same, how to let the company's code in their own machine running up is actually a more difficult thing. The solution may be to write documents, but this is a very painful and time-wasting thing.

The second is, if you are not able to replicate and reproduce this development environment, how do you automate your testing? Can not say that someone manually matching, run, today hang tomorrow and change it?

How to achieve a unified development environment? Our approach is to define a common interface, I think the internal implementation of the compilation system is irrelevant, how all can, but its interface is 100 times times more important than the implementation. You can think about it. If you each program, each component is the same method to Build, to run, this is what kind of experience? We define a build.sh and package.sh, this build means I use Java,python,ruby, I define build.sh it can finally produce a result can build this thing good, for me I don't care about the following program how to write , but I just build it, I changed a line of code and it can build something new. This is the way we are going to compare the Earth.

Google recently open source Bazel, which is written in Java compiler tool, it is actually a build command, followed by two backslashes representing the root of the entire Workspace, and then coding is a project,server is Project Targ Et. With this thing you actually build any project you think about it is logically layered rather than physically layered, the logical layering is I want to build a coding server, that this coding server may refer to other third-party libraries, header files, Ruby programs, Java programs do not matter, I just say I can build a coding server to be able to, the Bazel better is it can automatically handle recursive dependency, that is, you this rule can rely on another rule.

With this compilation We also need to have a replicable, reusable development environment, and what do we do with this development environment?
We're using Vagrant and Docker. Vagrant is a management tool for VMS that can generate a new VM, which is defined using code, and then runs each service as a Docker service on the VM.

Vagrant actually did three things:
The first thing is that it downloads Base box,base Box from a designated place we do it ourselves, for example an Ubuntu image plus some local dependencies.
The second is that it supports script definitions, you can run shell scripts to customize, and then choose a so-called Provider, this Provider is you can, for example, local Virtual Box Provider, remote and many cloud vendors can docking.

After these two operations, he produced a VM, this VM is a you can ssh in one click, it automatically put everything you have, this is your development environment, then with this thing, the whole company can have a consistent development environment, because it is a VM, It runs on which machine is the same way, all the dependent libraries can be put in, so the final result is that we have a few g of the image in our intranet, after each new colleague came we let him install a Vagrant or Virtual Box, then he knocked a command, automatically download the image, start, Then a button ran the whole Coding project on his machine.

This reusable development environment has been achieved, and we have also made the so-called one-click operation. One-click operation is and develop another level of things, it is not concerned about how this thing is built, I only care about the service I started, such as the Coding development environment when I need what services, then we use a script that we write to the Docker to do the choreography, Some jobs are defined in this configuration file, and each job has some image (the version that runs the program, the environment variable, and so on). In fact, most of the time we are using the Go Run command to start a lot of services and mirrors, to achieve what we have just said the unified code, replicable, reproducible development environment.

Technical tools are the places where I miss Google the most, because these tools bring a lot of benefits: first he encourages collaboration within the company, everyone writes the program is not the first to think of themselves to make a small thing out, we more time is to see what other people in the company do a thing, how it is implemented, can not be quoted over , can not be used, can be used to pull out the common class library;
Second it can let the new people quickly get started, our colleagues or old colleagues, more scenes are old colleagues for Project, formerly may be only users, instantly can become developers. Seamless switching of the development environment virtually reduces the cost of running a lot of companies.
The 3rd is that it makes automation possible. Just now we have a Vagrant to execute a command, this thing can also be implemented as an automation, such as every time we are in the Code Review, we can automatically start a new VM in the background, all things downloaded after the test run, and finally to change the right and wrong, Results such as impact, which can be automated. I think this is a key point, and then after this environment you have the quality analysis that the tool will be able to do before it can be accessed with more productivity tools.

Finally, let me say a few words about the management of the process.

As a CTO, our little dream is to keep delivering better software. The boss says you have to keep running forward as a old ox, and you can finish the task on time.

There are many implementations within the company: the first is Code Review, the process tool mixed:

1th is the Code Review is definitely brain pumping detector, everyone has brain pumping time, the review period is the buffer period, let you think you really want to do this thing? It's good for you to have someone check it for you.
The second is that code Review is a good way to share knowledge. I am writing this function by myself when I give the code to someone else to see it, and the other person will know about it, he may come to do your function in the future, you change the past to do his work. It is important to encourage knowledge sharing within the company.
Third it is an idea generation, everyone in the knowledge sharing is easy to find oneself Another place also found this problem, I was so solved, why don't you solve it? Is there any better way to solve it? Code Review do this to facilitate communication.

What does Code Review do when it's bad?

The first is the programmer despise chain, is the old programmer to the new programmer very despise, said you write something too, I am lazy look. This is a very bad behavior, we must avoid this kind of thing. We have made a number of procedural requirements, such as you have to read the whole document every time you Review, can not say I saw a line found too rotten to see, you changed the better I see again, this is not allowed.

The second we talk about Ownership, that is you in code Review, this is who, we say code review who write code who is responsible for, you write RM-RF you will be responsible for this, I read after I think you write a reasonable, but I did not see the bug, this is your problem is not the reviewer's problem. This is the experience we have practiced in google,coding,

The 3rd is to embrace change, that is, I might have a staff to write a program, he thought I wrote this program is really good, you do not change, a change I can not read. So he's very conflicted about any changes, which is also wrong, whether in Google or Coding, we insist on the principle that we have business case you can change the code. You can only go to write this code better, you must allow him to change, this is the key, you write the code are belong to the company, how can the company use code to do better things, you have to clear your truth, why you have to change this code, changed what benefits, turn this thing into a technical discussion. Rather than a duty, the discussion on the authority.

So I think these are some of the more important things to do with Code Review. Now let's talk about the issue of Release Schedule.

Because startups and big companies are not the same, each company has its own Release Schedule, before we basically rely on roar, today we want to go online, so full on-line, we want to release that release, if you and the boss Roar today can not line, the boss think of it then forget it , tomorrow, is not very serious. Our internal reform is to make the sprint into a long-distance running, we have been entrepreneurial for more than a year, can not always sprint state. We're going to be long-distance runners, delivering high-quality software on a continuous basis, instead of working overtime every day to get it done after midnight.

The way is very simple, is a week to die Release two times, I persuaded the boss two times a week is enough, three times more. Google one months to release once, the big project only release once a year, our start-up company can do this is very good. Release two times a week is what concept, Monday will be on the Staging test environment, Tuesday will release, Wednesday again on the Staging, Thursday and Release, leave everyone in Friday write program, fix Bug. In fact, the Release two times a week is also very frequent, but to ensure that at this stage of our conversion rate, we adopted this way. What is the benefit of putting this thing on the table, and he lets you plan my function next Tuesday or next Thursday, when you were in war with the project manager, he said you were in Tuesday or Thursday, and the programmer said that Tuesday might not be possible, so let's go to Thursday. Carefree matter the sprint to a long-distance run.

Next, we want to distinguish between Feture team and Infrastructure team, Infrastructure team is also a term we used to be Google. What does he mean by that? That is, although all the people are doing business logic, but we must take some time to carry out the technological evolution, you can not say that every day I come up to write code, copy paste, so that all things are very confusing. We made three points internally:

The first name is Coding one, which is all the projects within our company, all the technology, all the services are going to run in the same way. For example, use the same Java version. This is a very difficult thing, I think a lot of companies can not do, with the same third-party class library, which is very difficult, so use the class library are different, but the same thing is done. So everyone's code seems to be similar, but not the same, so Coding one is to solve this problem, that is, Java version, third-party library version, the type of third-party library, compile mode, operating environment, startup mode, should be the same. This will reduce the internal friction between many programmers, improve the efficiency of everyone.

The second is Coding two, just start a business when we all think we all use a machine, all things are running on top, this thing if it is broken it is all hung up. Every update everyone feels too dangerous, we still do it after midnight, 12 is not enough midnight, we three or four o'clock. Because you have no backup of this thing, there is no grayscale. Our name is Coding, which was from zero to one, and now we're going to go through the process from one to two. From one to two means you this thing can be many copies, you this thing hanging over there can continue on the top, this program you do release when you can publish to yourself, with good to send users, this is the responsible way.

3rd: Coding CI is the last point I want to talk about, and finally our ultimate goal, is the so-called Push on Green, meaning that you submit the code within a minute as long as all the tests have run, and immediately on the production environment, you can think about in their own company can do this? If not, then why? You programmers write code, we think programmers write code is good-natured, you pass code Review, business process is no problem, why can't he go directly to the production environment? Push on Green is the ultimate test to test your development system can do this, if you can do this you are a good research and development system.

This is what I share with you three points, people, technology and processes, I think this is a company is a gear together, the company we can be said to be a big machine, if the machine lubrication, the gear is better and more tightly combined, this is the goal we want to pursue. Thank you.

Video Address: http://v.qq.com/page/d/o/z/d0164k35goz.html
Coding-making development easier! Https://Coding.net

Coding CTO Sun Yucong: People, Technology and processes

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.