Rob Pike talk about Google Go: concurrency, Type System, memory management, and GC

Last Update:2014-10-15 Source: Internet

Author: User

Tags what interface java reference

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

This is a creation in Article, where the information may have evolved or changed.

1. Rob, you have created the Google go language. What is Google Go? Can you give me a brief introduction to Google go?

I'd like to talk about why we should create this language, which is slightly different from your question. I did a series of lectures on programming languages at Google, on YouTube, on a language I wrote earlier, called Newsqueak, the 80 's, very early. During the lecture, I began to wonder why some of the ideas in Newsqueak were not available in my current C + + working environment. And in Google we often have to build a very large program, light construction will take a lot of time, on the management of dependencies also have problems, because the link is not needed, the binary package becomes very large, the link time is very long, the compilation time is very long, and C + + works a bit old, its underlying actually c,c + + has a history of more than 30 years, and C is more than 40 years old. With today's hardware computing, there are many new things to consider: multicore machines, networking, distributed systems, cloud computing, and so on.

2. What are the main features of go? What are the important features?

For most people, their first impression of Go is that the language has concurrency as the language primitive, which is very good and important for us to deal with distributed computing and multicore stuff. I guess a lot of people think that go is a simple, uninteresting language, and there's nothing special about it because it looks like a glance at the idea. But you can't actually use the first impression to judge go. Many people who have used go will find that it is a very productive and expressive language that solves all the problems we expect it to solve when we write this language.

The go compilation process is fast, the binary package is smaller, and it manages to depend on the same things as the language itself. There is another story here, but it is no longer discussed here, but the concurrency of the language allows it to handle very complex operations and distributed computing environments in very simple patterns. I think the most important feature may be concurrency, and then we can talk about the type system of the language, which differs greatly from the traditional object-oriented type systems such as C + + and Java.

3. Before we go any further, can you explain why the go compiler can achieve that fast compilation speed? What's the Magic weapon?

There are two reasons why it is fast. First go has two compilers-two separate implementations. One is a new compiler written in Plan 9 (http://plan9.bell-labs.com/wiki/plan9/1/) style, which has its own unique way of working and is a new compiler. Another compiler, called the GCC Go, has a gcc frontend, which Ian Taylor later wrote. So go has two compilers, and speed is a common feature of both, but the Plan 9 style compiler is 5 times times faster than GCC go because it's completely new from head to toe, and without the GCC backend, those things can take a lot of time to produce really good code.

The GCC go compiler produces better code, so it's slower. But what's really important is that the go compiler's dependency management feature is the real reason for its faster compilation. If you go to a C or C + + program, you will find that its header file describes the function library, object code, and so on. The language itself does not force a check on dependencies, and every time you have to parse the code to see what your function is like. If you are compiling a C + + program that you want to use for another class, you must first compile the class and header files it depends on, and so on. If you have compiled a C + + program that has many classes and is internally relevant, you might compile the same header file hundreds of or even thousands of times. Of course, you can use precompiled header files and other tricks to avoid one problem.

But the language itself does not help you, the tool may make the problem better, but the biggest problem is that there is nothing to ensure that what you compile is what the program really needs. It is possible that your program contains a header file that is not really needed, but you cannot know it because the language does not have a mandatory check. And go has a more rigorous dependency model, it has something called a package (packages), you can think of it as a Java class file or something like that, or a library of things, although they are not the same, but the basic idea is the same. The key question is, if this thing depends on that thing, and that thing depends on something else, such as a relies on b,b and relies on C, then you must first compile the most inner dependency: that is, you compile C, then compile B, and finally compile a.

But what if a relies on B, but a does not depend directly on C, but on dependency passing? All the information B needs to get from C will be placed in the object code of B. So, when I compile a, I don't need C anymore. So it's very simple: when you compile a program, you simply walk up the type information along the dependency tree, and if you reach the top of the tree, just compile the immediate dependencies instead of relying on the rest of the hierarchy. If you want to do arithmetic, you will find that in objective-c or C + + or similar languages, although only a simple header file is included, you may compile a hundreds of thousands of-line program due to the existence of dependency passing. In go, however, you open a file that may have only 20 lines, because it only describes the public interface.

If there are only three files in a dependency chain, the benefits of go may not be obvious, but if you have thousands of files, the speed advantage of go will grow exponentially. We believe that if we use go, we should be able to compile millions of lines of code in seconds. However, if it is an equal number of programs written in C + +, the cost of compiling is much larger due to dependency management issues, and the compilation time will take up to several minutes. Therefore, the root cause of go speed is mainly due to the management of dependence.

4. Let's start talking about the type system in go. There is a struct (struct) and a type in go, so what is the type of go?

The types in go are similar to those in other traditional programming languages. The types in go are integers, strings, struct data structures, and Arrays (array), which we call slices (slice), which resemble arrays of C, but are easier to use and more fixed. You can declare a local type and name it, and then use it in the usual way. The difference between go and object-oriented is that a type is just one way to write data, and the method is a completely independent concept. You can put the method on a struct, there is no class concept in go, and instead the structure, and some of the methods declared for this structure.

Structs cannot be confused with classes. But you can also put the method on an array, an integer, a floating-point number, or a string, in fact any type can have a method. Thus, the concept of the method here is more generalized than the Java approach, and in Java the method is part of the class, and that's it. For example, you can have a method on an integer that sounds useless, but if you want to attach a to_string method to an integer constant called Tuesday to print a nice day format, or you want to reformat the string so that it can print itself in a different way, At this point you will realize its role. Why not put all the methods or other good things into the class, why not let them provide a wider range of services?

5. So these methods are only visible inside the package?

No, actually, go allows you to define the method within the package for the type you implement. I can't introduce your type and add it directly to my method, but I can use the anonymous attribute (anonymous field) to wrap it up, not where you want to add it, you define the type, and then you can put the method on it. Because of this, we provide another package-interface (interface) in the package, but it is difficult to understand the interface if you do not understand who can add the strict bounds of the method to the object.

6. You mean, I can add methods to int, but do I have to use typedef first?

You want the typedef an integer type, a name, if you are working on the seven days of the week, you can call it "Day", you can give the type you declared--day Add method, but you cannot directly add the method to int. Because the integer type is not defined by you and not in your package, it is introduced but not defined in your package, which means you cannot add methods to it. You cannot add methods to types that are not defined in your package.

7. It's interesting to learn from the idea of open class in Ruby. Ruby's Open class can actually modify classes and add new methods, which is destructive, but your approach is inherently safe because it creates new things.

It's safe and manageable, and it's easy to understand. Initially we thought the type might be inconvenient, and we wanted to add a method like Ruby, but it made the interface more difficult to understand. So, we just take the method out, not put it in, we can't think of any better way, so the restriction method can only be on the local type, but this idea is really easy to understand and use.

8. You also mentioned TypeDef, is it called typedef?

It should be called "type", and the type of--day you are talking about is defined in such a way as "type day int", so you have a new type, you can add methods on it, declare variables, but this type is different from int, unlike C, except that the same thing has another name. In go, you actually create a new type that is different from int, called "Day," which has the structural properties of int but has its own set of methods.

9. is a typedef a preprocessing directive in C? "Editor's note/disclaimer: typedef in C language is irrelevant to preprocessing"

That's actually an alias, but in go it's not an alias, it's a new type.

10. Let's start at the bottom, what's the smallest type of go?

The smallest type should be a Boolean type (BOOL). bool, int, and float, then there are size types, strings, complex types such as Int32, Float64, and so on, but this is the base type set. You can build structures, arrays, mappings (maps) from these types, and mappings in Go are built-in types that are not function libraries. Then I think it is the interface, to the interface, interesting things just really started.

11. However, a type such as int is a value type, right.

int is a value type. In go, any type is a value type, and like C, everything is called by value, but you can also use pointers. If you want to refer to something, you can get its address so that you have a pointer. Go also has pointers but there are more restrictions than the C pointer, and the pointers in go are safe because they are type-safe, so you can't cheat the compiler, and there are no pointer operations, so if you have a pointer to something, you can't move it out of the object or cheat the compiler.

12. Are they similar to C + + references?

Yes, much like references, but you can write them in the way you expect them to. And you can use an address in the middle of the structure, such as a buffer, which is not the same as the Java reference. In Java, you have to allocate a buffer next to it, which is an additional overhead. In go, you actually assign the object as part of the structure, in the same memory block, which is very important for performance.

13. It is a compound object within the structure.

Yes, if it is a value rather than a pointer, it is. Of course you can also put the pointer inside and outside the structure, but if you have struct A and put struct B in struct A, then stuct B is a piece of memory, and unlike Java, this is one of the reasons for Java performance problems.

14. You mentioned that the interface is more interesting, so let's talk about this part.

The interface in Go is really very, very simple. The interface indicates two different things: one, it shows the type of idea, the interface type is a type that lists a set of methods, so if you want to abstract a set of methods to define a behavior, define an interface and declare the methods. Now that you have a type, let's call it an interface type, and from now on all implementations of the types of these methods in the interface-including the base type, struct, map, or whatever type-are implicitly compliant with the interface requirements. Secondly, it is also really interesting that, unlike most languages, there is no "implements" statement in go.

You do not need to state "My object implements this interface", as long as you define those methods in the interface, it automatically implements the interface. Some people are very concerned about this, and what I think they want to say is that it is really important to know what interface you are implementing (Implement). If you really want to know what interface you've implemented, there are tricks you can do about it. But our idea is quite different, and the idea is that you should not consider what interface to implement, but rather write down what you want to do, because you don't have to decide which interface to implement beforehand. Maybe later you actually implemented an interface that you don't know yet, because the interface hasn't been designed yet, but now you're already implementing it.

Later you may find that two classes that were not previously considered relevant are relevant-I use the word class again, I think Java is too much-two structs have implemented some very useful ways to concentrate, It is useful to have a way to manipulate any of these two structs. That way you can declare an interface, and then you don't have to do anything, even if it's done in someone else's code, although you can't edit the code. In the case of Java, the code must declare the implementation of your interface, in a sense, the implementation is unidirectional. In go, however, the implementation is bidirectional. There are actually quite a few beautiful and simple examples of interfaces.

One of my favorite real examples is "reader", there is a package in go called the Io,io package has a Reader interface, it has only one method, which is the standard declaration of the Read method, such as reading from the operating system or file. This interface can be implemented by anything in the system called by the read system. Obviously, files, networks, caches, decompressor, decryption machines, pipelines, and even anything that wants to access data can provide a reader interface to their data, and any program that wants to read data from those resources can achieve the purpose through that interface. It's kind of like the Plan 9 we said earlier, but it's generalized in different ways.

Similarly, writer is another example of a better understanding, written by those who want to do the writing operations to achieve. So in the format of printing, FPRINGF's first parameter is not file, but writer. In this way, fprintf can do io formatting for anything that implements the Write method. There are a lot of good examples: HTTP, if you are implementing an HTTP server, you only have to do fprintf to connection, you can pass the data to the client without any fancy operation. You can write through a compressor, and you can write through anything I mention: compressors, encryption machines, caches, network connections, pipelines, files, you can do it directly through fprintf, because they all implement the Write method, so implies that it conforms to the writer's interface requirements.

15. Somewhat similar to a structured type system (structural typing)

Regardless of its behavior, it is somewhat like a structured type system. But it is completely abstract, and it does not have what it has, but what it can do. Once the structure (struct) has been established, it specifies the appearance of its memory, then the method illustrates the behavior of the structure, and then the interface abstracts the structure and other methods in other structures that implement the same method. This is a duck type system (duck typing, a dynamic type system, http://en.wikipedia.org/wiki/Duck_typing), not a structured type system.

16. You mentioned classes, but go has no class, right.

Go has no class.

17. But no class how to write code?

A structure with a method (Stuct) is much like a class. The interesting difference is that go has no subtype inheritance, you have to learn the alternative ways to go, go has more powerful, more expressive things. However, Java programmers and C + + programmers will be surprised when they start using go, because they are actually using go to write Java programs or C + + programs, so the code does not work well, you can do so, but this is slightly awkward. But if you step back and say to yourself, "How do I use go to write these things?" "You'll find that the patterns are actually different, and with go you can express similar ideas in shorter programs because you don't need to repeat the behavior in all subclasses. It's a very different environment than you look at at first glance.

18. How do I share these behaviors if I have some actions to achieve and I want to put them in multiple structs?

There is a concept called anonymous domain, which is called embedding. It works like this: if you have a struct, and there are other things that implement the behavior you want, you can embed these into your structure (struct) so that the structure (struct) can not only get the data of the embedded person but also obtain its method. If you have some kind of public behavior, such as having a name method in some type, in Java you would think that this is a set of subclasses (inherited methods), in go you just have to get a type that has the name method, put it in all the structures you want to implement this method, They will automatically get the name method instead of writing it in every structure. This is a very simple example, but there are a lot of interesting structured things used to embed.

And, you can embed multiple things into a single structure, you can think of it as multiple inheritance, but it's more confusing, actually it's very simple in go, it's just a collection, you can put anything in it, basically unite all the methods on each method set, All you have to do is write a single line of code to have all of its behavior.

19. What if there is a problem with multiple inheritance naming conflicts?

Naming conflicts is actually nothing, go is static handling of the problem. The rule is that if there are multiple layers embedded, the top layer takes precedence; if the same layer has two identical names or the same method, go will give a simple static error. You don't have to check it yourself, just pay attention to the error. Naming conflicts are static checks, and the rules are very simple, and in practice naming conflicts do not occur much.

20. Because there is no root object or root class in the system, what should I do if I want a list of different types of structures?

An interesting part of the interface is that they are just collections, a collection of methods, then there will be an empty collection, without any method of the interface, which we call the null interface. Everything in the system conforms to the requirements of the Null interface. The null interface is somewhat similar to the Java object, except that int, float, and string also conform to the null interface, and go does not need an actual class, because go does not have the concept of class, everything is unified, which is a bit like void*, but void* is for pointers, not values.

But an empty interface value can represent anything in the system and is very universal. So, if you create an empty interface array, you actually have a polymorphic container, and if you want to take it out, go has a type switch, you can ask for the type in the unpacking, so it's safe to unpack.

What's the difference between a go and what is called Goroutines? Not the same?

Coroutines and Goroutines are different, and their names reflect this. We gave it a new name because there are too many terms, processes (processes), threads (threads), lightweight threads, strings (chords), these things have countless names, and Goroutines is not new, the same concept is already available in other systems. But the concept is very different from the names in front of them, and I want us to name them ourselves. The implication behind Goroutine is that it is a coroutine, but it is transferred to other coroutine after blocking, and other coroutines on the same thread are transferred, so they do not block.

Therefore, fundamentally speaking, Goroutines is a branch of coroutines that can get multiple features on enough operating threads, and no goroutines will be blocked by other coroutine. If they're just collaborating, just one thread. However, if there are many IO operations, there will be many operating system actions, there will be many many threads. But Goroutines is very cheap, they can have hundreds of thousands of of people, overall well-run and only use a reasonable amount of memory, they are created cheap and garbage collection function, everything is very simple.

22. You mentioned that you used the M:N threading model, that is, the M-coroutines mapped to n threads?

Yes, but the number of coroutines and the number of threads is determined by the dynamics of the work done by the program.

Does the goroutines have a channel for communication?

Yes, once there are two independently executed functions, they need to talk to each other if the goroutine are to collaborate with one another. So there's the concept of a channel, which is actually a type of message queue, you can use it to send values, and if you hold one end of the channel in the Goroutine, you can send the type value to the other end, and that end will get what you want. Channels have synchronous and asynchronous points, we use synchronous channels whenever possible, because the synchronization channel is very well conceived, you can synchronize and communicate at the same time, everything runs unison.

But sometimes it makes sense to cache messages for efficiency reasons or scheduling reasons. You can send an integer message, a string, a struct, a pointer to a struct, and so on to the channel, something very interesting, and you can send another channel on the channel. In this way, I can send you communications with others, which is a very interesting concept.

24. You mentioned that you have cached synchronization channels and asynchronous channels.

No, synchronization is not cached; asynchronous and caching is a meaning, because with the cache, I can put the value in the cached space to save. But if there is no cache, I have to wait for someone else to take the value away, so no caching and synchronization is a meaning.

25. Each goroutine is like a small thread, so you can explain it to the reader.

Yes, but it's lightweight.

26. They are lightweight. But each thread also pre-allocates stack space, so they are very cost-goroutines, how to deal with it?

Yes, Goroutines was created, only a very small stack of--4k, maybe a little bit, this stack is in the heap, of course, you know if there is such a small stack in C language What happens, when you call a function or assign an array of things, the program will overflow immediately. In go, this does not happen, each function will have a number of instructions at the beginning to check whether the stack pointer reaches its bounds, if the boundary, it will be linked to other blocks, this connection is called a stack of stacks, if you use more than the beginning of the start of the stack, you have this stack block link string, we call the fragment stack.

Because there are only a few directives, this mechanism is very inexpensive. Of course, you can allocate multiple stacks, but the go compiler is more inclined to move large things onto the heap, so the typical usage is that you have to call several methods before reaching the 4K boundary, although this does not happen very often. But one thing is important: they are cheap to create because there is only one memory allocation and the allocated memory is very small, you don't have to specify the size of the stack when creating a new goroutine, which is a good abstraction, you don't have to worry about the size of the stack at all. Then, the stack will grow or shrink with demand, you don't have to worry about recursion, you don't have to worry about big caches or anything that's completely invisible to programmers, everything goes in the go language, which is the whole idea of a language.

27. Let's talk about automation, initially you're promoting the go language as a system-level language, and an interesting option is to use the garbage collector, but it's not fast or garbage collection intermittent, which is annoying if you use it to write an operating system. What do you think of this problem?

I think this is a very difficult problem and we haven't solved it yet, our garbage collector can work, but there are some latency issues, the garbage collector may pause, but our view is that we believe that although this is a research topic, it is not solved but we are working hard. For today's parallel machines, it is possible to separate some fragments of the machine core into garbage collection as a background task for parallel recycling. There's a lot of work to be done in this area and a lot of success, but it's a tricky question, and I don't think we're going to drop the delay to 0, but I'm sure we can make the delay as low as possible, which is no longer a problem for most system software. I don't guarantee that there will be significant delays in every program, but I think we can be successful, and this is a more active area of the go language.

28. There is no way to avoid facing the garbage collector, for example, with some bulk cache, we can throw the data in.

Go lets you drill down into the memory layout, you can allocate your own space, and you can do your own memory management if you want. There is no alloc and free method, but you can declare a cache to put things in, this technique is used to avoid unnecessary garbage. As in C, in C, if you are always malloc and free, the price is great. So you assign an array of objects and link them together, form a list, manage your own space, and without malloc and free, it will be fast. You can do the same thing as go, because go gives you the ability to work safely with the underlying thing, so you can actually do it yourself without deceiving the type system to achieve the goal.

I've already expressed the view that in Java, whenever you embed something else in a structure, it's done by pointers, but in go you can put it in a single structure. So if you have some data structures that require several caches, you can put the cache in the structure's memory, which not only means efficient (because you don't have to get the cache indirectly), but also means that a single structure can allocate memory and garbage collection within one step. This will reduce the cost. Therefore, if you consider the actual situation of garbage collection, you should not always consider this problem when you are designing something that is not of high performance. But for high-performance requirements, given the memory layout, although go is a language with a real garbage collection feature, it gives you the tools to control how much memory and garbage is generated. I think this is easy for many people to ignore.

29. Last question: Is go a system-level language or an application-level language?

We're designing him as a system-level language because what we do at Google is system-level, right? Web servers and database systems, as well as storage systems, are all systems. But not the operating system, I don't know if go can be a good operating system language, but it can't be said to be such a language. Interestingly, because of the way we used to design the language, go eventually became a very good universal language, which was a bit unexpected. I think most users don't actually think about it from a system standpoint, even though many people have done a little Web server or something like that.

Go is also very good for doing a lot of application classes, it will have a better library of functions, more and more tools and some go more useful things, go is a very good general language, it is I used the highest production language.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More