The design philosophy of data and grammar for the Go language mechanism (philosophy on data and semantics)

Source: Internet
Author: User
This is a creation in Article, where the information may have evolved or changed. # # Pre-order (Prelude) This series contains a total of four articles that help you understand some of the grammatical structures in the Go language and the design principles behind them, including pointers, stacks, heaps, escape analysis, and values or pointer passing. This is the last article that focuses on the design philosophy of using values and pointers for data and semantics in code. The following is the index of this series of articles: 1. [The stack and pointer of the Go language mechanism] (https://studygolang.com/articles/12443) 2. [Escape analysis of Go language mechanism] (https://studygolang.com/articles/12444) 3. [Memory Anatomy of Go language mechanism] (https://studygolang.com/articles/12445) 4. [Go language mechanism of data and grammar design philosophy] (https://studygolang.com/articles/12487) # # design philosophy (philosophies) "Save the value on the stack, This reduces the pressure on the garbage collector (GC). However, multiple copies of a given value are required to be stored, tracked, and maintained. Put the value on the heap, which adds pressure to the GC. But it is also useful because only one value needs to be stored, tracked, and maintained. "-Bill Kennedy. For a given type of data, want to maintain integrity and readability throughout the software, using values or pointers to be consistent. Why? Because, if you modify data semantics when passing data between functions, it is difficult to maintain a coherent mental model. The larger the code base and the team, the more bugs, the competition for the data, and other side effects sneak into the codebase. I want to start with a set of design philosophies that will guide (how do we) choose a semantic rather than a different semantic approach. # # Mental Model (mental Models) (Translator Note: A mental model is a script that is written by experience and learning, the process of developing something in your mind. Can be used as a control over the overall code) "Let's imagine a project that contains more than 1 million lines of code. These projects are currently less likely to succeed in the United States, far below 50%. Perhaps some people disagree with this statement. "-Tom Love (inventor of Objective C) Tom also says a box of copy paper can hold 100,000 lines of code. Think about it a little bit. What percentage of the code in this box can you control? I believe it is a problem for a developer to maintain a mental model of the code on a piece of paper (about 10,000 lines of code). However, let's assume that each developer develops 10,000 lines of code, and that requires a 100-bit developerTeam to maintain a code base with 1 million lines of code. That means 100 people need to coordinate, group, track and communicate constantly. Now, look at your team of 1 to 10 developers. How are you doing on this much smaller scale? Suppose you have 10,000 lines of code per person, and your team size matches the size of your code base? # # Debug (Debugging) "The biggest problem is that your mental model is wrong, so you can't find the problem at all. "-Brian Kernighan I don't believe that you can use a debugger to solve a problem without a mental model, you're just wasting your time and energy trying to understand the problem. If you have problems in the production environment, who can you ask? Yes, the log. If the log is useless to you during your development, then it must be useless for you if there is a problem with the production environment. The log should be based on the mental model of the code so that the problem can be found by reading the code. # # readability (readability) C is the best balance of performance and expression I have ever seen. You can do anything you want with simple programming, and you have a very good mental model of what the machine is going to be. You can predict its speed very reasonably, and you know what's going to happen ... "-Brian Kernighan I believe that Brian's sentence also applies to Go. Keeping this "mental model" is everything. It drives integrity, readability and simplicity. These are the cornerstones of well-written software that keeps it up and running. Writing code that guarantees the value of a given type of data or a pointer semantics is an important way to achieve this. # # for data oriented design "If you don't understand this data, you don't understand it. Because all the problems are unique and closely related to the data you use. When the data changes, your problems will change as well. But when the problem changes, your algorithm (data conversion) needs to change as well. "-Bill Kennedy thought about it. Your approach to solving the problem is actually solving the problem of data conversion. Every function you write, every program that runs, (just) gets some input data and produces some output data. From this perspective, the mental model of your software is the understanding of these data transformations (for example, how to organize and use them in your code). The principle of "less is more" is important to achieve fewer layers, code volumes, iterations, and reduce complexity and reduce workload when solving problems. # # type (is life) "integrity means that each time the memory is allocated, read memory and write memory are accurate, consistent and efficient. Type systems are essential for us to have this microscopic integrity. "-WilliaM Kennedy if data drives everything you do, it is very important to represent the type of data. In my opinion, "type is life" because the type provides the compiler with the ability to ensure data integrity. Types also drive and indicate semantic rules, and programs must follow the semantics of the data they manipulate. This is the right way to use the start of a value or pointer semantics: the usage type. # # Data (Ability) "When the data is practical and reasonable, the method is effective." "-William Kennedy Value or the idea of pointer semantics does not directly affect GO developers unless they need to decide whether a method receives a value or a pointer. This is the question I have: should I use a value as an argument or a pointer? As soon as I heard the question, I knew that the developer did not understand the semantics of these (types). The purpose of the method is to make the data capable of some kind. Imagine that data is capable of doing certain things. I always wanted to focus on the data, because it was the function of the driver. Data drives the algorithms you write, the encapsulation and the performance you can achieve. # # polymorphic (polymorphism) "Polymorphism means that you write a specific program, but it behaves differently, depending on the data it is manipulating. "-Tom Kurtz (inventor of BASIC) I like what Tom said above. The behavior of a function can vary depending on the data being manipulated. The behavior of this data is to separate functions from the specific data types that they can accept and use, which is why the data can have some ability. This view is the cornerstone of a system that allows architecture and design to adapt to change. # # The first method of the prototype (Prototype approach) "Unless the developer has a good idea of how the software will be used, the software is likely to have a problem." If the developer is not very knowledgeable or does not understand the software very well, then getting as many user input and user-level testing as possible is quite important. "-Brian Kernighan I want you to always focus on understanding the specific data and the data conversion algorithms needed to solve the problem. The first method of using this prototype is to write a specific implementation that can also be deployed in a production environment (if it is reasonable and practical). Once a specific implementation has been able to work, once you know what works and what doesn't, you should focus on refactoring, separating the implementation from the specific data and empowering the data (the Translator notes: My understanding, simply, is that abstraction is a method of data type). # # Semantic Principles (Semantic guidelines) when declaring a type, you must decide which semantics, values, or pointers the particular data type will use. The API that receives or returns this type of data must follow the semantics chosen for that type. The API does not allow (users) to specify or alter semantics. They must knowWhat semantics are used for TAO data, and this is consistent with this. This is the minimum requirement for implementing large code base consistency. Here are the basic guidelines:-When you declare a type, you must decide which semantics to use-the functions and methods must follow the semantics chosen by the given type-to avoid having the method receive different semantics corresponding to a given type-to avoid the function receiving or returning different semantics corresponding to the given type- Avoid changing the semantics of a given type there are some exceptions to these guidelines, the biggest of which is unmarshaling. Unmarshaling always needs to use pointer semantics. Marshaling and unmarshaling always seem to be the exception to the rule. How do you choose a semantics for a given type rather than another? These guidelines will answer this question. Below we will use the guidelines in specific cases: # # Built-in type the built-in type go language includes numbers, text, and Boolean types. These types should be handled using value semantics. Do not use pointers to share values of these types unless you have a very good reason to do so. As an example, view the declarations of these functions from the strings package. # # # code Listing 1 ' Gofunc Replace (s, old, new string, n int) Stringfunc LastIndex (S, Sep string) intfunc Containsrune (s string, R Rune) bool "All of these functions use value semantics in the API settings. # # Reference types in the Go language include slices, maps, interfaces, functions, and channel. These types suggest using value semantics because they are designed to be in the stack to minimize the pressure on the heap. They allow each function to have its own copy of the value, and not every function will cause a potential allocation. This is possible because these values contain a pointer that shares the underlying data structure between calls. Do not share these types of values with pointers unless you have a good reason to do so. It may be an exception to share the map or slice in the call stack to the Unmarshal function. As an example, look at both of these types declared on the Net Library. # # # code Listing 2 ' gotype IP []bytetype ipmask []byte ' ' IP and IPMask are byte slices. This means that they are reference types, and that they should conform to the value semantics. Here is a method called Mask, which is declared to receive an IP type of IPMask value. # # # code Listing 3 ' gofunc (IP IP) mask (mask IPMask) IP {if len (mask) = = IPv6len && len (IP) = = Ipv4len && Allff (Mask[:12]) {mask = Mask[12:]}if len (mask) = = Ipv4len && len ( IP) = = Ipv6len && bytesequal (Ip[:12], v4inv6prefix) {IP = ip[12:]}n: = Len (IP) if n! = Len (mask) {return Nil}out: = Make (IP, N) for I: = 0; I < n; i++ {Out[i] = Ip[i] & mask[i]}return out} "Note that this method is a transition operation and uses the value semantics of the API style. It uses the IP value as the receiver and creates a new IP value based on the incoming IPMask value and returns it to the caller. The method follows the basic guidelines for using value semantics on reference types. This is somewhat similar to the system default append function. # # # code Listing 4 "Govar data []stringdata = append (data," string ") ' Append function's transformation operation uses value semantics. The slice value is passed to append and a new slice value is returned after the change. Always in addition to unmarshaling, it needs to use pointer semantics. # # # code Listing 5 ' Gofunc (IP *ip) unmarshaltext (text []byte) error {if len (text) = = 0 {*ip = Nilreturn nil}s: = string (text) x: = PARSEIP (s) if x = = Nil {return &parseerror{type: "IP address", Text:s}}*ip = Xreturn nil} ' unmarshaltext implementation Encodin G.textunmarshaler interface. If you do not use pointer semantics, you cannot implement it at all. However, this is possible because shared values are usually safe. In addition to unmarshaling, if you use pointer semantics for a reference type, you should think twice. # # # User-defined type (users Defined Types) This is where you need to make the most decisions. You must decide what to use when you declare the typeSemantics. If I ask you to write the API interface to the time package, give you this type. # # # code Listing 6 ' gotype time struct {sec int64nsec int32loc *location} ' ' What semantics would you use? ' View the implementation of this type and the factory function now in the time package. # # # code Listing 7 ' Gofunc now () time {sec, nsec: = Now () return time{sec + unixtointernal, nsec, Local}} ' factory function is a very important function for a type, because It tells you the semantics you choose (this type). The now function is clear (to us) that the value semantics are used. The function creates a value of type time and returns a copy of the value to the caller. Sharing time values is not necessary (because) they do not need to be present on the heap for their lifetime. Take a look at the Add method, which is also a transition operation. # # # code Listing 8 ' Gofunc (t time) Add (d Duration) Time {t.sec + int64 (d/1e9) Nsec: = T.nsec + int32 (d%1e9) if nsec >= 1e9 { T.sec++nsec-= 1e9} else if nsec < 0 {T.sec--nsec + = 1e9}t.nsec = Nsecreturn T} "" You can see again that the Add method follows the semantics chosen by the type. The Add method uses a value sink to manipulate its own copy of the time value. Where a copy of the time value is used in the call. It modifies its own copy and returns a new copy of the time value to the caller. The following is a function that accepts a time value: # # # # 9 ' Gofunc div (t, D Duration) (qmod2 int, R Duration) {"' Once again, accept value semantics for the value of type". The only time API interface that uses pointer semantics is these unmarshal-related functions: # # # code Listing "Gofunc (t *time) unmarshalbinary (data []byte) error {func (t *tim e) gobdecode (data []byte) error {func (t *time) UnmarshaLjson (data []byte) error {func (T *time) unmarshaltext (data []byte) error {") In most cases, the ability to use value semantics is limited. Passing a value from one function to another, (usually) using a value copy is either incorrect or unreasonable. Modifying the data requires isolating it into a single value for sharing. At this point, you should use pointer semantics. If you can't make 100% sure that the copy value is correct and reasonable, then use the pointer semantics. View the production function for the File type in the OS package. # # # code Listing One "Gofunc open (name string) (file *file, err error) {return OpenFile (name, o_rdonly, 0)} ' The Open function returns a file type of Pointer. This means that for file type values, you should use pointer semantics to share the value of file. Modifying the pointer semantics to value semantics can have a devastating effect on your program. When you share values with a function, it is best to assume that you are not allowed to copy a pointer to a value and use that pointer. Otherwise, you do not know what kind of abnormal situation will occur. Looking at more APIs, you'll see more examples of using pointer semantics. # # # # code Listing Gofunc (f *file) Chdir () error {if f = = nil {return errinvalid}if e: = Syscall. Fchdir (F.FD); E! = Nil {return &patherror{"ChDir", F.name, E}}return nil} "" Although the File value is never modified, the ChDir method uses pointer semantics. The method must follow the semantic conventions of that type. # # # code Listing "" ' Gofunc epipecheck (file *file, e error) {if E = = Syscall. Epipe {if atomic. AddInt32 (&file.nepipe, 1) >= {sigpipe ()}} else {atomic. StoreInt32 (&file.nepipe, 0)} "This is a function called Epipecheck, which uses pointers to receive file values. Again, for File values, use pointer semantics consistently. # # Conclusion When I do code review, I will findWhether the search value or the pointer semantics are used consistently. It can help you ensure consistency and predictability of your code. It also enables everyone to maintain a clear and consistent mental model. As code libraries and teams become larger, the consistent use of values or pointer semantics becomes increasingly important. The puzzling part of the Go language is that the choice between pointer and value semantics has long been beyond the scope of the declaration of the receiver and function parameters. The mechanism of the interface, function values and slices are all within the language's working range. In future articles, I will show values or pointer semantics in different parts of these languages.

Via:https://www.ardanlabs.com/blog/2017/06/design-philosophy-on-data-and-semantics.html

Author: William Kennedy Translator: gogeof proofreading: polaris1119

This article by GCTT original compilation, go language Chinese network honor launches

This article was originally translated by GCTT and the Go Language Chinese network. Also want to join the ranks of translators, for open source to do some of their own contribution? Welcome to join Gctt!
Translation work and translations are published only for the purpose of learning and communication, translation work in accordance with the provisions of the CC-BY-NC-SA agreement, if our work has violated your interests, please contact us promptly.
Welcome to the CC-BY-NC-SA agreement, please mark and keep the original/translation link and author/translator information in the text.
The article only represents the author's knowledge and views, if there are different points of view, please line up downstairs to spit groove

848 Reads
Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.