Understanding Go with Go/parser

Source: Internet
Author: User
This is a creation in Article, where the information may have evolved or changed. The content of this article is the same as [episode Justforfunc] (Https://www.youtube.com/watch?v=YRWCa84pykM). # # Justforfunc Previously summary we used ' Go/scanner ' in [previous Article] (https://studygolang.com/articles/12324) to find the most commonly used identifiers in the standard library. > This identifier is v in order to obtain more valuable information, we only consider identifiers greater than or equal to three characters. Unsurprisingly, the most representative judgment statement in Go ' if err! = Nil {} ' has the most frequent occurrences of err and nil. # # Global variables and local variables if we want to know what the most common local variable names should do? What if you want to know the most common types or functions? Go/scanner for these issues does not meet our needs because it lacks support for the context. According to the previous method we can find the token (example: var a = 3), in order to get the scope of token (package level, function level, code block level) We need context support. There can be many declarations in a package, some of which may be function declarations, and in function declarations there may be local variables, constants, or function declarations. But how do we find this structure in the token sequence? Each programming language has a conversion rule from the token sequence to the syntax tree structure. Just like this: "' vardecl =" var "(Varspec | "(" {Varspec ";"} ") . Varspec = Identifierlist (Type ["=" expressionlist] | "=" expressionlist). This conversion rule tells us that a ' vardecl ' (variable declaration) starts with a ' var ' token, followed by a ' varspec ' (variable description) or a semicolon-delimited list of identifiers surrounded by parentheses. Note: Semicolons are actually automatically added to Go scanner, so you won't see them when parsing. Take var a = 3 For example, using Go/scanner we will get the token: "' [var],[ident" a "],[assign],[int" 3 "],[semicolon]" "according to the rules described earlier, this is a only VarSThe vardecl of the PEC. Immediately after we parse out the identifier list (' Identifierlist ') there is an identifier (' Identifier ') ' A ', no type (' type '), and the expression list (' expressionlist ') has an integer 3 expression (' Expression '). If the tree is represented, it will be the following image:! [Image] (Https://raw.githubusercontent.com/studygolang/gctt-images/master/go-parser/0_STJNoHjXJBsnWB4x.png) This will enable us to The rules of the token sequence parsing tree structure are called syntax or syntax, and the parsed tree structure is called the abstract syntax tree, or AST. # # using Go/scanner now we have enough theoretical basis to write some code. Let's see how we parse the expression ' var a = 3 ' and get his AST. "' Gopackage mainimport (" FMT "" Go/parser "" Go/token "" Log ") func main () {fs: = token. Newfileset () F, err: = parser. Parsefile (FS, "", "var a = 3", parser. allerrors) If err! = Nil {log. Fatal (Err)}fmt. Println (f)} ' This code can be compiled through but will error at run time: ' ' 1:1: Expected ' package ', found ' var ' (and 1 more errors) ' ' In order to parse this we call ' Parsefile ' Declaration, we need to give a complete go source file format (beginning with the package as the source file). > Note: Comments can be written in front of the package if you are parsing an expression like ' 3 + 5 ' or other code that can be considered a value, you can see them as a parameter called parseexpr. However, you cannot do this when you declare a function. Add ' package main ' to the beginning of the code and view the AST tree we obtained. "' Gopackage mainimport (" FMT "" Go/parser "" Go/token "" Log ") func main () {fs: = token. Newfileset () F, err: = parser. ParsefilE (FS, "", "package main; var a = 3 ", parser. allerrors) If err! = Nil {log. Fatal (Err)}fmt. Println (f)} "runs after the output as follows:" "Go run main.go&{<nil> 1 main [0xc420054100] scope 0xc42000e210 {var a} [] [] [] ' Println ' replaced with ' FMT. Printf ("% #v", f) ' Retry: ' Go run main.go&ast. File{doc: (*ast. Commentgroup) (nil), Package:1, Name: (*ast. Ident) (0xc42000a060), Decls:[]ast. decl{(*ast. GENDECL) (0xc420054100)}, Scope: (*ast. Scope) (0xc42000e210), Imports:[]*ast. Importspec (nil), Unresolved:[]*ast. Ident (nil), Comments:[]*ast. Commentgroup (Nil)} "' Looks OK but not easy to read, you can use ' Github.com/davecgh/go-spew/spew ' to make the output easier to read: ' ' Gopackage mainimport (" go/ Parser "" Go/token "" Log "" Github.com/davecgh/go-spew/spew ") func main () {fs: = token. Newfileset () F, err: = parser. Parsefile (FS, "", "package main; var a = 3 ", parser. allerrors) If err! = Nil {log. Fatal (Err)}spew. Dump (f)} ' re-run the program and we'll get more readable output: ' ' $ go run main.go (*ast. File) (0xc42009c000) ({Doc: (*ast. Commentgroup) (<nil>), Package: (token. Pos) 1, Name: (*ast. Ident) (0xc42000a1(main), Decls: ([]ast. DECL) (Len=1 cap=1) {(*ast. GENDECL) (0xc420054100) ({Doc: (*ast.commentgroup) (<nil>), Tokpos: (token. Pos), Tok: (token. token) Var, lparen: (token. Pos) 0, Specs: ([]ast. Spec) (Len=1 cap=1) {(*ast. Valuespec) (0xc4200802d0) ({Doc: (*ast.commentgroup) (<nil>), Names: ([]*ast. Ident) (Len=1 cap=1) {(*ast. Ident) (0xc42000a140) (a)}, Type: (AST. Expr) <nil>, Values: ([]ast. Expr) (Len=1 cap=1) {(*ast. Basiclit) (0xc42000a160) ({valuepos: token. Pos) at Kind: (token. token) INT, Value: (String) (Len=1) "3"}), Comment: (*ast.commentgroup) (<nil>)})}, Rparen: (token. Pos) (0})}, Scope: (*ast. Scope) (0xc42000e2b0) (scope 0xc42000e2b0 {var a}), Imports: ([]*ast. Importspec) <nil>, unresolved: ([]*ast. Ident) <nil>, Comments: ([]*ast.commentgroup) <nil>}) "I recommend taking some time to take a look at the tree and find the corresponding source part. ' Scope ', ' Obj ', ' unresolved ' we will say in the section below. # # from AST to code sometimes it is clearer to print the AST in the source location than the tree structure. Using Go/printer can easily print the AST information saved by the source code. "' Gopackage mainimport (" Go/parser "" go/Printer "" Go/token "" Log "" OS ") Func main () {fs: = token. Newfileset () F, err: = parser. Parsefile (FS, "", "package main; var a = 3 ", parser. allerrors) If err! = Nil {log. Fatal (Err)}printer. Fprint (OS. Stdout, FS, f)} ' execution of this code will print the parsing result of our source, will be parser. Allerrors replaced with parser. Importsonly or other values will have different output results. # # AST Guide AST tree has all the information we want to know, but how can we find the information we want? Then the Go/ast pack came in handy. We use the AST. Walk. This function accepts 2 parameters. The second parameter is an AST. Interfaces that are implemented by all nodes in the node,ast. The first parameter is an AST. Visitor interface. This interface has a method: ' ' Gotype Visitor interface {Visit (node node) (w Visitor)} ' ' Now we have a node that is ' parser '. Parsefile ' returned ' AST. File '. But we need to create an ' ast ' of our own. Visitor '. We implemented a print node type and returned our own ' AST '. Visitor '. "' Gopackage mainimport (" FMT "" Go/ast "" Go/parser "" Go/token "" Log ") func main () {fs: = token. Newfileset () F, err: = parser. Parsefile (FS, "", "package main; var a = 3 ", parser. allerrors) If err! = Nil {log. Fatal (Err)}var v visitorast. Walk (V, f)}type visitor Struct{}func (v visitor) Visit (n ast. Node) ast. Visitor {fmt. Printf ("%t\n", N) return v} "runs this program and we get a sequence of nodes without a tree structure. What are those nil nodes? In AST. WalkDocument to see if we return to visitor, we will continue to find his subordinate node, if no subordinate node will return nil. Knowing this feature, we can print the result like a tree. "' Gotype visitor IntFunc (v visitor) Visit (n ast. Node) ast. Visitor {if n = = nil {return nil}fmt. Printf ("%s%t\n", strings. Repeat ("\ t", int (v)), N) return v + 1} "the rest of the program has not changed, and after execution we will get the following output:" ' *ast. File*ast. Ident*ast. Gendecl*ast. Valuespec*ast. Ident*ast. Basiclit ' # # What are the most commonly used names for each identifier? We have been able to parse the code and access the AST node to export the information we want: which variable name is the most common in the package. The code looks like it used to read a list of files from the command line using Go/scanner. "' Gopackage mainimport (" FMT "" Go/ast "" Go/parser "" Go/token "" Log "" OS "" strings ") func main () {if Len (OS. Args) < 2 {fmt. fprintf (OS. Stderr, "usage:\n\t%s [files]\n", Os. ARGS[0]) os. Exit (1)}fs: = token. Newfileset () Var v visitorfor _, arg: = Range OS. Args[1:] {f, err: = parser. Parsefile (FS, ARG, nil, parser. allerrors) If err! = Nil {log. PRINTF ("Could not parse%s:%v", ARG, err) continue}ast. Walk (V, f)}}type visitor IntFunc (v visitor) Visit (n ast. Node) ast. Visitor {if n = = nil {return nil}fmt. Printf ("%s%t\n", strings. Repeat ("\ t", int (v)), N) return v + 1} "" Executes this code weYou will get all the AST files from the command line arguments. We can try to pass in the Main.go file that we just wrote. ' $ go build-o parser main.go && parser main.go# output removed for brevity ' change visitor to keep track of each identifier is used by different variable declarations for multiple Less times. Let's start by tracking short variable declarations. Because we know it's usually a local variable. "' Gotype visitor struct {Locals Map[string]int}func (v visitor) Visit (n ast. Node) ast. Visitor {if n = nil {return Nil}switch d: = N. (type) {case *ast. Assignstmt:for _, Name: = Range D.LHS {if ident, OK: = name. ( *ast. Ident); OK {if ident. Name = = "_" {continue}if ident. OBJ! = Nil && ident. Obj.pos () = = Ident. Pos () {v.locals[ident. Name]++}}}}return v} ' checks that the name of each assignment statement needs to be ignored ' _ ', when we need the ' Obj ' field to trace the context of the Declaration. If the field of ' OBJ ' is nil, it means that the variable is not defined in this file, so it is not a local variable declaration and we can ignore it. If we execute this code on the standard library, we will get: "' 7761 err6310 x5446 got4702 i3821 C '" Interesting is why we missed out on what local variables are declared in the absence of V? # # Consider parameters and variables in range we missed a pair of node types. They are also a local variable. -function parameter, receiver, return value name-the range statement because most of the previous code is followed, we have specifically defined a method for it. "' Gofunc (v visitor) local (n ast. Node) {ident, OK: = N. (*ast. Ident) If!ok {return}if Ident. Name = = "_" | | Ident. Name = = "" {return}if ident. OBJ! = Nil && ident. Obj.pos () = = Ident. Pos () {v.locals[ident. name]++}} "for parameters, return values, and method receivers, we will get a list of identifiers with a length of one. Define a method to handle this list of identifiers: "Gofunc (v visitor) locallist (fs []*ast. Field) {For _, F: = Range FS {for _, Name: = Range F.names {v.local (name)}}} "" So we can handle all the types that declare local variables: "' Gocase *ast. Assignstmt:if D.tok! = token. DEFINE {return v}for _, Name: = Range D.LHS {v.local (name)}case *ast. Rangestmt:v.local (D.key) v.local (d.value) case *ast. Funcdecl:v.locallist (d.recv.list) v.locallist (d.type.params.list) if d.type.results! = nil {v.locallist (d. Type.Results.List)} ' Now let's Run this code: ' shell$./parser ~/go/src/**/*.gomost Common local variable names 12264 err 9395 t 91 x 7442 i 6127 c ' ' # # handling VAR declaration now we need to further deal with the Var declaration, which is likely to be a global variable or a local variable, and only determine if it is an AST. File-level to determine if it is a global variable. For this purpose we create visitor for each new file to track the global variables in the file so that we can correctly calculate the number of identifiers. We will add a pkgdecls type to map[*ast in the struct. Gendecl]bool. In our visitor, we use the Newvisitor function to create a new visitor and initialize the work, and we add globals fields to track the number of times the global variable identifier is declared. "' Gotype visitor struct {PkgdecLS map[*ast. Gendecl]boolglobals map[string]intlocals Map[string]int}func newvisitor (f *ast. File) Visitor {decls: = Make (map[*ast. Gendecl]bool) For _, Decl: = Range F.decls {if d, OK: = Decl. (*ast. GENDECL); OK {decls[d] = True}}return visitor{decls,make (map[string]int), make (Map[string]int),}} "" Our main function will need to create a new one for each file Visitor to track summary results: ' golocals, globals: = Make (Map[string]int), make (Map[string]int) for _, arg: = Range OS. Args[1:] {f, err: = parser. Parsefile (FS, ARG, nil, parser. allerrors) If err! = nil {og. PRINTF ("Could not parse%s:%v", ARG, err) Continue}v: = Newvisitor (f) AST. Walk (V, f) for k, V: = range V.locals {locals[k] + = V}for k, V: = range V.globals {globals[k] + = v}} "and the last part needs to be tracked * Ast. GENDECL node and find all the declarations in the variable: ' ' Gocase *ast. Gendecl:if D.tok! = token. VAR {return v}for _, Spec: = Range D.specs {if value, OK: = Spec. (*ast. VALUESPEC); Ok {for _, Name: = range value. Names {if name. Name = = "_" {continue}if v.pkgdecls[d] {v.globals[name. name]++} else {v.locals[name. name]++}}} 'In each statement we only calculate with ' token '. VAR ' At the beginning of the declaration. constants, types, and other forms of identifiers are therefore ignored. In each declaration we also determine whether it is a global variable or a local variable, and the corresponding record occurrence and ignore ' _ '. The full version of the program is in [here] (https://github.com/campoy/justforfunc/blob/master/25-go-parser/main.go), the execution program we will get: "' shell$." Parser ~/go/src/**/*.gomost Common local variable names 12565 err 9876 x 9464 t 7554 i 6226 bmost common global variable N Ames29 errors28 signals23 failed15 tests12 Debug ' ' So far, we have come to the conclusion that the most commonly used local variable is err. The most commonly used package name is errors. Which constant name is most commonly used? How do we find them? # # Thank you if you like this article can share it also can subscribe to our channel, or follow me. You can also consider becoming a patron.

via:https://medium.com/@francesc/understanding-go-programs-with-go-parser-c4e88a6edb87

Author: Johnkoepi Translator: Saberuster proofreading: polaris1119

This article by GCTT original compilation, go language Chinese network honor launches

This article was originally translated by GCTT and the Go Language Chinese network. Also want to join the ranks of translators, for open source to do some of their own contribution? Welcome to join Gctt!
Translation work and translations are published only for the purpose of learning and communication, translation work in accordance with the provisions of the CC-BY-NC-SA agreement, if our work has violated your interests, please contact us promptly.
Welcome to the CC-BY-NC-SA agreement, please mark and keep the original/translation link and author/translator information in the text.
The article only represents the author's knowledge and views, if there are different points of view, please line up downstairs to spit groove

1113 Reads
Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.