[Translate] Compiler (6)-identifier

Source: Internet
Author: User
This is a creation in Article, where the information may have evolved or changed.

Here's the original.
———— Translation Divider Line ————

Compiler (6)-identifier

The first part: Introduction
Part II: compilation, Translation and interpretation
Part III: Compiler Design Overview
Part IV: Language Design Overview
Part V: Calc 1 Language Specification sheet

In this article, we can finally begin to immerse ourselves in the code!

Identifier

In the previous content, we have discussed the syntax and the collection of identifiers that need to be scanned. We have defined expressions, numbers, and operators. There is also a clear expectation of a pair of parentheses. You should also let the parser know when the scanner reaches the end of the file.

Before you start the scan, you need to format the identifiers in your code in order for the scanner to work. Identifiers are used at all stages involved in the compiler. If we want to develop a program like Go's FMT or vet, you may also need to reuse identifiers.

This is the first part of the code: HTTPS://GITHUB.COM/RTHORNTON128/CALC/BLOB/CALC1/TOKEN/TOKEN.GO

Those constants will look interesting at the beginning. Some non-exported identifiers that start with lowercase letters are mixed between the beginning of the capital letter and the export. A non-export identifier can help us write tool functions and allow the language to be extended without modifying any other code.

Https://github.com/rthornton128/calc/blob/calc1/token/token.go#L36

Next, the identifier (Token) is mapped to a string. You can also use an array of strings, but I didn't. It is now easier to write query functions (lookup).

Https://github.com/rthornton128/calc/blob/calc1/token/token.go#L50

The rest of the parts are tool functions. You can see in isliteral and isoperator that our non-export constants are in handy place. No matter how many new operators or grammar symbols you want to add, you do not need to modify these functions. Convenient Ah!

https://github.com/rthornton128/calc/blob/calc1/token/token.go#L58

Lookup, String, and Valid provide help when generating error messages.

Position

This file may require you to take some time to think. I'll try to explain it to you slowly.

When scanning, the first character from the stream starts, from the top down, from left to right. The offset of the first character is zero.

In contrast, when the user wants to know which row the error is reported on, the first character should be in the first row, the first column. Therefore, the location information of a character needs to be translated into information that is meaningful to the end user.

The position (Pos) is the offset of the character plus the cardinality of the file. If the cardinality is one, the offset of the string is zero, and the Pos of this string is one.

Zero position is illegal because it means a place outside of the file. Similarly, if a location is larger than the file's cardinality plus the length of the file, it is also illegal.

Why should we consider such a complex matter? Well, when you need to parse multiple files, in the absence of a lot of support, it's a hassle to determine which file the error message is from. Pos makes things easier. More on this in a later article.

The Position type is strictly used for error reporting. It allows us to output clear information about which row, which column, and which file has the wrong message. At this stage, we only need to deal with a single file, but in the future we will be grateful for this code.

File

Strictly speaking, File is completely unnecessary for writing a compiler, but I still think it is better to have a clear error message and a life-and-death import function. Go has done a good job, but some other compilers are not necessarily. For example, the GNU C compiler is often annoying. But it has improved a lot over the years.

The Code provides a framework for some of the content that will be provided. If we need to process more than one file a day, we will use it.

The core idea is essentially for error reporting and is closely linked to the location code. Again, since we only have one file (cardinality is one), or the starting position is one. It can't be smaller, but it could be bigger in the future. So don't dwell on it now.

Each time the scanner detects a new line of characters, it needs to add the location to the list of rows in the file. This allows the Position function to calculate where the error occurred and to report the location.

Summarize

About this is about the number of identifiers covered. As I promised, there is not much explanation for the relevant code.

I recommend that you review the code frequently. Once you understand how they work together, it will look much more meaningful. This library will be used extensively in the compiler, so we'll keep mentioning it.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.