After I posted the previous blog post, I felt that the effect was not very good. I wrote an article to explain it. This article can be considered as the previous article JavaScript lexical (http://www.cnblogs.com/winter-cn/archive/2012/04/17/2454229.html) Literacy instructions :)
Every language, whether it is a natural language or a programming language, can roughly distinguish Between Lexical and syntax. Words are the smallest meaningful unit in a language.
I often hear a joke, saying, "I can learn english well. I know all 26 letters !". As we all know, everything in English is made up of 26 letters. Why do we know all 26 letters and cannot speak English?
The answer is that a letter is not a word, and a single letter is meaningless. Only when a letter is a word can it express its meaning. Therefore, when learning English, the most important thing is "memorizing words ".
What does h mean when someone asks you? -- H has no meaning. h makes sense only when it appears in a word like help or hello.
For computer languages, in the same way, every Unicode or ASCII character is meaningless. It makes sense only when they form a word.
Therefore, for any computer language, the most basic rule is lexical. In JavaScript, the so-called "word" is already very familiar to everyone. For example:
If while else for function and other keywords
User-Defined variable names, such as Cat Dog play, are officially called identifiers
"Abc" 13.5/abc/g true or false, which indicates the direct quantity of variable values.
Parentheses, square brackets, brackets, plus signs, and other symbols
Carriage return and other line breaks
Blank characters such as spaces and tabs
Note
After many years of development, the lexical definition of computer languages has basically formed a general routine. Generally, all valid "Words" are collectively referred to as "InputElement )".
In the input element, all meaningful words are called "tokens" (this word is not widely accepted so far, so it is retained). In general understanding, inputElement other than the token can be directly discarded after it is scanned (of course, the actual situation is not good for most languages, and JavaScript is not good .) Therefore, most lexical analysis programs are called lexer, and some people like tokenizer.
In addition to meaningful tokens, other inputelements are used to modify or increase source code readability. In JS, there are only three types:
WhiteSpace: blank space
LineTerminator: The line terminator.
Comments: Comments
These three types are well understood and are commonly used.
The Lexical differences of most programming languages are reflected in the token, and there are only eight JS tokens:
English name |
Name |
Brief Introduction |
Example |
Token |
Lexical tag |
Practically meaningful lexical markers in all JS |
|
┣ IdentifierName |
ID name |
A word that starts with a letter or $. It can be used for Attribute names. |
Abc |
Zookeeper Identifier |
Identifier |
Non-reserved word IdentifierName, which can be used for variable or attribute names |
abc |
┃ ┣ Keyword |
Keywords |
IdentifierName with special syntax meaning |
while |
┃ ┣ NullLiteral |
Null Direct Volume |
Indicates a Null value. |
null |
┃ ┗ BooleanLiteral |
Boolean direct quantity |
Represents a Boolean value. |
true |
┣ Punctuator |
Punctuation Marks |
Punctuation Marks of special significance |
* |
┣ NumericLiteral |
Direct numeric count |
Indicates a value of the Number type. |
.12e-10 |
┣ StringLiteral |
Direct string count |
Indicates a value of the String type. |
"Hello world!" |
┗ RegularExpressionLiteral |
Direct amount of regular expressions |
Indicates an object of the RegularExpression class. |
/[a-z]+$$/g |
These tokens are basically all lexical and can be combined between them to form Syntax structures such as expressions, statements, and function definitions, and finally form a program with powerful expression capabilities.