ArticleDirectory
BNF is a language used to represent context-independent grammar. It describes a type of formal language. Although the bacos paradigm can also represent the syntax of some natural languages, it is still more widely usedProgramThe syntax representation of the design language, instruction set, and communication protocol. Most textbooks in programming language or formal semantics use the bacos paradigm. There are also some variants of the bacos paradigm in various documents, such as extended bacos paradigm ebnf or extended bacos paradigm ABNF.
BNF is a set of derivation rules (generative rules), written:
<Code> <symbol >::=< use a symbolic expression>
The <symbol> here is a non-Terminator, and the expression is composed of a symbolic sequence or multiple Symbolic Sequences separated by the selected vertical bars '|, each symbolic sequence is a possible replacement of the symbols at the left end. A symbol that never appears on the left is called a Terminator.
Extended bacos paradigm
Extended bacos-bayl Paradigm (ebnf) is expressed as a description of the computerProgramming LanguageAnd the formal mode of the formal form language. It is an extension of the basic BNF syntax notation.
It was initially developed by Nicolas Bos, and the most common ebnf variants are defined by standards, especially ISO-14977.
Basic
Code, For example, a computer program consisting of Terminator, namely, visual characters, numbers, punctuation marks, and white spacesSource code.
Ebnf defines the generation rules for assigning the Symbolic Sequences to non-terminologies respectively:
Digit excluding zero = "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9 "; digit = "0" | digit excluding zero;
This generation rule defines the non-terminator digit on the left end of the assignment. Vertical bars indicate that they are available, and Terminator is enclosed by quotation marks, followed by a semicolon as the ending character. Therefore, digit is a digit excluding zero that is 0 or can be 1, 2, or 3 until 9.
The generation rule can also include a series of Terminators or non-Terminators separated by commas:
Twelve = "1", "2"; two hundred one = "2", "0", "1"; three hundred twelve = "3", twelve; twelve thousand two hundred one = twelve, two hundred one;
Expressions that can be omitted or repeated can be passed through curly brackets {... } Indicates:
Natural Number = digit excluding zero, {digit };
In this case, string 1, 2 ,..., 10 ,..., 12345 ,... Are all correct expressions. To indicate this, everything set up in curly brackets can be repeated once, including not at all.
Optional values can be represented by square brackets:
Integer = "0" | ["-"], natural number;
Therefore, integer is a natural number of zero (0) or leading optional negative signs.
Ebnf also includes the description of the repetition of a specified number of times, and the syntax for excluding a part of the generated formula or inserting comments into the ebnf syntax.
Iso-based extension
According to the ISO 14977 standard, two facilities are provided to expand ebnf. One is the special sequence in the ebnf grammar section. It is any text enclosed by question marks and its interpretation is beyond the scope of the ebnf standard. For example, space characters can be defined using the following rules:
Space =? US-ASCII character 32 ?;
The second is the fact that ebnf cannot be followed by an identifier using parentheses. The following are invalid ebnf values:
Something = Foo (bar );
Therefore, ebnf extensions can use this notation. For example, in the lisp syntax, function applications can be defined using the following rules:
Function Application = List (symbol, [{expression}]);
Motivation for BNF Extension
BNF has the option and repetition issues that cannot be expressed directly. As an alternative, they need to use mediation rules or select one of the two. For Optional Rules, the definition is either empty or optional. For repeated rules, the recursive definition is either a duplicate production type or its own rule. The same structure is still available in ebnf.
Optional:
Signed number = [sign,] number;
It can be defined:
Signed number = sign, number | number;
Or
Signed number = optional sign, number; optional sign = ε | sign; (* use ε to clearly indicate null generative formula *)
Repeat:
Number = {digit };
It can be defined:
Number = digit | Number digit;
Other additions and modifications
Ebnf removes some BNF defects:
- BNF uses symbols for itself (<,>, |,: = ). When they appear in the language to be defined, BNF cannot be modified or interpreted.
- BNF-the syntax indicates only one rule in a row.
Ebnf solves these problems:
- The Terminator is strictly enclosed in quotation marks ("…" Or '... . Angle brackets for non-terminator ("<…> ") Can be omitted.
- Generally, end a rule with a semicolon.
In addition, it provides enhanced mechanisms for defining repetition times, Division selection (for example, all characters except quotation marks), and comments.
Regardless of all these enhancements, ebnf is no more powerful than BNF in a language that can be defined. Any syntax defined with ebnf in principle can be expressed with BNF. But it often leads to a considerable representation of more rules.
Ebnf has been standardized by ISO code ISO/IEC 14977: 1996 (E.
In some cases, any extended BNF is called ebnf. For example, W3C uses one ebnf to specify XML.
Another example
Simple programming languages that only allow assignment can be defined as ebnf:
(* A simple program in ebnf −wikipedia *) program = 'program ', white space, identifier, white space, 'begin', white space, {assignment ,";", white space}, 'end. '; identifier = alphabetic character, [{alphabetic character | digit}]; number = ["-"], digit, [{digit}]; string = '"', {all characters −' "'},'" '; assignment = identifier, ": =", (Number | identifier | St Ring ); alphabetic character = "A" | "B" | "C" | "D" | "E" | "F" | "G" | "H" | "I" |" J "|" K "|" L "|" M "|" N "|" O "|" p "|" Q "|" R "|" S "|" t "|" U "|" v "|" W "|" X "|" Y "|" Z "; digit = "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9 "; white space =? White space characters? ; All characters =? All visible characters? ;
A syntactically correct program:
Program demo1 begin A0: = 3; B: = 45; H: =-100023; C: = A; d123: = b34a; baboon: = giraffe; text: = "Hello world! "; End.
This language can easily expand the control flow, arithmetic expressions, and input/output commands. A small and available programming language can be developed.
The following characters proposed to be formally represented in the standard are used:
Purpose |
Symbol Representation |
Definition |
= |
Concatenation |
, |
Termination |
; |
Separate |
| |
Optional |
[...] |
Repeated |
{... } |
Group |
(... ) |
Double quotation marks |
"... " |
Single quotes |
'... ' |
Note |
(*... *) |
Special Sequence |
? ... ? |
Except |
- |
Conventions
- The following conventions are used:
- Each metadata identifier of extended BNF is written as one or more words connected by the font size;
- The meta identifier ending with "-symbol" is the name of the Terminator that extends BNF.
- It indicates that the normal characters of each operator that extends BNF and the priority contained in it (the top is the highest priority) are:
* Repetition-symbol t-symbol, concatenate-symbol | definition-separator-symbol = defining-symbol; Terminator-Symbol
- The following Brackets have higher priority than normal:
'First-quote-symbol first-quote-symbol' "second-quote-symbol" (* Start-comment-symbol end-comment-symbol *) (START-group-symbol end-group-symbol) [start-Option-symbol end-Option-Symbol] {start-repeat-symbol end-repeat-symbol }? Special-sequence-symbol special-sequence-symbol?
As an example, the following syntax rules demonstrate repeated expressions:
AA = "A"; BB = 3 * AA, "B"; CC = 3 * [AA], "C"; dd = {AA}, "D "; EE = AA, {AA}, "E"; FF = 3 * AA, 3 * [AA], "F"; GG = {3 * AA}, "D ";
The ending strings defined by these rules are as follows:
AA: a BB: aaab CC: c ac aac aaac DD: d ad aad AAad aaaad etc. EE: AE AAE aaae aaaae aaaaaaae etc. FF: AAAF aaaaf aaaaaaaf aaaaaaf GG: D AAad aaaaaaaad etc.
Related work
- W3C uses a different ebnf to specify the XML syntax.
- The British Standards Institute published an ebnf standard in 1981: BS 6154.
- IETF uses the expanded BNF (ABNF) defined in RFC 4234 ).
PS. Because the Chinese Wikipedia is not accessible, the original article reprinted the entry about the bacos paradigm and extended bacos paradigm.