This is an independently implemented SGF go chess and music file parser by liigo. This article introduces its implementation details. There is no doubt that a complete open-source SGF parser can be found on the network. I Don't directly use them, nor refer to their implementation code, but implement them independently, there is a reason, because I want to reinvent the wheel myself and think it will help improve my coding capability. (I will write an article about my immature argument that "we must learn to reinvent the wheel .)
This SGF parser developed by liigo uses Simple event-based APIs, similar to the SAX (Simple API for XML) in the XML parser ). The core of this parser is that the user provides a series of callback functions in advance. During the parsing process, the parser calls the relevant callback functions in sequence and passes in the corresponding parameters, the user program performs corresponding processing in the callback function. This type of parser is a lightweight parser with fast resolution speed, low memory usage, clear structure and easy implementation. It is not as easy to use as a DOM-based parser.
The SGF Format, Smart Game Format, is designed to record a variety of common Game chess and music formats. It has been promoted in the Go field and is the most important and most common form for describing go chess and music. It is a text-only TREE-based structure for easy identification, storage and transmission. The format is simple and practical, and it is very easy to parse by programming. Official SGF format Website: http://www.red-bean.com/sgf /. (When talking about Go games, you have to be amazed. It only needs to use a picture to completely restore the changing landscape of a game. As a comparison, A chess image can only describe the scene of a certain moment in the scene .)
The main structure of SGF is composed of tree (GameTree), Node Sequence, Node, and Property. "Attribute" is the most important basic unit. It consists of a property identifier (PropIdent) and a property value (PropValue. Multiple Attributes separated by semicolons (;) are called nodes. Multiple nodes are ordered in sequence. A node sequence enclosed by parentheses (")", called a tree, which can contain Subtrees. The EBNF definition of SGF is as follows (see html # ebnf-def "> http://www.red-bean.com/sgf/sgf4.html#ebnf-def ):
View plaincopy to clipboardprint?
Collection = GameTree {GameTree}
GameTree = "(" Sequence {GameTree }")"
Sequence = Node {Node}
Node = ";" {Property}
Property = PropIdent PropValue {PropValue}
PropIdent = UcLetter {UcLetter}
PropValue = "[" CValueType "]"
CValueType = (ValueType | Compose)
ValueType = (None | Number | Real | Double | Color | SimpleText | Text | Point | Move | Stone)
Collection = GameTree {GameTree}
GameTree = "(" Sequence {GameTree }")"
Sequence = Node {Node}
Node = ";" {Property}
Property = PropIdent PropValue {PropValue}
PropIdent = UcLetter {UcLetter}
PropValue = "[" CValueType "]"
CValueType = (ValueType | Compose)
ValueType = (None | Number | Real | Double | Color | SimpleText | Text | Point | Move | Stone)
The following is a simple and representative SGF text. Let's have a perceptual knowledge:
+ Expand sourceview plaincopy to clipboardprint?
(; FF [4] GM [1] SZ [19] FG [257: Figure 1] PM [1]
PB [Takemiya Masaki] BR [9 dan] PW [Cho Chikun]
WR [9 dan] RE [W + Resign] KM [5.5] TM [28800] DT [1996-10-]
EV [21st Meijin] RO [2 (final)] SO [Go World #78] US [Arno Hollosi]
; B [pd]; W [dp]; B [pp]; W [dd]; B [pj]; W [nc]; B [oe]; W [qc]; B [pc]; W [qd]
(; B [qf]; W [rf]; B [rg]; W [re]; B [qg]; W [pb]; B [ob]; W [qb]
(; B [mp]; W [fq]; B [ci]; W [cg]; B [dl]; W [cn]; B [qo]; W [ec]; B [jp]; W [jd]
; B [ei]; W [eg]; B [kk] LB [qq: a] [dj: B] [ck: c] [qp: d] N [Figure 1]
; W [me] FG [257: Figure 2]; B [kf]; W [ke]; B [lf]; W [jf]; B [jg]
(; W [mf]; B [if]; W [je]; B [ig]; W [mg]; B [mj]; W [mq]; B [SCSI]; W [nq]
(; B [lr]; W [qq]; B [pq]; W [pr]; B [rq]; W [rr]; B [rp]; W [oq]; B [mr]; W [oo]; B [mn]
(; W [nr]; B [qp] LB [kd: a] [kh: B] N [Figure 2]
W [pk] FG [257: Figure 3]; B [pm]; W [oj]; B [OK]; W [qr]; B [OS]; W [ol]; B [nk]; W [qj]
; B [pi]; W [pl]; B [qm]; W [ns]; B [sr]; W [om]; B [op]; W [qi]; B [oi]
(; W [rl]; B [qh]; W [rm]; B [rn]; W [ri]; B [ql]; W [qk]; B [sm]; W [sk]; B [sh]; W [og]
; B [oh]; W [np]; B [no]; W [mm]; B [nn]; W [lp]; B [kp]; W [lo]; B [ln]; W [ko]; B [mo]
; W [jo]; B [km] N [Figure 3])
(; W [ql] VW [ja: ss] FG [257: Dia. 6] MN [1]; B [rm]; W [ph]; B [oh]; W [pg]; B [og]; W [pf]
; B [qh]; W [qe]; B [sh]; W [of]; B [sj] TR [oe] [pd] [pc] [ob] LB [pe: a] [sg: B] [si: c]
N [di1_6])
(; W [no] VW [jj: ss] FG [257: Dia. 5] MN [1]; B [pn] N [di1_5])
B [pr] FG [257: Dia. 4] MN [1]; W [kq]; B [lp]; W [lr]; B [jq]; W [jr]; B [kp]; W [kr]; B [ir]
; W [hr] LB [is: a] [js: B] [or: c] N [di1_4])
(; W ['if] FG [257: Dia. 3] MN [1]; B [mf]; W [ig]; B [weight] LB [ki: a] N [di1_3])
(; W [oc] VW [aa: sk] FG [257: Dia. 2] MN [1]; B [md]; W [mc]; B [ld] N [di1_2])
(; B [qe] VW [aa: sj] FG [257: Dia. 1] MN [1]; W [re]; B [qf]; W [rf]; B [qg]; W [pb]; B [ob]
; W [qb] LB [rg: a] N [di1_1])
(; FF [4] GM [1] SZ [19] FG [257: Figure 1] PM [1]
PB [Takemiya Masaki] BR [9 dan] PW [Cho Chikun]
WR [9 dan] RE [W + Resign] KM [5.5] TM [28800] DT [1996-10-]
EV [21st Meijin] RO [2 (final)] SO [Go World #78] US [Arno Hollosi]
; B [pd]; W [dp]; B [pp]; W [dd]; B [pj]; W [nc]; B [oe]; W [qc]; B [pc]; W [qd]
(; B [qf]; W [rf]; B [rg]; W [re]; B [qg]; W [pb]; B [ob]; W [qb]
(; B [mp]; W [fq]; B [ci]; W [cg]; B [dl]; W [cn]; B [qo]; W [ec]; B [jp]; W [jd]
; B [ei]; W [eg]; B [kk] LB [qq: a] [dj: B] [ck: c] [qp: d] N [Figure 1]
; W [me] FG [257: Figure 2]; B [kf]; W [ke]; B [lf]; W [jf]; B [jg]
(; W [mf]; B [if]; W [je]; B [ig]; W [mg]; B [mj]; W [mq]; B [SCSI]; W [nq]
(; B [lr]; W [qq]; B [pq]; W [pr]; B [rq]; W [rr]; B [rp]; W [oq]; B [mr]; W [oo]; B [mn]
(; W [nr]; B [qp] LB [kd: a] [kh: B] N [Figure 2]
W [pk] FG [257: Figure 3]; B [pm]; W [oj]; B [OK]; W [qr]; B [OS]; W [ol]; B [nk]; W [qj]
; B [pi]; W [pl]; B [qm]; W [ns]; B [sr]; W [om]; B [op]; W [qi]; B [oi]
(; W [rl]; B [qh]; W [rm]; B [rn]; W [ri]; B [ql]; W [qk]; B [sm]; W [sk]; B [sh]; W [og]
; B [oh]; W [np]; B [no]; W [mm]; B [nn]; W [lp]; B [kp]; W [lo]; B [ln]; W [ko]; B [mo]
; W [jo]; B [km] N [Figure 3])
(; W [ql] VW [ja: ss] FG [257: Dia. 6] MN [1]; B [rm]; W [ph]; B [oh]; W [pg]; B [og]; W [pf]
; B [qh]; W [qe]; B [sh]; W [of]; B [sj] TR [oe] [pd] [pc] [ob] LB [pe: a] [sg: B] [si: c]
N [di1_6])
(; W [no] VW [jj: ss] FG [257: Dia. 5] MN [1]; B [pn] N [di1_5])
B [pr] FG [257: Dia. 4] MN [1]; W [kq]; B [lp]; W [lr]; B [jq]; W [jr]; B [kp]; W [kr]; B [ir]
; W [hr] LB [is: a] [js: B] [or: c] N [di1_4])
(; W ['if] FG [257: Dia. 3] MN [1]; B [mf]; W [ig]; B [weight] LB [ki: a] N [di1_3])
(; W [oc] VW [aa: sk] FG [257: Dia. 2] MN [1]; B [md]; W [mc]; B [ld] N [di1_2])
(; B [qe] VW [aa: sj] FG [257: Dia. 1] MN [1]; W [re]; B [qf]; W [rf]; B [qg]; W [pb]; B [ob]
; W [qb] LB [rg: a] N [di1_1])
Programmers who are familiar with writing text parsers should be clear about it. According to the definition of EBNF, it is quite simple and intuitive to write the corresponding parsers. It seems to be just a translation job. I implemented the SGF parser and once again confirmed this point. In most cases, I just translated EBNF into C language code step by step.
I first designed the "SGFParseContext" structure to save the relevant data during the parser's work:
View plaincopy to clipboardprint?
Typedef struct _ tagSGFParseContext
{
Void * pUserData;
Int treeIndex;
PFN_ON_TREE pfnOnTree;
PFN_ON_TREE_END pfnOnTreeEnd;
PFN_ON_NODE pfnOnNode;
PFN_ON_NODE_END pfnOnNodeEnd;
PFN_ON_PROPERTY pfnOnProperty;
Char idBuffer [16];
Char * valueBuffer;
Int valueBufferSize;
}
SGFParseContext;
Typedef struct _ tagSGFParseContext
{
Void * pUserData;
Int treeIndex;