WBXML(WMLC)學習筆記

來源:互聯網
上載者:User

WMLC/WBXML 學習筆記

資料類型
============================================
類型      說明
bit       : 1 bit of data
byte      : 8 bits of opaque data
u_int8    : 8 bit unsigned integer
mb_u_int32: 32 bit unsigned integer, encoded in multi-byte integer format.

multi-byte integer format: 每個位元組的第一位用於表示後面還有沒有位元組是屬於當前integer,1表示還有,0表示沒有。
    如:0xA0 轉換後變成兩個位元組0x81 0x20 ,0x60轉換後還是0x60.

BNF格式文檔架構
========================================================
start = version publicid charset strtbl body
strtbl = length *byte
body = *pi element *pi
element = ([switchPage] stag) [ 1*attribute END ] [ *content END ]
content = element | string | extension | entity | pi | opaque
stag = TAG | (literalTag index)
literalTag = LITERAL | LITERAL_A | LITERAL_C | LITERAL_AC
attribute = attrStart *attrValue
attrStart = ([switchPage] ATTRSTART) | ( LITERAL index )
attrValue = ([switchPage] ATTRVALUE) | string | extension | entity | opaque
extension = [switchPage] (( EXT_I termstr ) | ( EXT_T index ) | EXT)
string = inline | tableref
switchPage = SWTICH_PAGE pageindex
inline = STR_I termstr
tableref = STR_T index
entity = ENTITY entcode
entcode = mb_u_int32 // UCS-4 character code
pi = PI attrStart *attrValue END
opaque = OPAQUE length *byte
version = u_int8 // WBXML version number
publicid = mb_u_int32 | ( zero index )
charset = mb_u_int32
termstr = charset-dependent string with termination
index = mb_u_int32 // integer index into string table.
length = mb_u_int32 // integer length.
zero = u_int8 // containing the value zero (0)
pageindex = u_int8

Version
=======================================================
version = u_int8 // WBXML version number
WBXML版本號碼,前4位表示主要版本號減1後的值,後4位是副版本號碼,如:0x01表示版本號碼為1.1。

Public Identifier
=============================================================
publicid = mb_u_int32 | ( zero index )
zero = u_int8 // containing the value zero (0)
可用的public identifier清單如下:
0 String table index follows; public identifier is encoded as a literal in the string table.
1 Unknown or missing public identifier.
2 "-//WAPFORUM//DTD WML 1.0//EN" (WML 1.0)
3 DEPRECATED "-//WAPFORUM//DTD WTA 1.0//EN" (WTA Event 1.0)
4 "-//WAPFORUM//DTD WML 1.1//EN" (WML 1.1)
5 "-//WAPFORUM//DTD SI 1.0//EN" (Service Indication 1.0)
6 "-//WAPFORUM//DTD SL 1.0//EN" (Service Loading 1.0)
7 "-//WAPFORUM//DTD CO 1.0//EN" (Cache Operation 1.0)
8 "-//WAPFORUM//DTD CHANNEL 1.1//EN" (Channel 1.1)
9 "-//WAPFORUM//DTD WML 1.2//EN" (WML 1.2)
A “-//WAPFORUM//DTD WML 1.3//EN” (WML 1.3)
B “-//WAPFORUM//DTD PROV 1.0//EN” (Provisioning 1.0)
C “-//WAPFORUM//DTD WTA-WML 1.2//EN” (WTA-WML 1.2)
D “-//WAPFORUM//DTD CHANNEL 1.2//EN” (Channel 1.2)
E- 7F Reserved

Charset
===========================================================
charset = mb_u_int32
字元編碼採用IANA Charset MIB數值,常用數值列舉如下:
GBK   : 113
GB2312: 2025
Big5  : 2026
UTF-8 : 106
UTF-16: 1015
UTF-16BE:1013
完整列表請訪問:http://www.iana.org/assignments/character-sets

String Table
===================================================
strtbl = length *byte
length = mb_u_int32

Tokens
===================================================
TAG Token結構:

Bit(s) Description
7      該位(bit)指明當前Tag是否包含有Attributes,如果該位為0,表示該Tag不包含有屬性值(attribute),
       如果該位為1,表示該Tag有一個或多個attributes,直到碰到END token(即0)表示屬性結束。
6      該位(bit)指明當前Tag是否是一個包含有內容(Content)的元素,如果該位為0,表示沒有內容也沒有end tag,
       如果為1,表示有任意多的內容,並且直到碰到END token(即0)結束。
5 - 0  當前Tag值

Global Tokens:
==================================================================================
Token Name Token Description
SWITCH_PAGE 0    Change the code page for the current token state. Followed by a
                 single u_int8 indicating the new code page number.
END         1    Indicates the end of an attribute list or the end of an element.
                 ENTITY 2 A character entity. Followed by a mb_u_int32 encoding the
                 character entity number.
STR_I       3    Inline string. Followed by a termstr.
LITERAL     4    An unknown attribute name, or unknown tag posessing no
                 attributes or content.Followed by a mb_u_int32 that encodes
                 an offset into the string table.
EXT_I_0    40    Inline string document-type-specific extension token. Token is
                 followed by a termstr.
EXT_I_1    41    Inline string document-type-specific extension token. Token is
                 followed by a termstr.
EXT_I_2    42    Inline string document-type-specific extension token. Token is
                 followed by a termstr.
PI         43    Processing instruction.
LITERAL_C  44    An unknown tag posessing content but no attributes.
EXT_T_0    80    Inline integer document-type-specific extension token. Token is
                 followed by a mb_u_int32.
EXT_T_1    81    Inline integer document-type-specific extension token. Token is
                 followed by a mb_u_int32.
EXT_T_2    82    Inline integer document-type-specific extension token. Token is
                 followed by a mb_u_int32.
STR_T      83    String table reference. Followed by a mb_u_int32 encoding a
                 byte offset from the beginning of the string table.
LITERAL_A  84    An unknown tag posessing attributes but no content.
EXT_0      C0    Single -byte document-type-specific extension token.
EXT_1      C1    Single -byte document-type-specific extension token.
EXT_2      C2    Single -byte document-type-specific extension token.
OPAQUE     C3    Opaque document-type-specific data.
LITERAL_AC C4    An unknown tag posessing both attributes and content.

WML Tag Tokens
=====================================================================================
TagName Token
a        1C
anchor   22
access   23
b        24
big      25
br       26
card     27
do       28
em       29
fieldset 2A
go       2B
head     2C
i        2D
img      2E
input    2F
meta     30
noop     31
p        20
postfield 21
pre      1B
prev     32
onevent  33
optgroup 34
option   35
refresh  36
select   37
setvar   3E
small    38
strong   39
table    1F
td       1D
template 3B
timer    3C
tr       1E
u        3D
wml      3F

Attribute Start Tokens
tokens with a value less than 128 indicate the start of an attribute. The attribute start token fully
identifies the attribute name, e.g., URL=, and may optionally specify the beginning of the attribute value, e.g.,
PUBLIC="TRUE". Unknown attribute names are encoded with the globally unique code LITERAL (see section
5.8.4.5). LITERAL must not be used to encode any portion of an attribute value.
=============================================================================================================
AttributeName AttributeValuePrefix Token
accept-charset                        5
accesskey                             5E
align                                 52
align          bottom                 6
align          center                 7
align          left                   8
align          middle                 9
align          right                  A
align          top                    B
alt                                   C
cache-control  no-cache               64
class                                 54
columns                               53
content                               D
content        application/vnd.wap.wmlc;charset=  5C
domain                                F
emptyok        false                  10
emptyok        true                   11
enctype                               5F
enctype application/x-www-form-urlencoded  60
enctype multipart/form-data           61
format                                12
forua          false                  56
forua          true                   57
height                                13
href                                  4A
href           http://                4B
href           https://               4C
hspace                                14
http-equiv                            5A
http-equiv     Content-Type           5B
http-equiv     Expires                5D
id                                    55
ivalue                                15
iname                                 16
label                                 18
localsrc                              19
maxlength                             1A
method         get                    1B
method         post                   1C
mode           nowrap                 1D
mode           wrap                   1E
multiple       false                  1F
multiple       true                   20
name                                  21
newcontext     false                  22
newcontext     true                   23
onenterbackward                       25
onenterforward                        26
onpick                                24
ontimer                               27
optional       false                  28
optional       true                   29
path                                  2A
scheme                                2E
sendreferer    false                  2F
sendreferer    true                   30
size                                  31
src                                   32
src            http://                58
src            https://               59
ordered        true                   33
ordered        false                  34
tabindex                              35
title                                 36
type                                  37
type           accept                 38
type           delete                 39
type           help                   3A
type           password               3B
type           onpick                 3C
type           onenterbackward        3D
type           onenterforward         3E
type           ontimer                3F
type           options                45
type           prev                   46
type           reset                  47
type           text                   48
type           vnd.                   49
value                                 4D
vspace                                4E
width                                 4F
xml:lang                              50
xml:space      preserve               62
xml:space      default                63

Attribute Value -
tokens with a value of 128 or greater represent a well-known string present in an attribute value.
These tokens may only be used to represent attribute values. Unknown attribute values are encoded with string,
entity or extension codes (see section 5.8.4).
All tokenised attributes must begin with a single attribute
==================================================================================================================
Attribute Value Token
.com/                 85
.edu/                 86
.net/                 87
.org/               88
accept          89
bottom          8A
clear           8B
delete          8C
help            8D
http://         8E
http://www.     8F
https://        90
https://www.    91
middle          93
nowrap          94
onenterbackward 96
onenterforward  97
onpick          95
ontimer         98
options         99
password        9A
reset           9B
text            9D
top             9E
unknown         9F
wrap            A0
Www.            A1

參考文檔:
WBXML:
http://www.openmobilealliance.org/release_program/docs/CopyrightClick.asp?pck=Browsing&file=V2_1-20061020-A/WAP-192-WBXML-20010725-a.pdf

WML:
http://www.openmobilealliance.org/release_program/docs/CopyrightClick.asp?pck=Browsing&file=V2_1-20061020-A/WAP-191-WML-20000219-a.pdf

 

聯繫我們

該頁面正文內容均來源於網絡整理,並不代表阿里雲官方的觀點,該頁面所提到的產品和服務也與阿里云無關,如果該頁面內容對您造成了困擾,歡迎寫郵件給我們,收到郵件我們將在5個工作日內處理。

如果您發現本社區中有涉嫌抄襲的內容,歡迎發送郵件至: info-contact@alibabacloud.com 進行舉報並提供相關證據,工作人員會在 5 個工作天內聯絡您,一經查實,本站將立刻刪除涉嫌侵權內容。

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.