Leb128 (little endian base 128) is a variable-length data format (variable number of bytes) which can be divided into unsinged leb128 and signed leb128.
It can be used to store any large integer in a small number of bytes.
I. Encoding unsigned leb128
The calculation method is like this (dwarf-2.0.0.pdf Appendix 4 99 pages)
do{ byte = low order 7 bits of value; value >>= 7; if (value != 0) /* more bytes to come */ set high order bit of byte; emit byte;} while (value != 0);
Input: Unsigned int A = 12857 (0x3239)
Output: Unsigned leb128 result
Logic:
1>. Convert 0x3239 to binary
A
= 0000 0000 0000 0000 0011 0010
1001
2>. Capture 7 bits (gray back
Scene) to form a new byte
Byte0 = 0x39
3>. Add seven zeros for the original data.
A = 0000 000 0 0000 0000 0000 0000
0100
4>. If (! = 0), byte | = 0x80
Byte0 = 0x39 + 0x80 = 0xb9
5>. Repeat 3>, 4> until a = 0
Byte1
= 0x64
A
= 0000 000 0 0000 00 00 0000 0000 0000
The final result is B9.
64
Ii. Decoding
Unsigned leb128
Algorithm (dwarf-2.0.0.pdf Appendix 4 99 pages)
1>. Take the next byte:
Bytex = 0xb9
2>. Take it as low as 7 bits and place it in the position of the nth 7 bits.
Result | = (bytex & 0x7f) <(7 * X );
3>. If (! (Bytex | 0x80), continue 1>, otherwise it ends.
Byte0 = 0xb9;
Result = 0x39;
Byte0 & 0x80 = 1
Byte1 = 0x64
Result = 0x3239
Iii. Features of leb128
1. The number of bytes occupied by a leb128 data is unknown.
2. Whether the maximum bit of a leb128 byte is 0 indicates whether the byte stream has ended.
3. A leb128 may be converted to an unsigned int through codec.
Iv. References:
1.dwarf-2.0.0.pdf section 7.6 and
Http://dwarfstd.org/Appendix 4/
2. libdwarf dwarf_leb.c contains the leb128 decoding function.
References:
Blog of micklongen: http://blog.chinaunix.net/uid-20704646-id-95934.html
Http://en.wikipedia.org/wiki/LEB128