The Vint type of Lucene is detailed

Source: Internet
Author: User

Tags: Lucene io ar for SP div on code amp

Lucene Vint compression strategy is to use the highest bit of each byte to do the flag bit, the last 7 bits is a valid arithmetic bit, if the flag bit is 1, then the second byte and the current byte is the same number, 0 indicates that after a byte is a new number

The Lucene source code is stored and read in this way. OutputStream is responsible for writing:

1/** writes an int in a variable-length format. Writes between one and
2 * Five bytes.  Smaller values take fewer bytes. Negative numbers is not
3 * supported.
4 * @see Inputstream#readvint ()
5 */
6 Public final void writevint (int i) throws IOException {
7 while ((I & ~0x7f)! = 0) {
8 WriteByte ((Byte) ((I & 0x7f) | 0x80));
9 I >>>= 7;
10}
WriteByte ((byte) i);
12}

Writevint (compression) step

1. I & ~0x7f

With the lowest bit of int i byte and ~0x7f (1000 0000) to do with the operation, if it is true that the int is equal to or greater than the 8th bit on the bit still has a valid bit, These bits should be in the back of the writebyte operation in the write (one byte per loop including the low 7 bit bit of int and a flag bit)

2 WriteByte ((Byte) ((I & 0x7f) | 0x80));

Writes a byte, a minimum of 7 bit bits of the friend I and a flag bit (1) composed

3 I >>>= 7;

Since 7 bit bits are written, I move 7 bits to the right, allowing the latter bit to participate in the next write

4 WriteByte ((byte) i);

If the loop ends, then I have the remaining valid bit equal to or less than 7 bits, then this is the last time writebyte, this time the 8th bit does not need to set 1, directly write to this byte can be.

=============================================================================================================== ===============================


InputStream is responsible for reading

Public final int Readvint () throws IOException {
7 byte B = ReadByte ();
8 int i = b & 0x7F;
9 for (int shift = 7; (b & 0x80)! = 0; SHIFT + = 7) {
Ten B = readbyte ();
One by one I |= (b & 0x7F) << shift;
12}
return i;
14}

1 byte B = readbyte ();

Read the low one byte first

2 int i = b & 0x7F;

The lower 7 bits (significant digits) of this byte are assigned to I

3 for (int shift = 7; (b & 0x80)! = 0; SHIFT + = 7)

The loop first determines whether the 8th bit of the current byte (b) is 1, and if it is 1, then there is a byte that is also part of the int, and shift is the bit position that represents the valid 7 bits of the current B in the I, because it has already been assigned to 7bit at the beginning of the loop. So the shift loop starts at 7.

The Vint type of Lucene is detailed

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

Tags Index: