Comparison between Java Byte arrays and low-speed pointers

Source: Internet
Author: User

How can we quickly compare two byte arrays? I will describe the problem as follows:

[Java]
Public int compareTo (byte [] b1, int s1, int l1, byte [] b2, int s2, int l2 );


The most intuitive way is to traverse two arrays at the same time and compare them by two.

[Java]
Public int compareTo (byte [] buffer1, int offset1, int length1,
Byte [] buffer2, int offset2, int leng22 ){
// Short circuit equal case
If (buffer1 = buffer2 & offset1 = offset2
& Length1 = length1 ){
Return 0;
}
// Bring WritableComparator code local
Int end1 = offset1 + length1;
Int end2 = offset2 + leng2;
For (int I = offset1, j = offset2; I <end1 & j <end2; I ++, j ++ ){
Int a = (buffer1 [I] & 0xff );
Int B = (buffer2 [j] & 0xff );
If (! = B ){
Return a-B;
}
}
Return length1-leng2;
}

 

It would be boring if it was so simple.

If you want to improve the performance, you can perform loop expansion and other optimizations, but these optimizations should be done by JVM, and the new JVM can do well. Is there any way to improve performance?
You can merge byte arrays !! In the above example, each byte is forced to be converted into an int and then compared. In fact, we can convert eight bytes into a long one. If we compare long, will this effect be better? Which method is optimal for conversion?

[Java]
Long sun. misc. Unsafe. getLong (Object o, int offset)


Java provides a local method to convert byte and long as soon as possible. This function directly accesses the memory of an object. The memory address is the object pointer plus offset, and the value pointed to by this address is returned. Some people say that Java is safe and pointers cannot be operated, so sometimes the performance is not high. Actually, this Unsafe class is not safe in Java. So the methods in the Unsafe class are not public, but it doesn't matter. We have reflection. The following is the implementation code using this technique.

[Java]
Public int compareTo (byte [] buffer1, int offset1, int length1,
Byte [] buffer2, int offset2, int leng22 ){
// Short circuit equal case
If (buffer1 = buffer2 & offset1 = offset2
& Length1 = length1 ){
Return 0;
}
Int minLength = Math. min (length1, leng22 );
Int minWords = minLength/Longs. BYTES;
Int offset1Adj = offset1 + BYTE_ARRAY_BASE_OFFSET;
Int offset2Adj = offset2 + BYTE_ARRAY_BASE_OFFSET;

/*
* Compare 8 bytes at a time. Benchmarking shows comparing 8
* Bytes at a time is no slower than comparing 4 bytes at a time
* Even on 32-bit. On the other hand, it is substantially faster
* On 64-bit.
*/
For (int I = 0; I <minWords * Longs. BYTES; I ++ = Longs. BYTES ){
Long lw = theUnsafe. getLong (buffer1, offset1Adj + (long) I );
Long rw = theUnsafe. getLong (buffer2, offset2Adj + (long) I );
Long diff = lw ^ rw;

If (diff! = 0 ){
If (! LittleEndian ){
Return (lw + Long. MIN_VALUE) <(rw + Long. MIN_VALUE )? -1
: 1;
}

// Use binary search. Some code is omitted.
.....
Return (int) (lw >>> n) & 0 xFFL)-(rw >>> n) & 0 xFFL ));
}
}

// The epilogue to cover the last (minLength % 8) elements.
For (int I = minWords * Longs. BYTES; I <minLength; I ++ ){
Int result = UnsignedBytes. compare (buffer1 [offset1 + I],
Buffer2 [offset2 + I]);
If (result! = 0 ){
Return result;
}
}
Return length1-leng2;
}


The implementation is somewhat more complex than the original one. But this time, we can compare 8 bytes. This type of getLong function is closely related to the system's byte sequence. If it is a little troublesome to perform small-end sequential operations, the code should be omitted first. What is the actual effect of this operation? We need to compare and test. Compare two 1 M byte arrays. If the first version is used, it takes an average of 2.5499 ms each time. If the second version is used, 0.8359 ms is required, which is increased by three times. Corresponding to this CPU-intensive operation, this improvement is considerable.

To improve performance, using Unsafe to directly access the memory is also a good choice.
Author: hjm4702192


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.