Software vulnerability: start with the bug of "binary lookup"

Source: Internet
Author: User

Probably around July, I first saw this article on javalobby. At that time, I was shocked because arrays in JDK (1.5 and earlier versions. binarysearch (INT [] A, int key) and collections. binarysearch (INT [] A, int key) (which calls indexedbinarysearch) has a bug that has been hidden for decades in the industry, and the authors of these two methods are exactly two high-tech implementations:

* @ Author Josh Bloch
* @ Author Neal gafter

I think developers who have been in the Java field should have heard about it. Josh Bloch is called the "mother of Java" (although he is a male) because of the Java Collection framework, java. math, generic and objective Java programming language guide are all from his hands, while Neal gafter is the real-time author of the Java compiler javac that we use every day.

Let's first look at the problematic code:

Public static int binarysearch (INT [] A, int key ){
Int low = 0;
Int high = A. Length-1;
While (low <= high ){
Int mid = (low + high)> 1;
Int midval = A [Mid];
If (midval <low = "mid"> key)
High = mid-1;
Else
Return mid; // key found
}
Return-(low + 1); // key not found.
}

The problematic code is this line:

Int mid = (low + high)> 1;

As you know, it is equivalent:

Int mid = (low + high)/2;

The problem is that when low and high are both very large, for example, when the array element reaches 2 ^ 30, low + high will exceed the integer's maximum value 2 ^ 31-1, which will cause overflow, the mid value obtained after overflow is a negative value.

The correct implementation should be:

Int mid = low + (high-low)/2 );

Or use the unsigned right shift operator of Java more clearly:

Int mid = (low + high) >>> 1;

Although this problem has been solved, can we determine that the dozens of lines of programs are accurate?
However, even the two authors are still skeptical.

We learned from the industry that the first bipartite algorithm appeared in 1946, at that time, it was deemed that the implementation of "no error" had only appeared in 1962 (that is to say, more than a dozen lines of code were obtained after more than a decade ). Because the amount of data at that time could not reach the order of 2 ^ 30, the bug was submitted to the Java bug library last year. Human Thinking is flawed.

Currently, for search engines and genetic engineering, this order of magnitude should be rare, so if you need to process a large amount of data in your field, use JDK 6.0.

By the way, the implementation of C can be implemented as follows:

Mid = (unsigned) (low + high)> 1;

As a programmer, we should always be vigilant and keep a low profile!

 

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.