Joshua Bloch on Bugs

Discussions

Blogs: Joshua Bloch on Bugs

1. Joshua Bloch on Bugs (9 messages)

Is there a bug in your binary search? According to Joshua Bloch there maybe one lurking just outside of the range of your bounds test. What is even more surprising to Josh is that this bug is contained in a program that had been proven correct in chapter 5 of Programming Pearls. Though the condition is well known, it still remains somewhat obscure.
So what's the bug? Here's a standard binary search, in Java. (It's one that I wrote for the java.util.Arrays): 1: public static int binarySearch(int[] a, int key) { 2: int low = 0; 3: int high = a.length - 1; 4: 5: while (low <= high) { 6: int mid = (low + high) / 2; 7: int midVal = a[mid]; 8: 9: if (midVal < key) 10: low = mid + 1; 11: else if (midVal > key) 12: high = mid - 1; 13: else 14: return mid; // key found 15: } 16: return -(low + 1); // key not found. 17: }
The bug, found in line 16, is of course due to arithmetic overflow. If adding low + high > 2^31 – 1 (maximum value of int), then the value will become negative. There is no need to guess at the consequences of using a negative array index. So how does one fix such a bug? Josh suggests the formulation; low + ((high - low) / 2). This has the effect of avoiding overflow but involves several extra operations. He suggests a faster version; (low + high) >>> 1. This formula assumes that right sifting a register fills the left most bit with a 0. Since this story is based on a bad assumption.... Of course Josh address this point.
And now we know the binary search is bug-free, right? Well, we strongly suspect so, but we don't know. It is not sufficient merely to prove a program correct; you have to test it too.
He does point out that it is often not possible to completely test an application for all ranges of input and that the problem is worse if not impossible, when testing concurrent programs. Care to share your favorite hideous bug with us?

2. The culprit must be line 6. Not 16.[ Go to top ]

The faulty line is line 6. Not line 16.
3. Java doesn't leave >>> up to assumptions[ Go to top ]

He suggests a faster version; (low + high) >>> 1. This formula assumes that right sifting a register fills the left most bit with a 0
always fills the left most bit with 0. >> preserves the sign meaning a negative number is filled with 1
But the problem is there is no way to convert arithmethic overflows to exceptions in Java. We are lucky this one resulted in a IndexOutOfBoundsException so it was relatively easy to find and fix. I've started to see code like this (mostly C though):
assert a >= 0; assert b >= 0; if (a + b < 0) { errx(...); } else { ... } </blockquote> But it is a lot of work to find all places where this could be a problem and guard against it.
4. Re: Not a bug in C[ Go to top ]

A bug that in C didn't usually happen because of pointers and their particular arithmetic. The search in C is usually implemented with pointers instead or integers. With such pointers, the operation char *mid = (low + high) / 2; is not allowed, since you can't add two pointers. But you can substract them (and add an integer to a pointer). So the legal operation is also the correct: char *mid = low + (high-low) / 2; as explained in Kernighan & Ritchie's "The C Programming Language"
5. Solution is simple[ Go to top ]

The simplest solution would be as follows: mid = low/2 + high/2... This can never overflow your boundaries....
6. Re: Solution is simple[ Go to top ]

uhm... and u introduce a new bug... low=1, high=3... mid=1 always!
7. Re: Solution is simple[ Go to top ]

Dear Marcelo,
uhm... and u introduce a new bug... low=1, high=3... mid=1 always!
I guess you didnt read the full algorithm.... I am quoting it here for you...
1: public static int binarySearch(int[] a, int key) { 2: int low = 0; 3: int high = a.length - 1; 4: 5: while (low <= high) { 6: int mid = (low + high) / 2; 7: int midVal = a[mid]; 8: 9: if (midVal < key) 10: low = mid + 1; 11: else if (midVal > key) 12: high = mid - 1; 13: else 14: return mid; // key found 15: } 16: return -(low + 1); // key not found. 17: }
Iteration 1: low = 1, high = 3, mid = 1/2 + 3/2 = 0 + 1 = 1 now.. if a[1] = key - we found our man. if a[1] < key low = mid + 1 , i.e. 1 + 1 = 2 and if a[1] > key (i.e. the first element by itself is too large) high = mid - 1, 1 - 1 = 0 - now low > high so quit. Iteration 2: will happen only for a[1] < key. i.e. low = 2 and high = 3 mid = 2/2 + 3/2 = 1 + 1 = 2. now.. if a[2] = key - we found our match if a[2] < key low = mid + 1, i.e. 2+1 = 3 and if a[2] > key, high = mid -1 = 2 -1 =1.. (now a[1] < key < a[2] - no match) also high < low. so the only way we can go to the next iteration is if a[2] < key so in Iteration 3: low = 3 and high = 3 mid = 3/2 + 3/2 = 1 + 1 = 2. if a[3] != key, we know a[2] high -- we have seen all 3 elements and didnt find our key in the array - so quit. I see no reason why this algorithm should get stuck on the same number as you suggest :-) ...
8. Re: Solution is simple[ Go to top ]

in Iteration 3:
low = 3 and high = 3
mid = 3/2 + 3/2 = 1 + 1 = 2.
if a[3] != key
I am a bit of a mathematical genius, aint I?
9. Is it really a practical problem?[ Go to top ]

While I have not thought about this very carefully, I think this bug may occur only when searching a very large array of objects. For example, for the “(low + high) / 2” bug to occur, I think you’d have to have an array of more than one billion elements, and the search’s calculated mid point would have to move to nearly the last object in the array for (low + high) > (2^31 – 1) to occur. And if it is ever possible for the “-(low + 1)” bug to occur, you’d need to have at least 2,147,483,647 array elements, and each element of the example int array would be a sorted increasing number from 0 to 2,147,483,646 and the search key value would have to be 2,147,483,647, so there would be no match in the lower value range. This is not to mention that it would require more than 8 GB of RAM just for an array with 2 billion int values to allow such an approach to work. Am I missing something? I think it is more likely to happen with an array of objects (say String) or byte or char than with int elements. If your int (or other type of) array is that huge, most of us would take a different approach. For example, we might keep an array of missing values rather than an array of existing values, and invert the search so that if you don’t find the search key your search is considered “found.” Or we might avoid the issue altogether and set up a specialized database table to deal with it. My point is that if you consider most applications--and most enterprise business applications at that--we are probably going to have BIG problems because of caching a huge in-memory array long before we encounter problems with the actual limits of the binary search algorithm. This is not to suggest that the bug(s) would never happen. Of course the problem should be fixed, especially for the cases where Java is being used in gigantic memory computing environments (scientific?). Those of us “experienced” enough ;) to remember when the book “Programming Pearls” was first published, and that we used the algorithm in C, and on PCs with 640K RAM limits, know that such limits are never made to last. But most members of this community probably have little to worry about with practical uses of Java’s built in binary search.
10. RE: Is it really a practical problem?[ Go to top ]

I think it is more likely to happen with an array of objects (say String) or byte or char than with int elements.
Oops... of course you'd never have enough unique byte or or char values to hit the overflow situation. Sorry. I was just using another approach found in Programming Pearls, called "Back of the Envelope," and I needed to scratch out part of my conclusion. :)