
Getty Images/iStockphoto
Optimize code using binary notation
Code optimization can be more creative than compilers and instruction-level parallelism. Here are specific ways to optimize code using hardware resources with high-level languages.
Optimizations in programming have mostly been associated with more efficient data structures or algorithms. Any optimization that uses hardware resources explicitly is generally considered premature, not portable or too low-level. Optimization at the level of machine instructions makes little sense nowadays since compilers are better at finding arrangements of instructions to improve intrinsic instruction-level parallelism in modern processors.
Notwithstanding, an entire class of hardware resources has been completely neglected because some believe these resources are not meant to be used in high-level languages such as Java. Instead, to this thinking, they should be relegated to languages thought to be more related to low-level programming such as C and Zig. That couldn't be further from the truth.
This article focuses on how to exploit hardware resources with the goal of designing specific solutions in contexts where these resources promote better code efficiency. The main objective is to incite creative thinking by using binary logic not necessarily in standard ways. One should not concentrate too much on the specific algorithms and techniques shown here, but rather to use a similar reasoning for other specific solutions using these and other optimizations based on binary logic.
Example of candidate and occupancy sets
A candidate set is a set that contains candidates for a particular task, for example. Its opposite is an occupancy set, a set that already contains candidates but for which one wants to know if another candidate can or not be inserted in the set, provided it is not already in the set.
These sets are generally dealt in high-level languages with loops that iterate through them. In this case, to find if a certain candidate is already in an occupancy set, one must search it in the set, one by one. In the worst-case scenario, this simple search would have O(n) time complexity, where n is the set size. What if one could do this with an O(1) time complexity for any candidate? In fact, we can, provided our set is sufficiently small.
One way to approach this problem is to assess if binary logic could be used to optimize tasks that involve these sets. Integer numbers can be described in many different representations. Binary integers, for example, are just a compact way to represent numbers using base 2, but nothing really differentiates these numbers from representations in other bases. There is no extra information in these representations that would be of help in the problems we are dealing with here, for example.
Another way to represent this kind of set using binary logic would be to put a 1 at the bit position given by the number we are trying to represent. A zero in that same position would indicate the absence of this number in the set. This idea is not new; bit arrays have been used for this and other purposes since the dawn of binary computers. The advantage of this representation is that it allows binary logic operations to obtain other information and relationships between their elements that are not always obvious to a naïve eye.
If the size of the candidate/occupation set is sufficiently small, the entire bit array can fit in just one machine word. In this binary representation, a candidate to be inserted in an occupancy set (or each individual candidate already in the set) will always correspond to a power of two, because it's a number with only one bit set to 1 at the position corresponding to the digit.
The table below shows a few amount of candidates and their binary, hexadecimal and decimal representations. The binary representations are bit arrays that do not need to be restricted to 9 bits as shown here. The choice of 9 bits is because this logic can be used in Sudoku game solvers; in fact, these techniques were initially devised for a Sudoku game solver. The objective of this article is not to show how to develop a Sudoku solver, but only to show how these optimizations were devised.
Candidate | Binary representation | Hexadecimal | Decimal |
1 | 000000001 | 0x001 | 1 |
2 | 000000010 | 0x002 | 2 |
3 | 000000100 | 0x004 | 4 |
4 | 000001000 | 0x008 | 8 |
5 | 000010000 | 0x010 | 16 |
6 | 000100000 | 0x020 | 32 |
7 | 001000000 | 0x040 | 64 |
8 | 010000000 | 0x080 | 128 |
9 | 100000000 | 0x100 | 256 |
It is simple to convert candidate_number to its binary representation, as follows:
candidate = 1 << (candidate_number - 1).
Checking if a candidate is present in the set
As mentioned above, the set is now just an integer variable (set) wherein each bit represents a different candidate/value. To check if a certain candidate is already in the set, all one needs to do is a binary & (and) operation:
if (( candidate & set ) != 0 ) // then candidate is already in the set
No loop was needed to check this, since all candidates in the set were tested in parallel, bit by bit. This clearly has O(1) complexity.
Obtaining the next candidate
When one applies a brute-force method to try out candidates in a sequence, a key feature is to determine the next candidate in order that's not yet in the set.
Next candidate not in the set
Consider a Sudoku game in which a set already has all candidates inserted in the line, in the column and in the 3x3 sub-grid of the cell in which a new candidate will be inserted.
In this context, one simply assumes the first candidate is 1, and if further down in the algorithm candidates can't be inserted the algorithm simply backtracks and tries the next candidate in sequence. Because we always know what the previous candidate is, we must now determine which next candidate is not yet in the set.
Again, one would instinctively think of using a loop. Here, another hardware resource is used to deal with this in O(1) time complexity taking advantage of the automatic carry propagation in the following code:
next = (((candidate + set) ^ set) + candidate) >> 1;
That code has a small problem, however: It might overflow if there are no more candidates not in the set. In a Sudoku game solver, one tests for this overflow beforehand because this condition will trigger the backtracking mechanism as described above.
Let's test the code above with an example to see which candidate doesn't yet exist after 2 was already inserted in the set. According to the code above, first we add set and candidate:
101011110 // set: {9,7,5,4,3,2}
+000000010 // candidate: 2
As you can see, the result is 101100000.
The objective in these examples is also to show how the code was devised. Why add set and candidate? Because if there are other candidates in the set after 2, or in other words on the left of 2, when we add 2 to the set, all these candidates disappear and the carry propagation stops at the first zero and ceases to propagate. That is the position for which we are looking.
Doing an exclusive or between this result and set puts into evidence all bits that were changed during the carry propagation, including the bit we want:
101100000 // result of 1st addition
^101011110 // exclusive or with set
The result is 000111110.
Now we add candidate once more, which again causes a carry propagation to one bit beyond the one in question. The goal here is to clean all bits and leave just that bit on.
000111110
+000000010
The result is 001000000.
Because the bit now on is one bit beyond the one that's needed, we simply shift the result right one bit to get the answer: 000100000. According to the table above, that corresponds to 6, which is the correct answer.
This logic enables jumping over existing candidates to find the next candidate that doesn't yet exist in the set. Again, this is an ad hoc solution used in a particular context and should not be considered a generic algorithm. The purpose is to demonstrate how to obtain ingenuous solutions to simplify and optimize a certain task by reasoning about the problem in terms of binary logic. These are all ad hoc solutions that take advantage of common hardware resources.
Next candidate in the set
Given a candidate already in the set, let's now suppose that we want the next candidate numerically in order that exists in the set. This is not something used in any particular problem as the previous code. This code is only shown for educational purposes, to take advantage of the previous logic but to obtain the opposite result:
notset = ~set;
candidate = candidate << 1;
next = (((candidate + notset) ^ notset) + candidate) >> 1;
Let's test the code with a numerical example; again, we assume that the initial known candidate is 2.
101000010 // set: {9,7,2}
010111101 // notset = ~set
Now we add notset and candidate:
010111101 // notset
+000000100 // candidate: 2 << 1
The result is 011000001. Now let's do an exclusive or with notset:
011000001 // result of 1st addition
^010111101 // exclusive or with notset
The result is 001111100. Now let's add candidate again:
001111100
+000000100
The result is 010000000. Shifting it right one bit gives the final answer: 001000000. According to the table above, that corresponds to 7, which is the correct answer.
This logic enables the ability to jump over candidates that don't exist to find the next one that does exist. We can infer the logic here, and the explanations for the individual operations, by comparing this example with the previous example.
Extensions
We can extend these algorithms to larger sets if we implement a larger bit array. This involves the use of a normal array of integers in which each element is a bit array that represents the set using the binary representation. The approach afterward is classical, using loops.
Conclusion
The main objective of this exercise is to incite awareness of the use of binary logic to solve and optimize problems that are not necessarily related to binary logic. Do not automatically discard the use of binary logic in these contexts when using high-level languages. The reason is quite simple: Processors not only operate with machine instructions, they also store data and operate with data in binary, and many of the low-level binary operators in the processor are available in high-level languages as well.
A compiler is better suited than humans to generate code for modern processors. It often produces instructions to be executed out of order and thus exploits processors' internal parallelism. One should never confuse machine instructions, though, with binary logic. A computer is a binary machine; the more one exploits basic binary hardware resources, the better the code generated should perform.
Nilo Stolte has extensive experience in computer graphics, computer architecture and parallel processing, as well as high-level programming languages including C, C++ and Java.