Integer Arithmetic Continued

I continue my thing about inte­ger arith­metic. The plan is to add so much math to this blog that nobody will read it, although to be fair, the amount of math that takes is “any” and nobody was read­ing to begin with.


Taking the aver­age of two numbers is impor­tant. Super impor­tant. But how do you do it?


No, you can’t just add them up and divide by 2.

It’s been over seven years since Peter Norvig posted “Nearly All Binary Searches and Merge­sorts are Broken” and still, the top result for “binary search in C” has these maths for the indices:

first = 0;
last = n - 1;
middle = (first+last)/2;

Of course, the prob­lem here is that if first+last exceeds 231-1 or 2147483647, then it over­flows into the nega­tive, and using the now-nega­tive middle as an index will cause “unde­fined behav­ior1.”

Before jump­ing into fixes, let’s take a second to appre­ci­ate that page. I didn’t expect anything that bad to come up as the first result. Let’s take in the link to a compiled EXE (!), a screen­shot of it running (?), and this bril­liant program­mers’ admon­ish­ment about using binary search (?!):

Binary search is faster than linear search but list should be sorted, hash­ing is faster than binary search and perform searches in constant time.


Alright then.

In this case, the fix is really very simple: since both first and last are known to be posi­tive, their differ­ence can’t over­flow the range of inte­gers and you can do this:

middle = first + (last - first) / 2;

Guar­an­teed to not over­flow. However, what happens when you need to take an aver­age of two numbers that may not be posi­tive? If first is very large and posi­tive and last is very large and nega­tive, last - first could over­flow.

One naïve solu­tion would be to use both formu­lae, and switch between them based on the inputs. After all, if (a+b)/2 works when a and b have differ­ent signs, and a+(b-a)/2 works when they’re both posi­tive, then we just write a routine that applies them at the right times, right?

Except… I kind of ignored how the expres­sion last - first is guar­an­teed to be posi­tive. But if the order of our avg function’s two input values is not known, then that value can be nega­tive. Why is this a prob­lem?

As you might guess, it’s round­ing. See, divi­sion in C++ is defined to trun­cate towards zero, and in C it’s a frequent default to do the same. So, nega­tive numbers round up and posi­tive numbers round down. You see the prob­lem when you have an order-depen­dent oper­a­tion like subtrac­tion in the expres­sion (b-a)/2?

Let’s try an exam­ple; let’s take the aver­age of 2 and 7.

\begin{aligned} \text{avg}(2, 7) = 2 + \frac{7 - 2}{2} = 2 + \lfloor 2.5\rfloor = 2 + 2 = 4 \\ \text{avg}(7, 2) = 7 + \frac{2 - 7}{2} = 7 + \lceil -2.5\rceil = 7 + -2 = 5 \end{aligned}

Yeah, we ended up with an avg func­tion where avg(a, b) != avg(b, a). Not good. Sure, we could just correct for this with another condi­tional to check if (b-a) < 0 and subtract one from the result if one input is odd and the other even2. But you’d agree that’s pretty compli­cated, right?

// note: this is to prove a point. It's not been tested.
int32_t avg(int32_t a, int32_t b) {
    if ( (a < 0) != (b < 0) ) {
        return (a + b) / 2;
    } else {
        return a + (b - a) / 2 + ( (b - a < 0) & ( (a & 1) != (b & 1) ) );

What the hell, Batman?

Let’s try another approach. Maybe we should divide each input value first, then add the halves together.

int32_t avg(int32_t a, int32_t b) {
    return (a / 2) + (b / 2);

The error here should be obvi­ous now. I’ll show you with exam­ples.

\begin{aligned} \text{avg}(2, 4) = \frac{2}{2} + \frac{4}{2} = 1 + 2 = 3 \\ \text{avg}(3, 5) = \frac{3}{2} + \frac{5}{2} = \lfloor 1.5\rfloor + \lfloor 2.5\rfloor = 1 + 2 = 3 \end{aligned}

Wait. So our avg func­tion works as if avg(a, b) = avg(a + 1, b + 1)? That’s… kind of wrong. But at least avg(a, b) = avg(b, a)!

Let’s see here. So in order to build a func­tion that works simi­larly to (a+b)/2, the gold stan­dard of midpoint­ing, we must satisfy the follow­ing:

  1. avg(a, b) = avg(b, a)
  2. avg(a + 1, b + 1) = avg(a, b) + 1
  3. avg(-a, -b) = -avg(a, b)

That last one actu­ally spec­i­fies a wholly differ­ent and yet unex­plained require­ment: symme­try about zero. I just feel that if you’re gonna be aver­ag­ing signed numbers together, the func­tion ought to work simi­larly on either side of the number line and not bias results in either direc­tion. So it only makes sense to round the results toward zero.

Anyways, so how do with address our prob­lem with the “sum of halves” solu­tion? Can we fix it with­out adding a bunch of branches like with the former “condi­tional half of differ­ence” solu­tion?

Sum of Halves

Well, let’s think­ing about why it happens. The error mani­fests when both halves have a half after­ing halv­ing: that is, when both a / 2 and b / 2 are something.5 (read that as “some­thing point five”). This results in both parts being rounded down before getting summed. But, since both parts were decreased by .5, this causes a total decrease of 1 in the sum!

Of course, x / 2 needs only be rounded if x is odd, so the only time the above error mani­fests is then if both inputs are odd. So it’s a simple matter of adding one to the result when both are odd, right?

int32_t avg(int32_t a, int32_t b) {
    return (a / 2) + (b / 2) + (a & b & 1);
    // can also be           + (a % 2 && b % 2)

Kind of. See, this does take care of most of the errors, but there’s still a subtle bug that causes it to not satisfy require­ment 2. avg(a + 1, b + 1) = avg(a, b) + 1. Can you spot it?

\begin{aligned} \text{avg}(0, 3) = 0 + \frac{3}{2} = 0 + \lfloor1.5\rfloor = 1 \\ \text{avg}(-1, 2) = \frac{-1}{2} + 1 = \lceil-.5\rceil + 1 = 0 + 1 = 1 \end{aligned}

I’ll give it to ya straight: when a and b have oppo­site signs, they are rounded in oppo­site direc­tions. Nega­tives are rounded up, and posi­tives are rounded down. Not a big deal if both inputs are odd or both are even. In fact, the error only mani­fests when one input is odd and the other is even, as well as another condi­tion that I haven’t even yet been able to divine. Suffice to say that it’s not really the right path to go down.

What a mess.

So what can we do? The easi­est thing I figured out is to simply use a divi­sion that consis­tently rounds in the same direc­tion, then round the result as needed. To do that, we’ll use an arith­metic right shift by 1 in order to perform a divide-by-2.

It’s a common miscon­cep­tion that the C expres­sions x / 2 and x >> 1 are equiv­a­lent. In fact, for unsigned inte­gers, they are. However, the major distinc­tion is that the latter rounds towards nega­tive infin­ity on most plat­forms3! This is much more conve­nient for us to use, since correc­tions happen consis­tently across the range of inte­gers.

int32_t avg(int32_t a, int32_t b) {
    return (a >> 1) + (b >> 1) + (a & b & 1);
    // can also be             + (a % 2 && b % 2)

Already, this is a correct midpoint func­tion. However, it’s still just a bit off my mark: it doesn’t meet our third require­ment, symme­try on the number line about zero. That’s a simple fix though. Toss in a 1 when pari­ties don’t match but the sum of halves is nega­tive, and voilà, we have a winner! Rounds towards zero and every­thing!

/* SPDX-License-Identifier: MIT OR Apache-2.0 */
 * Overflow-safe average of two signed integers. The naive average function is
 * erroneous when the sum of the inputs overflows integer limits; this average
 * works by summing the halves of the input values and then correcting the sum
 * for rounding.
 * @param a first value, as a 32-bit signed integer
 * @param b second value, as a 32-bit signed integer
 * @return signed average of the two values, rounded towards zero if their
 * average is not an integer
static inline int32_t avg(int32_t a, int32_t b) {
    // shifts divide by two, rounded towards negative infinity
    const int32_t sumHalves = (a >> 1) + (b >> 1);
    // this has error of magnitude one if both are odd
    const uint32_t bothOdd = (a & b) & 1;
    // round toward zero; add one if one input is odd and sum is negative
    const uint32_t roundToZero = (sumHalves > 0) & (a ^ b);

    // result is sum of halves corrected for rounding
    return sumHalves + bothOdd + roundToZero;
  1. C program­mer catch-all for any sort of prob­lem the people who make C tools don’t care to have already solved for the program­mer. []
  2. (a&1) != (b&1) or (a%2) != (b%2) []
  3. Sadly, this isn’t really defined by the C or C++ stan­dards. []