Integer Arithmetic Continued

I con­tin­ue my thing about inte­ger arith­metic. The plan is to add so much math to this blog that nobody will read it, although to be fair, the amount of math that takes is “any” and nobody was read­ing to begin with.


Tak­ing the aver­age of two num­bers is impor­tant. Super impor­tant. But how do you do it?


No, you can’t just add them up and divide by 2.

It’s been over sev­en years since Peter Norvig post­ed “Near­ly All Bina­ry Search­es and Merge­sorts are Bro­ken” and still, the top result for “bina­ry search in C” has these maths for the indices:

first = 0;
last = n - 1;
middle = (first+last)/2;

Of course, the prob­lem here is that if first+last exceeds 231-1 or 2147483647, then it over­flows into the neg­a­tive, and using the now-neg­a­tive middle as an index will cause “unde­fined behav­ior1.”

Before jump­ing into fix­es, let’s take a sec­ond to appre­ci­ate that page. I did­n’t expect any­thing that bad to come up as the first result. Let’s take in the link to a com­piled EXE (!), a screen­shot of it run­ning (?), and this bril­liant pro­gram­mers’ admon­ish­ment about using bina­ry search (?!):

Bina­ry search is faster than lin­ear search but list should be sort­ed, hash­ing is faster than bina­ry search and per­form search­es in con­stant time.


Alright then.

In this case, the fix is real­ly very sim­ple: since both first and last are known to be pos­i­tive, their dif­fer­ence can’t over­flow the range of inte­gers and you can do this:

middle = first + (last - first) / 2;

Guar­an­teed to not over­flow. How­ev­er, what hap­pens when you need to take an aver­age of two num­bers that may not be pos­i­tive? If first is very large and pos­i­tive and last is very large and neg­a­tive, last - first could overflow.

One naïve solu­tion would be to use both for­mu­lae, and switch between them based on the inputs. After all, if (a+b)/2 works when a and b have dif­fer­ent signs, and a+(b-a)/2 works when they’re both pos­i­tive, then we just write a rou­tine that applies them at the right times, right?

Except… I kind of ignored how the expres­sion last - first is guar­an­teed to be pos­i­tive. But if the order of our avg func­tion’s two input val­ues is not known, then that val­ue can be neg­a­tive. Why is this a problem?

As you might guess, it’s round­ing. See, divi­sion in C++ is defined to trun­cate towards zero, and in C it’s a fre­quent default to do the same. So, neg­a­tive num­bers round up and pos­i­tive num­bers round down. You see the prob­lem when you have an order-depen­dent oper­a­tion like sub­trac­tion in the expres­sion (b-a)/2?

Let’s try an exam­ple; let’s take the aver­age of 2 and 7.

\begin{aligned} \text{avg}(2, 7) = 2 + \frac{7 - 2}{2} = 2 + \lfloor 2.5\rfloor = 2 + 2 = 4 \\ \text{avg}(7, 2) = 7 + \frac{2 - 7}{2} = 7 + \lceil -2.5\rceil = 7 + -2 = 5 \end{aligned}

Yeah, we end­ed up with an avg func­tion where avg(a, b) != avg(b, a). Not good. Sure, we could just cor­rect for this with anoth­er con­di­tion­al to check if (b-a) < 0 and sub­tract one from the result if one input is odd and the oth­er even2. But you’d agree that’s pret­ty com­pli­cat­ed, right?

// note: this is to prove a point. It's not been tested.
int32_t avg(int32_t a, int32_t b) {
    if ( (a < 0) != (b < 0) ) {
        return (a + b) / 2;
    } else {
        return a + (b - a) / 2 + ( (b - a < 0) & ( (a & 1) != (b & 1) ) );

What the hell, Batman?

Let’s try anoth­er approach. Maybe we should divide each input val­ue first, then add the halves together.

int32_t avg(int32_t a, int32_t b) {
    return (a / 2) + (b / 2);

The error here should be obvi­ous now. I’ll show you with examples.

\begin{aligned} \text{avg}(2, 4) = \frac{2}{2} + \frac{4}{2} = 1 + 2 = 3 \\ \text{avg}(3, 5) = \frac{3}{2} + \frac{5}{2} = \lfloor 1.5\rfloor + \lfloor 2.5\rfloor = 1 + 2 = 3 \end{aligned}

Wait. So our avg func­tion works as if avg(a, b) = avg(a + 1, b + 1)? That’s… kind of wrong. But at least avg(a, b) = avg(b, a)!

Let’s see here. So in order to build a func­tion that works sim­i­lar­ly to (a+b)/2, the gold stan­dard of mid­point­ing, we must sat­is­fy the following:

  1. avg(a, b) = avg(b, a)
  2. avg(a + 1, b + 1) = avg(a, b) + 1
  3. avg(-a, -b) = -avg(a, b)

That last one actu­al­ly spec­i­fies a whol­ly dif­fer­ent and yet unex­plained require­ment: sym­me­try about zero. I just feel that if you’re gonna be aver­ag­ing signed num­bers togeth­er, the func­tion ought to work sim­i­lar­ly on either side of the num­ber line and not bias results in either direc­tion. So it only makes sense to round the results toward zero.

Any­ways, so how do with address our prob­lem with the “sum of halves” solu­tion? Can we fix it with­out adding a bunch of branch­es like with the for­mer “con­di­tion­al half of dif­fer­ence” solution?

Sum of Halves

Well, let’s think­ing about why it hap­pens. The error man­i­fests when both halves have a half after­ing halv­ing: that is, when both a / 2 and b / 2 are something.5 (read that as “some­thing point five”). This results in both parts being round­ed down before get­ting summed. But, since both parts were decreased by .5, this caus­es a total decrease of 1 in the sum!

Of course, x / 2 needs only be round­ed if x is odd, so the only time the above error man­i­fests is then if both inputs are odd. So it’s a sim­ple mat­ter of adding one to the result when both are odd, right?

int32_t avg(int32_t a, int32_t b) {
    return (a / 2) + (b / 2) + (a & b & 1);
    // can also be           + (a % 2 && b % 2)

Kind of. See, this does take care of most of the errors, but there’s still a sub­tle bug that caus­es it to not sat­is­fy require­ment 2. avg(a + 1, b + 1) = avg(a, b) + 1. Can you spot it?

\begin{aligned} \text{avg}(0, 3) = 0 + \frac{3}{2} = 0 + \lfloor1.5\rfloor = 1 \\ \text{avg}(-1, 2) = \frac{-1}{2} + 1 = \lceil-.5\rceil + 1 = 0 + 1 = 1 \end{aligned}

I’ll give it to ya straight: when a and b have oppo­site signs, they are round­ed in oppo­site direc­tions. Neg­a­tives are round­ed up, and pos­i­tives are round­ed down. Not a big deal if both inputs are odd or both are even. In fact, the error only man­i­fests when one input is odd and the oth­er is even, as well as anoth­er con­di­tion that I haven’t even yet been able to divine. Suf­fice to say that it’s not real­ly the right path to go down.

What a mess.

So what can we do? The eas­i­est thing I fig­ured out is to sim­ply use a divi­sion that con­sis­tent­ly rounds in the same direc­tion, then round the result as need­ed. To do that, we’ll use an arith­metic right shift by 1 in order to per­form a divide-by‑2.

It’s a com­mon mis­con­cep­tion that the C expres­sions x / 2 and x >> 1 are equiv­a­lent. In fact, for unsigned inte­gers, they are. How­ev­er, the major dis­tinc­tion is that the lat­ter rounds towards neg­a­tive infin­i­ty on most plat­forms3! This is much more con­ve­nient for us to use, since cor­rec­tions hap­pen con­sis­tent­ly across the range of integers.

int32_t avg(int32_t a, int32_t b) {
    return (a >> 1) + (b >> 1) + (a & b & 1);
    // can also be             + (a % 2 && b % 2)

Already, this is a cor­rect mid­point func­tion. How­ev­er, it’s still just a bit off my mark: it does­n’t meet our third require­ment, sym­me­try on the num­ber line about zero. That’s a sim­ple fix though. Toss in a 1 when par­i­ties don’t match but the sum of halves is neg­a­tive, and voilà, we have a win­ner! Rounds towards zero and everything!

/* SPDX-License-Identifier: MIT OR Apache-2.0 */
 * Overflow-safe average of two signed integers. The naive average function is
 * erroneous when the sum of the inputs overflows integer limits; this average
 * works by summing the halves of the input values and then correcting the sum
 * for rounding.
 * @param a first value, as a 32-bit signed integer
 * @param b second value, as a 32-bit signed integer
 * @return signed average of the two values, rounded towards zero if their
 * average is not an integer
static inline int32_t avg(int32_t a, int32_t b) {
    // shifts divide by two, rounded towards negative infinity
    const int32_t sumHalves = (a >> 1) + (b >> 1);
    // this has error of magnitude one if both are odd
    const uint32_t bothOdd = (a & b) & 1;
    // round toward zero; add one if one input is odd and sum is negative
    const uint32_t roundToZero = (sumHalves < 0) & (a ^ b);

    // result is sum of halves corrected for rounding
    return sumHalves + bothOdd + roundToZero;
  1. C pro­gram­mer catch-all for any sort of prob­lem the peo­ple who make C tools don’t care to have already solved for the pro­gram­mer. []
  2. (a&1) != (b&1) or (a%2) != (b%2) []
  3. Sad­ly, this isn’t real­ly defined by the C or C++ stan­dards. []