Ah, the On-Line Encyclopedia of Integer Sequences.

OEIS elmo

The next best thing to magic.

Not only can you usually pick up a nice closed-form expression for whatever numbers you paste in, but more often than not you’ll get a peek into your numbers’ natural habitat in some strange (and often frightening) realm of mathematics. Careful! Don’t think too hard about why your problem’s secret isomorphism to gray-coded Hadamard matrices lest your mind become boggled. Just close that door back up, and you can go about coding in your new top-tier arcana.

For those unfamiliar, OEIS is a database of integer sequences. You plug in a few terms and get back a list of possible generating sequences with references to the literature and links to related sequences. (The New York Times did a nice feature on OEIS a few months back that’s well worth a read.)

Used OEIS today to work some problems related to hereditary stratigraphy, and crossed paths with A290255.

A290255 = [0,1,0,2,1,0,0,3,2,1,1,0,0,0,0,4,3,2,2,1,1,1,1,0,

This sequence ended up being the magic ingredient I needed, but OEIS listed no closed-form expression for the sequence.

The description — though — gave enough aha! to light the way. Turns out, this sequence is just the “number of 0’s following directly the first 1 in the binary representation of n.”

🔗 Implementation

I needed a closed-form expression for my use case, and here’s what I came up with.

def bit_floor(n: int) -> int:
    """Calculate the largest power of two not greater than n.

    If zero, returns zero.
    mask = 1 << (n >> 1).bit_length()
    return n & mask

def bit_drop_msb(n: int) -> int:
    """Drop most significant bit from binary representation of integer n."""
    return n & (~bit_floor(n))

def bit_count_immediate_zeros(x: int) -> int:
    """Count the number of zeros immediately following the first binary digit
    of x."""
    # https://oeis.org/A290255
    return x.bit_length() - bit_drop_msb(x).bit_length() - bool(x)

We can smoosh it all together to get a sufficiently ghastly one-liner.

This all comes together as
def bit_count_immediate_zeros(n: int) -> int:
    return n.bit_length() - (n & (~(1 << (n >> 1).bit_length()))).bit_length() - bool(n)

Note that if you’re using numpy and might have numpy data types floating around, you’ll want to explicitly cast inputs to int to get access to Python’s built-in bit manipulation methods.

🔗 How Does it Work?

How does this work? The trick is dropping the most significant bit. This will expose the immediate 0’s (i.e., directly following the leading 1), if any, and they’ll collapse away in Python’s integer representation. Afterwards, the next following 1 will be left most significant bit if present. We can test how many bits were stripped away by subtracting the new bit length from the old bit length.

Note that we need to account for the leading 1 in the bit length, which is taken care of by the last term. We’d like to subtract one from the bit length of the original number, but only if there was a bit there in the first place. The bool(x) tacked on the end does this for us — it will evaluate as 1 if x is positive and 0 if x is zero.

What about bit_drop_msb and bit_floor?

In bit_drop_msb, the call to bit_floor isolates the leading one in the binary representation of n. This leading bit lets us easily generate an inverse mask — i.e., with all other non-leading bits set to 1. Masking the original n and keeping only masked bits leaves everything else in place except for the most significant bit.

In bit_floor, our goal is to find the largest power of two not greater than n. This amounts to isolating the most significant bit of n. A simple way to do this would be counting the number of bits in n and then shifting 1 up to that power of two. Something like 1 << (n.bit_length() - 1). Note that we have to subtract one because the 1 we are shifting up is already “shifted” into the first (i.e., 0th) position.

The n >> 1 in bit_floor is a trick for zero-handling. If n is zero, n >> 1 will simply nop, and bit_floor will return zero. Otherwise, we can just shift 1 up to the power of two of n’s most significant bit. We no longer have to subtract one due to the n >> 1 downshift.

🔗 Validation

Let’s check against the reference sequence entries.

>>> import itertools as it
>>> juxtaposed = zip(
...    map(bit_count_immediate_zeros, it.count(1)),  # oeis is 1-indexed
...    A290255,
... )
>>> all(it.starmap(int.__eq__, juxtaposed))

Okay, let’s do a little more validation by comparing against a string-based implementation.

def bit_count_immediate_zeros_str(n: int) -> int:
    """Count the number of zeros immediately following the first binary digit
    of x."""
    return f"{n:b}1"[1:].index("1")  # fstring "1" ensures a trailing 1

We can test our alternate implementation against the reference sequence entries.

>>> juxtaposed = zip(
...    map(bit_count_immediate_zeros_str, it.count(1)),  # oeis is 1-indexed
...    A290255,
... )
>>> all(it.starmap(int.__eq__, juxtaposed))

And then test against one another.

>>> test_indices = [
...     0, 1, 2,
...     *map(int, np.linspace(0, 2**32, 10**5, dtype=int)),
...     *map(int, np.geomspace(1, 2**32, 10**5, dtype=int)),
...     *map(int, np.random.randint(0, 2**32, 10**5)),
... ]
>>> juxtaposed = zip(
...    map(bit_count_immediate_zeros, test_indices),
...    map(bit_count_immediate_zeros_str, test_indices),
... )
>>> all(it.starmap(int.__eq__, juxtaposed))

Yup, everything checks out. Both implementations agree over integers sampled up to 2**32!