Is there an easy function from a pair of 32-bit ints to a single 64-bit int that preserves rotational order?

Question

This is a question that came up in the context of sorting points with integer coordinates into clockwise order, but this question is not about how to do that sorting.

This question is about the observation that 2-d vectors have a natural cyclic ordering. Unsigned integers with usual overflow behavior (or signed integers using twos-complement) also have a natural cyclic ordering. Can you easily map from the first ordering to the second?

So, the exact question is whether there is a map from pairs of twos-complement signed 32-bit integers to unsigned (or twos-complement signed) 64-bit integers such that any list of vectors that is in clockwise order maps to integers that are in decreasing (modulo overflow) order?

Some technical cases that people will likely ask about:

Yes, vectors that are multiples of each other should map to the same thing
No, I don't care which vector (if any) maps to 0
No, the images of antipodal vectors don't have to differ by 2^63 (although that is a nice-to-have)

The obvious answer is that since there are only around 0.6*2^64 distinct slopes, the answer is yes, such a map exists, but I'm looking for one that is easily computable. I understand that "easily" is subjective, but I'm really looking for something reasonably efficient and not terrible to implement. So, in particular, no counting every lattice point between the ray and the positive x-axis (unless you know a clever way to do that without enumerating them all).

An important thing to note is that it can be done by mapping to 65-bit integers. Simply project the vector out to where it hits the box bounded by x,y=+/-2^62 and round toward negative infinity. You need 63 bits to represent that integer and two more to encode which side of the box you hit. The implementation needs a little care to make sure you don't overflow, but only has one branch and two divides and is otherwise quite cheap. It doesn't work if you project out to 2^61 because you don't get enough resolution to separate some slopes.

Also, before you suggest "just use atan2", compute atan2(1073741821,2147483643) and atan2(1073741820,2147483641)

EDIT: Expansion on the "atan2" comment:

Given two values x_1 and x_2 that are coprime and just less than 2^31 (I used 2^31-5 and 2^31-7 in my example), we can use the extended Euclidean algorithm to find y_1 and y_2 such that y_1/x_1-y_2/x_2 = 1/(x_1*x_2) ~= 2^-62. Since the derivative of arctan is bounded by 1, the difference of the outputs of atan2 on these values is not going to be bigger than that. So, there are lots of pairs of vectors that won't be distinguishable by atan2 as vanilla IEEE 754 doubles.

If you have 80-bit extended registers and you are sure you can retain residency in those registers throughout the computation (and don't get kicked out by a context switch or just plain running out of extended registers), then you're fine. But, I really don't like the correctness of my code relying on staying resident in extended registers.

What is "clockwise order" in this context? Would bit-wise interleaving of the two 32-bit entities into a 64-bit entity serve your purposes, e.g. pdep (x, 0x5555555555555555) | pdep (y, 0xaaaaaaaaaaaaaaaa)? — njuffa
Could you explain the significance of the comment about atan2()? Using 64-bit IEEE-754 double precision, I get atan2 (1073741821,2147483643) = 0x1.dac67052e881cp-2 and atan2 (1073741820,2147483642) = 0x1.dac6704fb54e9p-2. These two values are easily distinguished even assuming an error of a couple of ulps. There are probably vectors whose respective atan2 is much closer than this, but I assume the examples were picked for a reason. — njuffa
@AubreydaCunha This could be on topic on MSE, too, though I don't suppose the index of a fraction into the Farey sequence counts as "easily computable". — dxiv
Points on a circle or vectors do not have any natural rotational ordering because you can go from any one to any other one by counterclockwise rotation. Angles modulo 2*pi measured from some particular vector do have an order, but why would you want those, and what is the significance of that special vector? Perhaps there's an XY problem hiding in there. — n. 1.8e9-where's-my-share m.
Another comment about atan2 is that you don't really gain anything compared to just using the slope directly - in fact, you lose some precision for angles closer to odd multiples of 45 degrees, because there are more distinct angles in that part of the range due to the domain being a square rather than a disc. — kaya3

Mark Dickinson Mark Dickinson · Accepted Answer · 2021-04-01T19:04:45

Here's one possible approach, inspired by a comment in your question. (For the tl;dr version, skip down to the definition of point_to_line at the bottom of this answer: that gives a mapping for the first quadrant only. Extension to the whole plane is left as a not-too-difficult exercise.)

Your question says:

in particular, no counting every lattice point between the ray and the positive x-axis (unless you know a clever way to do that without enumerating them all).

There is an algorithm to do that counting without enumerating the points; its efficiency is akin to that of the Euclidean algorithm for finding greatest common divisors. I'm not sure to what extent it counts as either "easily computable" or "clever".

Suppose that we're given a point (p, q) with integer coordinates and both p and q positive (so that the point lies in the first quadrant). We might as well also assume that q < p, so that the point (p, q) lies between the x-axis y = 0 and the diagonal line y = x: if we can solve the problem for the half of the first quadrant that lies below the diagonal, we can make use of symmetry to solve it generally.

Write M for the bound on the size of p and q, so that in your example we want M = 2^31.

Then the number of lattice points strictly inside the triangle bounded by:

the x-axis y = 0
the ray y = (q/p)x that starts at the origin and passes through (p, q), and
the vertical line x = M

is the sum as x ranges over integers in (0, M) of ⌈qx/p⌉ - 1.

For convenience, I'll drop the -1 and include 0 in the range of the sum; both those changes are trivial to compensate for. And now the core functionality we need is the ability to evaluate the sum of ⌈qx/p⌉ as x ranges over the integers in an interval [0, M). While we're at it, we might also want to be able to compute a closely-related sum: the sum of ⌊qx/p⌋ over that same range of x (and it'll turn out that it makes sense to evaluate both of these together).

For testing purposes, here are slow, naive-but-obviously-correct versions of the functions we're interested in, here written in Python:

def floor_sum_slow(p, q, M):
    """
    Sum of floor(q * x / p) for 0 <= x < M.

    Assumes p positive, q and M nonnegative.
    """
    return sum(q * x // p for x in range(M))


def ceil_sum_slow(p, q, M):
    """
    Sum of ceil(q * x / p) for 0 <= x < M.

    Assumes p positive, q and M nonnegative.
    """
    return sum((q * x + p - 1) // p for x in range(M))

And an example use:

>>> floor_sum_slow(51, 43, 2**28)  # takes several seconds to complete
30377220771239253
>>> ceil_sum_slow(140552068, 161600507, 2**28)
41424305916577422

These sums can be evaluated much faster. The first key observation is that if q >= p, then we can apply the Euclidean "division algorithm" and write q = ap + r for some integers a and r. The sum then simplifies: the ap part contributes a factor of a * M * (M - 1) // 2, and we're reduced from computing floor_sum(p, q, M) to computing floor_sum(p, r, M). Similarly, the computation of ceil_sum(p, q, M) reduces to the computation of ceil_sum(p, q % p, M).

The second key observation is that we can express floor_sum(p, q, M) in terms of ceil_sum(q, p, N), where N is the ceiling of (q/p)M. To do this, we consider the rectangle [0, M) x (0, (q/p)M), and divide that rectangle into two triangles using the line y = (q/p)x. The number of lattice points within the rectangle that lie on or below the line is floor_sum(p, q, M), while the number of lattice points within the rectangle that lie above the line is ceil_sum(q, p, N). Since the total number of lattice points in the rectangle is (N - 1)M, we can deduce the value of floor_sum(p, q, M) from that of ceil_sum(q, p, N), and vice versa.

Combining those two ideas, and working through the details, we end up with a pair of mutually recursive functions that look like this:

def floor_sum(p, q, M):
    """
    Sum of floor(q * x / p) for 0 <= x < M.

    Assumes p positive, q and M nonnegative.
    """
    a = q // p
    r = q % p
    if r == 0:
        return a * M * (M - 1) // 2
    else:
        N = (M * r + p - 1) // p
        return a * M * (M - 1) // 2 + (N - 1) * M - ceil_sum(r, p, N)


def ceil_sum(p, q, M):
    """
    Sum of ceil(q * x / p) for 0 <= x < M.

    Assumes p positive, q and M nonnegative.
    """
    a = q // p
    r = q % p
    if r == 0:
        return a * M * (M - 1) // 2
    else:
        N = (M * r + p - 1) // p
        return a * M * (M - 1) // 2 + N * (M - 1) - floor_sum(r, p, N)

Performing the same calculation as before, we get exactly the same results, but this time the result is instant:

>>> floor_sum(51, 43, 2**28)
30377220771239253
>>> ceil_sum(140552068, 161600507, 2**28)
41424305916577422

A bit of experimentation should convince you that the floor_sum and floor_sum_slow functions give the same result in all cases, and similarly for ceil_sum and ceil_sum_slow.

Here's a function that uses floor_sum and ceil_sum to give an appropriate mapping for the first quadrant. I failed to resist the temptation to make it a full bijection, enumerating points in the order that they appear on each ray, but you can fix that by simply replacing the + gcd(p, q) term with + 1 in both branches.

from math import gcd

def point_to_line(p, q, M):
    """
    Bijection from [0, M) x [0, M) to [0, M^2), preserving
    the 'angle' ordering.
    """
    if p == q == 0:
        return 0
    elif q <= p:
        return ceil_sum(p, q, M) + gcd(p, q)
    else:
        return M * (M - 1) - floor_sum(q, p, M) + gcd(p, q)

Extending to the whole plane should be straightforward, though just a little bit messy due to the asymmetry between the negative range and the positive range in the two's complement representation.

Here's a visual demonstration for the case M = 7, printed using this code:

M = 7
for q in reversed(range(M)):
    for p in range(M):
        print(" {:02d}".format(point_to_line(p, q, M)), end="")
    print()

Results:

 48 42 39 36 32 28 27
 47 41 37 33 29 26 21
 46 40 35 30 25 20 18
 45 38 31 24 19 16 15
 44 34 23 17 14 12 11
 43 22 13 10 09 08 07
 00 01 02 03 04 05 06

Is there an easy function from a pair of 32-bit ints to a single 64-bit int that preserves rotational order?

3 Answers