3
votes

I was at the high frequency Trading firm interview, they asked me

Find a square whose length size is R with given n points in the 2D plane

conditions:

--parallel sides to the axis and it contains at least 5 of the n points

running complexity is not relative to the R

they told me to give them O(n) algorithm

2
so how did you respond?agentp
Is R given to you ahead of time or are you allowed to select it?jerry
this must be more missing here. The problem posed (assuming R given) may have no solution.agentp
@george that would proably be an acceptable result of the algorithm (find a solution or determine that none exist in O(n) time). I believe the difficult (or impossible) part is the linear time complexity (assuming the points are given as, say, an unsorted list).jerry
correct answer, "this question is ambiguously written and either trivial or impossible depending on interpretation".agentp

2 Answers

4
votes

Interesting problem, thanks for posting! Here's my solution. It feels a bit inelegant but I think it meets the problem definition:

Inputs: R, P = {(x_0, y_0), (x_1, y_1), ..., (x_N-1, y_N-1)}

Output: (u,v) such that the square with corners (u,v) and (u+R, v+R) contains at least 5 points from P, or NULL if no such (u,v) exist

Constraint: asymptotic run time should be O(n)

Consider tiling the plane with RxR squares. Construct a sparse matrix, B defined as

B[i][j] = {(x,y) in P | floor(x/R) = i and floor(y/R) = j}

As you are constructing B, if you find an entry that contains at least five elements stop and output (u,v) = (i*R, j*R) for i,j of the matrix entry containing five points.

If the construction of B did not yield a solution then either there is no solution or else the square with side length R does not line up with our tiling. To test for this second case we will consider points from four adjacent tiles.

Iterate the non-empty entries in B. For each non-empty entry B[i][j], consider the collection of points contained in the tile represented by the entry itself and in the tiles above and to the right. These are the points in entries: B[i][j], B[i+1][j], B[i][j+1], B[i+1][j+1]. There can be no more than 16 points in this collection, since each entry must have fewer than 5. Examine this collection and test if there are 5 points among the points in this collection satisfying the problem criteria; if so stop and output the solution. (I could specify this algorithm in more detail, but since (a) such an algorithm clearly exists, and (b) its asymptotic runtime is O(1), I won't go into that detail).

If after iterating the entries in B no solution is found then output NULL.

The construction of B involves just a single pass over P and hence is O(N). B has no more than N elements, so iterating it is O(N). The algorithm for each element in B considers no more than 16 points and hence does not depend on N and is O(1), so the overall solution meets the O(N) target.

1
votes

Run through set once, keeping the 5 largest x values in a (sorted) local array. Maintaining the sorted local array is O(N) (constant time performed N times at most).

Define xMin and xMax as the x-coordinates of the two points with largest and 5th largest x values respectively (ie (a[0] and a[4]).

Sort a[] again on Y value, and set yMin and yMax as above, again in constant time.

Define deltaX = xMax- xMin, and deltaY as yMax - yMin, and R = largest of deltaX and deltaY.

The square of side length R located with upper-right at (xMax,yMax) meets the criteria.

Observation if R is fixed in advance:

  • O(N) complexity means no sort is allowed except on a fixed number of points, as only a Radix sort would meet the criteria and it requires a constraint on the values of xMax-xMin and of yMax-yMin, which was not provided.
  • Perhaps the trick is to start with the point furthest down and left, and move up and right. The lower-left-most point can be determined in a single pass of the input.
  • Moving up and right in steps and counitng points in the square requries sorting the points on X and Y in advance, which to be done in O(N) time requiress that the Radix sort constraint be met.