2
votes

I know that there are clustering algorithms for points obviously, but I have a different scenario. I have many rays, all whose start points are on a sphere in 3D, and whose direction vectors point inwards into the sphere. Some of the rays are pointing towards a point A, others are pointing towards a point B, etc, with some noise(ie the rays don't perfectly intersect each other). Is there a clustering algorithm that will allow me to cluster the rays based on which point they are pointing towards? I don't know the locations of points A, B, etc, only the start points and direction vectors of the rays.

For example, in this picture is a sample setup, but in 2D, and I don't know which rays are red or blue at the start. How would I cluster the rays into red and blue? Or, how would I find the locations of the points they're pointing towards?

One solution I thought of was taking pairs of rays, finding the closest point between those two rays(in 2D this is the point of intersection if you extend the rays), and doing this for every pair of rays(so I'd get n(n-1)/2 points, where n is the number of rays). Then, I could use the regular clustering algorithm on those points. But, that's not seeming to work - I'm getting just one big lump of points at the origin, I don't know why that's happening.

3

3 Answers

1
votes

Here is something to try, loosely based on https://en.wikipedia.org/wiki/Random_sample_consensus and https://en.wikipedia.org/wiki/K-means_clustering. It attempts to hill-climb to a solution, so you might want to try it a number of times with different random starts.

We try and find the arrangements of points and clusters that minimizes the sum of squared distances from the rays in each cluster (extended to become lines) to the point associated with each cluster. We know that the square of the distance from a point to a line is a quadratic (https://en.wikipedia.org/wiki/Distance_from_a_point_to_a_line) so if you know which lines go into which cluster, you can find the point for that cluster which minimizes the sum of squared distances for it.

If you know the points, but not which lines go in which clusters, you can assign each line to the cluster whose point is closest to it.

So start with a random assignment of lines into clusters. Now find the point for each cluster which minimizes the sum of squared distances. Now you have points, put each line into the cluster whose point is closest, further reducing the sum of squared distances. Now you have a different arrangement of lines into clusters, recalculate the points, reducing the sum of squared distances again. Clearly you can repeat this process, reducing the sum of squared distances at each step, until you converge to some local minimum.

I suggest that you try this a number of times, starting from different random starts each time, and look to see if the same answer appears more than once at the end, in which case you might be finding something close to a global optimum, or at least a very prominent local minimum.

0
votes

You can try an approach like Hough transform

Make 3d grid (inside sphere?) with some cell size (should be chosen as compromiss between precision and computational space/time)

For every ray increment counters for all cells it touches (perhaps add ray id to the ray list of this cell if you have enough available memory)

Choose best candidates cells with the most votes (large counters) (perhaps cluster them).

Build correspondence of cell (or cluster) and rays through it, group these rays (they are already in the list if it has been built)

0
votes

I would suggest the following:

Given a ray, find out its ending point - the point at which the arrow, if extended, would meet the sphere. That can be done using routines, such as this.

Collect all the set of intersection points (x1, y1, z1),...,(xp, yp, zp)

Now, once you have these points, you could run a 3 dimensional version of the multisource Weber problem (MWP).

If you are looking for m clusters, solve the MWP for m sources. This will produce the best m clusters.

BTW: Solving the MWP is an NP-Hard problem. You are likely going to have to use some sort of enumerative algorithm to solve it. Here is one method.