6
votes

The curve-fitting problem for 2D data is well known (LOWESS, etc.) but given a set of 3D data points, how do I fit a 3D curve (eg. a smoothing/regression spline) to this data?

MORE: I'm trying to find a curve, fitting the data provided by vectors X,Y,Z which have no known relation. Essentially, I have a 3D point cloud, and need to find a 3D trendline.

MORE: I apologize for the ambiguity. I tried several approaches (I still haven't tried modifying the linear fit) and a random NN seems to work out best. I.e., I randomly pick a point from the point cloud, find the centroid of it's neighbors (within an arbitrary sphere), iterate. Connecting the centroids to form a smooth spline is proving to be difficult but the centroids obtained is passable.

To clarify the problem, the data is not a time series and I'm looking for a smooth spline which best describes the point cloud I.e., if I were to project this 3D spline on a plane formed by any 2 variables, the projected spline (onto 2D) will be a smooth fit of the projected point cloud (onto 2D).

IMG: I've included an image. The red points represent the centroid obtained from the aforementioned method.

3D Point Cloud and Local Centroids http://img510.imageshack.us/img510/2495/40670529.jpg

5
this question could be improved by including a minimum working example (MWE) of an x,y,z point cloud.Brian D

5 Answers

2
votes

A related questions is here:

Simple multidimensional curve fitting

In general, you could view a problem like this from a statistical learning point of view. In other words, you have a set of basis functions (e.g., splines) parametrized in a certain way, and then you use least squares or some other regression technique to find optimal coefficients. I happen to like Elements of Statistical Learning

1
votes

You could try additive (i.e single index models), as GAM http://www-stat.stanford.edu/software/gam/index.html

it's a greedy approach, very scalable, well implemented in several R packages

1
votes

It depends on what you mean by that. If you have a set of points f(x,y) -> z and you want to find a function that hits them all you could just do a spline.

If you have a known function and you want to adjust the parameters to minimize the RMS error, just consider x,y a composite object p (e.g., as if it were a complex or a 2-vector) and use an analog of the 2d case on f(p) -> z.

If you can be more specific about what you're trying to accomplish, I can be more specific with suggestions.

-- MarkusQ

So given the edited problem statement, I'd suggest the following:

  • If it's a time series (implied by your use of the term "trendline") I'd look at treating it as three parametric functions (x(t), y(t), z(t)) and doing 2d fitting on each of them.
  • Alternatively (but still assuming an ordered series), you may want to find a linear fit (a line through the heart of the cloud) and then add to that some sort of (probably polar) function based on the perpendicular projection from the points to the line.
  • If it isn't a time series (implied by the phrases "no known relation" and "point cloud") you have to define what "curve" you want to fit to them. Do you want a line? A surface / manifold? Do you want it to be a function of one or two of the variables, or independent of them (say, the convex hull). Does it have to be smooth, limited in degree, or...?

Really, the question is still too open ended.

0
votes

I would try using the Spacefilling Curve Heuristic. For example, sort the points by the order they are visited by a spacefilling curve. One solution to your problem would be a spline curve through the points taken in that order. To get a shorter and smoother curve (but greater RMS distance from the points to the curve), you could force the spline to go through only every kth point. You could improve the curve if, after choosing every kth point, you looked for a shorter Hamiltonian path through them (like the Traveling Salesman Problem, but for open paths). You could also adjust the spline knots to decrease the RMS distance. When calculating the RMS distance, I would use the spacefilling curve order to indicate which part of the spline is likely to be closest to a given point.

0
votes

There is a new very nice work by Charles Fefferman (yes - the Fields medalist) and Boaz Klartag:

You can find Both of them as pdf files on Klartag's publications page