1
votes

I have a question on inverse prediction in Machine Learning/Data Science. Here I give a example to illustrate my question: I have 20 input features X = (x0, x1, ... x19) and 3 output variables Y = (y0, y1, y2). The number of training/test data usually small, such as <1000 items or even <100 in the training set.

In general, by using the machine learning toolbox (such as scikit learn), I can train the models (such as random forest, linear/polynomial regression and neural network) from X --> Y. But what I actually want to know is, for example, how should I set X, so that I can have y1 values in a specific range (for example y1 > 100).

Does anyone know how to solve this kind of "inverse prediction"? There are two ways in my mind:

  1. Train the model in the normal way: X-->Y, then set a dense mesh in the high dimension X space. In this example, it is 20 dimensions. Then use all the point in this mesh as input data and throw them to the trained model. Select all the input points where the predicted y1 > 100. Finally, use some methods, such as clustering to look for some patterns in the selected data points.
  2. Direct learn models from Y to X. Then, set a dense mesh in the high dimension Y space, where let y1 > 100. Then use the trained models to calculate the X data points.

The second method might be OK when the Y also have high dimensions. But usually, in my application, Y is very low-dimension and X is very high-dimension, which makes me think method 2 is not very practical.

Does anyone have any new thoughts? I think this should be somehow very common in industry and maybe some people meet similar situation before.

Thank you!

1
This is generically called an "inverse problem"; a web search will find some resources. In this case, you have F which maps more dimensions m into less dimensions n. If F is differentiable, I'm pretty sure that this implies that sets of dimension m minus n in X map into points of dimension Y. That is, all of those points map in X map to the same point in Y. So any of those points are solutions of your inverse problem. You can look for the point which satisfies some criterion such as a minimum norm criterion. - Robert Dodier
To find a solution, try minimizing (F(X) - Y)^2 + G(X) with respect to X, where Y is your target and G is some criterion applied to X in the equivalence class F(X) = Y, such as the norm of X squared. You can apply a gradient based approach if F is differentiable, such as a neural network. Not sure what the picture looks like if F is non-differentiable such as a decision tree. - Robert Dodier
Hi Rocco, were you able to solve this problem? I am facing a similar problem and would like to know how your approach went. Thanks! - user3451660

1 Answers

0
votes

From what I understand of your needs, #1 is an excellent fit for this problem. I recommend that you use a simple binary classifier SVM to discriminate good/bad X vectors. SVM works well with high-dimensional spaces, and reading out the coefficients is easy in most SVM interfaces.