7
votes

I want to use Gaussian Processes to solve a regression task. My data is as follow : each X vector has a length of 37, and each Y vector has a length of 8.

I'm using the sklearnpackage in Python but trying to use gaussian processes leads to an Exception:

from sklearn import gaussian_process

print "x :", x__
print "y :", y__

gp = gaussian_process.GaussianProcess(theta0=1e-2, thetaL=1e-4, thetaU=1e-1)
gp.fit(x__, y__) 

x : [[ 136. 137. 137. 132. 130. 130. 132. 133. 134.
135. 135. 134. 134. 1139. 1019. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 70. 24. 55. 0. 9. 0. 0.] [ 136. 137. 137. 132. 130. 130. 132. 133. 134. 135. 135. 134. 134. 1139. 1019. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 70. 24. 55. 0. 9. 0. 0.] [ 82. 76. 80. 103. 135. 155. 159. 156. 145. 138. 130. 122. 122. 689. 569. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.] [ 156. 145. 138. 130. 122. 118. 113. 111. 105. 101. 98. 95. 95. 759. 639. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.] [ 112. 111. 111. 114. 114. 113. 114. 114. 112. 111. 109. 109. 109. 1109. 989. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.] [ 133. 130. 125. 124. 124. 123. 103. 87. 96. 121. 122. 123. 123. 399. 279. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.] [ 104. 109. 111. 106. 91. 86. 117. 123. 123. 120. 121. 115. 115. 549. 429. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.] [ 144. 138. 126. 122. 119. 118. 116. 114. 107. 105. 106. 119. 119. 479. 359. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]]

y : [[ 7. 9. 13. 30. 34. 37. 36. 41. ] [ 7. 9. 13. 30. 34. 37. 36. 41. ] [ -4. -9. -17. -21. -27. -28. -28. -20. ] [ -1. -1. -4. -5. 20. 28. 31. 23. ] [ -1. -2. -3. -1. -4. -7. 8. 58. ] [ -1. -2. -14.33333333 -14. -13.66666667 -32. -26.66666667 -1. ] [ 1. 3.33333333 0. -0.66666667 3. 6. 22. 54. ] [ -2. -8. -11. -17. -17. -16. -16. -23. ]]

--------------------------------------------------------------------------- Exception Traceback (most recent call last) in () 11 gp = gaussian_process.GaussianProcess(theta0=1e-2, thetaL=1e-4, thetaU=1e-1) 12 ---> 13 gp.fit(x__, y__)

/usr/local/lib/python2.7/site-packages/sklearn/gaussian_process/gaussian_process.pyc in fit(self, X, y) 300 if (np.min(np.sum(D, axis=1)) == 0. 301 and self.corr != correlation.pure_nugget): --> 302 raise Exception("Multiple input features cannot have the same" 303 " target value.") 304

Exception: Multiple input features cannot have the same target value.

I've found some topics related to a scikit-learn issue, but my version is up-to-date.

1
As per the suggestion in the issue, did you try to comment out line 307 in gaussian_process.py?erip
Thanks @erip , it does solve -momentarily- the problem !Julian

1 Answers

7
votes

It is known issue and it still has not actually been resolved.

It is happens, because if you have same points , your matrix is not invertible(singular).(meaning you cannot calculate A^-1 - which is part of solution for GP).

In order to solve it, just add some small gaussian noise to your examples or use other GP library.

You can always try to implement it, it is actually not that hard. The most important thing in GP is your kernel function, for example gaussian kernel:

exponential_kernel = lambda x, y, params: params[0] * \
    np.exp( -0.5 * params[1] * np.sum((x - y)**2) )

Now, we need to build covariance matrix, like this:

covariance = lambda kernel, x, y, params: \
    np.array([[kernel(xi, yi, params) for xi in x] for yi in y])

So, when you want to predict new point x calculate its covariance:

sigma1 = covariance(exponential_kernel, x, x, theta)

and apply following:

def predict(x, data, kernel, params, sigma, t):
    k = [kernel(x, y, params) for y in data]
    Sinv = np.linalg.inv(sigma)
    y_pred = np.dot(k, Sinv).dot(t)
    sigma_new = kernel(x, x, params) - np.dot(k, Sinv).dot(k)
    return y_pred, sigma_new

This is very naive implementation and for data with high dimensions, runtime will be high. Hardest thing to calculate here is Sinv = np.linalg.inv(sigma) which takes O(N^3).