I am new to machine learning and xgboost
and I am solving a regression problem.
My target value are very small (e.g.-1.23e-12).
I am using linear regression and xgboost
regressor,
but xgboost
always predicts the same values, like:
[1.32620335e-05 1.32620335e-05 ... 1.32620335e-05].
I tried to tune some parameters in xgboost.regressor, but it also predicted the same values.
I've seen Scaling of target causes Scikit-learn SVM regression to break down , so I tried to scale my target value to likes(data.target = data.target*(10**12)) , and it fixed the problem. But I am not sure this is reasonable to scale my target value, and I don't know if this problem in xgboost is the same to SVR? .
Here is target value of my data:
count 2.800010e+05 mean -1.722068e-12 std 6.219815e-13 min -4.970697e-12 25% -1.965893e-12 50% -1.490800e-12 75% -1.269998e-12 max -1.111604e-12
And part of my code:
X = df[feature].values
y = df[target].values *(10**(12))
X_train, X_test, y_train, y_test = train_test_split(X, y)
xgb = xgboost.XGBRegressor()
LR = linear_model.LinearRegression()
xgb.fit(X_train,y_train)
LR.fit(X_train,y_train)
xgb_predicted = xgb.predict(X_test)
LR_predicted = LR.predict(X_test)
print('xgb predicted:',xgb_predicted[0:5])
print('LR predicted:',LR_predicted[0:5])
print('ground truth:',y_test[0:5])
Output:
xgb predicted: [-1.5407631 -1.49756 -1.9647646 -2.7702322 -2.5296502] LR predicted: [-1.60908805 -1.51145989 -1.71565321 -2.25043287 -1.65725868] ground truth: [-1.6572993 -1.59879922 -2.39709641 -2.26119817 -2.01300088]
And the output with y = df[target].values
(i.e., did not scale target value)
xgb predicted: [1.32620335e-05 1.32620335e-05 1.32620335e-05 1.32620335e-05 1.32620335e-05] LR predicted: [-1.60908805e-12 -1.51145989e-12 -1.71565321e-12 -2.25043287e-12 -1.65725868e-12] ground truth: [-1.65729930e-12 -1.59879922e-12 -2.39709641e-12 -2.26119817e-12 -2.01300088e-12]