2
votes

I use this code to do LinearRegression :

from sklearn.linear_model import LinearRegression
import pandas as pd

def calculate_Intercept_X_Variable():
    list_a=[['2018', '3', 'aa', 'aa', 93,1884.7746222667, 165.36153386251098], ['2018', '3', 'bb', 'bb', 62, 665.6392779848, 125.30386609565328], ['2018', '3', 'cc', 'cc', 89, 580.2259903521, 160.19280253775514]]
    df = pd.DataFrame(list_a)
    X = df.iloc[:, 5]
    y = df.iloc[:, 6]
    clf = LinearRegression()
    clf.fit(X, y)

calculate_Intercept_X_Variable()

But the error message is:

File "E:\Anaconda3\lib\site-packages\sklearn\utils\validation.py", line 181, in check_consistent_length " samples: %r" % [int(l) for l in lengths]) ValueError: Found input variables with inconsistent numbers of samples: [1, 3]

Where is wrong?

How to modify my code?

1

1 Answers

0
votes

From scikit-learn document, it says:

fit(X, y, sample_weight=None)

X : array-like or sparse matrix, shape (n_samples, n_features) Training data

y : array_like, shape (n_samples, n_targets) Target values. Will be cast to X’s dtype if necessary

Problem is that right now X and y are 1D arrays.

X.shape, y.shape
# ((3,), (3,))

You should reshape your X and y:

X = X.values.reshape(-1, 1)
y = y.values.reshape(-1, 1)
clf.fit(X,y)