I started learning maching learning on Python using Pandas and Sklearn.
I tried to use the LinearRegression().fit
method :
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
house_data = pd.read_csv(r"C:\Users\yassine\Desktop\ml\OC-tp-ML\house_data.csv")
y = house_data[["price"]]
x = house_data[["surface","arrondissement"]]
X = house_data.iloc[:, 1:3].values
x_train, x_test, y_train, y_test = train_test_split (x, y, test_size=0.25, random_state=1)
model = LinearRegression()
model.fit(x_train, y_train)
When I run the code, I have this message :
ValueError: Input contains NaN, infinity or a value too large for dtype('float64').
Can You help me please.
NaN
values,infinite
values, or extremely large values that scikit can't handle. Check forNaN
rows in your data and try to remove them – G. Anderson