I am performing a linear regression analysis on bike share data. I am interested in predicting the bikecount based on the other factors.
So I split the data like so :
x = df[['rain', 'temp', 'rhum', 'msl', 'wdsp', 'day', 'month', 'monthname', 'season']]
y = df['bikecount']
Then, when I get to this stage: lm.fit(X_train,y_train)
it returns this error: ValueError: could not convert string to float: '07/06/2019'
I tried converting this column to float using df['date'] = float(df['date']) but that returns the error TypeError: cannot convert the series to <class 'float'>
I don't understand why this keeps coming up. I'm not even interested in the date column for my analysis. Any help would be appreciated!
0 datetime 6040 non-null datetime64[ns]
1 bikecount 6040 non-null int64
2 rain 6040 non-null float64
3 temp 6040 non-null float64
4 rhum 6040 non-null int64
5 msl 6040 non-null float64
6 wdsp 6040 non-null int64
7 date 6040 non-null object
8 time 6040 non-null object
9 day 6040 non-null object
10 month 6040 non-null int64
11 monthname 6040 non-null object
12 season 6040 non-null object
dtypes: datetime64ns, float64(3), int64(4), object(5)
memory usage: 613.6+ KB
datetime | bikecount | rain | temp | rhum | msl | wdsp | date | datetime.1 | day | month | monthname | season |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2019-01-01 00:00:00 | 1 | 0.0 | 9.9 | 78 | 1036.0 | 4 | 01/01/2019 | 00:00:00 | Tuesday | 1 | January | Winter |
2019-01-01 07:00:00 | 1 | 0.0 | 8.3 | 87 | 1036.8 | 2 | 01/01/2019 | 07:00:00 | Tuesday | 1 | January | Winter |
2019-01-01 11:00:00 | 2 | 0.0 | 9.5 | 89 | 1038.8 | 3 | 01/01/2019 | 11:00:00 | Tuesday | 1 | January | Winter |
2019-01-01 12:00:00 | 4 | 0.0 | 10.1 | 84 | 1038.7 | 3 | 01/01/2019 | 12:00:00 | Tuesday | 1 | January | Winter |
01/01/2019
, what is the expected result? – Corralien