0
votes

I updated my R version today and now I get an error with the lm() function. Now running on R version 4.0.3 (2020-10-10)

This is my data structure saved in df:

DataFrame

(if picture isn't loading, it's a dataframe with 2 columns: 'Date' and 'Value'. (dates are saved as date with as.Date() function)

I want to know the slope of the linear regression line, so I use the following function:

trend <- lm(formula = Date~Value, data=df)

It would normally return me the intercept and slope of the trend-line, but after the update of my R version, I get the following error:

Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) : NA/NaN/Inf in 'y'

There are no NA values in my dataframe and I've not found how to fix this error. Does someone have a suggestion how to fix it or another way to get to know the slope of the trendline. I think it's because I use a date variable, because the function doesn't return an error if I use 2 numeric variables.

Thanks in advance for your time and help!

1
Probably you are looking for lm(value~Date, df) There is no way date can be a response variableOnyambu
It indeed fixed the error, but now I got the coefficients per day. Is there any way to get the slope for the trendline over the whole data frame?Brecht Sreukers

1 Answers

2
votes

if you want to use a simple linear regression with one variable, you could convert the date into running days (days past since beginning).

Looking at your data this would not bring you much as it does not look very "linear". So you could deconstruct the date into relevant components (and possibly use the running date from above as well):

  • day of the month
  • day of the week
  • week
  • month
  • quarter/semester
  • year

(or a selection of these) and perform a multiple linear regression on these integers.

Last but not least time-series prediction would be the most correct way do go (auto.arima() i.e.)