0
votes

I am currently looking at if there are annual trends in my data. I am doing linear regressions between R and year, and H and Year, as well as between R and H.

However, when I do a linear regression of R against year where year, I get a NA F and P value. When year is y the code works, but I would like to know why the linear model only works one way, and if year as y is valid in this instance for data analysis? Thank you in advance.

DATA:

year R H
2000 160 140
2001 178 153
2002 149 138
2003 161 149
2004 180 173
2005 150 142
2006 158 130
2007 149 190
2008 167 200
2009 172 204

Code:

#this has lots of NA outputs
linearmodel<-lm(data$R ~ data$year)
linearmodel
summary(linearmodel)

#this gives output statistics
linearmodel<-lm(data$year ~ data$R)
linearmodel
summary(linearmodel)

Thank you again.

1
If I construct the dataframe from scratch with the same data it doesn't give me problems, maybe the issue is in the dataframe you are working on. Are you sure the values referring to the years are numeric and not strings? - Giulio Mattolin
What is the output of str(data) ? If year is a factor then it is not possible to run lm with year as dependent variable - Basti
Thank you for trying to replicate. The output of the str is that each row is numerical but year has this... $ year: chr [1:10], I converted the dataframe into a new csv file and now have no errors. I assume the issue was with the dataframe as it had been rearranged and formatted a number of times. Thanks again both! - Emma

1 Answers

1
votes

I can't reproduce this. Both formulas contain no NA values in the output. The same is true for the summary() of those two models.

data_68544559 <- data.frame(
  year = 2000:2009,
  R = c(160, 178, 149, 161, 180, 150, 158, 149, 167, 172)
)

lm(R ~ year, data_68544559)
#> 
#> Call:
#> lm(formula = R ~ year, data = data_68544559)
#> 
#> Coefficients:
#> (Intercept)         year  
#>   259.58788     -0.04848
lm(year ~ R, data_68544559)
#> 
#> Call:
#> lm(formula = year ~ R, data = data_68544559)
#> 
#> Coefficients:
#> (Intercept)            R  
#>   2.005e+03   -3.316e-03

Changing the format for the time column to POSIXct will also give no NAs, but an error in summary() because the residuals can't be calculated.

summary(lm(ISOdate(year, 1, 1) ~ R, data_68544559))
#> Error in Ops.difftime((f - mean(f)), 2) : 
#>   '^' not defined for "difftime" objects