I would like to run a time series regression with a list of dependent variables as the column. I would like to regress each column on a set of independent variables. I know you can just use
lm(dataframe~independent variables)
because if the dependent variable is a matrix, then they will just go through each column.
However, my dependent variables are information about stocks through time and sometimes information is not available for every single stock at every time point, so I have some NA values. The problem that I am having is that if I use lm, I have to omit the NA values, i.e. the lm function removes the whole row when running the regression. This is fine if I only want to run a regression on one dependent variable, but I have a list(1000+) of dependent variables which I would like to run my regression on. Because my dataset is only 15+ years, there is are missing values for very single time point, so when I run my lm regression, I get an error because the lm function has removed every single row when running the regression. The only way that I can think of to solve this problem is to run a for loop and run a separate regression for each stock, which I think will take a very long time to compute. For example, the following is an example of my data:
135081(P) 135084(P) 135090(P)
1994-12-30 NA NA NA
1995-01-02 NA NA NA
1995-01-03 06864935 NA NA
1995-01-04 NA NA -0.05474644
1995-01-05 NA NA 0.20894900
1995-01-06 NA -0.45672832 -0.02378632
so if I run a time series regression on this, I would get an error because the lm function would skip every single row.
So my question is, would there be another way to run a time series regression across a data frame with different DEPENDENT variables where the regression "skips" the NA for just the one particular dependent variable instead of skipping it for every other dependent variable as well?
I don't think using na.omit is correct because it removes the time series properties of my dataset and using na.action=NULL doesn't work because I have NA in my dataset. Thank you a lot for your help.
na.action=NULLwithin thelmfunction? - holzbenapply(dataframe, 2, function(x) x[!is.na(x)]). This will return a list with the non NA values for each columns, but than you must also index your independent variables according to the dependent.... - holzbenlmand ended up working with a consultant to run a Mann-Kendall test, for which there is now a package?Kendall. - Nazer