I'm currently migrating from matlab to R, and trying to find out if what I want to do is possible.
I want to estimate a non-linear model in R where the observations are US states. The wrinkle is that one of the independent variables is a state-level index over counties, calculated using a parameter to be estimated, i.e. the model looks like this:
log(Y_s) = log(phi) + log(f(theta, X_cs)) + u_s
where Y_s is a state-level variable and X_cs is a vector containing county-level observations of a variable within the state, and f() returns a scalar value of the index calculated for the state.
So far I've tried using R's nls
function while transforming the data as it's passed to the function. Abstracting from the details of the index, a simpler version of the code looks like this:
library(dplyr)
state <- c("AK", "AK", "CA", "CA", "MA", "MA", "NY", "NY")
Y <- c(3, 3, 5, 5, 6, 6, 4, 4)
X <- c(4, 5, 2, 3, 3, 5, 3, 7)
Sample <- data.frame(state, Y, X)
f <- function(data, theta) {
output <- data %>%
group_by(state) %>%
summarise(index = mean(X**theta),
Y = mean(Y))
}
model <- nls(Y ~ log(phi) + log(index),
data = f(Sample, theta),
start = list(phi = exp(3), theta = 1.052))
This returns an error, telling me that the gradient is singular. My guess is it's because R can't see how the parameter theta
should be used in the formula.
Is there a way to do this using nls
? I know I could define the criterion function to be minimised manually, i.e. log(Y_s) - log(phi) - log(f(theta, X_cs))
, and use a minimisation routine to estimate the parameter values. But I want to use the postestimation features of nls
, like having a confidence interval for the parameter estimates. Any help much appreciated.