0
votes

I want to run a monte carlo simulation for a multiple regression model that predicts mpg and then evaluate how many times each car has a better performance (lower mpg) than the other. This is what I got so far

library(pacman)
pacman::p_load(data.table, fixest, stargazer, dplyr, magrittr)

df <- mtcars
fit <- lm(mpg~cyl + hp, data = df)
fit$coefficients[1]

beta_0 = fit$coefficients[1] # Intercept 
beta_1 = fit$coefficients[2] # Slope (cyl)
beta_2 = fit$coefficients[3] # slope (hp)
set.seed(1)  # Seed
n = 1000     # Sample size
M = 500      # Number of experiments/iterations

## Storage 
slope_DT <- rep(0,M)
slope_DT_2 <- rep(0,M)
intercept_DT <- rep(0,M)

## Begin Monte Carlo

for (i in 1:M){ #  M is the number of iterations
  
  # Generate data
  U_i = rnorm(n, mean = 0, sd = 2) # Error
  X_i = rnorm(n, mean = 5, sd = 5) # Independent variable
  Y_i = beta_0 + beta_1*X_i + beta_2*X_i +U_i  # Dependent variable
  
  # Formulate data.table
  data_i = data.table(Y = Y_i, X = X_i)
  
  # Run regressions
  ols_i <- fixest::feols(data = data_i, Y ~ X)
  
  # Extract slope coefficient and save
  slope_DT_2[i] <- ols_i$coefficients[3]
  slope_DT[i] <- ols_i$coefficients[2]
  intercept_DT[i] <- ols_i$coefficients[1]
  
}


# Summary statistics
estimates_DT <- data.table(beta_2 = slope_DT_2,beta_1 = slope_DT, beta_0 = intercept_DT)

This code does not create any coefficients for hp I want to know how to add coefficients to the model and then predict results and test how many times one car has lower mpg than the other. For example how many times Mazda RX4 has a lower predicted mpg than Datsun 710. Some idea on how can make this work? Thank you

1
I think its just the typo. At the end of each iteration, you overwrite slope_DT. One of these has to be slope_DT_2.Jonas
Hi. Thanks for the observation I fix the type but still doesn't give the results I need. Any idea on how to solve this problem?Pastor Soto
Your are creating only one independnet variable X_i. I think you should create two of those, one for cyl (X_i_1) and one fpr hp (X_i_2). Then you can do data_i = data.table(Y = Y_i, X1 = X_i_1,X2=X_i_2) and the formula becomes ols_i <- fixest::feols(data = data_i, Y ~ X1 + X2). Then you also get three coefficeints instead of two which is the case in the current code.Jonas
That works thank you!. Now how can I use this to make predictions and evaluate how many times one car has a lower mpg than the other?Pastor Soto
So do you want to compare every pair of available cars?Jonas

1 Answers

1
votes

Like ive noted in my comment, you shuld use two independent variables. Moreover, I would like to sugest you the lapply-function, which makes code more short, since you don't need the initialization/Storage part.

estimates_DT <- do.call("rbind",lapply(1:M, function(i) {
  # Generate data
  U_i = rnorm(n, mean = 0, sd = 2) # Error
  X_i_1 = rnorm(n, mean = 5, sd = 5) # First independent variable
  X_i_2 = rnorm(n, mean = 5, sd = 5) #Second ndependent variable
  Y_i = beta_0 + beta_1*X_i_1 + beta_2*X_i_2 + U_i  # Dependent variable

  # Formulate data.table
  data_i = data.table(Y = Y_i, X1 = X_i_1, X2 = X_i_2)
  
  # Run regressions
  ols_i <- fixest::feols(data = data_i, Y ~ X1 + X2)  
  ols_i$coefficients
}))

estimates_DT <- setNames(data.table(estimates_DT),c("beta_0","beta_1","beta_2"))

To compare the predictions of the two cars, define the following function taking the two carnames you want to comapre as arguemnt:

compareCarEstimations <- function(carname1="Mazda RX4",carname2="Datsun 710") {
  car1data <- mtcars[rownames(mtcars) == carname1,c("cyl","hp")]
  car2data <- mtcars[rownames(mtcars) == carname2,c("cyl","hp")]
  
  predsCar1 <- estimates_DT[["beta_0"]] + car1data$cyl*estimates_DT[["beta_1"]]+car1data$hp*estimates_DT[["beta_2"]]
  predsCar2 <- estimates_DT[["beta_0"]] + car2data$cyl*estimates_DT[["beta_1"]]+car2data$hp*estimates_DT[["beta_2"]]
  
  list(
    car1LowerCar2 = sum(predsCar1 < predsCar2),
    car2LowerCar1 = sum(predsCar1 >= predsCar2)
  )
}

Make sure the names provided as argument are valid names, e.g. are in rownames(mtcars).