0
votes

I am trying to run a lm/glm between two variables "area" and "intensity".

enter image description here

I ran a linear model regression between the variables with all rows combined and got summary results as below. I want to run the lm for the two variables individually for each city (A/B/C/D/E). How can I modify/loop the script such that I do not have to run the script 5 times, and the r-squared value and model results are added in the dataframe?

R1 <- lm(formula = area ~ intensity, data = df1) Call: lm(formula = area ~ intensity, data = df1)

Residuals: Min 1Q Median 3Q Max -2716.1 -1540.5 -684.3 1588.8 2686.8

Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 1646.30 569.73 2.890 0.00976 ** intensity -333.10 42.73 -7.795 3.54e-07 ***

Signif. codes: 0 ‘’ 0.001 ‘’ 0.01 ‘’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1790 on 18 degrees of freedom Multiple R-squared: 0.7715, Adjusted R-squared: 0.7588 F-statistic: 60.77 on 1 and 18 DF, p-value: 3.537e-07

1

1 Answers

0
votes

I am sharing a way to store a list of outputs and also a way to put results in primary dataframe:

results <- list()
cities <- unique(df1$city)
for (i in 1:length(cities)){
R1 <- lm(area ~ intensity, data=df1[df1$city==cities[i],])
results[[cities[i]]] <- summary(R1) # if you want to store everything
temp_df <- data.frame(prediceted=fitted(R1))
temp_df$city <- cities[i] 
temp_df$r_square <- summary(R1)$r.squared
if(i==1) result_df <- temp_df else result_df <- rbind(result_df, 
temp_df)
}

df1 <- merge(df1, result_df, by='city')