Stan Polynomial Regression Parameter Estimation Model Review

Question

I have the following polynomial regression model:

Image Version

LaTeX Version

$Y_i | \mu_i, \sigma^2 \sim \text{Normal}(\mu_i, \sigma^2), i = 1, \dots, n \ \text{independent}$

$\mu_i = \alpha + \beta_1 x_{i1} + \beta_2 x_{i2} + \beta_3 x_{i1}^2 + \beta_4 x_{i2}^2 + \beta_5 x_{i1} x_{i2}$

$\alpha \sim \text{some suitable prior}$

$\beta_1, \dots, \beta_5 \sim \text{some suitable priors}$

$\sigma^2 \sim \text{some suitable prior}$

I want to take as input the sample size and the vectors of observations on $y_i$, $x_{i1}$, and $x_{i2}$. The code for this is as follows:

data{
  int<lower=1> n;
  vector[n] x1;
  vector[n] x2;
  vector[n] y;
}

I want to standardise (centre and scale) the two input variables to obtain the standardised regressor variables x1_std and x2_std. The code for this is in the transformed data block, as follows:

transformed data{
  real bar_x1;
  real x1_sd;
  vector[n] x1_std;
  real bar_x2;
  real x2_sd;
  vector[n] x2_std;
  real y_sd;

  bar_x1 = mean(x1);
  x1_sd = sd(x1);
  x1_std = (x1 - bar_x1)/x1_sd; // centered and scaled

  bar_x2 = mean(x2);
  x2_sd = sd(x2);
  x2_std = (x2 - bar_x2)/x2_sd; // centered and scaled

  y_sd = sd(y);
}

I then want to fit the above polynomial regression model using the standardised regressor variables and return estimates for the regression parameters $\alpha$, $\beta_1$ and $\dots, \beta_5$, on both the original and standardised scale.

Based on this, if I am not mistaken, the transformation formulae from the standardised parameters to the original scale are as follows:

Image Version

LaTeX Version

$\alpha = \tilde{\alpha} - \dfrac{\gamma_1}{s_1}\bar{x}_1 - \dfrac{\gamma_2}{s_2}\bar{x}_2 + \dfrac{\gamma_3}{s_1^2}\bar{x}_1^2 + \dfrac{\gamma_4}{s_2^2}\bar{x}_2^2 + \dfrac{\gamma_5}{s_1 s_2}\bar{x}_1\bar{x}_2$

$\beta_1 = \left( \dfrac{\gamma_1}{s_1} - 2\dfrac{\gamma_3}{s_1^2}\bar{x}_1 - \dfrac{\gamma_5}{s_1 s_2}\bar{x}_2 \right)$

$\beta_2 = \left( \dfrac{\gamma_2}{s_2} - 2\dfrac{\gamma_4}{s_2^2}\bar{x}_2 - \dfrac{\gamma_5}{s_1 s_2}\bar{x}_1 \right)$

$\beta_3 = \dfrac{\gamma_3}{s_1^2}$

$\beta_4 = \dfrac{\gamma_4}{s_2^2}$

$\beta_5 = \dfrac{\gamma_5}{s_1 s_2}$

The code implementing this is contained in the generated quantities block as follows:

alpha = alpha_std - beta1_std*bar_x1/x1_sd - beta2_std*bar_x2/x2_sd
      + (beta3_std*bar_x1^2)/x1_sd^2 + (beta4_std*bar_x2^2)/x2_sd^2
      + (beta5_std*bar_x2*bar_x1)/(x1_sd*x2_sd);

beta1 = beta1_std/x1_sd - 2*beta3_std*bar_x1/x1_sd^2
      - beta5_std*bar_x2/(x1_sd*x2_sd);

beta2 = beta2_std/x2_sd - 2*beta4_std*bar_x2/x2_sd^2
      - beta5_std*bar_x1/(x1_sd*x2_sd);

beta3 = beta3_std/x1_sd^2;

beta4 = beta4_std/x2_sd^2;

beta5 = beta5_std/(x1_sd*x2_sd);

My entire model is as follows:

data{
  int<lower=1> n;
  vector[n] x1;
  vector[n] x2;
  vector[n] y;
}
transformed data{
  real bar_x1;
  real x1_sd;
  vector[n] x1_std;
  real bar_x2;
  real x2_sd;
  vector[n] x2_std;
  real y_sd;

  bar_x1 = mean(x1);
  x1_sd = sd(x1);
  x1_std = (x1 - bar_x1)/x1_sd; // centered and scaled

  bar_x2 = mean(x2);
  x2_sd = sd(x2);
  x2_std = (x2 - bar_x2)/x2_sd; // centered and scaled

  y_sd = sd(y);
}
parameters{
  real<lower=0> sigma;
  real alpha_std;
  real beta1_std;
  real beta2_std;
  real beta3_std;
  real beta4_std;
  real beta5_std;
}
transformed parameters {
  real mu[n];

  for(i in 1:n) {
    mu[i] = alpha_std + beta1_std*x1_std[i]
      + beta2_std*x2_std[i] + beta3_std*x1_std[i]^2
      + beta4_std*x2_std[i]^2 + beta5_std*x1_std[i]*x2_std[i];
  }
}
model{
  alpha_std ~ normal(0, 10);
  beta1_std ~ normal(0, 2.5);
  beta2_std ~ normal(0, 2.5);
  beta3_std ~ normal(0, 2.5);
  beta4_std ~ normal(0, 2.5);
  beta5_std ~ normal(0, 2.5);
  sigma ~ exponential(1 / y_sd);

  y ~ normal(mu, sigma);
}
generated quantities {
  real alpha;
  real beta1;
  real beta2;
  real beta3;
  real beta4;
  real beta5;
  
  alpha = alpha_std - beta1_std*bar_x1/x1_sd - beta2_std*bar_x2/x2_sd
      + (beta3_std*bar_x1^2)/x1_sd^2 + (beta4_std*bar_x2^2)/x2_sd^2
      + (beta5_std*bar_x2*bar_x1)/(x1_sd*x2_sd);

  beta1 = beta1_std/x1_sd - 2*beta3_std*bar_x1/x1_sd^2
      - beta5_std*bar_x2/(x1_sd*x2_sd);

  beta2 = beta2_std/x2_sd - 2*beta4_std*bar_x2/x2_sd^2
      - beta5_std*bar_x1/(x1_sd*x2_sd);

  beta3 = beta3_std/x1_sd^2;

  beta4 = beta4_std/x2_sd^2;

  beta5 = beta5_std/(x1_sd*x2_sd);
}

I am using the hills data set from R's MASS package:

library(MASS)
hills[18, 3] <- 18.65 # Fixing transcription error
x1 <- hills$dist
x2 <- hills$climb
y <- hills$time
n <- length(x1)
data.in <- list(x1 = x1, x2 = x2, y = y, n = n)
model.fit <- sampling(example, data.in)

And now I output the standardised (alpha_std, beta1_std, beta2_std, beta3_std, beta4_std, beta5_std) and original scale (alpha, beta1, beta2, beta3, beta4, beta5) regression parameters:

print(model.fit, pars = c("alpha_std", "alpha", "beta1_std", "beta2_std", "beta3_std", "beta4_std", "beta5_std", "beta1", "beta2", "beta3", "beta4", "beta5", "sigma"), probs = c(0.05, 0.5, 0.95), digits = 5)

Have I gone about this correctly? I've also double- and triple-checked the mathematics, so I think it should be correct. Despite this, one thing that I am nervous about is the fact that beta4 is 0.00000. Is this an indication that I have made an error? As I said, I've gone over all of my code and mathematics, so, as far as I can tell, everything seems to be fine.

With beta4 = beta4_std/x2_sd^2;, given that beta4_std is non-zero, the only way beta4 is going to round to zero is if x2_sd^2 is very large. Is that what you'd expect? Usually when we standardize, we use a standard normal z ~ normal(0, 1) and then translate and scale by mean and sd, mu + sigma * z. I'm not sure why you're dividing by a square if you want to standardize. The original formulation is relatively simple, so I'd suggest starting there and then reparameterizing the model piece by piece. — Bob Carpenter
@BobCarpenter Hey Bob. Thanks for the response. I managed to find the error (or, rather, non-error); see my answer to the same post on the Stan forum. By the way, just a friendly heads-up, numerous people in my Bayesian class were experiencing the same Stan bug on their Mac machines: github.com/stan-dev/rstan/issues/524 So it seems to be a somewhat widespread bug. Not sure if you'll find this information useful, but I thought I'd just bring it up. — The Pointer
Thanks for letting me know where people are having problems. There's a solution at the end of that thread, but these kinds of install problems with Mac and Windows can be a nightmare because of the indirection. RStudio may soon build in the devtools, including a modern C++ toolchain based on Rtools, which should solve this problem for most users. — Bob Carpenter

The Pointer The Pointer · Accepted Answer · 2018-05-17T20:00:04

Ok, I just found out that the issue was that I wasn’t printing the values with enough digits (5 is insufficient) to see that the value is not 0.00000. Everything else is fine.

Stan Polynomial Regression Parameter Estimation Model Review

1 Answers