1
votes

I'm trying to replicate the ordered probit JAGS model in John Kruschke's "Doing Bayesian Analysis" (p. 676) in Stan:

JAGS model:

model {
    for ( i in 1:Ntotal ) {
      y[i] ~ dcat( pr[i,1:nYlevels] )
      pr[i,1] <- pnorm( thresh[1] , mu , 1/sigma^2 )
      for ( k in 2:(nYlevels-1) ) {
        pr[i,k] <- max( 0 ,  pnorm( thresh[ k ] , mu , 1/sigma^2 )
                           - pnorm( thresh[k-1] , mu , 1/sigma^2 ) )
      }
      pr[i,nYlevels] <- 1 - pnorm( thresh[nYlevels-1] , mu , 1/sigma^2 )
    }
    mu ~ dnorm( (1+nYlevels)/2 , 1/(nYlevels)^2 )
    sigma ~ dunif( nYlevels/1000 , nYlevels*10 )
    for ( k in 2:(nYlevels-2) ) {  # 1 and nYlevels-1 are fixed, not stochastic
      thresh[k] ~ dnorm( k+0.5 , 1/2^2 )
    }
  }

So far, I have the following that runs, but isn't producing the same results as what's in the book. Stan model:

data{
  int<lower=1> n; // number of obs
  int<lower=3> n_levels; // number of categories

  int y[n]; // outcome var 
}

parameters{
  real mu; // latent mean
  real<lower=0> sigma; // latent sd
  ordered[n_levels] thresh; // thresholds

}

model{
  vector[n_levels] pr[n];

  mu ~ normal( (1+n_levels)/2 , 1/(n_levels)^2 );
  sigma ~ uniform( n_levels/1000 , n_levels*10 );


  for ( k in 2:(n_levels-2) ) // 1 and nYlevels-1 are fixed, not stochastic
    thresh[k] ~ normal( k+0.5 , 1/2^2 );

  for(i in 1:n) {

    pr[i, 1] = normal_cdf(thresh[1], mu, 1/sigma^2);

    for (k in 2:(n_levels-1)) {
      pr[i, k] = max([0.0, normal_cdf(thresh[k], mu, 1/sigma^2) - normal_cdf(thresh[k-1], mu, 1/sigma^2)]);
    }

    pr[i, n_levels] = 1 - normal_cdf(thresh[n_levels - 1], mu, 1/sigma^2);

    y[i] ~ categorical(pr[i, 1:n_levels]);
  }

}

The data is here:

list(n = 100L, n_levels = 7, y = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 
4L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 6L, 6L, 7L))

Should recover a mu of 1.0 and sigma of 2.5. Instead, I'm getting mu of 3.98 and sigma of 1.25.

I'm sure I'm doing something wrong in the Stan model, but I'm very much a beginner and not sure what to do next. Thanks!

1
One basic thing to check is whether you're specifying your normal distributions right - "unlike JAGS, Stan defines the normal distribution in terms of the mean and standard deviation, not the mean and precision" (from: ling.uni-potsdam.de/~vasishth/JAGSStanTutorial/…), so you want to be doing something like normal(mean, sd) instead of normal(mean, 1 / sd^2).Marius
You also have to be careful with identifiability in these models. You can't have an intercept and completely varying cutpoints. Did you run multiple chains and get Rhat near 1 and decent effective sample size?Bob Carpenter
Thanks @Marius ! I've posted a new model as an answer that uses your suggestions.Nick DiQuattro
Thanks @BobCarpenter ! I've posted a new model as an answer that uses your suggestions.Nick DiQuattro

1 Answers

0
votes

Update: After scouring the internet (special thanks to Conor Goold) I've come up with this solution that replicates the book results pretty closely. Of course, any feedback/better model factorization would earn an accepted answer still!

data {
  real L;                     // Lower fixed thresholds
  real<lower=L> U;            // Upper fixed threshold

  int<lower=2> J;             // Number of outcome levels

  int<lower=0> N;             // Data length

  int<lower=1,upper=J> y[N];  // Ordinal responses
}

transformed data {
  real<lower=0> diff;         // difference between upper and lower fixed thresholds
  int<lower=1> K;             // Number of thresholds

  K = J - 1;
  diff = U - L;
}

parameters {
  simplex[K - 1] thresh_raw;      // raw thresholds
  real mu; // latent mean
  real<lower=0> sigma; // latent sd
}

transformed parameters {
  ordered[K] thresh;     // new thresholds with fixed first and last

  thresh[1] = L;
  thresh[2:K] = L + diff * cumulative_sum(thresh_raw);
  thresh[K] = U; // Overwrite last value to fix it
}

model{
  vector[J] theta;                  // local parameter for ordinal categories

  //priors
  mu ~ normal( (1+J)/2.0 , J );
  sigma ~ uniform( J/1000.0 , J*10 );

  for (i in 2:K-2)
    thresh[i] ~ normal(i + .05, 2);

  // likelihood statement
  for(n in 1:N) {

    // probability of ordinal category definitions
    theta[1] = normal_cdf( thresh[1] , mu, sigma );

    for (l in 2:K)
      theta[l] = fmax(0, normal_cdf(thresh[l], mu, sigma ) - normal_cdf(thresh[l-1], mu, sigma));

    theta[J] = 1 - normal_cdf(thresh[K] , mu, sigma);

    y[n] ~ categorical(theta);
  }
}