Monthly Archives: September 2012

rjags Updated and Numbering Corrected

On September 18th, Martyn Plummer updated rjags. Normally, this wouldn’t be a big deal, but a question came across the R-sig-debian mailing list about the package at the same time, and I noticed that my Ubuntu 12.04 (precise) machine was using version 3.3 from the Ubuntu universe repository and not the 3.5 version that is available at the RRutter or c2d4u PPAs.

Turns out the version numbering system used by Martyn is not compatible with the Ubuntu numbering system used for the package in the universe repository. The universe package is numbered “3.3” while Matryn’s numbering format uses all dashes, “3-5”. The debian version of the package has not been updated since 3.3, so the source package used to create my update was genereated via the cran2deb4ubuntu project and kept Martyn’s “-” version numbering.

Even though rjags 3-5 was available in the PPA, rjags 3.3 was considered a more recent version by Ubuntu and the package was never upgraded. It is possible to use apt pinning or synaptic to force the “newer” version, but I have correct the numbering issue and the newest version of r-cran-jags is numbered “3.7” and will update automatically.

Not sure if any other packages on c2d4u have similar issues. If anybody spots one, please let me know.

Comparing Stan to JAGS for Bayesian Inference (Part 1?)

Stan is a new, open source, Bayesian inference tool. Stan is based on the the No-U-Turn sampler, a variant of Hamiltonian Monte Carlo. In order to compare Stan with JAGS, a gibbs sampler approach to Bayesian inference, I used the classic WinBUGS rats example. The rats example uses a hierarchical model to look at the growth curves of 30 rats.

In this example, I used R with RStan and R2jags packages. I used the code on the RStan Getting Started and the examples from the JAGS sourceforge page as starting points. I used a very large number of samples to test run times. Data and model files needed are below.

rats.csv

rats.bug

rats.txt

rats.stan

Both RStan and R2jags use the same list structure for the model data. In in R2jags, you need to specify which variables you wish to monitor, while RStan defaults to all of the variables. Both require an external model file and have very similar commands to fit the model. So, within R, both packages are very similar.

When you run the code, you will note that JAGS starts the sampling immediately, while Stan requires a lengthy compilation step. You will also see that I used two different calls to “stan”.

 stan.fit <- stan(file = 'rats.stan', data = rats_dat, verbose = FALSE, warmup=1000,iter = 11000, n_chains = 3) stan.fit2 <- stan(fit=stan.fit, data = rats_dat, verbose = FALSE, warmup=1000,iter = 11000, n_chains = 3) 

The first compiles the model and runs the sampler, while the second uses the model from the first command. This eliminates the need to recompile the model, saving a good chunk of time. Notice the first argument has changed from “file=” to “fit=”. I also discovered that adding the “thin” option to either “stan” call creates a segfault, which I have reported the the Stan maintainers.

The answers that each program produced were identical (to the first decimal, at least). Both gave means, standard deviations, and key quantiles of the posterior distributions of the parameters, as well as R-hat, a measure of convergence you would like close to 1. The plots are slightly different, with RStan color coding the R-hat value vs. reporting it for R2jags. The output from RStan does not include a DIC value, at least by default.

R2jags Output

Inference for Bugs model at "rats.bug", fit using jags, 3 chains, each with 10000 iterations (first 1000 discarded), n.thin = 9 n.sims = 3000 iterations saved mu.vect sd.vect 2.5% 25% 50% 75% 97.5% Rhat alpha.c 242.390 3.712 237.001 240.654 242.431 244.237 247.839 1.129 alpha0 106.316 4.508 99.444 103.915 106.331 108.864 113.246 1.043 beta.c 6.185 0.109 5.968 6.116 6.185 6.258 6.394 1.002 sigma 6.136 1.182 5.266 5.776 6.065 6.395 7.088 1.001 tau.alpha 0.005 0.001 0.003 0.004 0.005 0.006 0.008 1.001 tau.beta 4.137 1.538 1.904 3.073 3.902 4.904 7.857 1.001 tau.c 0.027 0.004 0.020 0.024 0.027 0.030 0.036 1.001 deviance 967.610 23.168 941.121 957.316 966.043 976.319 998.467 1.001 n.eff alpha.c 3000 alpha0 3000 beta.c 1400 sigma 3000 tau.alpha 3000 tau.beta 3000 tau.c 3000 deviance 3000 For each parameter, n.eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor (at convergence, Rhat=1). DIC info (using the rule, pD = var(deviance)/2) pD = 268.5 and DIC = 1236.1 DIC is an estimate of expected predictive error (lower deviance is better). 

plot of chunk unnamed-chunk-1

RStan output

Inference for Stan model: anon_model. 3 chains: each with iter=11000; warmup=1000; thin=1; 11000 iterations saved. mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff alpha[1] 239.9 0.0 2.7 234.6 238.1 239.9 241.7 245.1 15672 alpha[2] 247.8 0.0 2.7 242.6 246.0 247.8 249.6 253.1 16865 alpha[3] 252.4 0.0 2.7 247.1 250.6 252.4 254.2 257.7 20956 alpha[4] 232.6 0.0 2.7 227.3 230.8 232.5 234.3 237.8 21050 alpha[5] 231.6 0.0 2.7 226.4 229.8 231.6 233.4 236.8 20250 alpha[6] 249.7 0.0 2.7 244.5 247.9 249.7 251.5 255.1 18570 alpha[7] 228.7 0.0 2.7 223.4 226.9 228.7 230.5 234.0 21257 alpha[8] 248.4 0.0 2.7 243.0 246.6 248.4 250.2 253.7 16278 alpha[9] 283.3 0.0 2.7 278.0 281.5 283.3 285.1 288.6 19244 alpha[10] 219.2 0.0 2.7 213.9 217.4 219.2 221.0 224.6 17471 alpha[11] 258.2 0.0 2.7 252.9 256.4 258.2 260.1 263.6 21935 alpha[12] 228.2 0.0 2.7 222.9 226.4 228.2 230.0 233.5 19933 alpha[13] 242.4 0.0 2.7 237.1 240.6 242.4 244.2 247.7 14641 alpha[14] 268.3 0.0 2.7 262.9 266.4 268.3 270.1 273.7 20241 alpha[15] 242.8 0.0 2.7 237.6 241.0 242.8 244.6 248.0 14321 alpha[16] 245.3 0.0 2.7 239.9 243.5 245.3 247.1 250.5 15314 alpha[17] 232.2 0.0 2.7 227.0 230.4 232.2 234.0 237.4 20874 alpha[18] 240.5 0.0 2.7 235.2 238.6 240.5 242.3 245.6 16301 alpha[19] 253.8 0.0 2.7 248.5 252.0 253.8 255.6 259.1 22001 alpha[20] 241.6 0.0 2.7 236.2 239.8 241.6 243.4 246.9 14330 alpha[21] 248.5 0.0 2.7 243.3 246.8 248.5 250.3 253.8 19178 alpha[22] 225.2 0.0 2.7 220.0 223.4 225.2 227.1 230.5 19358 alpha[23] 228.5 0.0 2.7 223.2 226.7 228.5 230.3 233.8 19742 alpha[24] 245.1 0.0 2.7 239.9 243.3 245.1 246.9 250.2 14794 alpha[25] 234.5 0.0 2.7 229.2 232.7 234.5 236.2 239.7 18778 alpha[26] 254.0 0.0 2.7 248.7 252.2 254.0 255.8 259.2 19957 alpha[27] 254.4 0.0 2.7 249.1 252.6 254.4 256.2 259.7 21171 alpha[28] 243.0 0.0 2.7 237.7 241.2 243.0 244.8 248.3 14643 alpha[29] 217.9 0.0 2.7 212.6 216.1 217.9 219.7 223.3 18663 alpha[30] 241.4 0.0 2.6 236.3 239.6 241.4 243.2 246.6 16161 beta[1] 6.1 0.0 0.2 5.6 5.9 6.1 6.2 6.5 1717 beta[2] 7.1 0.0 0.3 6.6 6.9 7.1 7.2 7.5 1052 beta[3] 6.5 0.0 0.2 6.0 6.3 6.5 6.6 7.0 1218 beta[4] 5.3 0.0 0.3 4.8 5.2 5.3 5.5 5.8 1187 beta[5] 6.6 0.0 0.2 6.1 6.4 6.6 6.7 7.0 1936 beta[6] 6.2 0.0 0.2 5.7 6.0 6.2 6.3 6.7 1651 beta[7] 6.0 0.0 0.2 5.5 5.8 6.0 6.1 6.4 1702 beta[8] 6.4 0.0 0.2 5.9 6.3 6.4 6.6 6.9 1926 beta[9] 7.1 0.0 0.3 6.6 6.9 7.1 7.2 7.6 1444 beta[10] 5.8 0.0 0.2 5.4 5.7 5.8 6.0 6.3 1514 beta[11] 6.8 0.0 0.2 6.3 6.6 6.8 7.0 7.3 780 beta[12] 6.1 0.0 0.2 5.7 6.0 6.1 6.3 6.6 1924 beta[13] 6.2 0.0 0.2 5.7 6.0 6.2 6.3 6.6 1283 beta[14] 6.7 0.0 0.3 6.2 6.5 6.7 6.9 7.2 1523 beta[15] 5.4 0.0 0.3 4.9 5.2 5.4 5.6 5.9 1000 beta[16] 5.9 0.0 0.2 5.4 5.8 5.9 6.1 6.4 1920 beta[17] 6.3 0.0 0.2 5.8 6.1 6.3 6.4 6.8 2247 beta[18] 5.8 0.0 0.2 5.4 5.7 5.8 6.0 6.3 1694 beta[19] 6.4 0.0 0.2 5.9 6.3 6.4 6.6 6.9 1184 beta[20] 6.1 0.0 0.2 5.6 5.9 6.1 6.2 6.5 1666 beta[21] 6.4 0.0 0.2 5.9 6.2 6.4 6.6 6.9 1618 beta[22] 5.9 0.0 0.2 5.4 5.7 5.9 6.0 6.3 2248 beta[23] 5.8 0.0 0.2 5.3 5.6 5.8 5.9 6.2 1597 beta[24] 5.9 0.0 0.2 5.4 5.7 5.9 6.0 6.4 2203 beta[25] 6.9 0.0 0.2 6.4 6.7 6.9 7.1 7.4 1778 beta[26] 6.5 0.0 0.2 6.1 6.4 6.5 6.7 7.0 1912 beta[27] 5.9 0.0 0.2 5.4 5.7 5.9 6.1 6.4 1546 beta[28] 5.8 0.0 0.2 5.4 5.7 5.8 6.0 6.3 1776 beta[29] 5.7 0.0 0.2 5.2 5.5 5.7 5.8 6.2 1514 beta[30] 6.1 0.0 0.2 5.7 6.0 6.1 6.3 6.6 1630 mu_alpha 242.4 0.0 2.8 236.9 240.6 242.4 244.3 247.9 12061 mu_beta 6.2 0.0 0.1 6.0 6.1 6.2 6.3 6.4 2855 sigmasq_y 37.3 0.1 5.7 27.8 33.2 36.7 40.7 50.0 2879 sigmasq_alpha 219.1 1.2 64.6 125.7 173.1 208.1 252.0 374.3 2934 sigmasq_beta 0.3 0.0 0.1 0.1 0.2 0.3 0.3 0.5 408 sigma_y 6.1 0.0 0.5 5.3 5.8 6.1 6.4 7.1 2864 sigma_alpha 14.7 0.0 2.1 11.2 13.2 14.4 15.9 19.3 2966 sigma_beta 0.5 0.0 0.1 0.4 0.5 0.5 0.6 0.7 402 alpha0 106.3 0.1 3.7 99.0 103.9 106.3 108.8 113.6 5365 lp__ -438.4 0.1 6.9 -453.0 -442.7 -437.9 -433.5 -426.2 2614 Rhat alpha[1] 1 alpha[2] 1 alpha[3] 1 alpha[4] 1 alpha[5] 1 alpha[6] 1 alpha[7] 1 alpha[8] 1 alpha[9] 1 alpha[10] 1 alpha[11] 1 alpha[12] 1 alpha[13] 1 alpha[14] 1 alpha[15] 1 alpha[16] 1 alpha[17] 1 alpha[18] 1 alpha[19] 1 alpha[20] 1 alpha[21] 1 alpha[22] 1 alpha[23] 1 alpha[24] 1 alpha[25] 1 alpha[26] 1 alpha[27] 1 alpha[28] 1 alpha[29] 1 alpha[30] 1 beta[1] 1 beta[2] 1 beta[3] 1 beta[4] 1 beta[5] 1 beta[6] 1 beta[7] 1 beta[8] 1 beta[9] 1 beta[10] 1 beta[11] 1 beta[12] 1 beta[13] 1 beta[14] 1 beta[15] 1 beta[16] 1 beta[17] 1 beta[18] 1 beta[19] 1 beta[20] 1 beta[21] 1 beta[22] 1 beta[23] 1 beta[24] 1 beta[25] 1 beta[26] 1 beta[27] 1 beta[28] 1 beta[29] 1 beta[30] 1 mu_alpha 1 mu_beta 1 sigmasq_y 1 sigmasq_alpha 1 sigmasq_beta 1 sigma_y 1 sigma_alpha 1 sigma_beta 1 alpha0 1 lp__ 1 Sample were drawn using NUTS2 at Wed Sep 5 21:55:07 2012. For each parameter, n_eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence, Rhat=1). 

plot of chunk unnamed-chunk-1

One interesting difference between the two outputs is the R-hat value for the alpha0 variable. Alpha0 is generated quantity, representing the value of the intercept when non-centered data is used. For R2jags, the value of R-hat is 1.228, while R-hat is 1 for RStan. A quick look at the output indicates that R2jags used a thin value of 9, while RStan defaults to 1 (and creates a segfault when changed).

There is a major speed difference between the two methods. Now, it should be noted that JAGS can be faster on some problems and Stan can be faster on others. On my machine, the JAGS sampler took 7.1 seconds, while the Stan sampler took 25 seconds, which was a bit shocking to me. And this was timed using the second “stan” call, when the Stan model was pre-compiled. This is something that needs to be explored further.

Installing RStan in Ubuntu

Based on RStan Getting Started

R packages

Assuming you have the most up to date version of R, the following packages need to be installed. This assumes you have the c2d4u PPA available. See here for more information.

 sudo apt-get install r-cran-rcpp r-cran-inline r-cran-rcpp 

C++ Compiler

You probably have a C++ compiler installed, but just in case:

 sudo apt-get install build-essential 

The “RStan Getting Started” page suggests modifying the C++ compiler optimization level. On my install of Ubuntu, the optimization was set as the page suggested. You can check with:

 grep "CXXFLAGS =" /usr/lib/R/etc/Makeconf 

Among the output of grep, you should see:

 CXXFLAGS = -O3 -pipe -g $(LTO) 

If you do not, use your favorite editor to edit “/ust/lib/R/etc/Makeconf” so that the CXXFLAGS line has “-03” as well as everything else that is there.

Install RStan

From within R, run the following commands. You will need to select a mirror, although it shouldn’t actually use it.

 options(repos = c(getOption("repos"), rstan = "http://wiki.stan.googlecode.com/git/R")) install.packages('rstan', type = 'source') 

It is a large download (for R, 2.3 Mb) and will take some time to compile.

Testing RStan

Now that RStan is installed, you can test it with the following code. RStan compiles the model, then runs it. It will take much longer to compile the code than it takes to generate the samples.