### 3.4 Interface to Stan

Stan is another open-source program similar to JAGS and OpenBUGS, but with two important diﬀerences (i) the language it uses is slightly more complex, but also more ﬂexible, and (ii) it uses specialized sampling algorithms designed to work eﬃciently on hierarchical models. No matter these diﬀerences, Stan also takes as inputs data and a model speciﬁcation ﬁle (written in Stan’s own language) and draws samples from the posterior distribution of the model’s parameters or latent variables, or maximizes the respective likelihood function.

BayES’ stan() function provides a convenient interface to Stan, which allows the user to:

- pass BayES matrices as input data to Stan
- request Stan to draw samples from the posterior distribution of the model’s parameters and latent data or maximize the likelihood function, given a model speciﬁcation ﬁle
- retrieve the draws or other results from Stan, summarize them and print summary statistics on the BayES console
- store the draws or maximum-likelihood estimates from Stan in a BayES model item, making them available for post estimation analysis

The general syntax of the stan() function is the
following:^{4}

$[$, "method"="sample"|"variational"|"optimize" $]$

$[$, "data"=<list of matrices to pass to Stan> $]$

$[$, "inits"=<structure of initial values> $]$

$[$, "output"=<file where Stan should store its results> $]$

$[$, "diagnostic"=<file where Stan should store results for diagnostics> $]$

$[$, "options"=<additional command-line options> $]$

$[$, "summarize"=true|false $]$

$[$, "diagnose"=true|false $]$

$[$, "chains"=<positive integer> $]$

$[$, "burnin"=<positive integer> $]$

$[$, "draws"=<positive integer> $]$

$[$, "thin"=<positive integer> $]$

$[$, "seed"=<positive integer> $]$

$[$, "refresh"=<positive integer> $]$

);

where:

- <model name> is a BayES id value which will be associated with the model resulting from executing the stan() function. If no model name is provided the results from Stan will still be returned to BayES and summarized, but they will not be stored for further analysis. jags(), openbugs() and stan() are the three interface functions that provide the highest level of integration with BayES: the results from these functions are stored in BayES model items, on which all BayES functions which operate on models can be used.
- <model specification file> is a string pointing to the ﬁle which contains the speciﬁcation of the Stan model. If the speciﬁcation ﬁle is not in the current working directory then the ﬁle name must be prepended by the path to the ﬁle, either in absolute terms (eg. "C:/MyFiles/myModel.stan") or relative to the current directory (eg. "./myModel.stan"). This is the only mandatory argument of the stan() function.
- "method" speciﬁes the Stan method to be used. This can be one of the strings "sample", "variational" or "optimize", each one of them invoking the respective Stan method. Note that Stan’s “diagnose" method cannot be accessed in BayES directly, but diagnostic tests can be performed within Stan by setting the "diagnose" option to true in the stan() function. The default value of the "method" argument is "sample", in which case Stan samples from the posterior distribution of the model’s parameters using a Hamiltonian Monte Carlo (HMC) algorithm of ﬁxed-parameter sampling (depending on other options).
- "data" speciﬁes the data matrices that will be passed as input to Stan. <list of matrices> is a list of the id values of matrices (comma-separated names inside curly brackets), as they appear in the Stan model speciﬁcation ﬁle. These matrices must be deﬁned in the current workspace.
- "inits" speciﬁes the initial values per chain used by Stan. <structure of initial values> is a BayES structure, the elements of which could be structures themselves. Each element of the chain-speciﬁc structure corresponds to a parameter or latent variable, using the same id values as the ones used in the Stan model speciﬁcation ﬁle. It is possible to provide initial values for all parameters/latent variables or only a subset of them. It is also possible to leave entire chains uninitialized. In such cases Stan will generate initial values for the chains/parameters/latent variables which are not initialized by the user, using its default options.
- "output" speciﬁes the ﬁle to which Stan should store its results. If the output ﬁle is not in the current directory then the ﬁle name must be prepended by the path to the ﬁle, either in absolute terms (eg. "C:/MyFiles/myResults.csv") or relative to the current working directory (eg. "./myResults.csv"). If the output ﬁle is not speciﬁed then BayES will create temporary ﬁles in the current working directory. If, however, the user provides a name for the output ﬁle(s), these will persist even after exiting BayES.
- "diagnostic" speciﬁes the ﬁle to which Stan should store results that can be used for post-estimation diagnostics. If the diagnostic ﬁle is not in the current directory then the ﬁle name must be prepended by the path to the ﬁle, either in absolute terms (eg. "C:/MyFiles/myDgnstcs.csv") or relative to the current directory (eg. "./myDgnstcs.csv"). If the user provides a name for the diagnostic ﬁle(s), these will persist even after exiting BayES.
- "summarize" indicates whether the results produced by Stan (draws or values at which the likelihood function is maximized) should be summarized withing Stan, before returning control to BayES. The default value of "summarize" is true.
- "diagnose" indicates whether Stan should run diagnostic tests on the results it produced (draws or values at which the likelihood function is maximized), before returning control to BayES. The default value of "summarize" is true. Note that this optional argument eﬀectively replaces Stan’s “diagnose" method.
- "options" can be used to pass additional options to Stan, using its extensive argument tree. These options should be provided to BayES’ stan() function as a string, which is then passed verbatim to Stan. For example, setting the right-hand side of the "options" argument to "algorihtm=ﬁxed_param" requests Stan to use the ﬁxed-parameter sampler under its “sample" method.
- "chains" speciﬁes the number of chains that Stan will run in parallel. If the "method" argument of the stan() function is set to "sample" (default), BayES spawns as many Stan processes as the number of chains, which run in parallel and also mutes Stan’s output on the console. If, however, the "method" argument is set to either "variational" or "optimize", the "chains" argument is ignored.
- "burnin" performs diﬀerent functions under diﬀerent Stan methods. If the "method" argument of the stan() function is set to "sample" (default), "burnin" speciﬁes the number of draws from the posterior that will be discarded (per chain) to avoid dependence of the results on initial values. If the "method" argument is set to "variational", "burnin" speciﬁes the maximum number of ADVI iterations. This argument is ignored when the "method" argument is set to "optimize". The right-hand side must be a positive integer and the default value is 10,000.
- "draws" performs diﬀerent functions under diﬀerent Stan methods. If the "method" argument of the stan() function is set to "sample" (default), or "variational", "draws" speciﬁes the number of draws from the posterior that will be retained, per chain. If the "method" argument is set to "optimize", "draws" speciﬁes the maximum number of iterations of the algorithm that is used to maximize the likelihood. The right-hand side must be a positive integer and the default value is 20,000.
- "thin" speciﬁes, when the "method" argument of the stan() function is set to "sample" (default), the number of draws from the posterior that will be skipped (after the burn-in phase) per retained draw, to avoid high autocorrelation of the retained draws. For example, if the thinning parameter is set to 3, then only one in three consecutive draws will be retained and become available for inference and post-estimation analysis. The "thin" argument is ignored when the "method" argument is set to "variational" or "optimize". The right-hand side must be a positive integer and the default value is 1.
- "seed" speciﬁes the seed for the random-number generator used by Stan. The right-hand side must be a positive integer and the default value is 42.
- "refresh" speciﬁes the rate at which Stan prints information about its progress on the console. For example, if "refresh" is set equal to 100, Stan will print information every 100 iterations of the respective algorithm. The right-hand side must be a positive integer and the default value is 1000.

The path to the Stan model speciﬁcation ﬁle must not contain any spaces. |

As the stan() function executes, Stan attempts to print output on the system’s command console. BayES grabs this output and redirects it to the BayES main console in real time. This output is entirely determined by Stan and it includes information on the model speciﬁcation ﬁle used in the current run, all the options used, any errors or warnings and, most importantly, information on the progress of the algorithm being used. Note that when multiple chains are run in parallel (under Stan’s “sample" method), BayES mutes Stan’s output on the console.

Many of the sample script ﬁles in "$BayESHOME/Samples/3$-$JAGS$-$OpenBUGS$-$Stan" contain examples of using the stan() function, along with Stan model speciﬁcation ﬁles for simple models. The Stan interface is also accessible from the BayES main menu via Interfaces → Stan.

^{4}Arguments inside square brackets are optional. Optional arguments passed to the stan() function can be
provided in any order, but always after the mandatory argument (model speciﬁcation ﬁle). Optional arguments always
come in pairs (eg. "chains"=1).