3.5 Interface to R
R is a free software environment designed for statistical computing and graphics. Compared to JAGS and OpenBUGS, R has a much more complete programming language and encompasses a vast array of computational and statistical techniques. Nevertheless, it too operates using script files written in its native language, which makes running it in batch mode feasible.
BayES’ rproject() function provides a convenient interface to R, which allows the user to:
- pass BayES matrices and datasets as input to R
- request R to execute code written in its native language
- retrieve output from R and store it in BayES dataset and matrix items; all data to be returned from R are stored inside a BayES structure item
The general syntax of the rproject() function is the following:5
, "data"=<list of matrices/datasets to pass to R>
, "return"=<list of matrices/dataframes to retrieve from R>
);
where:
- <structure name> is a BayES id value which will be associated with the BayES structure that the rproject() function returns. This structure will contain any R matrices or data frames that the user requests to be returned to BayES (using the "return" option) after execution of the R script completes.
- <R script file> is a string pointing to the file which contains the code (written in R’s language) that R will be requested to execute. If this file is not in the current directory then the file name must be prepended by the path to the file, either in absolute terms (eg. "C:/MyFiles/myScript.R") or relative to the current working directory (eg. "../myScript.R"). This is the only mandatory argument of the rproject() function.
- "data" specifies the data that will be passed as input to R. These can be either BayES datasets or matrices. <list of matrices/datasets to pass to R> is a list of the id values of matrices or datasets (comma-separated names inside curly brackets), as they appear in the R script file. These matrices or datasets must be defined in the current workspace. When a BayES matrix is passed as input to R then this becomes available as an R matrix in R. When a BayES dataset is passed as input to R then this becomes available as an R data frame in R.
- "return" specifies the R objects that will be returned to BayES when execution of the R script completes. <list of matrices/dataframes to retrieve from R> is a list of id values (comma-separated id values inside curly brackets) that specify the names of the matrices/datasets that should be returned from R, as they appear in the R script file. Any R matrix that is returned will be stored in BayES as a matrix, while any R data frame will be stored as a BayES dataset. These matrices/datasets are grouped together into a BayES structure. Passing other data types (structures, lists, strings, etc.) between BayES and R is not supported.
As the rproject() function executes, R attempts to print output on the system’s command console. BayES grabs this output and redirects it to the BayES main console in real time. This output is entirely determined by R and the commands contained in the R script file provided to rproject().
The sample script file in "$BayESHOME/Samples/5Interfaces/rproject" contains an example of using the rproject() function, along with a simple R script file. The R interface is also accessible from the BayES main menu via Interfaces → R project.
5Arguments inside square brackets are optional. Optional arguments passed to the rproject() function can be provided in any order, but always after the mandatory argument (R script file). Optional arguments always come in pairs (eg. "data"={myDataset,myMatrix}).