Bayesian Econometrics Software

10.1 Simple Seemingly Unrelated Regressions (SUR)

Mathematical representation

\begin{matrix} y_{1 i} & = & x_{1 i}^{'} β_{1} & + & 𝜀_{1 i} \\ y_{2 i} & = & x_{2 i}^{'} β_{2} & + & 𝜀_{2 i} \\ ⋮ & ⋮ & ⋮ \\ y_{M i} & = & x_{M i}^{'} β_{M} & + & 𝜀_{M i} \end{matrix}

the model consists of $M$ equations
the model is estimated using $N$ observations ( $i = 1, 2, \dots, N$ )
$y_{m i}$ is the value of equation $m$ ’s dependent variable for observation $i$
$x_{m i}$ is a $K_{m} \times 1$ vector that stores the values of the $K_{m}$ independent variables for observation $i$ , as they appear in equation $m$
the same independent variable can appear in multiple equations, associated with diﬀerent coeﬃcients
$β_{m}$ is a $K_{m} \times 1$ vector of parameters associated with equation $m$ ’s independent variables
in total, there are $K = \sum_{m = 1}^{M} K_{m}$ $β$ slope parameters to be estimated
the $M$ error terms jointly follow a multivariate Normal distribution with mean $0$ and precision matrix $Ω$

An equivalent and more compact representation of the model is:

y_{i} = X_{i} β + 𝜀_{i}, 𝜀_{i} \sim N (0, Ω^{- 1})

where:

\underset{M \times 1}{y_{i}} = [\begin{matrix} y_{1 i} \\ y_{2 i} \\ ⋮ \\ y_{M i} \end{matrix}], \underset{M \times K}{X_{i}} = [\begin{matrix} x_{1 i}^{'} & 0 & \dots & 0 \\ 0 & x_{2 i}^{'} & \dots & 0 \\ ⋮ & ⋮ & ⋱ & ⋮ \\ 0 & 0 & \dots & x_{M i}^{'} \end{matrix}], \underset{K \times 1}{β} = [\begin{matrix} β_{1} \\ β_{2} \\ ⋮ \\ β_{M} \end{matrix}], \underset{M \times 1}{𝜀_{i}} = [\begin{matrix} 𝜀_{1 i} \\ 𝜀_{2 i} \\ ⋮ \\ 𝜀_{M i} \end{matrix}]

Priors


Parameter	Probability density function	Default hyperparameters

$β$	$p (β) = \frac{\| P \|^{1 ∕ 2}}{{(2 π)}^{K ∕ 2}} exp \{- \frac{1}{2} {(β - m)}^{'} P (β - m)\}$	$m = 0_{K}$ , $P = 0.001 \cdot I_{K}$
$Ω$	$p (Ω) = \frac{\| Ω \|^{\frac{n - M - 1}{2}} \| V^{- 1} \|^{n ∕ 2}}{2^{n M ∕ 2} Γ_{M} (\frac{n}{2})} exp \{- \frac{1}{2} tr (V^{- 1} Ω)\}$	$n = M^{2}$ , $V = \frac{100}{M} \cdot I_{M}$

Syntax

[

<model name> =

]

sur( {
y1 ~ x11 x12 … x1K

_{1}

,
y2 ~ x21 x22 … x2K

_{2}

,
…,
yM ~ xM1 xM2 … xMK

_{M}

}

[

, <options>

]

);

where:

y1, y2, …, yM are the dependent variable names, as they appear in the dataset used for estimation
xm1 xm2 $\dots$ xmK $_{m}$ is a list of the $K_{m}$ independent variable names for equation $m = 1, 2, \dots, M$ , as they appear in the dataset used for estimation; when a constant term is to be included in an equation, this must be requested explicitly; $M$ such lists must be provided

The optional arguments for the Seemingly Unrelated Regressions model are:¹

Gibbs parameters

"chains"	number of chains to run in parallel (positive integer); the default value is 1
"burnin"	number of burn-in draws per chain (positive integer); the default value is 10000
"draws"	number of retained draws per chain (positive integer); the default value is 20000
"thin"	value of the thinning parameter (positive integer); the default value is 1
"seed"	value of the seed for the random-number generator (positive integer); the default value is 42
Hyperparameters

"m"	mean vector of the prior for $β$ ( $K \times 1$ vector); the default value is $0_{K}$
"P"	precision matrix of the prior for $β$ ( $K \times K$ symmetric and positive-deﬁnite matrix); the default value is $0.001 \cdot I_{K}$
"V"	scale matrix of the prior for $Ω$ ( $M \times M$ symmetric and positive-deﬁnite matrix); the default value is $\frac{100}{M} \cdot I_{M}$
"n"	degrees-of-freedom parameter of the prior for $Ω$ (real number greater than or equal to $M$ ); the default value is $M^{2}$
Dataset and log-marginal likelihood

"dataset"	the id value of the dataset that will be used for estimation; the default value is the ﬁrst dataset in memory (in alphabetical order)
"logML_CJ"	boolean indicating whether the Chib (1995)/Chib & Jeliazkov (2001) approximation to the log-marginal likelihood should be calculated (true $\|$ false); the default value is false

Reported Parameters


$β$	variable_name	vector of parameters associated with the independent variables; these are broken into groups according to the equation in which the independent variables appear

Stored values and post-estimation analysis
If a left-hand-side id value is provided when a SUR model is created, then the following results are saved in the model item and are accessible via the ‘.’ operator:

Samples	a matrix containing the draws from the posterior of $β$ (across all equations, starting from the ﬁrst equation) and the unique elements of $Ω$
ym$xm1, $\dots$ , ym$xmK $_{m}$	vectors containing the draws from the posterior of the parameters associated with variables xm1, $\dots$ ,xmK $_{m}$ , for $m = 1, 2, \dots, M$ (the names of these vectors are the names of the variables that were included in the right-hand side of equation $m$ , prepended by ym$, where ym is the name of the dependent variable in equation $m$ ; this is done so that the samples on the parameters associated with a variable that appears in more than one equations can be distinguished)
Omega_i_j	vectors containing the draws from the posterior of the unique elements of $Ω$ ; because $Ω$ is symmetric, only $\frac{(M - 1) M}{2} + M$ of its elements are stored (instead of all $M^{2}$ elements); i and j index the row and column of $Ω$ , respectively, at which the corresponding element is located
Omega	$M \times M$ matrix that stores the posterior mean of $Ω$
logML	the Lewis & Raftery (1997) approximation of the log-marginal likelihood
logML_CJ	the Chib (1995)/Chib & Jeliazkov (2001) approximation to the log-marginal likelihood; this is available only if the model was estimated with the "logML_CJ"=true option
nchains	the number of chains that were used to estimate the model
nburnin	the number of burn-in draws per chain that were used when estimating the model
ndraws	the total number of retained draws from the posterior ( $=$ chains $\cdot$ draws)
nthin	value of the thinning parameter that was used when estimating the model
nseed	value of the seed for the random-number generator that was used when estimating the model

Additionally, the following functions are available for post-estimation analysis (see section B.14):

diagnostics()
test()
pmp()

Examples

Example 1

Example 2

myData = import("$BayESHOME/Datasets/dataset5.csv");
myData.constant = ones(rows(myData), 1);

model1 = sur( {
        y1 ~ constant x1 x2 x3 x4 x5 x6 x7 x8 x9 x10,
        y2 ~ constant x1 x2 x3,
        y3 ~ constant x1 x2 x3
        }, "logML_CJ"=true );

print(mean([model1.y1$x1-model1.y2$constant, model1.y1$x2-model1.y3$constant]));

model2 = sur( {
        y1 ~ constant x1 x2 x3 x4 x5 x6 x7 x8 x9 x10,
        y2 ~ constant x1 x2 x3,
        y3 ~ constant x1 x2 x3
        },
    "constraints" = {
        y1$x1-y2$constant=0, y1$x5-0.5*y2$x1=0, y1$x6-y2$x2=0, y1$x7-y2$x3=0,
        y1$x2-y3$constant=0, y1$x6-y3$x1=0, y1$x8-0.5*y3$x2=0, y1$x9-y3$x3=0,
    },
    "Xi" = 1e7*eye(8,8), "logML_CJ"=true );

print(mean([model2.y1$x1-model2.y2$constant, model2.y1$x2-model2.y3$constant]));

pmp( {model1, model2} );
pmp( {model2, model2}, "logML_CJ" = true);

¹Optional arguments are always given in option-value pairs (eg. "chains"=3).

[next] [prev] [prev-tail] [front] [up]