Bayesian Econometrics Software

5.1 Simple stochastic frontier

Mathematical representation

y_{i} = x_{i}^{'} β + v_{i} \pm u_{i}, v_{i} \sim N (0, \frac{1}{τ}), u_{i} \sim D (𝜃)

(5.1)

the model is estimated using $N$ observations
$y_{i}$ is the value of the dependent variable for observation $i$
$x_{i}$ is a $K \times 1$ vector that stores the values of the $K$ independent variables for observation $i$
$β$ is a $K \times 1$ vector of parameters
$τ$ is the precision of the noise component of the error term: $σ_{v}^{2} = \frac{1}{τ}$
ui is the ineﬃciency component of the error term and it can have any non-negative distribution, represented in the equation above by D 𝜃; BayES supports the following distributions for ui:
- exponential: $p (u_{i}) = λ e^{- λ u_{i}}$
- half normal: $p (u_{i}) = \frac{2 ϕ^{1 ∕ 2}}{{(2 π)}^{1 ∕ 2}} exp \{- \frac{ϕ}{2} u_{i}^{2}\}$
- truncated normal: $p (u_{i}) = \frac{ϕ^{1 ∕ 2} exp \{- \frac{ϕ}{2} {(u_{i} - μ)}^{2}\}}{{(2 π)}^{1 ∕ 2} Φ^{1 ∕ 2} (ϕ^{1 ∕ 2} μ)}$
- gamma: $p (u_{i}) = \frac{λ^{κ}}{Γ (κ)} u_{i}^{κ - 1} e^{- λ u_{i}}$
- log-Normal: $p (u_{i}) = \frac{ϕ^{1 ∕ 2}}{{(2 π)}^{1 ∕ 2} u_{i}} exp \{- \frac{ϕ}{2} {(log u_{i} - μ)}^{2}\}$

When

u_{i}

enters the speciﬁcation with a plus sign then the model represents a cost frontier, while when

u_{i}

enters with a minus sign the model represents a production frontier. For the eﬃciency scores generated by a stochastic frontier model to be meaningful, the dependent variable in both cases must be in logarithms.

Priors


Parameter	Probability density function	Default hyperparameters

Common to all models
$β$	$p (β) = \frac{\| P \|^{1 ∕ 2}}{{(2 π)}^{K ∕ 2}} exp \{- \frac{1}{2} {(β - m)}^{'} P (β - m)\}$	$m = 0_{K}$ , $P = 0.001 \cdot I_{K}$
$τ$	$p (τ) = \frac{b_{τ}^{a_{τ}}}{Γ (a_{τ})} τ^{a_{τ} - 1} e^{- τ b_{τ}}$	$a_{τ} = 0.001$ , $b_{τ} = 0.001$
Exponential model
$λ$	$p (λ) = \frac{b_{λ}^{a_{λ}}}{Γ (a_{λ})} λ^{a_{λ} - 1} e^{- λ b_{λ}}$	$a_{λ} = 1$ , $b_{λ} = 0.15$
Half normal model
$ϕ$	$p (ϕ) = \frac{b_{ϕ}^{a_{ϕ}}}{Γ (a_{ϕ})} ϕ^{a_{ϕ} - 1} e^{- ϕ b_{ϕ}}$	$a_{ϕ} = 7$ , $b_{ϕ} = 0.5$
Truncated normal model
$μ$	$p (μ) = \frac{t_{μ}^{1 ∕ 2}}{{(2 π)}^{1 ∕ 2}} exp \{- \frac{t_{μ}}{2} {(μ - m_{μ})}^{2}\}$	$m_{μ} = 0$ , $t_{μ} = 1$
$ϕ$	$p (ϕ) = \frac{b_{ϕ}^{a_{ϕ}}}{Γ (a_{ϕ})} ϕ^{a_{ϕ} - 1} e^{- ϕ b_{ϕ}}$	$a_{ϕ} = 5$ , $b_{ϕ} = 0.5$
Gamma model
$κ$	$p (κ) = \frac{b_{κ}^{a_{κ}}}{Γ (a_{κ})} κ^{a_{κ} - 1} e^{- κ b_{κ}}$	$a_{κ} = 3$ , $b_{κ} = 2$
$λ$	$p (λ) = \frac{b_{λ}^{a_{λ}}}{Γ (a_{λ})} λ^{a_{λ} - 1} e^{- λ b_{λ}}$	$a_{λ} = κ$ , $b_{λ} = 0.2$
Log-normal model
$μ$	$p (μ) = \frac{t_{μ}^{1 ∕ 2}}{{(2 π)}^{1 ∕ 2}} exp \{- \frac{t_{μ}}{2} {(μ - m_{μ})}^{2}\}$	$m_{μ} = - 1.5$ , $t_{μ} = 1$
$ϕ$	$p (ϕ) = \frac{b_{ϕ}^{a_{ϕ}}}{Γ (a_{ϕ})} ϕ^{a_{ϕ} - 1} e^{- ϕ b_{ϕ}}$	$a_{ϕ} = 2$ , $b_{ϕ} = 1$

Syntax

[

<model name> =

]

sf( y ~ x1 x2

\dots

[

, <options>

]

);

where:

y is the dependent variable name, as it appears in the dataset used for estimation
x1 x2 $\dots$ xK is a list of the $K$ independent variable names, as they appear in the dataset used for estimation; when a constant term is to be included in the model, this must be requested explicitly

The optional arguments for the simple stochastic frontier model are:¹

Gibbs parameters

"chains"	number of chains to run in parallel (positive integer); the default value is 1
"burnin"	number of burn-in draws per chain (positive integer); the default value is 10000
"draws"	number of retained draws per chain (positive integer); the default value is 20000
"thin"	value of the thinning parameter (positive integer); the default value is 1
"seed"	value of the seed for the random-number generator (positive integer); the default value is 42
Model specification

"udist"	speciﬁcation of the distribution of the ineﬃciency component of the error term; the following options are available, corresponding to the distributions presented at the beginning of this section: "exp" "hnorm" "tnorm" "gamma" "lognorm" the default value is "exp"
"production"	boolean specifying the type of frontier (production/cost); it could be set to either true (production) or false (cost); the default value is true
Hyperparameters

Common to all models
"m"	mean vector of the prior for $β$ ( $K \times 1$ vector); the default value is $0_{K}$
"P"	precision matrix of the prior for $β$ ( $K \times K$ symmetric and positive-deﬁnite matrix); the default value is $0.001 \cdot I_{K}$
"a_tau"	shape parameter of the prior for $τ$ (positive number); the default value is $0.001$
"b_tau"	rate parameter of the prior for $τ$ (positive number); the default value is $0.001$
Exponential model
"a_lambda"	shape parameter of the prior for $λ$ (positive number); the default value is $1$
"b_lambda"	rate parameter of the prior for $λ$ (positive number); the default value is $0.15$
Half normal model
"a_phi"	shape parameter of the prior for $ϕ$ (positive number); the default value is $7$
"b_phi"	rate parameter of the prior for $ϕ$ (positive number); the default value is $0.5$
Truncated normal model
"m_mu"	location parameter of the prior for $μ$ (real number); the default value is $0$
"t_mu"	precision parameter of the prior for $μ$ (positive number); the default value is $1$
"a_phi"	shape parameter of the prior for $ϕ$ (positive number); the default value is $5$
"b_phi"	rate parameter of the prior for $ϕ$ (positive number); the default value is $0.5$
Gamma model
"a_kappa"	shape parameter of the prior for $κ$ (positive number); the default value is $3$
"b_kappa"	rate parameter of the prior for $κ$ (positive number); the default value is $2$
"b_lambda"	rate parameter of the prior for $λ$ (positive number); the default value is $0.2$
Log-normal model
"m_mu"	location parameter of the prior for $μ$ (real number); the default value is $- 1.5$
"t_mu"	precision parameter of the prior for $μ$ (positive number); the default value is $1$
"a_phi"	shape parameter of the prior for $ϕ$ (positive number); the default value is $2$
"b_phi"	rate parameter of the prior for $ϕ$ (positive number); the default value is $1$
Dataset and log-marginal likelihood

"dataset"	the id value of the dataset that will be used for estimation; the default value is the ﬁrst dataset in memory (in alphabetical order)
"logML_CJ"	boolean indicating whether the Chib (1995)/Chib & Jeliazkov (2001) approximation to the log-marginal likelihood should be calculated (true $\|$ false); the default value is false

Reported Parameters


Common to all models

$β$	variable_name	vector of parameters associated with the independent variables

$τ$	tau	precision parameter of the noise component of the error term, $v_{i}$

$σ_{v}$	sigma_v	standard deviation of the noise component of the error term, $σ_{v} = 1 ∕ τ^{1 ∕ 2}$

Exponential model

$λ$	lambda	rate parameter of the distribution of the ineﬃciency component of the error term, $u_{i}$

$σ_{u}$	sigma_u	scale parameter of the ineﬃciency component of the error term: $σ_{u} = 1 ∕ λ$ . For the exponential model the standard deviation of $u_{i}$ is equal to the scale parameter.

Half normal model

$ϕ$	phi	precision parameter of the distribution of the ineﬃciency component of the error term, $u_{i}$

$σ_{u}$	sigma_u	scale parameter of the ineﬃciency component of the error term: $σ_{u} = 1 ∕ ϕ^{1 ∕ 2}$ . The standard deviation of $u_{i}$ for the half-normal model can be obtained as $σ_{u} \sqrt{1 - \frac{2}{π}}$ .

Truncated normal model

$μ$	mu	location parameter of the distribution of the ineﬃciency component of the error term, $u_{i}$

$ϕ$	phi	precision parameter of the distribution of the ineﬃciency component of the error term, $u_{i}$

$σ_{u}$	sigma_u	scale parameter of the ineﬃciency component of the error term: $σ_{u} = 1 ∕ ϕ^{1 ∕ 2}$ . The standard deviation of $u_{i}$ for the truncated-normal model can be obtained as $σ_{u} \sqrt{1 - 2 \frac{μ}{σ_{u}} ϕ (- \frac{μ}{σ_{u}}) - 4 ϕ^{2} (- \frac{μ}{σ_{u}})}$ .

Gamma model

$κ$	kappa	shape parameter of the distribution of the ineﬃciency component of the error term, $u_{i}$

$λ$	lambda	rate parameter of the distribution of the ineﬃciency component of the error term, $u_{i}$

$𝜃$	theta	scale parameter of the ineﬃciency component of the error term: $𝜃 = 1 ∕ λ$ . The standard deviation of $u_{i}$ for the Gamma model can be obtained as $𝜃 \sqrt{κ}$ .

Log-normal model

$μ$	mu	location parameter of the distribution of the ineﬃciency component of the error term, $u_{i}$

$ϕ$	phi	precision parameter of the distribution of the ineﬃciency component of the error term, $u_{i}$

$σ_{u}$	sigma_u	scale parameter of the ineﬃciency component of the error term: $σ_{u} = 1 ∕ ϕ^{1 ∕ 2}$ . The standard deviation of $u_{i}$ for the log-normal model can be obtained as $\sqrt{(e^{σ_{u}^{2}} - 1) e^{2 μ + σ_{u}^{2}}}$ .

Stored values and post-estimation analysis
If a left-hand-side id value is provided when a simple stochastic frontier model is created, then the following results are saved in the model item and are accessible via the ‘.’ operator:

Samples	a matrix containing the draws from the posterior of $β$ and $τ$ , and, depending on the estimated model, $λ$ , $μ$ , $ϕ$ , $κ$
x1, $\dots$ ,xK	vectors containing the draws from the posterior of the parameters associated with variables x1, $\dots$ ,xK (the names of these vectors are the names of the variables that were included in the right-hand side of the model)
tau	vector containing the draws from the posterior of $τ$
lambda	vector containing the draws from the posterior of $λ$ (available after the estimation of the exponential and Gamma models)
mu	vector containing the draws from the posterior of $μ$ (available after the estimation of the truncated-normal and log-normal models)
phi	vector containing the draws from the posterior of $ϕ$ (available after the estimation of the half-normal, truncated-normal and log-normal models)
kappa	vector containing the draws from the posterior of $κ$ (available after the estimation of the Gamma model)
logML	the Lewis & Raftery (1997) approximation of the log-marginal likelihood
logML_CJ	the Chib (1995)/Chib & Jeliazkov (2001) approximation to the log-marginal likelihood; this is available only if the model was estimated with the "logML_CJ"=true option
eff_i	$N \times 1$ vector that stores the expected values of the observation-speciﬁc eﬃciency scores, $E (e^{- u_{i}})$ ; the values in this vector are not guaranteed to be in the same order as the order in which the observations appear in the dataset used for estimation; use the store() function to associate the values in eff_i with the observations in the dataset
nchains	the number of chains that were used to estimate the model
nburnin	the number of burn-in draws per chain that were used when estimating the model
ndraws	the total number of retained draws from the posterior ( $=$ chains $\cdot$ draws)
nthin	value of the thinning parameter that was used when estimating the model
nseed	value of the seed for the random-number generator that was used when estimating the model

Additionally, the following functions are available for post-estimation analysis (see section B.14):

diagnostics()
test()
pmp()
store()

The simple stochastic frontier model uses the store() function to associate the estimates of the eﬃciency scores (eff_i) with speciﬁc observations and store their values in the dataset used for estimation. The generic syntax for a statement involving the store() function after estimation of a simple stochastic frontier model is:

store( eff_i, <new variable name>

[

, "model"=<model name>

]

);

Examples

Example 1

myData = import("$BayESHOME/Datasets/dataset1.csv");
myData.constant = ones(rows(myData), 1);

sf( y ~ constant x1 x2 x3, "logML_CJ" = true );

Example 2

myData = import("$BayESHOME/Datasets/dataset1.csv");
myData.constant = ones(rows(myData), 1);

expSF = sf( y ~ constant x1 x2 x3,
    "a_lambda"=-log(0.8), "b_lambda"=1.0,
    "logML_CJ"=true );

hnSF = sf( y ~ constant x1 x2 x3,
    "udist" = "hnorm",
    "a_phi"=7.0, "b_phi"=0.5,
    "logML_CJ"=true );

pmp( { expSF, hnSF } );
pmp( { expSF, hnSF }, "logML_CJ"=true);

store( eff_i, eff_exp, "model"=expSF );
store( eff_i, eff_hn, "model"=hnSF );

hist(myData.eff_exp,
    "title"="Efficiency scores from the exponential model",
    "grid"="on");
hist(myData.eff_hn,
    "title"="Efficiency scores from the half

-

normal model",
"grid"="on");

¹Optional arguments are always given in option-value pairs (eg. "chains"=3).

[next] [prev] [prev-tail] [front] [up]