### 5.2 Ineﬃciency-eﬀects stochastic frontier

Mathematical representation

 ${y}_{i}={\mathbf{x}}_{i}^{\prime }\beta +{v}_{i}±{u}_{i},\phantom{\rule{2em}{0ex}}\phantom{\rule{1em}{0ex}}{v}_{i}\sim \mathrm{N}\left(0,\frac{1}{\tau }\right),\phantom{\rule{1em}{0ex}}{u}_{i}\sim \mathrm{D}\left(𝜃,{\mathbf{z}}_{i}\right)$ (5.2)
• the model is estimated using $N$ observations
• ${y}_{i}$ is the value of the dependent variable for observation $i$
• ${\mathbf{x}}_{i}$ is a $K\phantom{\rule{0.3em}{0ex}}×\phantom{\rule{0.3em}{0ex}}1$ vector that stores the values of the $K$ independent variables for observation $i$
• $\beta$ is a $K\phantom{\rule{0.3em}{0ex}}×\phantom{\rule{0.3em}{0ex}}1$ vector of parameters
• ${\mathbf{z}}_{i}$ is an $L\phantom{\rule{0.3em}{0ex}}×\phantom{\rule{0.3em}{0ex}}1$ vector that stores the values of the $L$ determinants of ineﬃciency for observation $i$
• $\tau$ is the precision of the noise component of the error term: ${\sigma }_{v}^{2}=\frac{1}{\tau }$
• ${u}_{i}$ is the ineﬃciency component of the error term and it can have any non-negative distribution, represented in the equation above by $\mathrm{D}\left(𝜃,{\mathbf{z}}_{i}\right)$; BayES supports the following distributions for ${u}_{i}$:
• exponential: $p\left({u}_{i}\right)={\lambda }_{i}{e}^{-{\lambda }_{i}{u}_{i}}$, with ${\lambda }_{i}={e}^{{\mathbf{z}}_{i}^{\prime }\delta }$ and $\delta$ being an $L\phantom{\rule{0.3em}{0ex}}×\phantom{\rule{0.3em}{0ex}}1$ vector of parameters to be estimated
• truncated normal: $p\left({u}_{i}\right)=\frac{{\varphi }^{1∕2}exp\left\{-\frac{\varphi }{2}{\left({u}_{i}-{\mu }_{i}\right)}^{2}\right\}}{{\left(2\pi \right)}^{1∕2}{\Phi }^{1∕2}\left({\varphi }^{1∕2}{\mu }_{i}\right)}$, with ${\mu }_{i}={\mathbf{z}}_{i}^{\prime }\delta$ and $\delta$ being an $L\phantom{\rule{0.3em}{0ex}}×\phantom{\rule{0.3em}{0ex}}1$ vector of parameters and $\varphi$ a scalar parameter to be estimated

 When ${u}_{i}$ enters the speciﬁcation with a plus sign then the model represents a cost frontier, while when ${u}_{i}$ enters with a minus sign the model represents a production frontier. For the eﬃciency scores generated by a stochastic frontier model to be meaningful, the dependent variable in both cases must be in logarithms.

Priors

 Parameter Probability density function Default hyperparameters Common to all models $\beta$ $p\left(\beta \right)=\frac{|{\mathbf{P}}_{\beta }{|}^{1∕2}}{{\left(2\pi \right)}^{K∕2}}exp\left\{-\frac{1}{2}{\left(\beta \phantom{\rule{0.3em}{0ex}}-\phantom{\rule{0.3em}{0ex}}{\mathbf{m}}_{\beta }\right)}^{\prime }{\mathbf{P}}_{\beta }\left(\beta \phantom{\rule{0.3em}{0ex}}-\phantom{\rule{0.3em}{0ex}}{\mathbf{m}}_{\beta }\right)\right\}$ ${\mathbf{m}}_{\beta }={\mathbf{0}}_{K}$, ${\mathbf{P}}_{\beta }=0.001\cdot {\mathbf{I}}_{K}$ $\tau$ $p\left(\tau \right)=\frac{{b}_{\tau }^{{a}_{\tau }}}{\Gamma \left({a}_{\tau }\right)}{\tau }^{{a}_{\tau }-1}{e}^{-\tau {b}_{\tau }}$ ${a}_{\tau }=0.001$, ${b}_{\tau }=0.001$ Exponential model $\delta$ $p\left(\delta \right)=\frac{|{\mathbf{P}}_{\delta }{|}^{1∕2}}{{\left(2\pi \right)}^{L∕2}}exp\left\{-\frac{1}{2}{\left(\delta \phantom{\rule{0.3em}{0ex}}-\phantom{\rule{0.3em}{0ex}}{\mathbf{m}}_{\delta }\right)}^{\prime }{\mathbf{P}}_{\delta }\left(\delta \phantom{\rule{0.3em}{0ex}}-\phantom{\rule{0.3em}{0ex}}{\mathbf{m}}_{\delta }\right)\right\}$ ${\mathbf{m}}_{\delta }={\mathbf{0}}_{L}$, ${\mathbf{P}}_{\delta }=0.01\cdot {\mathbf{I}}_{L}$ Truncated normal model $\delta$ $p\left(\delta \right)=\frac{|{\mathbf{P}}_{\delta }{|}^{1∕2}}{{\left(2\pi \right)}^{L∕2}}exp\left\{-\frac{1}{2}{\left(\delta \phantom{\rule{0.3em}{0ex}}-\phantom{\rule{0.3em}{0ex}}{\mathbf{m}}_{\delta }\right)}^{\prime }{\mathbf{P}}_{\delta }\left(\delta \phantom{\rule{0.3em}{0ex}}-\phantom{\rule{0.3em}{0ex}}{\mathbf{m}}_{\delta }\right)\right\}$ ${\mathbf{m}}_{\delta }={\mathbf{0}}_{L}$, ${\mathbf{P}}_{\delta }=0.01\cdot {\mathbf{I}}_{L}$ $\varphi$ $p\left(\varphi \right)=\frac{{b}_{\varphi }^{{a}_{\varphi }}}{\Gamma \left({a}_{\varphi }\right)}{\varphi }^{{a}_{\varphi }-1}{e}^{-\varphi {b}_{\varphi }}$ ${a}_{\varphi }=4$, ${b}_{\varphi }=0.5$

Syntax

$\left[$<model name> = $\right]$ sf( y ~ x1 x2 $\dots$ xK $|$ z1 z2 $\dots$ zL $\left[$, <options>$\right]$ );

where:

• y is the dependent variable name, as it appears in the dataset used for estimation
• x1 x2 $\dots$xK is a list of the $K$ independent variable names, as they appear in the dataset used for estimation; when a constant term is to be included in the model, this must be requested explicitly
• z1 z2 $\dots$zL is a list of the $L$ variable names that aﬀect ${u}_{i}$ (determinants of ineﬃciency), as they appear in the dataset used for estimation; when a constant term is to be included in the model, this must be requested explicitly

The optional arguments for the ineﬃciency-eﬀects stochastic frontier model are:2

 Gibbs parameters "chains" number of chains to run in parallel (positive integer); the default value is 1 "burnin" number of burn-in draws per chain (positive integer); the default value is 10000 "draws" number of retained draws per chain (positive integer); the default value is 20000 "thin" value of the thinning parameter (positive integer); the default value is 1 "seed" value of the seed for the random-number generator (positive integer); the default value is 42 Model specification "udist" speciﬁcation of the distribution of the ineﬃciency component of the error term; the following options are available, corresponding to the distributions presented at the beginning of this section: "exp" "tnorm" the default value is "exp" "production" boolean specifying the type of frontier (production/cost); it could be set to either true (production) or false (cost); the default value is true Hyperparameters Common to all models "m_beta" mean vector of the prior for $\beta$ ($K\phantom{\rule{0.3em}{0ex}}×\phantom{\rule{0.3em}{0ex}}1$ vector); the default value is ${\mathbf{0}}_{K}$ "P_beta" precision matrix of the prior for $\beta$ ($K\phantom{\rule{0.3em}{0ex}}×\phantom{\rule{0.3em}{0ex}}K$ symmetric and positive-deﬁnite matrix); the default value is $0.001\phantom{\rule{0.3em}{0ex}}\cdot \phantom{\rule{0.3em}{0ex}}{\mathbf{I}}_{K}$ "a_tau" shape parameter of the prior for $\tau$ (positive number); the default value is $0.001$ "b_tau" rate parameter of the prior for $\tau$ (positive number); the default value is $0.001$ Exponential model "m_delta" mean vector of the prior for $\delta$ ($L\phantom{\rule{0.3em}{0ex}}×\phantom{\rule{0.3em}{0ex}}1$ vector); the default value is ${\mathbf{0}}_{L}$ "P_delta" precision matrix of the prior for $\delta$ ($L\phantom{\rule{0.3em}{0ex}}×\phantom{\rule{0.3em}{0ex}}L$ symmetric and positive-deﬁnite matrix); the default value is $0.01\phantom{\rule{0.3em}{0ex}}\cdot \phantom{\rule{0.3em}{0ex}}{\mathbf{I}}_{L}$ Truncated normal model "m_delta" mean vector of the prior for $\delta$ ($L\phantom{\rule{0.3em}{0ex}}×\phantom{\rule{0.3em}{0ex}}1$ vector); the default value is ${\mathbf{0}}_{L}$ "P_delta" precision matrix of the prior for $\delta$ ($L\phantom{\rule{0.3em}{0ex}}×\phantom{\rule{0.3em}{0ex}}L$ symmetric and positive-deﬁnite matrix); the default value is $0.01\phantom{\rule{0.3em}{0ex}}\cdot \phantom{\rule{0.3em}{0ex}}{\mathbf{I}}_{L}$ "a_phi" shape parameter of the prior for $\varphi$ (positive number); the default value is $4$ "b_phi" rate parameter of the prior for $\varphi$ (positive number); the default value is $0.5$ Dataset and log-marginal likelihood "dataset" the id value of the dataset that will be used for estimation; the default value is the ﬁrst dataset in memory (in alphabetical order) "logML_CJ" boolean indicating whether the Chib (1995)/Chib & Jeliazkov (2001) approximation to the log-marginal likelihood should be calculated (true$|$false); the default value is false

Reported Parameters

 Common to all models $\beta$ variable_name vector of parameters associated with the independent variables in the x list $\tau$ tau precision parameter of the noise component of the error term, ${v}_{i}$ ${\sigma }_{v}$ sigma_v standard deviation of the noise component of the error term, ${\sigma }_{v}=1∕{\tau }^{1∕2}$ Exponential model $\delta$ variable_name vector of parameters associated with the independent variables in the z list Truncated normal model $\delta$ variable_name vector of parameters associated with the independent variables in the z list $\varphi$ phi precision parameter of the distribution of the ineﬃciency component of the error term, ${u}_{i}$ ${\sigma }_{u}$ sigma_u scale parameter of the ineﬃciency component of the error term: ${\sigma }_{u}=1∕{\varphi }^{1∕2}$.

Stored values and post-estimation analysis
If a left-hand-side id value is provided when an ineﬃciency-eﬀects stochastic frontier model is created, then the following results are saved in the model item and are accessible via the ‘.’ operator:

 Samples a matrix containing the draws from the posterior of $\beta$ and $\tau$, and, depending on the estimated model, $\delta$, or $\delta$ and $\varphi$ y$x1,$\dots$,y$xK vectors containing the draws from the posterior of the parameters associated with variables x1,$\dots$,xK (the names of these vectors are the names of the variables that were included in the right-hand side of the model, prepended by y$, where y is the name of the dependent variable; this is done so that the samples on the parameters associated with a variable that appears in both x and z lists can be distinguished) tau vector containing the draws from the posterior of $\tau$ u$z1,$\dots$,u$zL vectors containing the draws from the posterior of the parameters associated with variables z1,$\dots$,zL (the names of these vectors are the names of the variables that were included in the z list, in the right-hand side of the model, prepended by u$; this is done so that the samples on the parameters associated with a variable that appears in both x and z lists can be distinguished) phi vector containing the draws from the posterior of $\varphi$ (available after the estimation of the truncated-normal model) logML the Lewis & Raftery (1997) approximation of the log-marginal likelihood logML_CJ the Chib (1995)/Chib & Jeliazkov (2001) approximation to the log-marginal likelihood; this is available only if the model was estimated with the "logML_CJ"=true option eff_i $N\phantom{\rule{0.3em}{0ex}}×\phantom{\rule{0.3em}{0ex}}1$ vector that stores the expected values of the observation-speciﬁc eﬃciency scores, $E\left({e}^{-{u}_{i}}\right)$; the values in this vector are not guaranteed to be in the same order as the order in which the observations appear in the dataset used for estimation; use the store() function to associate the values in eff_i with the observations in the dataset nchains the number of chains that were used to estimate the model nburnin the number of burn-in draws per chain that were used when estimating the model ndraws the total number of retained draws from the posterior ($=$chains $\cdot$ draws) nthin value of the thinning parameter that was used when estimating the model nseed value of the seed for the random-number generator that was used when estimating the model

Additionally, the following functions are available for post-estimation analysis (see section B.14):

• diagnostics()
• test()
• pmp()
• store()
• mfx()

The ineﬃciency-eﬀects stochastic frontier model uses the store() function to associate the estimates of the eﬃciency scores (eff_i) with speciﬁc observations and store their values in the dataset used for estimation. The generic syntax for a statement involving the store() function after estimation of an ineﬃciency-eﬀects stochastic frontier model is:

store( eff_i, <new variable name> $\left[$, "model"=<model name>$\right]$ );

The ineﬃciency-eﬀects stochastic frontier model uses the mfx() function to calculate and report the marginal eﬀects of the variables in the z list on the expected value of $u$ and on the expected value of the eﬃciency score ($={e}^{-u}$). The two types of marginal eﬀects can be requested by setting the "type" argument of the mfx() function equal to 1 or 2. The generic syntax for a statement involving the mfx() function after estimation of an ineﬃciency-eﬀects stochastic frontier model is:

mfx( $\left[$"type"=1$\right]$ $\left[$, "point"=<point of calculation>$\right]$ $\left[$, "model"=<model name>$\right]$ );

and:

mfx( "type"=2 $\left[$, "point"=<point of calculation>$\right]$ $\left[$, "model"=<model name>$\right]$ );

for calculation of the marginal eﬀects on $E\left(u\right)$ and on $E\left({e}^{-u}\right)$, respectively. The default value of the "type" option is 1. See the general documentation of the mfx() function (section B.14) for details on the other optional arguments.

Examples

Example 1

myData = import("$BayESHOME/Datasets/dataset3.csv", ","); myData.constant = ones(rows(myData), 1); sf( y ~ constant x2 x3 x4 | constant z2 z3, "production"=false ); Example 2 myData = import("$BayESHOME/Datasets/dataset3.csv", ",");
myData.constant = ones(rows(myData), 1);

expSF = sf( y ~ constant x2 x3 x4 | constant z2 z3,
"udist"="exp", "production"=false );

tnormSF = sf( y ~ constant x2 x3 x4 | constant z2 z3,
"udist"="tnorm", "production"=false );

store( eff_i, eff_exp, "model" = expSF );
store( eff_i, eff_tnorm, "model" = tnormSF );

mfx( "point"="mean", "model"=expSF, "type"=1 );
mfx( "point"="mean", "model"=tnormSF, "type"=1 );

mfx( "point"="mean", "model"=expSF, "type"=2 );
mfx( "point"="mean", "model"=tnormSF, "type"=2 );

pmp( { expSF, tnormSF } );

2Optional arguments are always given in option-value pairs (eg. "chains"=3).