### 9.2 Type II Tobit

Mathematical representation

 $\begin{array}{cc}\begin{array}{ccc}\hfill {y}_{i}^{\ast }& ={\mathbf{x}}_{i}^{\prime }\beta +{𝜀}_{i}\hfill & \hfill \\ \hfill {y}_{i}& =\left\{\begin{array}{ccc}{y}_{i}^{\ast }\hfill & \hfill \mathrm{if}\hfill & {s}_{i}=1\hfill \\ -\hfill & \hfill \mathrm{if}\hfill & {s}_{i}=0\hfill \end{array}\right\\hfill \end{array}\hfill & \phantom{\rule{2em}{0ex}}\phantom{\rule{2em}{0ex}}\begin{array}{ccc}\hfill {s}_{i}^{\ast }& ={\mathbf{z}}_{i}^{\prime }\delta +{v}_{i}\hfill & \hfill \\ \hfill {s}_{i}& =\left\{\begin{array}{ccc}1\hfill & \hfill \mathrm{if}\hfill & {s}_{i}^{\ast }>0\hfill \\ 0\hfill & \hfill \mathrm{if}\hfill & {s}_{i}^{\ast }\le 0\hfill \end{array}\right\\hfill \end{array}\hfill \end{array}$ (9.2)

with:

 $\left[\begin{array}{c}\hfill {𝜀}_{i}\hfill \\ \hfill {v}_{i}\hfill \end{array}\right]\sim \mathrm{N}\left(\mathbf{0},{\Omega }^{-1}\right),\phantom{\rule{2em}{0ex}}\text{where}\phantom{\rule{2em}{0ex}}{\Omega }^{-1}\equiv \Sigma =\left[\begin{array}{cc}\hfill \xi +{\gamma }^{2}\hfill & \hfill \gamma \hfill \\ \hfill \gamma \hfill & \hfill 1\hfill \end{array}\right]$ (9.3)
• the model is estimated using $N$ observations, for ${N}_{1}$ of which the outcome variable is observed (${s}_{i}=1$) and for the remaining ${N}_{0}$ the outcome variable is missing (${s}_{i}=0$)
• ${y}_{i}$ is the value of the dependent variable in the outcome equation for observation $i$ and it is observed only if ${s}_{i}=1$
• ${s}_{i}^{\ast }$ is the value of the latent dependent variable in the selection equation for observation $i$
• ${\mathbf{x}}_{i}$ is a $K\phantom{\rule{0.3em}{0ex}}×\phantom{\rule{0.3em}{0ex}}1$ vector that stores the values of the $K$ independent variables in the outcome equation for observation $i$
• ${\mathbf{z}}_{i}$ is an $L\phantom{\rule{0.3em}{0ex}}×\phantom{\rule{0.3em}{0ex}}1$ vector that stores the values of the $L$ independent variables in the selection equation for observation $i$
• the same variable could appear in both ${\mathbf{x}}_{i}$ and ${\mathbf{z}}_{i}$
• $\beta$ is a $K\phantom{\rule{0.3em}{0ex}}×\phantom{\rule{0.3em}{0ex}}1$ vector of parameters
• $\delta$ is an $L\phantom{\rule{0.3em}{0ex}}×\phantom{\rule{0.3em}{0ex}}1$ vector of parameters
• $\Omega$ is the precision matrix of the error vector, ${\left[\begin{array}{cc}\hfill {𝜀}_{i}\hfill & \hfill {v}_{i}\hfill \end{array}\right]}^{\prime }$ and $\Sigma$ is the covariance matrix of the error vector
• $\xi$ is the variance of ${𝜀}_{i}$ conditional on ${v}_{i}$
• $\gamma$ is the covariance of ${𝜀}_{i}$ and ${v}_{i}$

Priors

 Parameter Probability density function Default hyperparameters $\beta$ $p\left(\beta \right)=\frac{|{\mathbf{P}}_{\beta }{|}^{1∕2}}{{\left(2\pi \right)}^{K∕2}}exp\left\{-\frac{1}{2}{\left(\beta -{\mathbf{m}}_{\beta }\right)}^{\prime }{\mathbf{P}}_{\beta }\left(\beta -{\mathbf{m}}_{\beta }\right)\right\}$ ${\mathbf{m}}_{\beta }={\mathbf{0}}_{K}$, ${\mathbf{P}}_{\beta }=0.001\cdot {\mathbf{I}}_{K}$ $\delta$ $p\left(\delta \right)=\frac{|{\mathbf{P}}_{\delta }{|}^{1∕2}}{{\left(2\pi \right)}^{L∕2}}exp\left\{-\frac{1}{2}{\left(\delta -{\mathbf{m}}_{\delta }\right)}^{\prime }{\mathbf{P}}_{\delta }\left(\delta -{\mathbf{m}}_{\delta }\right)\right\}$ ${\mathbf{m}}_{\delta }={\mathbf{0}}_{K}$, ${\mathbf{P}}_{\delta }=0.001\cdot {\mathbf{I}}_{L}$ $\xi$ $p\left(\frac{1}{\xi }\right)=\frac{{b}_{\xi }^{{a}_{\xi }}}{\Gamma \left({a}_{\xi }\right)}{\left(\frac{1}{\xi }\right)}^{{a}_{\xi }-1}{e}^{-{b}_{\tau }∕\xi }$ ${a}_{\xi }=0.001$, ${b}_{\xi }=0.001$ $\gamma$ $p\left(\gamma |\xi \right)=\frac{{\left(\frac{{t}_{\gamma }}{\xi }\right)}^{1∕2}}{{\left(2\pi \right)}^{1∕2}}\left\{-\frac{{t}_{\gamma }}{2\xi }{\left(\gamma -{m}_{\gamma }\right)}^{2}\right\}$ ${m}_{\gamma }=0$, ${t}_{\gamma }=1$

 Because $\xi$ is a variance parameter, a Gamma prior is placed on the corresponding precision parameter, $\frac{1}{\xi }$. This is equivalent to placing and inverse-Gamma prior on $\xi$ directly.

 The prior for $\gamma$ depends on the value of $\xi$: given $\xi$, $\gamma$ follows a Normal distribution with mean ${m}_{\gamma }$ and precision $\frac{{t}_{\gamma }}{\xi }$. This is done so that the prior uncertainty around $\gamma$ scales along with the prior uncertainty around $\xi$.

Syntax

$\left[$<model name> = $\right]$ tobitII( y ~ x1 x2  xK $|$ z1 z2 $\dots$ zL $\left[$,<options> $\right]$ );

where:

• y is the dependent variable name in the outcome equation, as it appears in the dataset used for estimation
• x1 x2 $\dots$xK is a list of the $K$ independent variable names in the outcome equation, as they appear in the dataset used for estimation; when a constant term is to be included in the model, this must be requested explicitly
• z1 z2 $\dots$zL is a list of the $L$ independent variable names in the selection equation, as they appear in the dataset used for estimation; when a constant term is to be included in the model, this must be requested explicitly

 The dependent variable, y, in the dataset used for estimation must contain both numerical values and missing values (“nans"). The values of the variables in the x list are not used during estimation and they could be missing. However, observations with missing values in the z list are dropped prior to estimation.

The optional arguments for the type II Tobit model are:2

 Gibbs parameters "chains" number of chains to run in parallel (positive integer); the default value is 1 "burnin" number of burn-in draws per chain (positive integer); the default value is 10000 "draws" number of retained draws per chain (positive integer); the default value is 20000 "thin" value of the thinning parameter (positive integer); the default value is 1 "seed" value of the seed for the random-number generator (positive integer); the default value is 42 Hyperparameters "m_beta" mean vector of the prior for $\beta$ ($K\phantom{\rule{0.3em}{0ex}}×\phantom{\rule{0.3em}{0ex}}1$ vector); the default value is ${\mathbf{0}}_{K}$ "P_beta" precision matrix of the prior for $\beta$ ($K\phantom{\rule{0.3em}{0ex}}×\phantom{\rule{0.3em}{0ex}}K$ symmetric and positive-deﬁnite matrix); the default value is $0.001\phantom{\rule{0.3em}{0ex}}\cdot \phantom{\rule{0.3em}{0ex}}{\mathbf{I}}_{K}$ "m_delta" mean vector of the prior for $\delta$ ($L\phantom{\rule{0.3em}{0ex}}×\phantom{\rule{0.3em}{0ex}}1$ vector); the default value is ${\mathbf{0}}_{L}$ "P_delta" precision matrix of the prior for $\delta$ ($L\phantom{\rule{0.3em}{0ex}}×\phantom{\rule{0.3em}{0ex}}L$ symmetric and positive-deﬁnite matrix); the default value is $0.001\phantom{\rule{0.3em}{0ex}}\cdot \phantom{\rule{0.3em}{0ex}}{\mathbf{I}}_{L}$ "a_xi" shape parameter of the prior for $\frac{1}{\xi }$ (positive number); the default value is $0.001$ "b_xi" rate parameter of the prior for $\frac{1}{\xi }$ (positive number); the default value is $0.001$ "m_gamma" mean of the prior for $\gamma$; the default value is $0$ "t_gamma" precision scaling parameter of the prior for $\gamma$ (positive number); the default value is $1$ Dataset and log-marginal likelihood "dataset" the id value of the dataset that will be used for estimation; the default value is the ﬁrst dataset in memory (in alphabetical order) "logML_CJ" boolean indicating whether the Chib (1995)/Chib & Jeliazkov (2001) approximation to the log-marginal likelihood should be calculated (true$|$false); the default value is false

Reported Parameters

 $\beta$ variable_name vector of parameters associated with the independent variables in the outcome equation $\delta$ variable_name vector of parameters associated with the independent variables in the selection equation $\xi$ xi conditional variance parameter of the error term in the outcome equation, ${𝜀}_{i}$ $\gamma$ gamma covariance of the error terms in the outcome and selection equations ${\sigma }_{𝜀}$ sigma_e standard deviation of the error term in the outcome equation: ${\sigma }_{𝜀}={\xi }^{1∕2}$ $\rho$ rho correlation coeﬃcient between the error terms in the outcome and selection equations: $\rho =\gamma ∕{\xi }^{1∕2}$

Stored values and post-estimation analysis
If a left-hand-side id value is provided when a type II Tobit model is created, then the following results are saved in the model item and are accessible via the ‘.’ operator:

 Samples a matrix containing the draws from the posterior of $\beta$, $\delta$, $\xi$ and $\gamma$ y$x1,$\dots$,y$xK vectors containing the draws from the posterior of the parameters associated with variables x1,$\dots$,xK (the names of these vectors are the names of the variables that were included in the right-hand side of the outcome equation of the model, prepended by y$, where y is the name of the dependent variable; this is done so that the samples on the parameters associated with a variable that appears in both x and z lists can be distinguished) s$z1,$\dots$,s$zL vectors containing the draws from the posterior of the parameters associated with variables z1,$\dots$,zL (the names of these vectors are the names of the variables that were included in the z list, in the right-hand side of the selection equation of the model, prepended by s$; this is done so that the samples on the parameters associated with a variable that appears in both x and z lists can be distinguished) xi vector containing the draws from the posterior of $\xi$ gamma vector containing the draws from the posterior of $\gamma$ logML the Lewis & Raftery (1997) approximation of the log-marginal likelihood logML_CJ the Chib (1995)/Chib & Jeliazkov (2001) approximation to the log-marginal likelihood; this is available only if the model was estimated with the "logML_CJ"=true option nchains the number of chains that were used to estimate the model nburnin the number of burn-in draws per chain that were used when estimating the model ndraws the total number of retained draws from the posterior ($=$chains $\cdot$ draws) nthin value of the thinning parameter that was used when estimating the model nseed value of the seed for the random-number generator that was used when estimating the model

Additionally, the following functions are available for post-estimation analysis (see section B.14):

• diagnostics()
• test()
• pmp()
• mfx()

Usually the marginal eﬀects of primary importance in a type II Tobit model are the eﬀects of changes in the independent variables in the outcome equation on the expected value of the dependent variable in the same equation, for the entire population (whether selected or not). These eﬀects, at least for variables included linearly in the model, are the corresponding $\beta$s. Nevertheless, there are two additional types of marginal eﬀects that could be of interest and which are not linear functions of the model’s parameters. The type II Tobit model uses the mfx() function to calculate and report the marginal eﬀects of:

• the variables in the z list on the probability of selection: $\frac{\partial Prob\left({s}_{i}=1|{\mathbf{z}}_{i}\right)}{\partial {z}_{i\ell }}$
• the variables in the x list on the expected value of the response variable for the part of the population that is selected3: $\frac{\partial E\left({y}_{i}|{\mathbf{x}}_{i},{\mathbf{z}}_{i},{s}_{i}=1\right)}{\partial {x}_{ik}}$

The two types of marginal eﬀects can be requested by setting the "type" argument of the mfx() function equal to 1 or 3. The generic syntax for a statement involving the mfx() function after estimation of a type II Tobit model is:

mfx( $\left[$"type"=1$\right]$ $\left[$, "point"=<point of calculation>$\right]$ $\left[$, "model"=<model name>$\right]$ );

and:

mfx( "type"=2 $\left[$, "point"=<point of calculation>$\right]$ $\left[$, "model"=<model name>$\right]$ );

for calculation of the marginal eﬀects on $Prob\left(s=1\right)$ and on $E\left(y|s=1\right)$. The default value of the "type" option is 1. See the general documentation of the mfx() function (section B.14) for details on the other optional arguments.

Examples

Example 1

myData = import("$BayESHOME/Datasets/dataset8.csv"); myData.constant = ones(rows(myData), 1); tobitII( y2 ~ constant x1 x2 x3 x4 | constant x1 x2 x3 z1 z2 ); Example 2 myData = import("$BayESHOME/Datasets/dataset8.csv");
myData.constant = ones(rows(myData), 1);

myModel = tobitII( y2 ~ constant x1 x2 x3 x4 | constant x1 x2 x3 x4 z1 z2,
"m_beta"=ones(5,1), "P_beta" = 0.1*eye(5,5),
"m_delta"=ones(7,1), "P_delta" = 0.1*eye(7,7),
"a_xi"=0.01, "b_xi"=0.01, "m_gamma"=0.0, "t_gamma"=0.1,
"burnin"=10000, "draws"=40000, "thin"=4, "chains"=2,
"logML_CJ" = true, "dataset"=myData);

diagnostics("model"=myModel);

mfx("type"=1,"point"="mean","model"=myModel);
mfx("type"=2,"point"="mean","model"=myModel);

2Optional arguments are always given in option-value pairs (eg. "chains"=3).

3If the $k$-th independent variable in the outcome equation does not appear also as an independent variable in the selection equation then its marginal eﬀect is simply ${\beta }_{k}$.