### 8.2 Negative-Binomial model

Mathematical representation

 ${y}_{i}\sim \text{NBinom}\left({p}_{i},\gamma \right),\phantom{\rule{2em}{0ex}}{p}_{i}=\frac{{\mu }_{i}}{{\mu }_{i}+\gamma }\phantom{\rule{1em}{0ex}}\text{and}\phantom{\rule{1em}{0ex}}{\mu }_{i}={e}^{{\mathbf{x}}_{i}^{\prime }\beta }$ (8.2)

With this parameterization, $E\left({y}_{i}|{\mathbf{x}}_{i}\right)={\mu }_{i}$ and $V\left({y}_{i}|{\mathbf{x}}_{i}\right)={\mu }_{i}+\frac{{\mu }_{i}^{2}}{\gamma }$

• the model is estimated using $N$ observations
• ${y}_{i}$ is the value of the dependent variable for observation $i$ and it can assume non-negative integer values
• ${\mathbf{x}}_{i}$ is a $K\phantom{\rule{0.3em}{0ex}}×\phantom{\rule{0.3em}{0ex}}1$ vector that stores the values of the $K$ independent variables for observation $i$
• $\beta$ is a $K\phantom{\rule{0.3em}{0ex}}×\phantom{\rule{0.3em}{0ex}}1$ vector of parameters
• $\gamma$ is the over-dispersion parameter and as $\gamma \to \infty$, the negative-Binomial model tends towards the Poisson model

Priors

 Parameter Probability density function Default hyperparameters $\beta$ $p\left(\beta \right)=\frac{|\mathbf{P}{|}^{1∕2}}{{\left(2\pi \right)}^{K∕2}}exp\left\{-\frac{1}{2}{\left(\beta -\mathbf{m}\right)}^{\prime }\mathbf{P}\left(\beta -\mathbf{m}\right)\right\}$ $\mathbf{m}={\mathbf{0}}_{K}$, $\mathbf{P}=0.001\cdot {\mathbf{I}}_{K}$ $\gamma$ $p\left(\gamma \right)=\frac{{b}_{\gamma }^{{a}_{\gamma }}}{\Gamma \left({a}_{\gamma }\right)}{\gamma }^{{a}_{\gamma }-1}{e}^{-\gamma {b}_{\gamma }}$ ${a}_{\gamma }=0.001$, ${b}_{\gamma }=0.001$

Syntax

$\left[$<model name> = $\right]$ nbinom( y ~ x1 x2  xK $\left[$, <options> $\right]$ );

where:

• y is the dependent variable name, as it appears in the dataset used for estimation
• x1 x2 $\dots$xK is a list of the $K$ independent variable names, as they appear in the dataset used for estimation; when a constant term is to be included in the model, this must be requested explicitly

 The dependent variable, y, in the dataset used for estimation must contain non-negative integer values. Observations with missing values in y are dropped during estimation, but if a non-integer or negative numerical value is encountered, then an error is produced.

The optional arguments for the negative-Binomial model are:2

 Gibbs parameters "chains" number of chains to run in parallel (positive integer); the default value is 1 "burnin" number of burn-in draws per chain (positive integer); the default value is 10000 "draws" number of retained draws per chain (positive integer); the default value is 20000 "thin" value of the thinning parameter (positive integer); the default value is 1 "seed" value of the seed for the random-number generator (positive integer); the default value is 42 Hyperparameters "m" mean vector of the prior for $\beta$ ($K\phantom{\rule{0.3em}{0ex}}×\phantom{\rule{0.3em}{0ex}}1$ vector); the default value is ${\mathbf{0}}_{K}$ "P" precision matrix of the prior for $\beta$ ($K\phantom{\rule{0.3em}{0ex}}×\phantom{\rule{0.3em}{0ex}}K$ symmetric and positive-deﬁnite matrix); the default value is $0.001\phantom{\rule{0.3em}{0ex}}\cdot \phantom{\rule{0.3em}{0ex}}{\mathbf{I}}_{K}$ "a_gamma" shape parameter of the prior for $\gamma$ (positive number); the default value is $0.001$ "b_gamma" rate parameter of the prior for $\gamma$ (positive number); the default value is $0.001$ Dataset and log-marginal likelihood "dataset" the id value of the dataset that will be used for estimation; the default value is the ﬁrst dataset in memory (in alphabetical order) "logML_CJ" boolean indicating whether the Chib (1995)/Chib & Jeliazkov (2001) approximation to the log-marginal likelihood should be calculated (true$|$false); the default value is false

Reported Parameters

 $\beta$ variable_name vector of parameters associated with the independent variables $\gamma$ gamma over-dispersion parameter

Stored values and post-estimation analysis
If a left-hand-side id value is provided when a negative-Binomial model is created, then the following results are saved in the model item and are accessible via the ‘.’ operator:

 Samples a matrix containing the draws from the posterior of $\beta$ and $\gamma$ x1,$\dots$,xK vectors containing the draws from the posterior of the parameters associated with variables x1,$\dots$,xK (the names of these vectors are the names of the variables that were included in the right-hand side of the model) gamma vector containing the draws from the posterior of $\gamma$ logML the Lewis & Raftery (1997) approximation of the log-marginal likelihood logML_CJ the Chib (1995)/Chib & Jeliazkov (2001) approximation to the log-marginal likelihood; this is available only if the model was estimated with the "logML_CJ"=true option nchains the number of chains that were used to estimate the model nburnin the number of burn-in draws per chain that were used when estimating the model ndraws the total number of retained draws from the posterior ($=$chains $\cdot$ draws) nthin value of the thinning parameter that was used when estimating the model nseed value of the seed for the random-number generator that was used when estimating the model

Additionally, the following functions are available for post-estimation analysis (see section B.14):

• diagnostics()
• test()
• pmp()
• mfx()

The negative-Binomial model uses the mfx() function to calculate and report the marginal eﬀects of the variables in the x list on:

• the expected value of the dependent variable: $\frac{\partial E\left({y}_{i}|{\mathbf{x}}_{i}\right)}{\partial {x}_{ik}}$
• the variance of the dependent variable: $\frac{\partial V\left({y}_{i}|{\mathbf{x}}_{i}\right)}{\partial {x}_{ik}}$

These two types of marginal eﬀects can be requested by setting the "type" argument of the mfx() function equal to 1 or 2, respectively. The generic syntax for a statement involving the mfx() function after estimation of a negative-Binomial model is:

mfx( $\left[$"type"=1$\right]$ $\left[$, "point"=<point of calculation>$\right]$ $\left[$, "model"=<model name>$\right]$ );

and:

mfx( "type"=2 $\left[$, "point"=<point of calculation>$\right]$ $\left[$, "model"=<model name>$\right]$ );

for calculation of the marginal eﬀects on $E\left(y\right)$, and on $V\left(y\right)$, respectively. The default value of the "type" option is 1. See the general documentation of the mfx() function (section B.14) for details on the other optional arguments.

Examples

Example 1

myData = import("$BayESHOME/Datasets/dataset9.csv"); myData.constant = ones(rows(myData), 1); nbinom( y ~ constant x1 x2 x3 x4 ); Example 2 myData = import("$BayESHOME/Datasets/dataset9.csv");
myData.constant = ones(rows(myData), 1);

myModel = nbinom( y ~ constant x1 x2 x3 x4,
"m"=zeros(5,1), "P" = 0.1*eye(5,5),
"a_gamma"=0.01, "b_gamma"=0.1,
"burnin"=10000, "draws"=30000, "thin"=3, "chains"=2,
"logML_CJ" = true, "dataset"=myData);

diagnostics("model"=myModel);

mfx("type"=1,"point"="median","model"=myModel);
mfx("type"=2,"point"="median","model"=myModel);

2Optional arguments are always given in option-value pairs (eg. "chains"=3).