4.6 Latent-class linear model with panel data
Mathematical representation
(4.7) |
- the model is estimated using observations from groups, each group observed for periods (balanced or unbalanced panels); the total number of observations is
- the model involves classes (counting starts at zero) and each group, , is restricted to belong to the same class for all observations
- is the value of the dependent variable for group , observed in period
- is a vector that stores the values of the independent variables for group , observed in period
- is a vector of parameters for class
- is the precision of the error term for class :
- each group, , belongs to
class with prior probability
(before seeing the data )
.
BayES supports two types of models:
-
unconditional prior class membership probabilities, in which case:
With this specification is a vector of parameters to be estimated, with and .
-
conditional prior class membership probabilities, in which case:
where:
- is an vector that stores the values of the determinants of class-membership for group ; these variables vary by group only (not within group)
- is an vector of parameters to be estimated
In this specification class-membership probabilities are determined by a multinomial Logit model, where, for identification purposes, is normalized to an vector of zeros.
-
Priors
Parameter | Probability density function | Default hyperparameters |
Common to both model types
| ||
, | ||
, | ||
Model with unconditional class-membership probabilities
| ||
Model with conditional class-membership probabilities
| ||
, | ||
Syntax
where:
- y is the dependent variable name, as it appears in the dataset used for estimation
- x1 x2 xK is a list of the independent variable names, as they appear in the dataset used for estimation; when a constant term is to be included in the model, this must be requested explicitly
- z1 z2 zL is a list of the variable names that enter the specification of the class-membership probabilities (determinants of class membership), as they appear in the dataset used for estimation; when a constant term is to be included in the model, this must be requested explicitly; this list is optional and when provided the conditional latent-class model is estimated; if not provided the unconditional model is estimated; the values of the variables in this list must be constant within each group
If the dataset used for estimation has not been previously declared as a panel dataset or if this structure has been removed (by a call to the set_cs() function) then the model estimated is the one documented in the preceding section. Different time observations from the same group in that model are allowed to belong to different classes. |
The optional arguments for the latent-class linear model with panel data are:7
Gibbs parameters
| |
"chains" | number of chains to run in parallel (positive integer); the default value is 1 |
"burnin" | number of burn-in draws per chain (positive integer); the default value is 10000 |
"draws" | number of retained draws per chain (positive integer); the default value is 20000 |
"thin" | value of the thinning parameter (positive integer); the default value is 1 |
"seed" | value of the seed for the random-number generator (positive integer); the default value is 42 |
Model specification
| |
"classes" | specification of the number of classes to be used in the model (positive integer); the default value is 2 |
Hyperparameters
| |
Common to both model types
| |
"m" | mean vector of the prior for each ( vector); the default value is |
"P" | precision matrix of the prior for each ( symmetric and positive-definite matrix); the default value is |
"mj" | mean vector of the prior for , ( vector); this mean overwrites the generic mean ("m") for class only |
"Pj" | precision matrix of the prior for , ( symmetric and positive-definite matrix); this precision matrix overwrites the generic precision matrix ("P") for class only |
"a_tau" | shape parameter of the prior for each (positive number); the default value is |
"b_tau" | rate parameter of the prior for each (positive number); the default value is |
"a_tauj" | shape parameter of the prior for , (positive number); this shape parameter overwrites the generic shape parameter ("a_tau") for class only |
"b_tauj" | rate parameter of the prior for , (positive number); this rate parameter overwrites the generic rate parameter ("b_tau") for class only |
Model with unconditional class-membership probabilities
| |
"a" | vector of concentration parameters for the Dirichlet prior on ( vector with positive entries); the default value is a vector of ones |
Model with conditional class-membership probabilities
| |
"m_delta" | mean vector of the prior for ( vector); the default value is |
"P_delta" | precision matrix of the prior for ( symmetric and positive-definite matrix); the default value is |
Dataset and log-marginal likelihood
| |
"dataset" | the id value of the dataset that will be used for estimation; the default value is the first dataset in memory (in alphabetical order) |
"logML_CJ" | boolean indicating whether the Chib (1995)/Chib & Jeliazkov (2001) approximation to the log-marginal likelihood should be calculated (truefalse); the default value is false |
Reported Parameters
Common to both model types
| ||
| variable_name | vector of parameters associated with the independent variables for class |
| tau | precision parameter of the error term for class , |
| sigma_e | standard deviation of the error term for class : |
Model with unconditional class-membership probabilities
| ||
| pi | prior class-membership probability for class |
Model with conditional class-membership probabilities
| ||
| variable_name | vector of parameters associated with the determinants of class membership for class ; for identification purposes, these parameters for class 0 are normalized to zero |
Stored values and post-estimation analysis
If a left-hand-side id value is provided when a latent-class linear model with panel data is created,
then the following results are saved in the model item and are accessible via the ‘.’ operator:
Samples | a matrix containing the draws from the posterior of and for , and, depending on the estimated model, or . |
cj$x1,,cj$xK | vectors containing the draws from the posterior of the parameters associated with variables x1,,xK, for (the names of these vectors are the names of the variables that were included in the right-hand side of the model, prepended by ‘c’, the class index and a dollar sign; in this way ‘cj’ can be used to distinguish among parameters across different classes) |
cj$tau | vectors containing the draws from the posterior of each , for (‘tau’ is prepended by ‘c’, the class index and the dollar sign; in this way ‘cj’ can be used to distinguish among precision parameters in different classes) |
pi_j | vectors containing the draws from the posterior of each , for (these vectors are available only after the estimation of the model with unconditional class-membership probabilities) |
pi_j$z1,, | vectors containing the draws from the posterior of the parameters associated with variables z1,,zL, for 8 (the names of these vectors are the names of the variables that were included the z list, in the right-hand side of the model, prepended by ‘pi_j’ and the dollar sign; in this way ‘pi_j’ can be used to distinguish among parameters associated with variables with different roles in the model, for example the same variable appearing in both x and z lists, as well as among parameters associated with a variable in the z list, but corresponding to different classes; these vectors are available only after the estimation of the model with conditional class-membership probabilities) |
logML | the Lewis & Raftery (1997) approximation of the log-marginal likelihood |
logML_CJ | the Chib (1995)/Chib & Jeliazkov (2001) approximation to the log-marginal likelihood; this is available only if the model was estimated with the "logML_CJ"=true option |
pi_i | matrix that stores the expected values of the posterior class-membership probabilities for each group and for each of the classes |
nchains | the number of chains that were used to estimate the model |
nburnin | the number of burn-in draws per chain that were used when estimating the model |
ndraws | the total number of retained draws from the posterior (chains draws) |
nthin | value of the thinning parameter that was used when estimating the model |
nseed | value of the seed for the random-number generator that was used when estimating the model |
nclasses | number of classes used during the estimation of the model |
Additionally, the following functions are available for post-estimation analysis (see section B.14):
- diagnostics()
- test()
- pmp()
- store()
- mfx()
The latent-class linear model with panel data uses the store() function to associate the estimates of the posterior class-membership probabilities (pi_i) with specific observations and store their values in the dataset used for estimation. The generic syntax for a statement involving the store() function after estimation of a latent-class linear model with panel data is:
This statement will generate additional variables in the dataset used for estimation of the model, with names constructed by appending the class index () to the prefix provided as the second argument to store().
The latent-class linear model with panel data and conditional class-membership probabilities uses the mfx() function to calculate and report the marginal effects of the variables in the z list on the prior class-membership probabilities that come from the multinomial-Logit part of the model. The generic syntax for a statement involving the mfx() function after estimation of a latent-class linear model with panel data and conditional class-membership probabilities is:
See the general documentation of the mfx() function (section B.14) for details on the optional arguments.
Examples
Example 1
myData.constant = 1;
set_pd( year, id, "dataset" = myData);
lm_lc(y ~ constant x1 x2);
Example 2
myData.constant = 1;
set_pd( year, id, "dataset" = myData);
myModel = lm_lc(y ~ constant x1 x2,
"m0"=[2;0.6;0.3], "P" = 10*eye(3,3),
"burnin"=10000, "draws"=30000, "thin"=2, "chains"=2, "classes"=2,
"logML_CJ" = true, "dataset"=myData);
diagnostics("model"=myModel);
plot([myModel.c0$x1, myModel.c1$x1],
"title"="∖beta2 for the two classes");
plot([myModel.pi_0, myModel.pi_1],
"title"="Prior classmembership probabilities");