BayES BayES

4.5 Latent-class linear model

Mathematical representation

yi|c = xiβ c + 𝜀i|c,𝜀i|c N 0, 1 τc ,c = 0,1,,C 1 (4.6)

Priors




Parameter Probability density function Default hyperparameters



Common to both model types
βc p βc = |Pc|12 2πK2 exp 1 2 βc mc P c βc mc mc = 0K, Pc = 0.001 IK
τc p τc = bτcaτc Γaτc τcaτc1eτcbτc aτ c = 0.001, bτc = 0.001
Model with unconditional class-membership probabilities
π p π = 1 Ba c=0C1πcac1 a0 = a1 = = aC1 = 1
Model with conditional class-membership probabilities
δ p δ = |Pδ|12 2πLC1 2 exp 1 2 δ mδ P δ δ mδ mδ = 0LC1, Pδ = 0.001 ILC1



Syntax

[<model name> = ] lm_lc( y ~ x1 x2  xK [ | z1 z2 zL] [, <options> ] );

where:

PIC If the dataset used for estimation has been previously declared as a panel dataset (typically, by a call to the set_pd() function) then the model estimated is the one documented in the following section. Each group in that model is restricted to belong to the same class for the entire period for which it is observed.

The optional arguments for the latent-class linear model are:5

Gibbs parameters


"chains"

number of chains to run in parallel (positive integer); the default value is 1

"burnin"

number of burn-in draws per chain (positive integer); the default value is 10000

"draws"

number of retained draws per chain (positive integer); the default value is 20000

"thin"

value of the thinning parameter (positive integer); the default value is 1

"seed"

value of the seed for the random-number generator (positive integer); the default value is 42

Model specification


"classes"

specification of the number of classes to be used in the model (positive integer); the default value is 2

Hyperparameters


Common to both model types

"m"

mean vector of the prior for each βc (K ×1 vector); the default value is 0K

"P"

precision matrix of the prior for each βc (K ×K symmetric and positive-definite matrix); the default value is 0.001 IK

"mj"

mean vector of the prior for βj, j = 0,1,,C 1 (K ×1 vector); this mean overwrites the generic mean ("m") for class j only

"Pj"

precision matrix of the prior for βj, j = 0,1,,C 1 (K ×K symmetric and positive-definite matrix); this precision matrix overwrites the generic precision matrix ("P") for class j only

"a_tau"

shape parameter of the prior for each τc (positive number); the default value is 0.001

"b_tau"

rate parameter of the prior for each τc (positive number); the default value is 0.001

"a_tauj"

shape parameter of the prior for τj, j = 0,1,,C 1 (positive number); this shape parameter overwrites the generic shape parameter ("a_tau") for class j only

"b_tauj"

rate parameter of the prior for τj, j = 0,1,,C 1 (positive number); this rate parameter overwrites the generic rate parameter ("b_tau") for class j only

Model with unconditional class-membership probabilities

"a"

vector of concentration parameters for the Dirichlet prior on π (C ×1 vector with positive entries); the default value is a C ×1 vector of ones

Model with conditional class-membership probabilities

"m_delta"

mean vector of the prior for δ (L C 1 ×1 vector); the default value is 0LC1

"P_delta"

precision matrix of the prior for δ (L C 1 ×1 symmetric and positive-definite matrix); the default value is 0.001 ILC1

Dataset and log-marginal likelihood


"dataset"

the id value of the dataset that will be used for estimation; the default value is the first dataset in memory (in alphabetical order)

"logML_CJ"

boolean indicating whether the Chib (1995)/Chib & Jeliazkov (2001) approximation to the log-marginal likelihood should be calculated (true|false); the default value is false

Reported Parameters




Common to both model types



βc

variable_name

vector of parameters associated with the independent variables for class c




τc

tau

precision parameter of the error term for class c, 𝜀i|c




σ𝜀,c

sigma_e

standard deviation of the error term for class c: σ𝜀,c = 1τc12




Model with unconditional class-membership probabilities



πc

pi

prior class-membership probability for class c




Model with conditional class-membership probabilities



δc

variable_name

vector of parameters associated with the determinants of class membership for class c; for identification purposes, these parameters for class 0 are normalized to zero




Stored values and post-estimation analysis
If a left-hand-side id value is provided when a latent-class linear model is created, then the following results are saved in the model item and are accessible via the ‘.’ operator:

Samples

a matrix containing the draws from the posterior of βc and τc for c = 0,1,,C 1, and, depending on the estimated model, π or δ.

cj$x1,,cj$xK

vectors containing the draws from the posterior of the parameters associated with variables x1,,xK, for j = 0,1,,C 1 (the names of these vectors are the names of the variables that were included in the right-hand side of the model, prepended by ‘c’, the class index and a dollar sign; in this way ‘cj’ can be used to distinguish among parameters across different classes)

cj$tau

vectors containing the draws from the posterior of each τc, for j = 0,1,,C 1 (‘tau’ is prepended by ‘c’, the class index and the dollar sign; in this way ‘cj’ can be used to distinguish among precision parameters in different classes)

pi_j

vectors containing the draws from the posterior of each πc, for j = 0,1,,C 1 (these vectors are available only after the estimation of the model with unconditional class-membership probabilities)

pi_j$z1,,
  pi_j$zL

vectors containing the draws from the posterior of the parameters associated with variables z1,,zL, for j = 1,,C 16 (the names of these vectors are the names of the variables that were included the z list, in the right-hand side of the model, prepended by ‘pi_j’ and the dollar sign; in this way ‘pi_j’ can be used to distinguish among parameters associated with variables with different roles in the model, for example the same variable appearing in both x and z lists, as well as among parameters associated with a variable in the z list, but corresponding to different classes; these vectors are available only after the estimation of the model with conditional class-membership probabilities)

logML

the Lewis & Raftery (1997) approximation of the log-marginal likelihood

logML_CJ

the Chib (1995)/Chib & Jeliazkov (2001) approximation to the log-marginal likelihood; this is available only if the model was estimated with the "logML_CJ"=true option

pi_i

N ×C matrix that stores the expected values of the posterior class-membership probabilities for each observation and for each of the C classes

nchains

the number of chains that were used to estimate the model

nburnin

the number of burn-in draws per chain that were used when estimating the model

ndraws

the total number of retained draws from the posterior ( =chains draws)

nthin

value of the thinning parameter that was used when estimating the model

nseed

value of the seed for the random-number generator that was used when estimating the model

nclasses

number of classes used during the estimation of the model

Additionally, the following functions are available for post-estimation analysis (see section B.14):

The latent-class linear model uses the store() function to associate the estimates of the posterior class-membership probabilities (pi_i) with specific observations and store their values in the dataset used for estimation. The generic syntax for a statement involving the store() function after estimation of a latent-class linear model is:

store( pi_i, <new variable name prefix> [, "model"=<model name>] );

This statement will generate C additional variables in the dataset used for estimation of the model, with names constructed by appending the class index (0,1,,C 1) to the prefix provided as the second argument to store().

The latent-class linear model with conditional class-membership probabilities uses the mfx() function to calculate and report the marginal effects of the variables in the z list on the prior class-membership probabilities that come from the multinomial-Logit part of the model. The generic syntax for a statement involving the mfx() function after estimation of a latent-class linear model with conditional class-membership probabilities is:

mfx( ["type"=1] [, "point"=<point of calculation>] [, "model"=<model name>] );

See the general documentation of the mfx() function (section B.14) for details on the optional arguments.

Examples

Example 1

myData = import("$BayESHOME/Datasets/dataset2.csv"); 
myData.constant = 1; 
 
lm_lc(y ~ constant x1 x2);

Example 2

myData = import("$BayESHOME/Datasets/dataset2.csv"); 
myData.constant = 1; 
 
myModel = lm_lc(y ~ constant x1 x2, 
    "m0"=[2;0.6;0.3], "P" = 10*eye(3,3), 
    "burnin"=10000, "draws"=30000, "thin"=2, "chains"=2, "classes"=2, 
    "logML_CJ" = true, "dataset"=myData); 
 
diagnostics("model"=myModel); 
 
plot([myModel.c0$x1, myModel.c1$x1], 
    "title"="beta2 for the two classes"); 
plot([myModel.pi_0, myModel.pi_1], 
    "title"="Prior classmembership probabilities");

Example 3

myData = import("$BayESHOME/Datasets/dataset2.csv"); 
myData.constant = 1; 
 
myModel = lm_lc(y ~ constant x1 x2 | constant x3, 
    "m0"=[2;0.6;0.3], "P" = 10*eye(3,3), 
    "burnin"=10000, "draws"=30000, "thin"=2, "chains"=2, "classes"=2, 
    "logML_CJ" = true, "dataset"=myData); 
 
diagnostics("model"=myModel); 
mfx("model"=myModel); 
 
plot([myModel.c0$x1, myModel.c1$x1], 
    "title"="beta2 for the two classes"); 
plot(myModel.pi_1$x3, 
    "title"="delta2 for class 1");

5Optional arguments are always given in option-value pairs (eg. "chains"=3).

6Indexing starts at 1 because the parameters of the multinomial-Logit part of the model associated with class 0 are normalized to zero for identification purposes.

Share this content:
Facebook Twitter LinkedIn Email
© 2016–20 Grigorios Emvalomatis