International Journal on Advanced Science, Engineering and Information Technology, Vol. 11 (2021) No. 5, pages: 1832-1840, DOI:10.18517/ijaseit.11.5.14910

A Latent Class Model for Multivariate Binary Data Subject to Missingness

Samah Zakaria, Mai Sherif Hafez, Ahmed M. Gad


When researchers are interested in measuring social phenomena that cannot be measured using a single variable, the appropriate statistical tool to be used is a latent variable model. A number of manifest variables is used to define the latent phenomenon. The manifest variables may be incomplete due to different forms of non-response that may or may not be random. In such cases, especially when the missingness is nonignorable, it is inevitable to include a missingness mechanism in the model to obtain valid estimates for parameters. In social surveys, categorical items can be considered the most common type of variable. We thus propose a latent class model where two categorical latent variables are defined; one represents the latent phenomenon of interest, and another represents a respondent’s propensity to respond to survey items. All manifest items are considered to be categorical. The proposed model incorporates a missingness mechanism that accounts for forms of missingness that may not be random by allowing the latent response propensity class to depend on the latent phenomenon under consideration, given a set of covariates. The Expectation-Maximization (EM) algorithm is used for estimating the proposed model. The proposed model is used to analyze data from 2014 Egyptian Demographic and Health Survey (EDHS14). Missing data is artificially created in order to study results under the three types of missingness: missing completely at random (MCAR), missing at random (MAR), and missing not at random (MNAR).


binary variables; latent class model; item non-response; non-random missingness; response propensity.

Viewed: 84 times (since abstract online)

cite this paper     download