Generalized Estimating Equations in longitudinal data analysis in the presence of missing data

Authors

  • Edralyn Marie Rufino Philippine Statistics Authority, 8800 Municipality of Nabunturan, Davao de Oro, Philippines
  • Bernadette Tubo Department of Mathematics and Statistics, MSU-Iligan Institute of Technology, 9200 Iligan City, Philippines

DOI:

https://doi.org/10.62071/tmjm.v7i1.721

Keywords:

Selection Criteria, Correlation Structures, Missing Data, Generalized Estimating Equations, Generalized Linear Model

Abstract

Generalized Estimating Equations (GEE) are a statistical approach used to estimate the parameters of Generalized Linear Models (GLMs) in the presence of potential correlations among observations, particularly across different time points. GEE adjusts for within-cluster correlations, enabling more accurate and efficient parameter estimation when fitting regression models. Correctly specifying the correlation structure in a statistical model enhances the efficiency of parameter estimates. However, the challenge of missing data, which is common in many studies, can significantly impact the reliability of inferences drawn from GEE-based models. This paper explores recently developed selection criteria for identifying the underlying correlation structure, focusing on longitudinal studies with varying degrees of missingness ($\Delta m \in {5\%, 10\%, 15\%}$). The criteria under investigation include: (a) Rotnitzky and Jewell Criterion (RJ), (b) Gaussian Pseudolikelihood Criterion (GP), (c) Quasi-likelihood under Independence Model Criterion (QIC), (d) Correlation Information Criterion (CIC), (e) Pardo and Alonso Criterion (PAC), and (f) Gaussian Bayesian Information Criterion (GBIC). The study examines performance across varying cluster sizes, highlighting the importance of accounting for different degrees of correlation in both complete and incomplete datasets. Across all scenarios with positive results, the findings reveal that GBIC demonstrates robust and consistent performance, even in the presence of missing observations.

Downloads

Published

2025-05-31

How to Cite

Rufino, E. M., & Tubo, B. (2025). Generalized Estimating Equations in longitudinal data analysis in the presence of missing data. The Mindanawan Journal of Mathematics, 7(1), 17–36. https://doi.org/10.62071/tmjm.v7i1.721

Issue

Section

Articles