2.Statistical Method in Interval-censored Failure Time Data with Missing Covariates
Missing data can arise due to many circumstances, and in general, their analysis highly depends on the censoring mechanism. In the motivating example from Alzheimer’s Disease Neuroimaging Initiative (ADNI) study, due to the nature of the study, only interval-censored data are available on the occurrence time of the AD conversion. Our research interest is determining significant baseline prognostic factors for AD conversion patients in the mild cognitive impairment group. However, ADNI data contains a large number of subjects with missing covariates and interval-censored failure time. If the missing mechanism is not completely at random, analysis based only on the observed part of the data will lead to biased and inefficient estimation. To deal with this problem, we proposed a sieve maximum likelihood estimation approach using I-spline functions to approximate the unknown cumulative baseline hazard function in the model survival. For the implementation of the proposed method, we develop an EM algorithm based on a two-stage data augmentation. Furthermore, we show that the proposed estimators of regression parameters are consistent and asymptotically normal. When applied to simulated and ADNI dataset, our proposed method yields more accurate and efficient estimates compared with those from the complete case (CC) method and the multiple imputation method.