The divide-and-combine approaches for multivariate survival analysis and multistate survival analysis in big data

Wang, Wei

doi:doi:10.7282/t3-s5sq-vq68

RUcore: Rutgers University Community Repository

Search
- All
- Text
- Images
- Audio
- Video
Advanced Search | Help

Search all content in all RUcore collections.
Services
Collections

Help Contact Us My Account

Home

Resource

The divide-and-combine approaches for multivariate survival analysis and multistate survival analysis in big data

PDF

PDF format is widely accepted and good for printing.

Plug-in required

PDF-1(1.25 MB)

Citation & Export

View Usage Statistics

Staff View

Citation & Export
Hide

Simple citation

Wang, Wei. The divide-and-combine approaches for multivariate survival analysis and multistate survival analysis in big data. Retrieved from https://doi.org/doi:10.7282/t3-s5sq-vq68

Export

Click here for information about Citation Management Tools at Rutgers.

Statistics
Hide

Description

TitleThe divide-and-combine approaches for multivariate survival analysis and multistate survival analysis in big data

NameWang, Wei (author); Lu, Shou-En (chair); LIN, YONG (internal member); Wang, Yaqun (internal member); Kim, Sinae (internal member); Xie, Minge (outside member); Rutgers University; School of Graduate Studies

Date Created2020

Other Date2020-10 (degree)

SubjectConfidence distribution, Public Health

Extent1 online resource (xi, 144 pages)

DescriptionMultivariate failure time data can be unordered or ordered, which can be analyzed using multivariate survival analysis and multistate survival analysis, respectively. When sample sizes are extraordinarily large, both analyses could face computational challenges. In this dissertation, we propose divide-and-combine approaches to analyze large-scale multivariate failure time data in both multivariate survival analysis and multistate survival analysis. Our approaches are motivated by the Myocardial Infarction Data Acquisition System (MIDAS), a New Jersey statewide database that includes 73,725,160 admissions to non-federal hospitals and emergency rooms (ERs) from 1995 to 2017. We propose to randomly divide the full data into multiple subsets and propose a weighted method to combine these estimators obtained from individual subsets. In divided subsets, estimated regression parameters and estimated cumulative hazards are calculated, respectively, for multivariate survival analysis and multistate survival analysis. Under mild conditions, we show that the combined estimators are asymptotically equivalent to the estimators obtained from the full data as if the data were analyzed all at once. In addition, to screen out risk factors with weak signals in multivariate survival analysis, we propose to perform the regularized estimation on the combined estimators using their combined confidence distributions. Theoretical properties of proposed approaches, such as asymptotic equivalence between divide-and-combine analysis and full-data analysis, estimation consistency, selection consistency, and oracle properties are studied. Performances of proposed estimators are investigated using simulation studies. The MIDAS data are used to illustrate our proposed methodologies.

NotePh.D.

NoteIncludes bibliographical references

Genretheses, ETD doctoral

Persistent URLhttps://doi.org/doi:10.7282/t3-s5sq-vq68

LanguageEnglish

CollectionSchool of Graduate Studies Electronic Theses and Dissertations

Organization NameRutgers, The State University of New Jersey

RightsThe author owns the copyright to this work.

Version 8.5.5

Citation & ExportHide

Simple citation

Export

StatisticsHide

Description

Citation & Export
Hide

Statistics
Hide