Three essays on large panel data econometrics

Choi, Sung Hoon

doi:doi:10.7282/t3-2k11-pq98

RUcore: Rutgers University Community Repository

Search
- All
- Text
- Images
- Audio
- Video
Advanced Search | Help

Search all content in all RUcore collections.
Services
Collections

Help Contact Us My Account

Home

Resource

Three essays on large panel data econometrics

PDF

PDF format is widely accepted and good for printing.

Plug-in required

PDF-1(908.62 kb)

Citation & Export

View Usage Statistics

Staff View

Citation & Export
Hide

Simple citation

Choi, Sung Hoon. Three essays on large panel data econometrics. Retrieved from https://doi.org/doi:10.7282/t3-2k11-pq98

Export

Click here for information about Citation Management Tools at Rutgers.

Statistics
Hide

Description

TitleThree essays on large panel data econometrics

NameChoi, Sung Hoon (author); Liao, Yuan (chair); Swanson, Norman (internal member); Landon-Lane, John (internal member); Cheng, Xu (outside member); Rutgers University; School of Graduate Studies

Date Created2021

Other Date2021-05 (degree)

SubjectEconomics, Econometric theory, Econometrics, Heteroscedasticity, Big data

Extent1 online resource (xii, 153) : illustrations

DescriptionMy dissertation consists of three chapters that focus on the development of new tools for use with big data, machine learning, and forecasting. In particular, I employ regularization methods from machine learning literature to improve the estimation of standard errors, efficient estimation of coefficients, and big data forecasts. Chapter 1 considers large factor models and proposes a new principal component analysis method that increases estimation accuracy and efficiency for big data forecasts. Chapters 2 and 3 develop the standard error of the ordinary least squares (OLS) estimation and the generalized least squares (GLS) estimation that take into account general dependences on the idiosyncratic error terms for linear panel data models.

All three chapters highlight the importance of cross-sectional heteroskedasticity and correlations of the error terms when the cluster structure is unknown. For example, idiosyncratic error variances can vary remarkably among different companies and the errors can be correlated among companies. In addition, the knowledge of cluster structure may be unavailable in practice. Taking these aspects into account is essential for obtaining more precise results for predictions or causal inference.

In Chapter 1, I propose a feasible weighted projected principal component analysis (FPPC) for factor models in which observable characteristics partially explain the latent factors. This novel method provides more efficient and accurate estimators than existing methods. To increase efficiency, I take into account both cross-sectional dependence and heteroskedasticity by using a consistent estimator of the inverse error covariance matrix as the weight matrix. To improve accuracy, I employ a projection approach using the additionally observed characteristics because the projection removes noise components in high-dimensional factor analysis. By using the FPPC method, estimators of the factors and loadings have faster rates of convergence than those of the conventional factor analysis. Moreover, I propose an FPPC-based diffusion index forecasting model. The limiting distribution of the parameter estimates and the rate of convergence for forecast errors are obtained. Using U.S. bond market and macroeconomic data, I demonstrate that the proposed model outperforms models based on conventional principal component estimators. I also show that the proposed model performs well among a large group of machine learning techniques in forecasting excess bond returns.

Chapter 2, a joint work with Jushan Bai and Yuan Liao, develops a new standard-error estimator for linear panel data models. The proposed estimator is robust to heteroskedasticity, serial correlation, and cross-sectional correlation of unknown forms. The serial correlation is controlled by the Newey-West method. To control for cross-sectional correlations, we propose to use the thresholding method, without assuming the clusters to be known. We establish the consistency of the proposed estimator. Monte Carlo simulations show the method works well. We illustrate our method in an application of U.S. divorce law reform effects and find that cross-sectional correlations are non-negligible.

In Chapter 3, a co-authored paper with Jushan Bai and Yuan Liao, we consider the GLS estimation for linear panel data models. By estimating the large error covariance matrix consistently, the proposed feasible GLS estimator is more efficient than the OLS estimator in the presence of heteroskedasticity and both serial and cross-sectional correlations. The covariance matrix used for the feasible GLS is estimated via the banding and thresholding method. We establish the limiting distribution of the proposed estimator. A Monte Carlo study is considered. The proposed method is applied to the U.S. divorce rate data. We find that our more efficient estimators identify the significant effects of divorce law reforms on the divorce rate and provide tighter confidence intervals than existing methods.

NotePh.D.

NoteIncludes bibliographical references

Genretheses, ETD doctoral

Persistent URLhttps://doi.org/doi:10.7282/t3-2k11-pq98

LanguageEnglish

CollectionSchool of Graduate Studies Electronic Theses and Dissertations

Organization NameRutgers, The State University of New Jersey

RightsThe author owns the copyright to this work.

Version 8.5.5

Citation & ExportHide

Simple citation

Export

StatisticsHide

Description

Citation & Export
Hide

Statistics
Hide