(Translated by https://www.hiragana.jp/)
Large covariance estimation by thresholding principal orthogonal complements
IDEAS home Printed from https://ideas.repec.org/p/pra/mprapa/38697.html
   My bibliography  Save this paper

Large covariance estimation by thresholding principal orthogonal complements

Author

Listed:
  • Fan, Jianqing
  • Liao, Yuan
  • Mincheva, Martina

Abstract

This paper deals with estimation of high-dimensional covariance with a conditional sparsity structure, which is the composition of a low-rank matrix plus a sparse matrix. By assuming sparse error covariance matrix in a multi-factor model, we allow the presence of the cross-sectional correlation even after taking out common but unobservable factors. We introduce the Principal Orthogonal complEment Thresholding (POET) method to explore such an approximate factor structure. The POET estimator includes the sample covariance matrix, the factor-based covariance matrix (Fan, Fan and Lv, 2008), the thresholding estimator (Bickel and Levina, 2008) and the adaptive thresholding estimator (Cai and Liu, 2011) as specic examples. We provide mathematical insights when the factor analysis is approximately the same as the principal component analysis for high dimensional data. The rates of convergence of the sparse residual covariance matrix and the conditional sparse covariance matrix are studied under various norms, including the spectral norm. It is shown that the impact of estimating the unknown factors vanishes as the dimensionality increases. The uniform rates of convergence for the unobserved factors and their factor loadings are derived. The asymptotic results are also veried by extensive simulation studies.

Suggested Citation

  • Fan, Jianqing & Liao, Yuan & Mincheva, Martina, 2011. "Large covariance estimation by thresholding principal orthogonal complements," MPRA Paper 38697, University Library of Munich, Germany.
  • Handle: RePEc:pra:mprapa:38697
    as

    Download full text from publisher

    File URL: https://mpra.ub.uni-muenchen.de/38697/1/MPRA_paper_38697.pdf
    File Function: original version
    Download Restriction: no
    ---><---

    Other versions of this item:

    References listed on IDEAS

    as
    1. Peter Hall & J. S. Marron & Amnon Neeman, 2005. "Geometric representation of high dimension, low sample size data," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 67(3), pages 427-444, June.
    2. H. Wang, 2012. "Factor profiled sure independence screening," Biometrika, Biometrika Trust, vol. 99(1), pages 15-28.
    3. Forni, Mario & Hallin, Marc & Lippi, Marco & Reichlin, Lucrezia, 2004. "The generalized dynamic factor model consistency and rates," Journal of Econometrics, Elsevier, vol. 119(2), pages 231-255, April.
    4. Ledoit, Olivier & Wolf, Michael, 2004. "A well-conditioned estimator for large-dimensional covariance matrices," Journal of Multivariate Analysis, Elsevier, vol. 88(2), pages 365-411, February.
    5. Alexei Onatski, 2009. "Testing Hypotheses About the Number of Factors in Large Factor Models," Econometrica, Econometric Society, vol. 77(5), pages 1447-1479, September.
    6. Stephen A. Ross, 2013. "The Arbitrage Theory of Capital Asset Pricing," World Scientific Book Chapters, in: Leonard C MacLean & William T Ziemba (ed.), HANDBOOK OF THE FUNDAMENTALS OF FINANCIAL DECISION MAKING Part I, chapter 1, pages 11-30, World Scientific Publishing Co. Pte. Ltd..
    7. Doz, Catherine & Giannone, Domenico & Reichlin, Lucrezia, 2011. "A two-step estimator for large approximate dynamic factor models based on Kalman filtering," Journal of Econometrics, Elsevier, vol. 164(1), pages 188-205, September.
    8. Jianqing Fan & Jingjin Zhang & Ke Yu, 2012. "Vast Portfolio Selection With Gross-Exposure Constraints," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 107(498), pages 592-606, June.
    9. Lam, Clifford & Yao, Qiwei & Bathia, Neil, 2011. "Estimation of latent factors for high-dimensional time series," LSE Research Online Documents on Economics 31549, London School of Economics and Political Science, LSE Library.
    10. Kapetanios, George, 2010. "A Testing Procedure for Determining the Number of Factors in Approximate Factor Models With Large Datasets," Journal of Business & Economic Statistics, American Statistical Association, vol. 28(3), pages 397-409.
    11. Chamberlain, Gary & Rothschild, Michael, 1983. "Arbitrage, Factor Structure, and Mean-Variance Analysis on Large Asset Markets," Econometrica, Econometric Society, vol. 51(5), pages 1281-1304, September.
    12. Lam, Clifford & Fan, Jianqing, 2009. "Sparsistency and rates of convergence in large covariance matrix estimation," LSE Research Online Documents on Economics 31540, London School of Economics and Political Science, LSE Library.
    13. Ahn, Seung Chan & Hoon Lee, Young & Schmidt, Peter, 2001. "GMM estimation of linear panel data models with time-varying individual effects," Journal of Econometrics, Elsevier, vol. 101(2), pages 219-255, April.
    14. Ledoit, Olivier & Wolf, Michael, 2003. "Improved estimation of the covariance matrix of stock returns with an application to portfolio selection," Journal of Empirical Finance, Elsevier, vol. 10(5), pages 603-621, December.
    15. Mario Forni & Marc Hallin & Marco Lippi & Lucrezia Reichlin, 2000. "The Generalized Dynamic-Factor Model: Identification And Estimation," The Review of Economics and Statistics, MIT Press, vol. 82(4), pages 540-554, November.
    16. Carvalho, Carlos M. & Chang, Jeffrey & Lucas, Joseph E. & Nevins, Joseph R. & Wang, Quanli & West, Mike, 2008. "High-Dimensional Sparse Factor Modeling: Applications in Gene Expression Genomics," Journal of the American Statistical Association, American Statistical Association, vol. 103(484), pages 1438-1456.
    17. Alexander Chudik & M. Hashem Pesaran & Elisa Tosetti, 2011. "Weak and strong cross‐section dependence and estimation of large panels," Econometrics Journal, Royal Economic Society, vol. 14(1), pages 45-90, February.
    18. Johnstone, Iain M. & Lu, Arthur Yu, 2009. "On Consistency and Sparsity for Principal Components Analysis in High Dimensions," Journal of the American Statistical Association, American Statistical Association, vol. 104(486), pages 682-693.
    19. James H. Stock & Mark W. Watson, 2005. "Implications of Dynamic Factor Models for VAR Analysis," NBER Working Papers 11467, National Bureau of Economic Research, Inc.
    20. Shen, Haipeng & Huang, Jianhua Z., 2008. "Sparse principal component analysis via regularized low rank matrix approximation," Journal of Multivariate Analysis, Elsevier, vol. 99(6), pages 1015-1034, July.
    21. Shen, Dan & Shen, Haipeng & Marron, J.S., 2013. "Consistency of sparse PCA in High Dimension, Low Sample Size contexts," Journal of Multivariate Analysis, Elsevier, vol. 115(C), pages 317-333.
    22. Jianqing Fan & Jingjin Zhang & Ke Yu, 2008. "Asset Allocation and Risk Assessment with Gross Exposure Constraints for Vast Portfolios," Papers 0812.2604, arXiv.org.
    23. Jushan Bai & Serena Ng, 2002. "Determining the Number of Factors in Approximate Factor Models," Econometrica, Econometric Society, vol. 70(1), pages 191-221, January.
    24. Clifford Lam & Qiwei Yao & Neil Bathia, 2011. "Estimation of latent factors for high-dimensional time series," Biometrika, Biometrika Trust, vol. 98(4), pages 901-918.
    25. Hallin, Marc & Liska, Roman, 2007. "Determining the Number of Factors in the General Dynamic Factor Model," Journal of the American Statistical Association, American Statistical Association, vol. 102, pages 603-617, June.
    26. Jushan Bai & Shuzhong Shi, 2011. "Estimating High Dimensional Covariance Matrices and its Applications," Annals of Economics and Finance, Society for AEF, vol. 12(2), pages 199-215, November.
    27. Jushan Bai, 2003. "Inferential Theory for Factor Models of Large Dimensions," Econometrica, Econometric Society, vol. 71(1), pages 135-171, January.
    28. M. Hashem Pesaran, 2006. "Estimation and Inference in Large Heterogeneous Panels with a Multifactor Error Structure," Econometrica, Econometric Society, vol. 74(4), pages 967-1012, July.
    29. Stock, James H & Watson, Mark W, 2002. "Macroeconomic Forecasting Using Diffusion Indexes," Journal of Business & Economic Statistics, American Statistical Association, vol. 20(2), pages 147-162, April.
    30. Baik, Jinho & Silverstein, Jack W., 2006. "Eigenvalues of large sample covariance matrices of spiked population models," Journal of Multivariate Analysis, Elsevier, vol. 97(6), pages 1382-1408, July.
    31. Jianqing Fan & Jinchi Lv, 2008. "Sure independence screening for ultrahigh dimensional feature space," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 70(5), pages 849-911, November.
    32. Karim M. Abadir & Walter Distaso & Filip Žikeš, 2010. "Model-Free Estimation of Large Variance Matrices," Working Paper series 17_10, Rimini Centre for Economic Analysis.
    33. Pesaran, M. Hashem & Yamagata, Takashi, 2012. "Testing CAPM with a Large Number of Assets," IZA Discussion Papers 6469, Institute of Labor Economics (IZA).
    34. Fan, Jianqing & Fan, Yingying & Lv, Jinchi, 2008. "High dimensional covariance matrix estimation using a factor model," Journal of Econometrics, Elsevier, vol. 147(1), pages 186-197, November.
    35. Enrique Sentana, 2009. "The econometrics of mean-variance efficiency tests: a survey," Econometrics Journal, Royal Economic Society, vol. 12(3), pages 65-101, November.
    36. Bouveyron, C. & Girard, S. & Schmid, C., 2007. "High-dimensional data clustering," Computational Statistics & Data Analysis, Elsevier, vol. 52(1), pages 502-519, September.
    37. Jung, Sungkyu & Sen, Arusharka & Marron, J.S., 2012. "Boundary behavior in High Dimension, Low Sample Size asymptotics of PCA," Journal of Multivariate Analysis, Elsevier, vol. 109(C), pages 190-203.
    38. Forni, Mario & Lippi, Marco, 2001. "The Generalized Dynamic Factor Model: Representation Theory," Econometric Theory, Cambridge University Press, vol. 17(6), pages 1113-1141, December.
    39. Cai, Tony & Liu, Weidong, 2011. "Adaptive Thresholding for Sparse Covariance Matrix Estimation," Journal of the American Statistical Association, American Statistical Association, vol. 106(494), pages 672-684.
    40. Alexei Onatski, 2010. "Determining the Number of Factors from Empirical Distribution of Eigenvalues," The Review of Economics and Statistics, MIT Press, vol. 92(4), pages 1004-1016, November.
    41. repec:bla:jfinan:v:58:y:2003:i:4:p:1651-1684 is not listed on IDEAS
    42. Rothman, Adam J. & Levina, Elizaveta & Zhu, Ji, 2009. "Generalized Thresholding of Large Covariance Matrices," Journal of the American Statistical Association, American Statistical Association, vol. 104(485), pages 177-186.
    43. Alessi, Lucia & Barigozzi, Matteo & Capasso, Marco, 2010. "Improved penalization for determining the number of factors in approximate factor models," Statistics & Probability Letters, Elsevier, vol. 80(23-24), pages 1806-1813, December.
    44. Efron, Bradley, 2007. "Correlation and Large-Scale Simultaneous Significance Testing," Journal of the American Statistical Association, American Statistical Association, vol. 102, pages 93-103, March.
    45. Hallin, Marc & Liska, Roman, 2011. "Dynamic factors in the presence of blocks," Journal of Econometrics, Elsevier, vol. 163(1), pages 29-41, July.
    46. Fama, Eugene F & French, Kenneth R, 1992. "The Cross-Section of Expected Stock Returns," Journal of Finance, American Finance Association, vol. 47(2), pages 427-465, June.
    47. Kourtis, Apostolos & Dotsis, George & Markellos, Raphael N., 2012. "Parameter uncertainty in portfolio selection: Shrinking the inverse covariance matrix," Journal of Banking & Finance, Elsevier, vol. 36(9), pages 2522-2531.
    48. Kaufman, Cari G. & Schervish, Mark J. & Nychka, Douglas W., 2008. "Covariance Tapering for Likelihood-Based Estimation in Large Spatial Data Sets," Journal of the American Statistical Association, American Statistical Association, vol. 103(484), pages 1545-1555.
    49. Boivin, Jean & Ng, Serena, 2006. "Are more data always better for factor analysis?," Journal of Econometrics, Elsevier, vol. 132(1), pages 169-194, May.
    50. Xi Luo, 2011. "Recovering Model Structures from Large Low Rank and Sparse Covariance Matrix Estimation," Papers 1111.1133, arXiv.org, revised Mar 2013.
    51. Joong-Ho Won & Johan Lim & Seung-Jean Kim & Bala Rajaratnam, 2013. "Condition-number-regularized covariance estimation," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 75(3), pages 427-450, June.
    52. Liu, Yufeng & Hayes, David Neil & Nobel, Andrew & Marron, J. S, 2008. "Statistical Significance of Clustering for High-Dimension, Low–Sample Size Data," Journal of the American Statistical Association, American Statistical Association, vol. 103(483), pages 1281-1293.
    53. repec:hal:journl:peer-00844811 is not listed on IDEAS
    54. Lingzhou Xue & Shiqian Ma & Hui Zou, 2012. "Positive-Definite ℓ 1 -Penalized Estimation of Large Covariance Matrices," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 107(500), pages 1480-1491, December.
    55. Fama, Eugene F. & French, Kenneth R., 1993. "Common risk factors in the returns on stocks and bonds," Journal of Financial Economics, Elsevier, vol. 33(1), pages 3-56, February.
    56. William F. Sharpe, 1964. "Capital Asset Prices: A Theory Of Market Equilibrium Under Conditions Of Risk," Journal of Finance, American Finance Association, vol. 19(3), pages 425-442, September.
    57. Tibor F. Liska, 2007. "The Liska model," Society and Economy, Akadémiai Kiadó, Hungary, vol. 29(3), pages 363-381, December.
    58. John Stephen Yap & Jianqing Fan & Rongling Wu, 2009. "Nonparametric Modeling of Longitudinal Covariance Structure in Functional Mapping of Quantitative Trait Loci," Biometrics, The International Biometric Society, vol. 65(4), pages 1068-1077, December.
    59. Bai, Jushan & Ng, Serena, 2008. "Large Dimensional Factor Analysis," Foundations and Trends(R) in Econometrics, now publishers, vol. 3(2), pages 89-163, June.
    60. Lam, Clifford & Yao, Qiwei, 2012. "Factor modeling for high-dimensional time series: inference for the number of factors," LSE Research Online Documents on Economics 45684, London School of Economics and Political Science, LSE Library.
    61. Stock J.H. & Watson M.W., 2002. "Forecasting Using Principal Components From a Large Number of Predictors," Journal of the American Statistical Association, American Statistical Association, vol. 97, pages 1167-1179, December.
    62. Efron, Bradley, 2010. "Correlated z-Values and the Accuracy of Large-Scale Statistical Estimates," Journal of the American Statistical Association, American Statistical Association, vol. 105(491), pages 1042-1055.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Aït-Sahalia, Yacine & Xiu, Dacheng, 2017. "Using principal component analysis to estimate a high dimensional factor model with high-frequency data," Journal of Econometrics, Elsevier, vol. 201(2), pages 384-399.
    2. Jianqing Fan & Yuan Liao & Han Liu, 2016. "An overview of the estimation of large covariance and precision matrices," Econometrics Journal, Royal Economic Society, vol. 19(1), pages 1-32, February.
    3. Bodnar, Taras & Reiß, Markus, 2016. "Exact and asymptotic tests on a factor model in low and large dimensions with applications," Journal of Multivariate Analysis, Elsevier, vol. 150(C), pages 125-151.
    4. Bai, Jushan & Liao, Yuan, 2012. "Efficient Estimation of Approximate Factor Models," MPRA Paper 41558, University Library of Munich, Germany.
    5. Bai, Jushan & Liao, Yuan, 2016. "Efficient estimation of approximate factor models via penalized maximum likelihood," Journal of Econometrics, Elsevier, vol. 191(1), pages 1-18.
    6. Poncela, Pilar & Ruiz, Esther & Miranda, Karen, 2021. "Factor extraction using Kalman filter and smoothing: This is not just another survey," International Journal of Forecasting, Elsevier, vol. 37(4), pages 1399-1425.
    7. Matteo Luciani, 2015. "Monetary Policy and the Housing Market: A Structural Factor Analysis," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 30(2), pages 199-218, March.
    8. Karim Barhoumi & Olivier Darné & Laurent Ferrara, 2014. "Dynamic factor models: A review of the literature," OECD Journal: Journal of Business Cycle Measurement and Analysis, OECD Publishing, Centre for International Research on Economic Tendency Surveys, vol. 2013(2), pages 73-107.
    9. Jörg Breitung & In Choi, 2013. "Factor models," Chapters, in: Nigar Hashimzade & Michael A. Thornton (ed.), Handbook of Research Methods and Applications in Empirical Macroeconomics, chapter 11, pages 249-265, Edward Elgar Publishing.
      • In Choi & Jorg Breitung, 2011. "Factor models," Working Papers 1121, Nam Duck-Woo Economic Research Institute, Sogang University (Former Research Institute for Market Economy), revised Dec 2011.
    10. Lütkepohl, Helmut, 2014. "Structural vector autoregressive analysis in a data rich environment: A survey," SFB 649 Discussion Papers 2014-004, Humboldt University Berlin, Collaborative Research Center 649: Economic Risk.
    11. Freyaldenhoven, Simon, 2022. "Factor models with local factors — Determining the number of relevant factors," Journal of Econometrics, Elsevier, vol. 229(1), pages 80-102.
    12. Pilar Poncela & Esther Ruiz, 2016. "Small- Versus Big-Data Factor Extraction in Dynamic Factor Models: An Empirical Assessment," Advances in Econometrics, in: Dynamic Factor Models, volume 35, pages 401-434, Emerald Group Publishing Limited.
    13. Dai, Chaoxing & Lu, Kun & Xiu, Dacheng, 2019. "Knowing factors or factor loadings, or neither? Evaluating estimators of large covariance matrices with noisy and asynchronous data," Journal of Econometrics, Elsevier, vol. 208(1), pages 43-79.
    14. Gagliardini, Patrick & Ossola, Elisa & Scaillet, Olivier, 2019. "A diagnostic criterion for approximate factor structure," Journal of Econometrics, Elsevier, vol. 212(2), pages 503-521.
    15. Matteo Barigozzi & Marc Hallin, 2023. "Dynamic Factor Models: a Genealogy," Papers 2310.17278, arXiv.org, revised Jan 2024.
    16. Barigozzi, Matteo & Trapani, Lorenzo, 2020. "Sequential testing for structural stability in approximate factor models," Stochastic Processes and their Applications, Elsevier, vol. 130(8), pages 5149-5187.
    17. Catherine Doz & Peter Fuleky, 2019. "Dynamic Factor Models," Working Papers 2019-4, University of Hawaii Economic Research Organization, University of Hawaii at Manoa.
    18. Stock, J.H. & Watson, M.W., 2016. "Dynamic Factor Models, Factor-Augmented Vector Autoregressions, and Structural Vector Autoregressions in Macroeconomics," Handbook of Macroeconomics, in: J. B. Taylor & Harald Uhlig (ed.), Handbook of Macroeconomics, edition 1, volume 2, chapter 0, pages 415-525, Elsevier.
    19. Choi, Sung Hoon & Kim, Donggyu, 2023. "Large volatility matrix analysis using global and national factor models," Journal of Econometrics, Elsevier, vol. 235(2), pages 1917-1933.
    20. Matteo Barigozzi & Antonio M. Conti & Matteo Luciani, 2014. "Do Euro Area Countries Respond Asymmetrically to the Common Monetary Policy?," Oxford Bulletin of Economics and Statistics, Department of Economics, University of Oxford, vol. 76(5), pages 693-714, October.

    More about this item

    Keywords

    High dimensionality; approximate factor model; unknown factors; principal components; sparse matrix; low-rank matrix; thresholding; cross-sectional correlation;
    All these keywords.

    JEL classification:

    • C13 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Estimation: General
    • C01 - Mathematical and Quantitative Methods - - General - - - Econometrics

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:pra:mprapa:38697. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Joachim Winter (email available below). General contact details of provider: https://edirc.repec.org/data/vfmunde.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.