Reconstruction of a matrix of genotypic correlations between variants within a gene for joint analysis of imputed and sequenced data

封面

如何引用文章

全文:

开放存取 开放存取
受限制的访问 ##reader.subscriptionAccessGranted##
受限制的访问 订阅存取

详细

When combining imputed and sequenced data in a single gene-based association analysis, the problem of reconstructing genetic correlation matrices arises. It is related to the fact that for a gene, we know the correlations between genotypes of all imputed variants and the correlations between genotypes of all sequenced variants, but we do not know the correlations between genotypes of variants, one of which is imputed and the other is sequenced. To recover these correlations, we propose an efficient method based on maximising the determinant of the matrix. This method has a number of useful properties and has an analytical solution for our task. Approbation of the proposed method was performed by comparing reconstructed and real correlation matrices constructed on individual genotypes from the UK biobank. Comparison of the results of gene-based association analysis performed by the SKAT, BT and PCA methods on reconstructed and real matrices, using modelled summary statistics and calculated summary statistics on real phenotypes, showed high quality of reconstruction and robustness of the method to different gene structures.

作者简介

G. Svishcheva

Federal Research Centre, Institute of Cytology and Genetics, Siberian Branch of the Russian Academy of Sciences; Vavilov Institute of General Genetics, Russian Academy of Sciences

编辑信件的主要联系方式.
Email: gulsvi@mail.ru
俄罗斯联邦, 630090, Novosibirsk; 119991, Moscow

A. Kirichenko

Federal Research Centre, Institute of Cytology and Genetics, Siberian Branch of the Russian Academy of Sciences

Email: gulsvi@mail.ru
俄罗斯联邦, 630090, Novosibirsk

N. Belonogova

Federal Research Centre, Institute of Cytology and Genetics, Siberian Branch of the Russian Academy of Sciences

Email: gulsvi@mail.ru
俄罗斯联邦, 630090, Novosibirsk

E. Elgaeva

Federal Research Centre, Institute of Cytology and Genetics, Siberian Branch of the Russian Academy of Sciences; Novosibirsk State University

Email: gulsvi@mail.ru
俄罗斯联邦, 630090, Novosibirsk; 630090, Novosibirsk

A. Tsepilov

Federal Research Centre, Institute of Cytology and Genetics, Siberian Branch of the Russian Academy of Sciences

Email: gulsvi@mail.ru
俄罗斯联邦, 630090, Novosibirsk

I. Zorkoltseva

Federal Research Centre, Institute of Cytology and Genetics, Siberian Branch of the Russian Academy of Sciences

Email: gulsvi@mail.ru
俄罗斯联邦, 630090, Novosibirsk

T. Axenovich

Federal Research Centre, Institute of Cytology and Genetics, Siberian Branch of the Russian Academy of Sciences

Email: gulsvi@mail.ru
俄罗斯联邦, 630090, Novosibirsk

参考

  1. Eichler E.E., Flint J., Gibson G. at al. Missing heritability and strategies for finding the underlying causes of complex disease // Nat. Rev. Genet. 2010. V. 11. № 6. P. 446–450. https://doi.org/10.1038/nrg2809
  2. Li B., Leal S.M. Methods for detecting associations with rare variants for common diseases: application to analysis of sequence data // The Am. J. Hum. Genet. 2008. V. 83. № 3. P. 311−321. https://doi.org/10.1016/j.ajhg.2008.06.024
  3. Cirulli E.T. The increasing importance of gene-based analyses // PloS Genetics. 2016. V. 12. № 4. https://doi.org/10.1371/journal.pgen.1005852
  4. Kang G., Jiang B., Cui Y. Gene-based genomewide association analysis: A comparison study // Curr. Genomics. 2013. V. 14. № 4. P. 250–255. https://doi.org/10.2174/13892029113149990001
  5. Li Y., Willer C., Sanna S., Abecasis G. Genotype imputation // Ann. Rev. Genomics and Hum. Genet. 2009. V. 10. P. 387–406. https://doi.org/10.1146/annurev.genom.9.081307.164242
  6. Uffelmann E., Huang Q.Q., Munung N.S. et al. Genome-wide association studies // Nat. Rev. Methods Primers. 2021. V. 1. № 59. P. 1–21. https://doi.org/10.1038/s43586-021-00056-9
  7. Guo Y., Long J., He J. et al. Exome sequencing generates high quality data in non-target regions // BMC Genomics. 2012. V. 13. № 1. P. 1–10. https://doi.org/10.1186/1471-2164-13-194
  8. Clark M.J., Chen R., Lam H.Y. et al. Performance comparison of exome DNA sequencing technologies // Nat. Biotechnol. 2011. V. 29. № 10. P. 908–914. https://doi.org/10.1038/nbt.1975
  9. Stanley J.C., Wang M.D. Restrictions on the possible values of r12, given r13 and r23 // Educational and Psychol. Measurement. 1969. V. 29. № 3. P. 579–581.
  10. Glass G.V., Collins J.R. Geometric proof of the restriction on the possible values of rxy when rxz and ryz are fixed // Educational and Psychol. Measurement. 1970. V. 30. № 1. P. 37–39.
  11. Budden M., Hadavas P., Hoffman L., Pretz C. Generating valid 4 × 4 correlation matrices // Applied Mathemat. E-Notes. 2007. V. 7. P. 53–59.
  12. Glunt W., Hayden T., Johnson C.R., Tarazaga P. Positive definite completions and determinant maximization // Linear Algebra and its Applications. 1999. V. 288. P. 1–10. https://doi.org/10.1016/S0024-3795(98)10211-2
  13. Vandenberghe L., Boyd S., Wu S.-P. Determinant maximization with linear matrix inequality constraints // SIAM J. Matrix Analysis and Applications. 1998. V. 19. № 2. P. 499–533. https://doi.org/10.1137/S0895479896303430
  14. Georgescu D.I., Higham N.J., Peters G.W. Explicit solutions to correlation matrix completion problems, with an application to risk management and insurance // Royal Soc. Open Sci. 2018. V. 5. № 3. P. 172348.
  15. Grone R., Johnson C.R., Sá E.M., Wolkowicz H. Positive definite completions of partial Hermitian matrices // Linear Algebra and its Applications. 1984. V. 58. P. 109–124.
  16. Popescu O., Rose C., Popescu D.C. Maximizing the determinant for a special class of block-partitioned matrices // Mathem. Problems in Engineering. 2004. V. 2004. P. 49–61. https://doi.org/10.1155/S1024123X04307027
  17. Li B., Liu D.J., Leal S.M. Identifying rare variants associated with complex traits via sequencing // Curr. Protocols in Hum. Genet. 2013. V. 78. № 1. P. 1–26. https://doi.org/10.1002/0471142905.hg0126s78
  18. Wu M.C., Lee S., Cai T. et al. Rare-variant association testing for sequencing data with the sequence kernel association test // The Am. J. Hum. Genet. 2011. V. 89. № 1. P. 82–93. https://doi.org/10.1016/j.ajhg.2011.05.029
  19. Jiang L., Zheng Z., Fang H., Yang J. A generalized linear mixed model association tool for biobank-scale data // Nat. Genet. 2021. V. 53. № 11. P. 1616–1621. https://doi.org/10.1038/s41588-021-00954-4
  20. Svishcheva G.R. A generalized model for combining dependent SNP-level summary statistics and its extensions to statistics of other levels // Scientific Reports. 2019. V. 9. № 1. P. 1–8. https://doi.org/10.1038/s41598-019-41827-5
  21. Svishcheva G.R., Belonogova N.M., Zorkoltseva I.V. et al. Gene-based association tests using GWAS summary statistics // Bioinformatics. 2019. V. 35. № 19. P. 3701–3708. https://doi.org/10.1093/bioinformatics/btz172
  22. Belonogova N.M., Svishcheva G.R., Kirichenko A.V. et al. sumSTAAR: A flexible framework for gene-based association studies using GWAS summary statistics // PloS Comput. Biology. 2022. T. 18. № 6. https://doi.org/10.1371/journal.pcbi.1010172
  23. Тихонов А.Н. О решении некорректно поставленных задач и методе регуляризации // ДАН. 1963. Т. 151. № 3. C. 501–504.

补充文件

附件文件
动作
1. JATS XML

版权所有 © Russian Academy of Sciences, 2024