Artificial intelligence and classical methods in animal genetics and breeding

Capa

Citar

Texto integral

Acesso aberto Acesso aberto
Acesso é fechado Acesso está concedido
Acesso é fechado Somente assinantes

Resumo

The article analyses basic methods of population genetics and animal breeding, as well as mathematical methods of machine learning used in animal breeding. The training of cat boost library models was carried out on the example of two domesticated species – domestic horse (Equus caballus) and reindeer (Rangifer tarandus). Data from microsatellite panels of 16 and 17 loci, respectively, were used to train the model using data on domesticated and wild reindeer, European and Russian horse breeds. The standard indicators: accuracy, precision, recall and f1 were calculated to determine the success of the model. Confusion matrices were constructed. New possibilities of identification of animal breed affiliation were shown.

Texto integral

Acesso é fechado

Sobre autores

А. Soloshenkov

Vavilov Institute of General Genetic, Russian Academy of Sciences; Russian State Agrarian University – Moscow Timiryazev Agricultural Academy

Autor responsável pela correspondência
Email: alesol@rgau-msha.ru
Rússia, 119991, Moscow; 127434, Moscow

E. Soloshenkova

Vavilov Institute of General Genetic, Russian Academy of Sciences

Email: alesol@rgau-msha.ru
Rússia, 119991, Moscow

M. Semina

Vavilov Institute of General Genetic, Russian Academy of Sciences

Email: alesol@rgau-msha.ru
Rússia, 119991, Moscow

N. Spasskaya

Moscow State University

Email: alesol@rgau-msha.ru

Zoo museum

Rússia, 125009, Moscow

V. Voronkova

Vavilov Institute of General Genetic, Russian Academy of Sciences

Email: alesol@rgau-msha.ru
Rússia, 119991, Moscow

Y. Stolpovky

Vavilov Institute of General Genetic, Russian Academy of Sciences

Email: alesol@rgau-msha.ru
Rússia, 119991, Moscow

Bibliografia

  1. Моисеева И.Г., Уханов С.В., Столповский Ю.А. и др. Генофонды сельскохозяйственных животных. Генетические ресурсы животноводства России. М.: Наука, 2006. 462 с.
  2. Weigel K.A., VanRaden P.M., Norman H.D., Grosu H. A 100-year review: Methods and impact of genetic selection in dairy cattle-from daughter-dam comparisons to deep learning algorithms // J. Dairy Sci. 2017. V. 100. № 12. P. 10234–10250.
  3. Храброва Л.А., Зайцев А.М., Суходольская И.В. и др. Проблемы учета и сохранения аборигенных пород лошадей // Аборигенное коневодство России: история, современность, перспективы: Сб. науч. трудов по матер. II Всеросс. научно-практ. конф. с междунар. участием. Мезень, 2018. С. 170–176.
  4. Николаева Э.А., Спасская Н.Н., Столповский Ю.А., Воронкова В.Н. Структура популяций заводских и вторично одичавших лошадей // Генетические процессы в популяциях: Материалы науч. Конф. с междунар. участием, посвященной 50-летнему юбилею лаборатории популяционной генетики им. Ю.П. Алтухова ИОГен РАН и 85-летию со дня рождения академика Юрия Петровича Алтухова. 2022. С. 45.
  5. Ashley M.V., Dow B.D. The use of microsatellite analysis in population biology: background, methods and potential applications // Mol. Ecol. Evol.: Approaches and Applications. 1994. P. 185–201.
  6. Столповский Ю.А., Пискунов А.К., Свищева Г.Р. Геномная селекция. I: Последние тенденции и возможные пути развития // Генетика. 2020. Т. 56. № 9. С. 1006–1017. https://doi.org/10.31857/S0016675820090143
  7. Николаева Э.А., Воронкова В.Н., Политова М.А. и др. Генетическая структура русской верховой породы лошадей // Генетика. 2023. Т. 59. № 9. С. 1048–1058. https://doi.org/10.31857/S0016675823090096. EDN WUWYIE.
  8. Животовский Л.А. Генетика природных популяций. Йошкар-Ола: Вертикаль, 2021. 600 с.
  9. Meirmans P.G., Hedrick P.W. Assessing population structure: FST and related measures // Mol. Ecol. Res. 2011. V. 11. № 1. P. 5–18. https://doi.org/10.1111/j.1755-0998.2010.02927.x
  10. Adamack A.T., Gruber B. Popgenreport: Simplifying basic population genetic analyses in R // Methods Ecol, Evol, 2014. V. 5. N 4. P. 384-387. https://doi.org/10.1111/2041-210X.12158
  11. Каштанов С.Н., Свищёва Г.Р., Пищулина С.Л. и др. Географическая структура генофонда соболя (Martes zibellina L.): данные анализа микросателлитных локусов // Генетика. 2015. Т. 51. №. 1. С. 78–78. https://doi.org/10.1134/S1022795415010044
  12. Väli Ü., Einarsson A., Waits L., Ellegren H. To what extent do microsatellite markers reflect genome-wide genetic diversity in natural populations? // Mol. Ecol. 2008. V. 17. № 17. P. 3808–3817.
  13. Porras-Hurtado L., Ruiz Y., Santos C. et al. An overview of STRUCTURE: Applications, parameter settings, and supporting software // Front. in Genet. 2013. V. 4. P. 98. https://doi.org/10.3389/fgene.2013.00098
  14. Gronau I., Moran S. Optimal implementations of UPGMA and other common clustering algorithms // Inform. Proc. Letters. 2007. V. 104. № 6. P. 205–210. https://doi.org/10.1016/j.ipl.2007.07.002
  15. Efron B. Bootstrap methods: Another look at the jackknife // Ann. Statist. 1979. V. 7. P. 1–26. https://doi.org/10.1214/aos/1176344552
  16. Reich D., Price A., Patterson N. Principal component analysis of genetic data // Nat. Genet. 2008. V. 40. P. 491–492. https://doi.org/10.1038/ng0508-491
  17. Sievert C. Interactive Web-based Data Visualization With R, plotly, and shiny. CRC Press, 2020.
  18. Spasskaya N.N., Voronkova V.N., Letarov A.V. et al. Features of reproduction in an isolated island population of the feral horses of the Lake Manych-Gudilo (Rostov Region, Russia) // App. An. Beh. Sci. 2022. V. 254. https://doi.org/10.1016/j.applanim.2022.105712
  19. Maloy S., Hughes K. Brenner’s Encyclopedia of Genetics. MS, Cambridge: Academic Press,. 2013.
  20. Ruzica Bruvo, Nicolaas K. Michiels, Thomas G. D’Souza, Hinrich Shulenberg. A simple method for the calculation of microsatellite genotype distances irrespective of ploidy level // Mol. Ecol. 2004. V. 13(7). P. 2101–2106.
  21. Henderson C.R. Applications of linear models in animal breeding. Guelph, Canada: Univ. Guelph Press. 1984. 462 p.
  22. Отраднов П.И., Рудиянов Д.М., Белоус А.А. Валидация оценок племенной ценности свиней породы дюрок по признакам кормового поведения // Свиноводство. 2023. № 5. С. 22–26. https://doi.org/10.37925/0039-713X-2023-5-22-26
  23. Сермягин А.А., Белоус А.А., Контэ А.Ф. и др. Валидация геномного прогноза племенной ценности быков-производителей по признакам молочной продуктивности дочерей на примере популяции черно-пестрого и голштинского скота // С.-х. биология. 2017. Т. 52. № 6. С. 1148–1156.
  24. Контэ А.Ф., Белоус А.А., Отраднов П.И. Племенная ценность кормового поведения свиней // Аграрный вестник Урала. 2022. №. 10 (225). С. 44–53.
  25. Nayeri S., Sargolzaei M., Tulpan D. A review of traditional and machine learning methods applied to animal breeding // Animal Health Res. Rev. 2019. V. 20, P. 31–46. https://doi.org/10.1017/ S1466252319000148
  26. Zhou Z.H. Machine Learning. London: Springer Nature, 2021. 460 p. https://doi.org/10.1016/S0034-4257(97)00083-7
  27. Stehman S.V. Selecting and interpreting measures of thematic classification accuracy // Remote Sensing of Environment. 1997. V. 62. № 1. P. 77–89. https://doi.org/10.1016/S0034-4257(97)00083-7
  28. Erickson B.J., Kitamura F. Magician’s corner: 9. Performance metrics for machine learning models // Radiology: Artificial Intelligence. 2021. V. 3. № 3. https://doi.org/10.1148/ryai.2021200126
  29. Powers D.M.W. Evaluation: From precision, recall and F-measure to ROC, informedness, markedness and correlation // arXiv preprint arXiv:2010.16061. 2020. https://doi.org/10.48550/arXiv.2010.16061
  30. Sasaki Y. The truth of the F-measure // Teach Tutor Mater. 2007. V. 1. № 5. P. 1–5.
  31. Penzar D.D., Zinkevich A.O., Vorontsov I.E. What do neighbors tell about you: The local context of cis-regulatory modules complicates prediction of regulatory variants // Front. Genet. 2019. V. 10. https://doi.org/10.3389/fgene.2019.01078
  32. Михальский А.И., Новосельцева Ж.А. Применение методов машинного обучения в задачах продуктивного животноводства // Пробл. биол. продуктивных животных. 2018. № 4. С. 98-109. https://doi.org/10.25687/1996-6733.prodanimbiol.2018.3.98-109
  33. Simonyan K., Zisserman A. Very deep convolutional networks for large-scale image recognition // arXiv preprint arXiv:1409.1556. 2014. doi: 10.48550/arXiv.1409.1556
  34. Jwade S.A., Guzzomi A., Mian A. On farm automatic sheep breed classification using deep learning // Computers and Electronics in Agriculture. 2019. V. 167. https://doi.org/10.1016/j.compag.2019.105055
  35. Batic D., Culibrk D. Identifying individual dogs in social media images // arXiv:2003.06705. 2019.
  36. Столповский Ю.А., Бабаян О.В., Каштанов С.Н. и др. Генетическая оценка пород северного оленя (Rangifer tarandus) и их дикого предка с помощью новой панели STR-маркеров // Генетика. 2020. Т. 56. № 12. С. 1409–1425. https://catboost.ai/en/docs/concepts/loss-functions-multiclassification#usage-information
  37. Южаков А.А., Мухачев А.Д., Лайшев К.А. Породы и проблемы селекции северных оленей России. М.: Наука, 2023. 165 с.

Arquivos suplementares

Arquivos suplementares
Ação
1. JATS XML
2. Pic. 1. Thermal map of the frequency of alleles for 54 breeds of horses. The color from yellow to red is indicated the frequency of the allele in the population. The private allel of the 14th Lokus HTG7 for the Russian riding breed (RVP) is identified.

Baixar (457KB)
3. Fig. 2. The population structure of the factory breeds of horses. Orange color - Akhal -Teke, blue - Budyonnovskaya, red - feral horses of Fr. Water, blue - Don, green - Russian heavy, pink - Russian riding, yellow - Soviet heavy.

Baixar (487KB)
4. Figure 3. Distribution of riding and heavy-carriage horse breeds in the space of two principal components in comparison with feral horses of Vodny Island. Water to clarify the origin of this population. Wild - feral horses, Buden - Budennov breed; Shaelteke - Akhalteke horses of the Shael plant; Don - Don horse; rvp2, rvpstar, rvp3 - samples of Russian riding breed; Rustyazh - Russian heavy-drawn breed; Sovtyazh - Soviet heavy-drawn breed. Translated with DeepL.com (free version)

Baixar (219KB)
5. Fig. 4. Tree construction by VPGVA method. a - dichotomous clustering of chum salmon samples; b - principal component method.

Baixar (151KB)
6. Figure 5. Error matrix.

Baixar (81KB)
7. Figure 6. Error matrix of the example.

Baixar (93KB)
8. Figure 7. VGG-16 architecture.

Baixar (275KB)
9. Figure 8. Error matrix for each pair of studied rocks.

Baixar (261KB)
10. Figure 9. Error matrix of the binary classification model for domestic and wild deer. 0 - domestic, 1 - wild.

Baixar (63KB)
11. Figure 10: Model error matrix for domestic deer breeds and their wild populations.

Baixar (163KB)

Declaração de direitos autorais © Russian Academy of Sciences, 2024