Evaluation of the minimum number of markers for individual ancestry estimation in an Argentinean population sample

Authors

  • María Gabriela Russo Universidad Maimónides
  • Francisco Di Fabio Rocca Universidad Maimónides
  • Patricio Doldán Universidad Maimónides
  • Darío Gonzalo Cardozo Universidad Maimónides
  • Cristina Beatriz Dejean Universidad Maimónides
  • Verónica Seldes Universidad de Buenos Aires. Facultad de Filosofía y Letras. Instituto de Ciencias Antropológicas. Sección de Antropología Biológica
  • Sergio Avena Universidad Maimónides

DOI:

https://doi.org/10.31048/1852.4826.v9.n1.12579

Keywords:

number of AIMs, individual ancestry, Argentinean population

Abstract

Estimation of individual ancestry has great relevance when studying population composition in regions like South America, where intensive admixture processes have occurred, being also important in biomedical sciences. For that reason, it is important to assess the factors that may affect the reliability of results. In this work, we investigate the minimum number of ancestry informative markers (AIMs) for obtaining acceptable estimations of ancestry. As an example, we take individuals from a population sample of different Argentinean regions. Considering a three component model (Native American, Eurasian and Sub-Saharan), we calculated ancestry of 441 individuals using 10, 20, 30 and 50 AIMs. The results indicate that the number of markers affects ancestry estimation and its accuracy increases with AIMs number. When compared to previous estimations obtained from 99 AIMs, the result shows that at least 30 markers are needed to achieve good correlation values for the minority component (Sub-Saharan in this case). For individual ancestry studies, we suggest to take into account not only the number of markers, but also its informativeness and the background of the studied population.

Downloads

Download data is not yet available.

Author Biographies

María Gabriela Russo, Universidad Maimónides

Lic. en Cs. Biológicas y estudiante de Doctorado (UBA). Becaria CONICET - Universidad Maimónides.

Francisco Di Fabio Rocca, Universidad Maimónides

Equipo de Antropología Biológica, Departamento de Cs. Naturales y Antropológicas, CEBBAD, Fundación de Historia Natural Félix de Azara, Universidad Maimónides. CONICET.

Patricio Doldán, Universidad Maimónides

Equipo de Antropología Biológica, Departamento de Cs. Naturales y Antropológicas, CEBBAD, Fundación de Historia Natural Félix de Azara, Universidad Maimónides.

Darío Gonzalo Cardozo, Universidad Maimónides

Equipo de Antropología Biológica, Departamento de Cs. Naturales y Antropológicas, CEBBAD, Fundación de Historia Natural Félix de Azara, Universidad Maimónides. Sección de Antropología Biológica, ICA, Facultad de Filosofía y Letras, Universidad de Buenos Aires. CONICET.

Cristina Beatriz Dejean, Universidad Maimónides

Equipo de Antropología Biológica, Departamento de Cs. Naturales y Antropológicas, CEBBAD, Fundación de Historia Natural Félix de Azara, Universidad Maimónides. Sección de Antropología Biológica, ICA, Facultad de Filosofía y Letras, Universidad de Buenos Aires, Argentina.

Verónica Seldes, Universidad de Buenos Aires. Facultad de Filosofía y Letras. Instituto de Ciencias Antropológicas. Sección de Antropología Biológica

Consejo Nacional de Investigaciones Científicas y Técnicas. 

Sergio Avena, Universidad Maimónides

Equipo de Antropología Biológica, Departamento de Cs. Naturales y Antropológicas, CEBBAD, Fundación de Historia Natural Félix de Azara, Universidad Maimónides. Sección de Antropología Biológica, ICA, Facultad de Filosofía y Letras, Universidad de Buenos Aires. CONICET.

References

Alexander, D. H., J. Novembre y K. Lange. 2009. Fast model-based estimation of ancestry in unrelated individuals. Genome Research, 19(9):1655-1664. DOI: https://doi.org/10.1101/gr.094052.109

Akaike, H. 1974. A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19(6):716-723. DOI: https://doi.org/10.1109/TAC.1974.1100705

Avena, S., M. Via, E. Ziv, E. J. Pérez-Stable, C.R. Gignoux, C. Dejean, S. Huntsman, G. Torres-Mejía, J. Dutil, J. L. Matta, K. Beckman, E. G. Burchard, M. L. Parolin, A. Goicoechea, N. Acreche, M. Boquet, M. C. Ríos Part, V. Fernández, J. Rey, M. C. Stern, R. F. Carnese y L. Fejerman. 2012. Heterogeneity in genetic admixture across different regions of Argentina. PLoS One, 7(4):e34695. http://doi.org/10.1371/journal.pone.0034695 (Última consulta: 11/10/2015). DOI: https://doi.org/10.1371/journal.pone.0034695

Banks, M. A. y W. Eichert. 2000. WHICHRUN (version 3.2): a computer program for population assignment of individuals based on multilocus genotype data. Journal of Heredity, 91(1):87-89. DOI: https://doi.org/10.1093/jhered/91.1.87

Beebe-Dimmer, J. L., A. M. Levin, A. M. Ray, K. A. Zuhlke, M. J. Machiela, B. A. Halstead-Nussloch, G. R. Johnson, K. A. Cooney y J. A. Douglas. 2008. Chromosome 8q24 markers: risk of early-onset and familial prostate cancer. International Journal of Cancer, 122(12):2876-2879. DOI: https://doi.org/10.1002/ijc.23471

Bonilla, C., B. Bertoni, P. C. Hidalgo, N. Artagaveytia, E. Ackermann, I. Barreto, P. Cancela, M. Cappetta, A. Egaña, G. Figueiro, S. Heinzen, S. Hooker, E. Román, M. Sans y R. A. Kittles. 2015. Breast cancer risk and genetic ancestry: a case-control study in Uruguay. BMC Womens Health, 15:11. DOI: https://doi.org/10.1186/s12905-015-0171-8

Burchard, E. G., E. Ziv, N. Coyle, S. L. Gomez, H. Tang, A. J. Karter, J. L. Mountain, E. J. Pérez-Stable, D. Sheppard y N. Risch. 2003. The importance of race and ethnic background in biomedical research and clinical practice. New England Journal of Medicine, 348(12):1170-1175. DOI: https://doi.org/10.1056/NEJMsb025007

Cann, H. M., C. de Toma, L. Cazes, M. F. Legrand, V. Morel, L. Piouffre, J. Bodmer, W. F. Bodmer, B. Bonne-Tamir, A. Cambon-Thomsen, Z. Chen, J. Chu, C. Carcassi, L. Contu, R. Du, L. Excoffier, G. B. Ferrara, J. S. Friedlaender, H. Groot, D. Gurwitz, T. Jenkins, R. J. Herrera, X. Huang, J. Kidd, K. K. Kidd, A. Langaney, A. A. Lin, S. Q. Mehdi, P. Parham, A. Piazza, M. P. Pistillo, Y. Qian, Q. Shu, J. Xu, S. Zhu, J. L. Weber, H. T. Greely, M. W. Feldman, G. Thomas, J. Dausset y L. L. Cavalli-Sforza. 2002. A human genome diversity cell line panel. Science, 296(5566):261-262. DOI: https://doi.org/10.1126/science.296.5566.261b

Cardini, A. y S. Elton. 2007. Sample size and sampling error in geometric morphometric studies of size and shape. Zoomorphology, 126(2):121-134. DOI: https://doi.org/10.1007/s00435-007-0036-2

Corach, D., O. Lao, C. Bobillo, K. van Der Gaag, S. Zuniga, M. Vermeulen, K. van Duijn, M. Goedbloed, P. M. Vallone, W. Parson, P. de Knijff y M. Kayser. 2010. Inferring continental ancestry of argentineans from Autosomal, Y-chromosomal and mitochondrial DNA. Annals of Human Genetics, 74(1):65-76. DOI: https://doi.org/10.1111/j.1469-1809.2009.00556.x

Corander, J., P. Waldmann, P. Marttinen y M. J. Sillanpää. 2004. BAPS 2: Enhanced possibilities for the analysis of genetic population structure. Bioinformatics, 20(15): 2363-2369. DOI: https://doi.org/10.1093/bioinformatics/bth250

Dawson, K. J. y K. Belkhir. 2001. A Bayesian approach to the identification of panmictic populations and the assignment of individuals. Genetical Research, 78(1):59­-77. DOI: https://doi.org/10.1017/S001667230100502X

Di Rienzo, J. A., A. W. Guzman y F. Casanoves. 2002. A Multiple Comparisons Method based On the Distribution of the Root Node Distance of a Binary Tree Obtained by Average Linkage of the Matrix of Euclidean Distances between Treatment Means. Journal of Agricultural, Biological, and Environmental Statistics, 7(2):129-142. DOI: https://doi.org/10.1198/10857110260141193

Di Rienzo, J. A., F. Casanoves, M. G. Balzarini, L. Gonzalez, M. Tablada y C. W. Robledo. 2013. InfoStat versión 2013. Grupo InfoStat, FCA, Universidad Nacional de Córdoba, Argentina. http://www.infostat.com.ar.

Galanter, J. M., J. C. Fernandez-Lopez, C. R. Gignoux, J. Barnholtz-Sloan, C. Fernandez-Rozadilla, M. Via, A. Hidalgo-Miranda, A. V. Contreras, L. U. Figueroa, P. Raska, G. Jimenez-Sanchez, I. S. Zolezzi, M. Torres, C. R. Ponte, Y. Ruiz, A. Salas, E. Nguyen, C. Eng, L. Borjas, W. Zabala, G. Barreto, F. R. González, A. Ibarra, P. Taboada, L. Porras, F. Moreno, A. Bigham, G. Gutierrez, T. Brutsaert, F. León-Velarde, L. G. Moore, E. Vargas, M. Cruz, J. Escobedo, J. Rodriguez-Santana, W. Rodriguez-Cintrón, R. Chapela, J. G. Ford, C. Bustamante, D. Seminara, M. Shriver, E. Ziv, E. G. Burchard, R. Haile, E. Parra, A. Carracedo y LACE Consortium. 2012. Development of a panel of genome-wide ancestry informative markers to study admixture throughout the Americas. PLoS Genetics, 8(3): e1002554. http://doi.org/10.1371/journal.pgen.1002554 (Última consulta: 11/10/2015). DOI: https://doi.org/10.1371/journal.pgen.1002554

García, A., L. Tovo-Rodrigues, M. Pauro, S. M. Callegari-Jacques, F. M. Salzano, M. H. Hutz y D. A. Demarchi. 2011. Caracterización del mestizaje en poblaciones del centro de Argentina a partir de marcadores moleculares informativos de ancestralidad (AIM). M. F. Cesani, Libro de Resúmenes de las Décimas Jornadas Nacionales de Antropología Biológica, 136, Asociación de Antropología Biológica Argentina, City Bell.

González, P. N., V. Bernal, S. I. Pérez, M. Del Papa, F. Gordon y G. Ghidini. 2004. El error de observación y su influencia en los análisis morfológicos de restos óseos humanos. Datos de variación discreta. Revista Argentina de Antropología Biológica, 6(1):35-46.

González-José, R., I. Escapa, W. A. Neves, R. Cúneo y H. M. Pucciarelli. 2011. Morphometric variables can be analyzed using cladistic methods: a reply to Adams et al. Journal of Human Evolution, 60(2):244-245. DOI: https://doi.org/10.1016/j.jhevol.2010.11.001

Halder, I. y M. D. Shriver. 2003. Measuring and using admixture to study the genetics of complex diseases. Human Genomics, 1(1):52-62. DOI: https://doi.org/10.1186/1479-7364-1-1-52

Handley, L. J., A. Manica, J. Goudet y F. Balloux. 2007. Going the distance: human population genetics in a clinal world. Trends in Genetics, 23(9):432-439. DOI: https://doi.org/10.1016/j.tig.2007.07.002

Haryono, S. J., I. G. Datasena, W. B. Santosa, R. Mulyarahardja y K. Sari. 2015. A pilot genome-wide association study of breast cancer susceptibility loci in Indonesia. Asian Pacific Journal of Cancer Prevention, 16(6):2231-2235. DOI: https://doi.org/10.7314/APJCP.2015.16.6.2231

Heinz, T., V. Alvarez-Iglesias, J. Pardo-Seco, P. Taboada-Echalar, A. Gómez-Carballa, A. Torres-Balanza, O. Rocabado, A. Carracedo, C. Vullo y A. Salas. 2013. Ancestry analysis reveals a predominant Native American component with moderate European admixture in Bolivians. Forensic Science International. Genetics, 7(5):537-542. DOI: https://doi.org/10.1016/j.fsigen.2013.05.012

International HapMap Consortium, 2003. The International HapMap Project. Nature, 426(6968):789-796. DOI: https://doi.org/10.1038/nature02168

Keene, K. L., J. C. Mychaleckyj, T. S. Leak, S. G. Smith, P. S. Perlegas, J. Divers, C. D. Langefeld, B. I. Freedman, D. W. Bowden y M. M. Sale. 2008. Exploration of the utility of ancestry informative markers for genetic association studies of African Americans with type 2 diabetes and end stage renal disease. Human Genetics, 124(2):147-154. DOI: https://doi.org/10.1007/s00439-008-0532-6

Manel, S., P. Berthier y G. Luikart. 2002. Detecting wildlife poaching: Identifying the origin of individuals with Bayesian assignment tests and multilocus genotypes. Conservation Biology, 16(3):650-659. DOI: https://doi.org/10.1046/j.1523-1739.2002.00576.x

Marchini, J., L. R. Cardon, M. S. Phillips y P. Donnelly. 2004. The effects of human population structure on large genetic association studies. Nature Genetics, 36(5):512-517. DOI: https://doi.org/10.1038/ng1337

Nalls, M. A., J. G. Wilson, N. J. Patterson, A. Tandon, J. M. Zmuda, S. Huntsman, M. García, D. Hu, R. Li, B. A. Beamer, K. V. Patel, E. L. Akylbekova, J. C. Files, C. L. Hardy, S. G. Buxbaum, H. A. Taylor, D. Reich, T. B. Harris y E. Ziv. 2008. Admixture mapping of white cell count: genetic locus responsible for lower white blood cell count in the Health ABC and Jackson Heart studies. American Journal of Human Genetics, 82(1):81-87. DOI: https://doi.org/10.1016/j.ajhg.2007.09.003

Peprah, E., H. Xu, F. Tekola-Ayele y C. D. Royal. 2015. Genome-wide association studies in Africans and African Americans: expanding the framework of the genomics of human traits and disease. Public Health Genomics, 18(1):40-51. DOI: https://doi.org/10.1159/000367962

Pinheiro, J., D. Bates, S. DebRoy, D. Sarkar y R Core Team. 2015. nlme: Linear and Nonlinear Mixed Effects Models. R package version 3.1-120. http://CRAN.R-project.org/package=nlme.

Price, A. L., N. Patterson, F. Yu, D. R. Cox, A. Waliszewska, G. J. McDonald, A. Tandon, C. Schirmer, J. Neubauer, G. Bedoya, C. Duque, A. Villegas, M. C. Bortolini, F. M. Salzano, C. Gallo, G. Mazzotti, M. Tello-Ruiz, L. Riba, C. A. Aguilar-Salinas, S. Canizales-Quinteros, M. Menjivar, W. Klitz, B. Henderson, C. A. Haiman, C. Winkler, T. Tusie-Luna, A. Ruiz-Linares y D. Reich. 2007. A genomewide admixture map for Latino populations. American Journal of Human Genetics, 80(6):1024-1036. DOI: https://doi.org/10.1086/518313

Pritchard, J. K., M. Stephens y P. Donnelly. 2000. Inference of population structure using multilocus genotype data. Genetics, 155(2):945-959. DOI: https://doi.org/10.1093/genetics/155.2.945

Pritchard, J. K. y P. Donnelly. 2001. Case-control studies of association in structured or admixed populations. Theoretical Population Biology, 60(3):227-237. DOI: https://doi.org/10.1006/tpbi.2001.1543

R Core Team. 2014. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. http://www.R-project.org/.

Rice, W. R. 1989. Analyzing tables of statistical tests. Evolution, 43(1):223-225. DOI: https://doi.org/10.1111/j.1558-5646.1989.tb04220.x

Robbins, C., J. B. Torres, S. Hooker, C. Bonilla, W. Hernandez, A. Candreva, C. Ahaghotu, R. Kittles y J. Carpten. 2007. Confirmation study of prostate cancer risk variants at 8q24 in African Americans identifies a novel risk locus. Genome Research, 17(12):1717-1722. DOI: https://doi.org/10.1101/gr.6782707

Rohlf, F.J. y L. F. Marcus. 1993. A revolution in morphometrics. Trends in Ecology & Evolution, 8(4):129-132. DOI: https://doi.org/10.1016/0169-5347(93)90024-J

Rosenberg, N. A., J. K. Pritchard, J. L. Weber, H. M. Cann, K. K. Kidd, L. A. Zhivotovsky y M. W. Feldman. 2002. Genetic structure of human populations. Science, 298(5602):2381-2385. DOI: https://doi.org/10.1126/science.1078311

Rosenberg, N. A., S. Mahajan, S. Ramachandran, C. Zhao, J. K. Pritchard y M. W. Feldman. 2005. Clines, clusters, and the effect of study design on the inference of human population structure. PLoS Genetics, 1(6):e70. DOI: https://doi.org/10.1371/journal.pgen.0010070

Rúa, O., I. M. Larráyoz, M. T. Barajas, S. Velilla y A. Martínez. 2012. Oral doxycycline reduces pterygium lesions; results from a double blind, randomized, placebo controlled clinical trial. PLoS One, 7(12):e52696. http://doi.org/10.1371/journal.pone.0052696 (Última consulta: 11/10/2015). DOI: https://doi.org/10.1371/journal.pone.0052696

Ruiz-Linares, A., K. Adhikari, V. Acuña-Alonzo, M. Quinto-Sanchez, C. Jaramillo, W. Arias, M. Fuentes, M. Pizarro, P. Everardo, F. de Avila, J. Gómez-Valdés, P. León-Mimila, T. Hunemeier, V. Ramallo, C. C. Silva de Cerqueira, M. W. Burley, E. Konca, M. Z. de Oliveira, M. R. Veronez, M. Rubio-Codina, O. Attanasio, S. Gibbon, N. Ray, C. Gallo, G. Poletti, J. Rosique, L. Schuler-Faccini, F. M. Salzano, M. C. Bortolini, S. Canizales-Quinteros, F. Rothhammer, G. Bedoya, D. Balding y R. Gonzalez-José. 2014. Admixture in Latin America: Geographic structure, phenotypic diversity and self-perception of ancestry based on 7,342 individuals. PLoS Genetics, 10(9):e1004572. http://doi.org/10.1371/journal.pgen.1004572 (Última consulta: 11/10/2015). DOI: https://doi.org/10.1371/journal.pgen.1004572

Ruiz-Narváez, E. A., L. Rosenberg, L. A. Wise, D. Reich y J. Palmer. 2010. Validation of a small set of Ancestral Informative Markers for control of population admixture in African Americans. American Journal of Epidemiology, 173(5):587-592. DOI: https://doi.org/10.1093/aje/kwq401

Schwarz, G. 1978. Estimating the dimension of a model. Annals of Statistics, 6(2):461-464. DOI: https://doi.org/10.1214/aos/1176344136

Silva-Zolezzi, I., A. Hidalgo-Miranda, J. Estrada-Gil, J. C. Fernandez-Lopez, L. Uribe-Figueroa, A. Contreras, E. Balam-Ortiz, L. del Bosque-Plata, D. Velazquez-Fernandez, C. Lara, R. Goya, E. Hernandez-Lemus, C. Davila, E. Barrientos, S. March y G. Jimenez-Sanchez. 2009. Analysis of genomic diversity in Mexican Mestizo populations to develop genomic medicine in Mexico. Proceedings of the National Academy of Sciences of the United States of America, 106(21):8611-8616. DOI: https://doi.org/10.1073/pnas.0903045106

Tang, H., J. Peng, P. Wang y N. Risch. 2005. Estimation of individual admixture: analytical and study design considerations. Genetic Epidemiology, 28(4):289-301. DOI: https://doi.org/10.1002/gepi.20064

Torcida, S. y S. I. Pérez. 2012. Análisis de Procrustes y el estudio de la variación morfológica. Revista Argentina de Antropología Biológica, 14(1):131-141.

Toscanini, U., L. Gusmão, G. Berardi, A. Gómez, R. Pereira y E. Raimondi. 2011. Ancestry proportions in urban populations of Argentina. Forensic Science International: Genetics Supplement Series, 3(1):e387-e388. DOI: https://doi.org/10.1016/j.fsigss.2011.09.055

Trinks, J., M. L. Hulaniuk, M. Caputo, L. B. Pratx, V. Ré, L. Fortuny, A. Pontoriero, A. Frías, O. Torres, F. Nuñez, V. Gadano, D. Corach y D. Flichman. 2014. Distribution of genetic polymorphisms associated with hepatitis C virus (HCV) antiviral response in a multiethnic and admixed population. The Pharmacogenomics Journal, 14(6):549-554. DOI: https://doi.org/10.1038/tpj.2014.20

Tsai, H. J., S. Choudhry, M. Naqvi, W. Rodriguez-Cintron, E. G. Burchard y E. Ziv. 2005. Comparison of three methods to estimate genetic ancestry and control for stratification in genetic association studies among admixed populations. Human Genetics, 118(3-4):424-433. DOI: https://doi.org/10.1007/s00439-005-0067-z

Turakulov, R. y S. Easteal. 2003. Number of SNPS loci needed to detect population structure. Human Heredity, 55(1):37-45. DOI: https://doi.org/10.1159/000071808

Utermohle CJ, Zegura SL. 1982. Intra- and interobserver error in craniometry: a cautionary tale. Am J Phys Anthropol 57(3):303-10. DOI: https://doi.org/10.1002/ajpa.1330570307

Wheeler, H. E., L. K. Gorsic, M. Welsh, A. L. Stark, E. R. Gamazon, N. J. Cox y M. E. Dolan. 2011. Genome-wide local ancestry approach identifies genes and variants associated with chemotherapeutic susceptibility in African Americans. PLoS One, 6(7):e21920. http://doi.org/10.1371/journal.pone.0021920 (Última consulta: 11/10/2015). DOI: https://doi.org/10.1371/journal.pone.0021920

Zhang, Q., C. E. Lewis, L. E. Wagenknecht, R. H. Myers, J. S. Pankow, S. C. Hunt, K. E. North, J. E. Hixson, J. Jeffrey Carr, L. C. Shimmin, I. Borecki y M. A. Province. 2008. Genome-wide admixture mapping for coronary artery calcification in African Americans: the NHLBI Family Heart Study. Genetic Epidemiology, 32(3):264-272. DOI: https://doi.org/10.1002/gepi.20301

Zhu, X. y R. S. Cooper. 2007. Admixture mapping provides evidence of association of the VNN1 gene with hypertension. PLoS One, 2(11):e1244. http://doi.org/10.1371/journal.pone.0001244 (Última consulta: 11/10/2015). DOI: https://doi.org/10.1371/journal.pone.0001244

Ziv, E., E. M. John, S. Choudhry, J. Kho, W. Lorizio, E. J. Perez-Stable y E. G. Burchard. 2006. Genetic ancestry and risk factors for breast cancer among Latinas in the San Francisco Bay Area. Cancer epidemiology, biomarkers & prevention: a publication of the American Association for Cancer Research, cosponsored by the American Society of Preventive Oncology, 15(10):1878-1885. DOI: https://doi.org/10.1158/1055-9965.EPI-06-0092

Published

2016-06-22

How to Cite

Russo, M. G., Di Fabio Rocca, F., Doldán, P., Cardozo, D. G., Dejean, C. B., Seldes, V., & Avena, S. (2016). Evaluation of the minimum number of markers for individual ancestry estimation in an Argentinean population sample. Revista Del Museo De Antropología, 9(1), 49–56. https://doi.org/10.31048/1852.4826.v9.n1.12579

Issue

Section

Biological Anthropology