Evaluation of the minimum number of markers for individual ancestry estimation in an Argentinean population sample

Authors

  • María Gabriela Russo Universidad Maimónides
  • Francisco Di Fabio Rocca Universidad Maimónides
  • Patricio Doldán Universidad Maimónides
  • Darío Gonzalo Cardozo Universidad Maimónides
  • Cristina Beatriz Dejean Universidad Maimónides
  • Verónica Seldes Universidad de Buenos Aires. Facultad de Filosofía y Letras. Instituto de Ciencias Antropológicas. Sección de Antropología Biológica
  • Sergio Avena Universidad Maimónides

DOI:

https://doi.org/10.31048/1852.4826.v9.n1.12579

Keywords:

number of AIMs, individual ancestry, Argentinean population

Abstract

Estimation of individual ancestry has great relevance when studying population composition in regions like South America, where intensive admixture processes have occurred, being also important in biomedical sciences. For that reason, it is important to assess the factors that may affect the reliability of results. In this work, we investigate the minimum number of ancestry informative markers (AIMs) for obtaining acceptable estimations of ancestry. As an example, we take individuals from a population sample of different Argentinean regions. Considering a three component model (Native American, Eurasian and Sub-Saharan), we calculated ancestry of 441 individuals using 10, 20, 30 and 50 AIMs. The results indicate that the number of markers affects ancestry estimation and its accuracy increases with AIMs number. When compared to previous estimations obtained from 99 AIMs, the result shows that at least 30 markers are needed to achieve good correlation values for the minority component (Sub-Saharan in this case). For individual ancestry studies, we suggest to take into account not only the number of markers, but also its informativeness and the background of the studied population.

Downloads

Download data is not yet available.

Author Biographies

  • María Gabriela Russo, Universidad Maimónides
    Lic. en Cs. Biológicas y estudiante de Doctorado (UBA). Becaria CONICET - Universidad Maimónides.
  • Francisco Di Fabio Rocca, Universidad Maimónides
    Equipo de Antropología Biológica, Departamento de Cs. Naturales y Antropológicas, CEBBAD, Fundación de Historia Natural Félix de Azara, Universidad Maimónides. CONICET.
  • Patricio Doldán, Universidad Maimónides
    Equipo de Antropología Biológica, Departamento de Cs. Naturales y Antropológicas, CEBBAD, Fundación de Historia Natural Félix de Azara, Universidad Maimónides.
  • Darío Gonzalo Cardozo, Universidad Maimónides
    Equipo de Antropología Biológica, Departamento de Cs. Naturales y Antropológicas, CEBBAD, Fundación de Historia Natural Félix de Azara, Universidad Maimónides. Sección de Antropología Biológica, ICA, Facultad de Filosofía y Letras, Universidad de Buenos Aires. CONICET.
  • Cristina Beatriz Dejean, Universidad Maimónides
    Equipo de Antropología Biológica, Departamento de Cs. Naturales y Antropológicas, CEBBAD, Fundación de Historia Natural Félix de Azara, Universidad Maimónides. Sección de Antropología Biológica, ICA, Facultad de Filosofía y Letras, Universidad de Buenos Aires, Argentina.
  • Verónica Seldes, Universidad de Buenos Aires. Facultad de Filosofía y Letras. Instituto de Ciencias Antropológicas. Sección de Antropología Biológica

    Consejo Nacional de Investigaciones Científicas y Técnicas. 

  • Sergio Avena, Universidad Maimónides
    Equipo de Antropología Biológica, Departamento de Cs. Naturales y Antropológicas, CEBBAD, Fundación de Historia Natural Félix de Azara, Universidad Maimónides. Sección de Antropología Biológica, ICA, Facultad de Filosofía y Letras, Universidad de Buenos Aires. CONICET.

References

Alexander, D. H., J. Novembre y K. Lange. 2009. Fast model-based estimation of ancestry in unrelated individuals. Genome Research, 19(9):1655-1664.

Akaike, H. 1974. A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19(6):716-723.

Avena, S., M. Via, E. Ziv, E. J. Pérez-Stable, C.R. Gignoux, C. Dejean, S. Huntsman, G. Torres-Mejía, J. Dutil, J. L. Matta, K. Beckman, E. G. Burchard, M. L. Parolin, A. Goicoechea, N. Acreche, M. Boquet, M. C. Ríos Part, V. Fernández, J. Rey, M. C. Stern, R. F. Carnese y L. Fejerman. 2012. Heterogeneity in genetic admixture across different regions of Argentina. PLoS One, 7(4):e34695. http://doi.org/10.1371/journal.pone.0034695 (Última consulta: 11/10/2015).

Banks, M. A. y W. Eichert. 2000. WHICHRUN (version 3.2): a computer program for population assignment of individuals based on multilocus genotype data. Journal of Heredity, 91(1):87-89.

Beebe-Dimmer, J. L., A. M. Levin, A. M. Ray, K. A. Zuhlke, M. J. Machiela, B. A. Halstead-Nussloch, G. R. Johnson, K. A. Cooney y J. A. Douglas. 2008. Chromosome 8q24 markers: risk of early-onset and familial prostate cancer. International Journal of Cancer, 122(12):2876-2879.

Bonilla, C., B. Bertoni, P. C. Hidalgo, N. Artagaveytia, E. Ackermann, I. Barreto, P. Cancela, M. Cappetta, A. Egaña, G. Figueiro, S. Heinzen, S. Hooker, E. Román, M. Sans y R. A. Kittles. 2015. Breast cancer risk and genetic ancestry: a case-control study in Uruguay. BMC Womens Health, 15:11.

Burchard, E. G., E. Ziv, N. Coyle, S. L. Gomez, H. Tang, A. J. Karter, J. L. Mountain, E. J. Pérez-Stable, D. Sheppard y N. Risch. 2003. The importance of race and ethnic background in biomedical research and clinical practice. New England Journal of Medicine, 348(12):1170-1175.

Cann, H. M., C. de Toma, L. Cazes, M. F. Legrand, V. Morel, L. Piouffre, J. Bodmer, W. F. Bodmer, B. Bonne-Tamir, A. Cambon-Thomsen, Z. Chen, J. Chu, C. Carcassi, L. Contu, R. Du, L. Excoffier, G. B. Ferrara, J. S. Friedlaender, H. Groot, D. Gurwitz, T. Jenkins, R. J. Herrera, X. Huang, J. Kidd, K. K. Kidd, A. Langaney, A. A. Lin, S. Q. Mehdi, P. Parham, A. Piazza, M. P. Pistillo, Y. Qian, Q. Shu, J. Xu, S. Zhu, J. L. Weber, H. T. Greely, M. W. Feldman, G. Thomas, J. Dausset y L. L. Cavalli-Sforza. 2002. A human genome diversity cell line panel. Science, 296(5566):261-262.

Cardini, A. y S. Elton. 2007. Sample size and sampling error in geometric morphometric studies of size and shape. Zoomorphology, 126(2):121-134.

Corach, D., O. Lao, C. Bobillo, K. van Der Gaag, S. Zuniga, M. Vermeulen, K. van Duijn, M. Goedbloed, P. M. Vallone, W. Parson, P. de Knijff y M. Kayser. 2010. Inferring continental ancestry of argentineans from Autosomal, Y-chromosomal and mitochondrial DNA. Annals of Human Genetics, 74(1):65-76.

Corander, J., P. Waldmann, P. Marttinen y M. J. Sillanpää. 2004. BAPS 2: Enhanced possibilities for the analysis of genetic population structure. Bioinformatics, 20(15): 2363-2369.

Dawson, K. J. y K. Belkhir. 2001. A Bayesian approach to the identification of panmictic populations and the assignment of individuals. Genetical Research, 78(1):59­-77.

Di Rienzo, J. A., A. W. Guzman y F. Casanoves. 2002. A Multiple Comparisons Method based On the Distribution of the Root Node Distance of a Binary Tree Obtained by Average Linkage of the Matrix of Euclidean Distances between Treatment Means. Journal of Agricultural, Biological, and Environmental Statistics, 7(2):129-142.

Di Rienzo, J. A., F. Casanoves, M. G. Balzarini, L. Gonzalez, M. Tablada y C. W. Robledo. 2013. InfoStat versión 2013. Grupo InfoStat, FCA, Universidad Nacional de Córdoba, Argentina. http://www.infostat.com.ar.

Galanter, J. M., J. C. Fernandez-Lopez, C. R. Gignoux, J. Barnholtz-Sloan, C. Fernandez-Rozadilla, M. Via, A. Hidalgo-Miranda, A. V. Contreras, L. U. Figueroa, P. Raska, G. Jimenez-Sanchez, I. S. Zolezzi, M. Torres, C. R. Ponte, Y. Ruiz, A. Salas, E. Nguyen, C. Eng, L. Borjas, W. Zabala, G. Barreto, F. R. González, A. Ibarra, P. Taboada, L. Porras, F. Moreno, A. Bigham, G. Gutierrez, T. Brutsaert, F. León-Velarde, L. G. Moore, E. Vargas, M. Cruz, J. Escobedo, J. Rodriguez-Santana, W. Rodriguez-Cintrón, R. Chapela, J. G. Ford, C. Bustamante, D. Seminara, M. Shriver, E. Ziv, E. G. Burchard, R. Haile, E. Parra, A. Carracedo y LACE Consortium. 2012. Development of a panel of genome-wide ancestry informative markers to study admixture throughout the Americas. PLoS Genetics, 8(3): e1002554. http://doi.org/10.1371/journal.pgen.1002554 (Última consulta: 11/10/2015).

García, A., L. Tovo-Rodrigues, M. Pauro, S. M. Callegari-Jacques, F. M. Salzano, M. H. Hutz y D. A. Demarchi. 2011. Caracterización del mestizaje en poblaciones del centro de Argentina a partir de marcadores moleculares informativos de ancestralidad (AIM). M. F. Cesani, Libro de Resúmenes de las Décimas Jornadas Nacionales de Antropología Biológica, 136, Asociación de Antropología Biológica Argentina, City Bell.

González, P. N., V. Bernal, S. I. Pérez, M. Del Papa, F. Gordon y G. Ghidini. 2004. El error de observación y su influencia en los análisis morfológicos de restos óseos humanos. Datos de variación discreta. Revista Argentina de Antropología Biológica, 6(1):35-46.

González-José, R., I. Escapa, W. A. Neves, R. Cúneo y H. M. Pucciarelli. 2011. Morphometric variables can be analyzed using cladistic methods: a reply to Adams et al. Journal of Human Evolution, 60(2):244-245.

Halder, I. y M. D. Shriver. 2003. Measuring and using admixture to study the genetics of complex diseases. Human Genomics, 1(1):52-62.

Handley, L. J., A. Manica, J. Goudet y F. Balloux. 2007. Going the distance: human population genetics in a clinal world. Trends in Genetics, 23(9):432-439.

Haryono, S. J., I. G. Datasena, W. B. Santosa, R. Mulyarahardja y K. Sari. 2015. A pilot genome-wide association study of breast cancer susceptibility loci in Indonesia. Asian Pacific Journal of Cancer Prevention, 16(6):2231-2235.

Heinz, T., V. Alvarez-Iglesias, J. Pardo-Seco, P. Taboada-Echalar, A. Gómez-Carballa, A. Torres-Balanza, O. Rocabado, A. Carracedo, C. Vullo y A. Salas. 2013. Ancestry analysis reveals a predominant Native American component with moderate European admixture in Bolivians. Forensic Science International. Genetics, 7(5):537-542.

International HapMap Consortium, 2003. The International HapMap Project. Nature, 426(6968):789-796.

Keene, K. L., J. C. Mychaleckyj, T. S. Leak, S. G. Smith, P. S. Perlegas, J. Divers, C. D. Langefeld, B. I. Freedman, D. W. Bowden y M. M. Sale. 2008. Exploration of the utility of ancestry informative markers for genetic association studies of African Americans with type 2 diabetes and end stage renal disease. Human Genetics, 124(2):147-154.

Manel, S., P. Berthier y G. Luikart. 2002. Detecting wildlife poaching: Identifying the origin of individuals with Bayesian assignment tests and multilocus genotypes. Conservation Biology, 16(3):650-659.

Marchini, J., L. R. Cardon, M. S. Phillips y P. Donnelly. 2004. The effects of human population structure on large genetic association studies. Nature Genetics, 36(5):512-517.

Nalls, M. A., J. G. Wilson, N. J. Patterson, A. Tandon, J. M. Zmuda, S. Huntsman, M. García, D. Hu, R. Li, B. A. Beamer, K. V. Patel, E. L. Akylbekova, J. C. Files, C. L. Hardy, S. G. Buxbaum, H. A. Taylor, D. Reich, T. B. Harris y E. Ziv. 2008. Admixture mapping of white cell count: genetic locus responsible for lower white blood cell count in the Health ABC and Jackson Heart studies. American Journal of Human Genetics, 82(1):81-87.

Peprah, E., H. Xu, F. Tekola-Ayele y C. D. Royal. 2015. Genome-wide association studies in Africans and African Americans: expanding the framework of the genomics of human traits and disease. Public Health Genomics, 18(1):40-51.

Pinheiro, J., D. Bates, S. DebRoy, D. Sarkar y R Core Team. 2015. nlme: Linear and Nonlinear Mixed Effects Models. R package version 3.1-120. http://CRAN.R-project.org/package=nlme.

Price, A. L., N. Patterson, F. Yu, D. R. Cox, A. Waliszewska, G. J. McDonald, A. Tandon, C. Schirmer, J. Neubauer, G. Bedoya, C. Duque, A. Villegas, M. C. Bortolini, F. M. Salzano, C. Gallo, G. Mazzotti, M. Tello-Ruiz, L. Riba, C. A. Aguilar-Salinas, S. Canizales-Quinteros, M. Menjivar, W. Klitz, B. Henderson, C. A. Haiman, C. Winkler, T. Tusie-Luna, A. Ruiz-Linares y D. Reich. 2007. A genomewide admixture map for Latino populations. American Journal of Human Genetics, 80(6):1024-1036.

Pritchard, J. K., M. Stephens y P. Donnelly. 2000. Inference of population structure using multilocus genotype data. Genetics, 155(2):945-959.

Pritchard, J. K. y P. Donnelly. 2001. Case-control studies of association in structured or admixed populations. Theoretical Population Biology, 60(3):227-237.

R Core Team. 2014. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. http://www.R-project.org/.

Rice, W. R. 1989. Analyzing tables of statistical tests. Evolution, 43(1):223-225.

Robbins, C., J. B. Torres, S. Hooker, C. Bonilla, W. Hernandez, A. Candreva, C. Ahaghotu, R. Kittles y J. Carpten. 2007. Confirmation study of prostate cancer risk variants at 8q24 in African Americans identifies a novel risk locus. Genome Research, 17(12):1717-1722.

Rohlf, F.J. y L. F. Marcus. 1993. A revolution in morphometrics. Trends in Ecology & Evolution, 8(4):129-132.

Rosenberg, N. A., J. K. Pritchard, J. L. Weber, H. M. Cann, K. K. Kidd, L. A. Zhivotovsky y M. W. Feldman. 2002. Genetic structure of human populations. Science, 298(5602):2381-2385.

Rosenberg, N. A., S. Mahajan, S. Ramachandran, C. Zhao, J. K. Pritchard y M. W. Feldman. 2005. Clines, clusters, and the effect of study design on the inference of human population structure. PLoS Genetics, 1(6):e70.

Rúa, O., I. M. Larráyoz, M. T. Barajas, S. Velilla y A. Martínez. 2012. Oral doxycycline reduces pterygium lesions; results from a double blind, randomized, placebo controlled clinical trial. PLoS One, 7(12):e52696. http://doi.org/10.1371/journal.pone.0052696 (Última consulta: 11/10/2015).

Ruiz-Linares, A., K. Adhikari, V. Acuña-Alonzo, M. Quinto-Sanchez, C. Jaramillo, W. Arias, M. Fuentes, M. Pizarro, P. Everardo, F. de Avila, J. Gómez-Valdés, P. León-Mimila, T. Hunemeier, V. Ramallo, C. C. Silva de Cerqueira, M. W. Burley, E. Konca, M. Z. de Oliveira, M. R. Veronez, M. Rubio-Codina, O. Attanasio, S. Gibbon, N. Ray, C. Gallo, G. Poletti, J. Rosique, L. Schuler-Faccini, F. M. Salzano, M. C. Bortolini, S. Canizales-Quinteros, F. Rothhammer, G. Bedoya, D. Balding y R. Gonzalez-José. 2014. Admixture in Latin America: Geographic structure, phenotypic diversity and self-perception of ancestry based on 7,342 individuals. PLoS Genetics, 10(9):e1004572. http://doi.org/10.1371/journal.pgen.1004572 (Última consulta: 11/10/2015).

Ruiz-Narváez, E. A., L. Rosenberg, L. A. Wise, D. Reich y J. Palmer. 2010. Validation of a small set of Ancestral Informative Markers for control of population admixture in African Americans. American Journal of Epidemiology, 173(5):587-592.

Schwarz, G. 1978. Estimating the dimension of a model. Annals of Statistics, 6(2):461-464.

Silva-Zolezzi, I., A. Hidalgo-Miranda, J. Estrada-Gil, J. C. Fernandez-Lopez, L. Uribe-Figueroa, A. Contreras, E. Balam-Ortiz, L. del Bosque-Plata, D. Velazquez-Fernandez, C. Lara, R. Goya, E. Hernandez-Lemus, C. Davila, E. Barrientos, S. March y G. Jimenez-Sanchez. 2009. Analysis of genomic diversity in Mexican Mestizo populations to develop genomic medicine in Mexico. Proceedings of the National Academy of Sciences of the United States of America, 106(21):8611-8616.

Tang, H., J. Peng, P. Wang y N. Risch. 2005. Estimation of individual admixture: analytical and study design considerations. Genetic Epidemiology, 28(4):289-301.

Torcida, S. y S. I. Pérez. 2012. Análisis de Procrustes y el estudio de la variación morfológica. Revista Argentina de Antropología Biológica, 14(1):131-141.

Toscanini, U., L. Gusmão, G. Berardi, A. Gómez, R. Pereira y E. Raimondi. 2011. Ancestry proportions in urban populations of Argentina. Forensic Science International: Genetics Supplement Series, 3(1):e387-e388.

Trinks, J., M. L. Hulaniuk, M. Caputo, L. B. Pratx, V. Ré, L. Fortuny, A. Pontoriero, A. Frías, O. Torres, F. Nuñez, V. Gadano, D. Corach y D. Flichman. 2014. Distribution of genetic polymorphisms associated with hepatitis C virus (HCV) antiviral response in a multiethnic and admixed population. The Pharmacogenomics Journal, 14(6):549-554.

Tsai, H. J., S. Choudhry, M. Naqvi, W. Rodriguez-Cintron, E. G. Burchard y E. Ziv. 2005. Comparison of three methods to estimate genetic ancestry and control for stratification in genetic association studies among admixed populations. Human Genetics, 118(3-4):424-433.

Turakulov, R. y S. Easteal. 2003. Number of SNPS loci needed to detect population structure. Human Heredity, 55(1):37-45.

Utermohle CJ, Zegura SL. 1982. Intra- and interobserver error in craniometry: a cautionary tale. Am J Phys Anthropol 57(3):303-10.

Wheeler, H. E., L. K. Gorsic, M. Welsh, A. L. Stark, E. R. Gamazon, N. J. Cox y M. E. Dolan. 2011. Genome-wide local ancestry approach identifies genes and variants associated with chemotherapeutic susceptibility in African Americans. PLoS One, 6(7):e21920. http://doi.org/10.1371/journal.pone.0021920 (Última consulta: 11/10/2015).

Zhang, Q., C. E. Lewis, L. E. Wagenknecht, R. H. Myers, J. S. Pankow, S. C. Hunt, K. E. North, J. E. Hixson, J. Jeffrey Carr, L. C. Shimmin, I. Borecki y M. A. Province. 2008. Genome-wide admixture mapping for coronary artery calcification in African Americans: the NHLBI Family Heart Study. Genetic Epidemiology, 32(3):264-272.

Zhu, X. y R. S. Cooper. 2007. Admixture mapping provides evidence of association of the VNN1 gene with hypertension. PLoS One, 2(11):e1244. http://doi.org/10.1371/journal.pone.0001244 (Última consulta: 11/10/2015).

Ziv, E., E. M. John, S. Choudhry, J. Kho, W. Lorizio, E. J. Perez-Stable y E. G. Burchard. 2006. Genetic ancestry and risk factors for breast cancer among Latinas in the San Francisco Bay Area. Cancer epidemiology, biomarkers & prevention: a publication of the American Association for Cancer Research, cosponsored by the American Society of Preventive Oncology, 15(10):1878-1885.

Downloads

Published

2016-06-22

Issue

Section

Biological Anthropology

How to Cite

Russo, M. G., Di Fabio Rocca, F., Doldán, P., Cardozo, D. G., Dejean, C. B., Seldes, V., & Avena, S. (2016). Evaluation of the minimum number of markers for individual ancestry estimation in an Argentinean population sample. Revista Del Museo De Antropología, 9(1), 49-56. https://doi.org/10.31048/1852.4826.v9.n1.12579

Similar Articles

1-10 of 186

You may also start an advanced similarity search for this article.