Genetic Admixture in the Mexican Mestizo Population and implications for Future Medical and Anthropological Studies.
Ricardo M. Cerda-Flores and Augusto Rojas-Martínez. Facultad de Enfermería y Centro de Investigación y Desarrollo en Ciencias de la Salud, Universidad Autónoma de Nuevo León. Calle Carlos Canseco, S.N. Colonia Mitras Centro. Monterrey, Mexico. C.P. 64460. E-mail:
[email protected].
Common complex diseases like cardiovascular disorders, cancer, asthma, diabetes, high blood pressure, schizophrenia, cleft lip and palate, etc. are the main focus of current genomic and epidemiological research. For most of these illnesses, the ethnic background is suspected to play an important ethiopathogenic role. Genomewide association studies have been conducted to find genetic sequences or genes associated to several of these multifactorial disorders, but most of them have found a small effect size and low heredability to be valuable for genetic probability testing.1 It is clear than in a near future; better planning of translational-focused genomics research and better bioinformatic approaches will improve our knowledge on gene-gene and gene-environment interactions, and genome tests will find their value to predict clinical outcomes. 2 Most of these genome analyzes have been conducted in subjects from particular ethnic groups, mainly Caucasians, Africans, and East Asians for whom specific and defined panel of polymorphic markers have been developed after the conclusion of the HapMap project 3,
1
but conducting these studies in populations with ethnic admixture represent a challenge, as well an opportunity to understand the genetic contributions to mixed racial phenotypes. Because of their recent history, the admixed population of Latin America constitutes an interesting case-study for clinical genomics research.
Latin America contributes to about 8.3% of the global population4 and is an active growing population, not just in the subcontinent, but in industrialized countries like the US, Canada and some European countries. The main ethnic population of this area is currently known “Hispanic” in the medical literature, a very confusing word to describe a mixed population descending from at least three main parental groups: Native Americans, Caucasians from Spain, and Africans. In its most precise definition, Hispanic means native or belonging to Spain, but this term is used to define Spanish speakers, or subjects with Spanish-derived last name, although this can include Philippine, a group clearly different than the groups populating Latin America. The Medical Subject Headings of the US National Library of Medicine (MeSH) define Hispanic Americans as: “any person living in the United States of Mexican (Mexican Americans), Puerto Rican, Cuban, Central or South American, or other Spanish culture or origin. The concept does not include Brazilian Americans or Portuguese Americans”.5 This concept was introduced in this index in 1978, probably based on a former concept of the US Census Bureau, which recently states that: “U.S. federal government agencies must adhere to standards issued by the Office of Management and Budget (OMB) in October 1997, which specify that race and Hispanic origin (also known as ethnicity) are two separate and distinct concepts. These standards generally reflect a
2
social definition of race and ethnicity recognized in this country and they do not conform to any biological, anthropological, or genetic criteria. The standards include five minimum categories for data on race: "American Indian or Alaska Native," "Asian," "Black or African American," "Native Hawaiian or Other Pacific Islander," and "White." There are two minimum categories for data on ethnicity: "Hispanic or Latino" and "Not Hispanic or Latino." The concept of race reflects self-identification by people according to the race or races with which they most closely identify”. 6 As reflected, Hispanic is still a vague notwell defined concept, even for two agencies of the same government; but in the world medical literature, this definition has become operational in the clinical and biomedical research. Although debatable, it may be preferable to use the Spanish term “Mestizo”, which describes the descendents of mating resulting from the above described parental ethnic groups, to conclude that the large proportion of the Latin American population is constituted by Mestizos.
Proportions of original ethnic groups that arrived to the Americas and Mestizos differ from each Latin American country, and even from different regions inside some countries. Per example, Mexican population represents a particular intermingle of racial admixture with a predominant Mestizo population and a representative proportion of American aborigines (5.6%) representing 62 linguistic groups that are unevenly distributed across the country. 7 Mestizos constitute the predominant group in the Mexican urban areas and the main focus of healthcare coverage. Several anthropologic and genetic studies have shown a gradient distribution of the Mestizo to Native American populations along the
3
North-South axis of the country, with relevant African contributions in communities of the Gulf and Pacific coasts. There is also some presence of descendants from other geographic areas, mainly from the Middle East, Central and South Europe, Portugal, and East Asia that mostly arrived during the 20th century.
Initial works on the genetic structure of the Mexican population, based on blood groups, polymorphic proteins, hemoglobin variants, HLA and DNA markers, where crucial to define the inter-racial variability, mainly among the Caucasian, Native American, and African components from diverse Mexican communities;8 but along the time, the issue of the ethnic admixture in the Mestizo started to generate interest among the scholars. 9 An intriguing question for population geneticists concerns about the homogeneity of the Mexican population, or more precisely, the Mestizo Mexican population. The appropriate answer to this question is fundamental for the design of genome-based population studies, genomewide association studies for complex disease, and pharmacogenomics, among others, because there is always the risk of structure subestimation.
To develop the topic of genetic homogeneity in the Mexican population, this chapter will focus on reports of open population genetic studies, preferable on those carried out with anonymous genetic markers, to avoid confounding factors associated to possible genetic markers associated with prevalent complex diseases or particular premorbid life styles. We also discarded studies performed with non-randomly selected subjects, as university students, groups with defined income, religious communities, etc. We also will follow a
4
historic perspective of the described contributions in order to correlate results with the advancements in the methodologies for genetic marker screening applied to population studies.
Monterrey, Nuevo León is the third largest urban conglomerate in Mexico and the main metropolitan area at the northeast of the country. Cerda-Flores RM, et al (1987) analyzed the genetic admixture in a randomly selected sample of 400 inhabitants from Monterrey and the genetic distances from parental populations.10 Allele frequencies of the ABO, Rh, MNSs, Duffy, P, Kidd and Lutheran blood group systems within populations from Monterrey were determined and grouped in accordance with the grandparent’s place of birth. Data from parental populations was obtained from published data on Tlaxcaltecan Native Americans and Mestizos11, West Europeans (Spaniard, Portuguese, French, German, and Sephardic Jews)12,13, and Africans12. After estimations of genetic admixture and distances, it was found that the gene frequencies in inhabitants of the metropolitan area were intermediate to those found in their putative ancestral populations, confirming that they were Mestizo. The genetic structure resulted different from that reported by Lisker R, et al. (1980) for the Mestizo population from Central Mexico, who found a predominant genetic contribution of Native American,8 but similar to the MexicanAmerican population from South Texas reported by Hanis CL, et al. (1986). 14 Cerda-Flores R, et al. also found that the European genetic contribution was higher in the subpopulation descending from grandparents born in Northeastern Mexico, than in the subpopulation descending from grandparents born in Central or South Mexico.
5
Studies of genetic admixture in the Mexican-American population of Starr County, Texas, are relevant contributions to understand Mexican admixture. Mexican Mestizos account for 93.8% of the population of this county located at the shared border between South Texas and Northeast Mexico. In a study reported by Hanis CL, el al. (1986), using a collection of 1,004 randomly selected samples from Mestizo subjects of this county, it was originally established that the gene pool of this community had contributions of 61, 31, and 8% from the Spanish, Native American, and West African parental populations, respectively.14 In a subsequent study of subpopulations of Mexican-Americans divided by birth place (Mexico and Texas) using the same sample collection, Cerda-Flores RM, et al (1992) aimed at establishing the genetic diversity and to estimate genetic admixture with 21 polymorphic erythrocyte enzymes and plasma proteins as genetic markers. They found that both subpopulations were genetically homogeneous and similar, with more than 99% of the total average gene diversity (HT) attributed to individual variation within the population. Again, the genetic admixture analysis showed a predominant influence coming from the Spanish, followed by a lesser contribution from Amerindians, and a slight one from the West Africans. Contributions from the ancestral populations to both subpopulations were similar, establishing that Mexican-Americans of Starr County were similar to northeastern Mexicans.15 An additional study of genetic variation by birth cohorts in Mexican Americans of the same Texan county, grouped in three cohorts (18961925, 1926-1955, 1956-1985) and using 21 genetic markers, was conducted to ascertain the extent of birth cohort-related genetic variation within this population and the genetic
6
differences from the Monterrey population. The three cohort groups were genetically indistinguishable. Gene diversity analysis attributed more than 99.8% of the total gene diversity to variation between individuals within the birth cohort populations. Subdivision by birth cohort contributed with only 0.18% to the total gene diversity. Once more, the genetic admixture analysis suggested a predominant influence from the Spaniards. The genetic structure of the Mexican American population of Starr County was found again similar to the population of Monterrey.16
After the initial study of genetic admixture in the Monterrey area, a subsequent report was published by Cerda-Flores RM, et al. which describes the genetic structure of the populations migrating to the Monterrey metropolitan area, from the central states of San Luis Potosí and Zacatecas.17 4,680 residents of Monterrey were grouped by generation and birthplace of the four grandparents (Monterrey, San Luis Potosí, and Zacatecas) to determine the extent of genetic variation within this population and the genetic distances among Monterrey natives and the immigrant populations from the above mentioned states. Nine blood group systems were analyzed as genetic markers. The genetic distance analysis indicated that immigrants from San Luis Potosí and Zacatecas were similar to the Monterrey population, irrespective of birthplace and generation. More than 96% of the total gene diversity was attributed to individual variation within the population. The genetic admixture analysis suggested that populations from the three Mexican locations, stratified by birthplace and generation, received a predominant Spanish contribution (78.5%), followed by a Native American contribution (21.5%). An admixture analysis of the
7
population from Nuevo León stratified by generation, indicated a substantial contribution from Monterrey (64.6%), followed by Zacatecas (22.1%) and San Luis Potosí (13.3%).
After the description of DNA profiling by Jeffreys AJ, et al. (1985),18 a genotyping technique based in the analysis of polymorphic DNA sequences, classical serological genetic markers, like blood groups, HLA, and polymorphic proteins and enzymes were become obsolete, mainly because of their restricted informative power and the simplicity of the DNA genotyping techniques. Genotyping based on DNA polymorphic markers and improvements in marker screening methodologies impacted positively in the understanding of the genetic structure of the human populations and these tools were rapidly assimilated for the study of admixture in Mexico. A genetic admixture study of three Mexican Mestizo populations (Nuevo León, Jalisco, and Mexico City) based on polymorphisms of the D1S80 and HLA-DQA1 loci,19 showed that allele frequency distributions were relatively homogenous in the three samples; with only the Mexico City population showing minor differences of the HLA-DQA1 allele frequencies. Genetic admixture analyses demonstrated contributions ranging from 60-70% from SpanishEuropean, 37-49% from Native American, and 1-3% from African parental populations. These estimates were virtually the same as those reported earlier from blood group loci. A further study of 143 Mestizos from Northeast Mexican subjects in whom 13 short tandem repeat (STR) loci were genotyped (D3S1358, vWA, FGA, D8S1179, D21S11, D18S51, D5S818, D13S317, D7S820, CSF1PO, TPOX, TH01, and D16S539), maximum likelihood estimates of admixture components generated a trihybrid model with
8
admixture proportions of 54.99% +/- 3.44, 39.99% +/- 2.57, and 5.02% +/- 2.82 for the Spanish, Native American, and African contributions, respectively (not different from those obtained using D1S80 and HLA-DQA1 loci). Figure 1 shows the admixture contributions to Mestizos from these locations. Nuevo León 3% 37% 60%
Mexico City 1%
Jalisco 1% 49%
50%
43% 56%
Spanish Native American African
Figure 1. Contributions from parental populations to three Mestizo populations using 13 STR markers.19
Larger panels of DNA polymorphic markers resulted after the conclusion of the Human Genome Project, as well as new techniques for hightroughput genotyping, like the microarray genotyping and the next generation sequencing platforms. A large study to understand the genomic diversity of the Mexican Mestizos was carried out by the National Institute of Genomic Medicine of Mexico in 300 samples of unrelated Mestizos from the 9
Mexican states of the north (Sonora and Zacatecas), Central (Guanajuato), Gulf and Pacific coasts (Veracruz and Guerrero, respectively), and south (Oaxaca and Yucatán) and 30 Native Americans (Zapoteco from Oaxaca).20 A microarray platform was used for the screening of 99,953 single nucleotide polymorphisms (SNPs) and a subset of 1,814 ancestry informative markers was generated to analyze genomewide distribution and minimize linkage disequilibrium between SNPs. Data on ancestral contributions came from the International Hap Map Project (Northern Europeans, Africans, Chinese, and Japanese data sets).3 Mean ancestries were 55.2 ± 15.4% for Native American, 41.8 ± 15.5% for European, 1.8 ± 3.5 for African, and 1.2 ± 1.8% for East Asian. The highest and lowest estimates of mean European ancestry were 61.6 ± 8.5% for Sonora and 28.5 ± 12.0 % for Guerrero. This study establish that genetic differences among Mexican Mestizos are due to regional differences in the European and Native American parental contributions and also describes an admixture gradient, in which the European contribution predominates in the Northern population, while populations in Central states are closer to the Native American studied community. Some new private alleles are found in the study that clearly shows a Native American origin. Despite the fact that the African contribution is low (