Languages, geography and HLA haplotypes in Native ... - Europe PMC

2 downloads 0 Views 213KB Size Report
Embera. Waunana. Tule. Ijka. Toba Pilaga. Eastern Toba. Figure 1. Paci¢c Rim and Bering Sea region with approximate location of Asian and American ...
Languages, geography and HLA haplotypes in Native American and Asian populations M. V. Monsalve1,2*, A. Helgason3 and D. V. Devine2 1

Department of Anthropology and Sociology and 2Department of Pathology, University of British Columbia,Vancouver, BC, Canada V6T 2B5 and 3Department of Biological Anthropology, University of Oxford, Oxford OX2 6QS, UK A number of studies based on linguistic, dental and genetic data have proposed that the colonization of the New World took place in three separate waves of migration from North-East Asia. Recently, other studies have suggested that only one major migration occurred. It is the aim of this study to assess these opposing migration hypotheses using molecular-typed HLA class II alleles to compare the relationships between linguistic and genetic data in contemporary Native American populations. Our results suggest that gene £ow and genetic drift have been important factors in shaping the genetic landscape of Native American populations. We report signi¢cant correlations between genetic and geographical distances in Native American and East Asian populations. In contrast, a less clear-cut relationship seems to exist between genetic distances and linguistic a¤liation. In particular, the close genetic relationship of the neighbouring Na-Dene Athabaskans and Amerindian Salishans suggests that geography is the more important factor. Overall, our results are most congruent with the single migration model. Keywords: human evolution; HLA class II haplotypes; Native Americans; Asian populations

1. INTRODUCTION

The distribution of allele frequencies of classical genetic markers indicates that America, in particular South America, is one of the most genetically variable regions in the world (Cavalli-Sforza et al. 1994). Studies based on mitochondrial DNA (mtDNA), archaeological, linguistic and dental data have suggested that the Americas were populated through three separate waves of migration (Amerind, Na-Dene and Eskimo) derived from only four Asian ancestral populations (Torroni et al. 1992). It has been hypothesized that (i) the ¢rst migration is estimated to have occurred 15 000^30 000 years ago and their descendants are the Amerinds, (ii) the Na-Dene are the descendants of the second migration 10 000^15 000 years ago and (iii) the last migration occurred 6000^9000 years ago and gave rise to the Aluet-Eskimos (Greenberg et al. 1986). A study of mtDNA in Na-Dene-speaking Haida and two other Amerindian groups from the Paci¢c North-West coast indicated a reduced amount of diversity in the Haida group (Ward et al. 1993). This diversity could be explained as a result of Haida having originated more recently than the Amerindians and, thus, belonging to separate waves of migration into the continent. On the basis of these ¢ndings, it has been suggested that there may have been two separate migrations into the Americas involving a single source population. Hence, it is supposed that the Na-Dene and Eskimo-Aluet populations originated from an ancestral Asiatic source population that also gave rise to the Amerind mtDNA gene pool (Shields et al. 1993). *

Author for correspondence ([email protected]).

Proc. R. Soc. Lond. B (1999) 266, 2209^2216 Received 28 June 1999 Accepted 28 July 1999

A number of other studies have supported a single wave of migration to the New World. Thus, for example, the distribution of the four founding mtDNA haplogroups in Native Americans has been interpreted as evidence for a single wave of migration into the New World, which included multiple variants of three of the lineages (Merriwether et al. 1995; Easton et al. 1996). Di¡erent studies with mtDNA trace all Native American populations to a single ancestral founder population that lived in the region of Mongolia/North China (Merriwether et al. 1995; Kolman et al. 1996). It has been suggested that a major wave of migration of a population ancestral to the Amerinds from north-eastern Siberia to America occurred 20 000^25 000 years ago (Forster et al. 1996). A rapid expansion of Beringian source populations ca. 11300 years ago is subsequently held to have given rise to the present Eskimo and Na-Dene populations (Forster et al. 1996). Yet another study posited the separation of Amerinds from the Na-Dene, Eskimo and probably from the Siberian Chukchi as a result of glacial blocking of the Alberta land corridor some 14 000^ 20 000 years ago (Bonatto & Salzano 1997). A recent mtDNA study in a prehistoric Native America population also suggested a single wave of migration with considerable mtDNA diversity which bears signals of a population expansion 23 000^37 000 years ago (Stone & Stoneking 1998). An analysis of the genetic variation in the Y chromosome in Amerindian and Na-Dene populations placed the age of the ancestral founder haplotype at 22 770 years (minimum 13 500 years and maximum 58 700 years) (Bianchi et al. 1998). The ¢nding of this haplotype in high frequencies in Amerindian, Na-Dene and Eskimo-Aluet linguistic groups seems to favour the

2209

& 1999 The Royal Society

2210

M.V. Monsalve and others

HLA class II in Native American and Asian populations

hypothesis of a common origin for all Native Americans based on studies of classic genetic markers (Szathmary 1984). A recent study with Y-chromosome markers suggests that separate migrations brought at least two of the major paternal founders to the New World (Karafet et al. 1999). Although mtDNA and Y-chromosome markers provide important information about human dispersal to the Americas, they only represent two independent loci. Individual loci sampled from the same population can have very di¡erent genetic histories that are not necessarily informative about the population's genetic history. Further studies of the genetic patterns exhibited by nuclear molecular genetic markers are essential to verify and add to the emerging picture of human colonization of the New World. The histocompatibility system (HLA) is the most polymorphic system in humans and has been typed extensively in a number of populations (Bodmer et al. 1992). The HLA system is highly polymorphic because of balancing selection which maintains a few allelic lines over very long periods (Harpending et al. 1998). The extensive variation in HLA markers makes this system highly useful for determining genealogical relationships between populations. To date, the DRB1 locus has been most extensively surveyed in Native American populations. Many of these studies indicate a reduced number of DRB1 alleles in these groups when compared to populations outside the New World (Cerna et al. 1993; Yunis et al. 1994; Monsalve et al. 1998). The observation that certain allelic lineages are missing in all Amerindian groups lends support to the idea of an `into America' population bottleneck (Erlich et al. 1997). The aims of the present study are twofold. The ¢rst is to explore the genetic relationships of British Columbian populations to other New World and East Asian populations. Second and more generally, the study aims to establish whether the combination of linguistic, geographical and genetic data from HLA class II loci can contribute to current knowledge about the peopling of the Americas. In a previous study the allelic variation of HLA class I and class II loci was determined in Athabaskans (Na-Dene) and Penutians (Amerindian) from British Columbia (Monsalve et al. 1998). The current study combines linguistic, geographical and HLA DRB1^ DQA1^ DQB1 haplotype frequency data for Athabaskans (Na-Dene) and Salishans (Amerindians) from British Columbian populations along with previously published data from a number of di¡erent Native American and East Asian populations. 2. MATERIAL AND METHODS Blood samples were collected from Salishans from the Soowahlie band as part of the recruitment of native people from British Columbia onto the Canadian Unrelated Bone Marrow Registry (UBMDR). The participants completed a written questionnaire on linguistic, ethnic and family heritage and consented in writing to the use of blood for the HLA analysis. The second exons of the DRB1 and DQB1 alleles were ampli¢ed using the primers derived from the 11th International Histocompatibility Workshop. For detection, the alleles for the DRB1 group and subgroups and DQB1 alleles were typed using Proc. R. Soc. Lond. B (1999)

non-radioactive sequence-speci¢c oligonucleotide probes as described previously (Monsalve et al. 1998). The DQA1 alleles were typed using the ampli¢cation refractory mutation system. The haplotype frequencies in the natives from British Columbia with native parents and grandparents and no other family member of ¢rst or second degree in the study group were determined by direct counting of unrelated individuals. Family data and linkage disequilibrium data were used to infer the most probable DRB1^ DQA^ DQB1 haplotypes. In addition, haplotype frequencies were estimated using the maximum-likelihood method (Imanishi et al. 1992a). DRB1^ DQA1^ DQB1 haplotype frequencies were obtained from a search in the literature on Native Americans and East Asian populations. The data from the Kogui and Kogi both represent the same linguistic group, but were included in our analysis as separate groups, as they derive from two di¡erent publications (Yunis et al. 1994; Trachtenberg et al. 1996). Table 1 shows the number of chromosomes scored for each Native American and Asian population, the number of di¡erent haplotypes found and estimated heterozygosity values. Figure 1 shows a partial world map of the groups included in this study. As genotyping rarely reveals all DRB1, DQA1 and DQB1 alleles in a population, the sum of the resultant haplotype frequencies was less than one in a number of populations. However, as many of the methods used to calculate genetic distances and phylogenetic trees require allele frequencies to add up to one, the haplotype frequencies were corrected prior to analysis by dividing by the sum of the haplotype frequencies for each population. Figure 2 shows Ruhlen's (1987) language classi¢cation of the Native American and Asian groups analysed in this study. Genetic distances based on the haplotype frequencies were calculated using the chord distance method incorporated in the GENDIST program within the PHYLIP v. 3.5c package (Felsenstein 1993). Phylogenetic trees were constructed using the maximum-likelihood (CONTML program) and neighbourjoining (NEIGHBOR program) methods implemented in PHYLIP v. 3.5c. Haplotype frequencies of Hottentots, an African population belonging to the Khoisan family group, were used as an outgroup (Imanishi et al. 1992b). Multidimensional scaling (MDS) was employed as an alternative method of simplifying the multidimensional matrices of genetic distance between populations through a graphical presentation in twodimensional space. MDS analyses were run on a computer using the SPSS 8.0 package. Correlations between genetic and geographical distances were calculated using Mantel tests (MANTEL program by A. Rogers, available at ftp.utah.edu). Geographical distances were calculated as geodesic (great circle) distances using the geographical coordinates of the population locations as de¢ned in The world atlas (Jones 1978). A useful method of assaying the relative in£uence of isolation, migration and genetic drift on the gene pools of a number of related populations is to plot heterozygosity against the distance from the gene frequency centroid (de¢ned as the mean allele frequencies of all the populations included in the analysis; Harpending & Ward 1982). As genetic drift results in both increasing population divergence and decreasing heterozygosity, it is expected that a strong negative relationship be observed between these variables. Given this negative relationship, it is assumed that populations that plot above the expected regression line have received above average levels of gene £ow, while those that fall below the regression line have drifted in greater isolation.

HLA class II in Native American and Asian populations

M.V. Monsalve and others

2211

Table 1. DRB1^DQA1^DQB1 haplotypes and heterozygosity index in Asian populations, South America and British Columbia natives (n, chromosomes scored. The linguistic classi¢cation is according to Ruhlen 1987.) n

number of haplotypes

heterozygosity

Asian populations Buyi Chinese Singapore Chukchi Eskimos Evenks Guan Kets Korean Udegeys Koryacs Wajin

126 140 116 160 70 182 44 190 42 184 610

12 13 37 42 37 19 25 17 24 39 29

0.8880 0.9113 0.8927 0.9071 0.9522 0.9113 0.9306 0.9263 0.9193 0.8921 0.9274

Imanishi et al. (1992b) Imanishi et al. (1992b) Grahovac et al. (1998) Grahovac et al (1998_ Grahovac et al. (1998) Gao et al. (1991) Grahovac et al. (1998) Imanishi et al. (1992b) Grahovac et al. (1998) Grahovac et al. (1998) Imanishi et al. (1992b)

South America natives Arhuaco Arsario Bari Cayapa Coreguaje East Toba Embera Ijka Ingano Kogi Kogui Mataco Wichi Nukak Sikuani Ticuna Toba Pilaga Tule Waunana Wayu Xavante

52 36 116 200 60 270 40 60 22 28 60 98 40 54 98 38 58 60 176 148

12 4 6 23 6 9 10 10 14 5 4 7 5 8 10 7 12 9 15 5

0.8386 0.6500 0.6928 0.8421 0.6770 0.8251 0.8135 0.6040 0.9087 0.6995 0.5842 0.7796 0.5468 0.6750 0.7803 0.8094 0.8754 0.7811 0.8808 0.7514

Yunis et al. (1994) Yunis et al. (1994) Layrisse et al. (1995) Trachtenberg et al. (1995) Trachtenberg et al. (1996) Cerna et al. (1993) Trachtenberg et al. (1996) Trachtenberg et al. (1996) Trachtenberg et al. (1996) Yunis et al. (1994) Trachtenberg et al. (1996) Cerna et al. (1993) Trachtenberg et al. (1996) Trachtenberg et al. (1996) Mack & Erlich (1997) Cerna et al. (1993) Trachtenberg et al. (1996) Trachtenberg et al. (1996) Yunis et al. (1994) Cerna et al. (1993)

British Columbia natives Athabaskans Salishans

124 36

15 11

0.8136 0.8371

Monsalve et al. (1998) this study

3. RESULTS

The analysis of 11 DRB1^ DQA1^ DQB1 haplotypes in the Salishans indicated common haplotypes: (i) with the other Amerindian groups, (ii) with the Athabaskan NaDene from British Columbia, Amerindians from South America and Asian populations, (iii) with the Athabaskan from British Columbia and Asian populations and (iv) with Amerindians from South America and Asian populations. The most common haplotype in the Salishans (DRB1*0404^ DQB1*0301^ DQA1*0302) accounted for 33.3% of the haplotypes in our sample. The DRB1*1402^ DQA1*0501^ DQB1*0301 haplotype, which was found in high frequencies in Athabaskans (Monsalve et al. 1998), was also found in Native Americans (Troup 1994) and more recently has been reported in Eastern Siberian populations and East Asian populations (Grahovac et al. 1998). Another common haplotype in North American aboriginals and Asian populations (DRB1*0802^ DQA1*0401^ DQB1*0402) was the second most common haplotype in the Salishans. Proc. R. Soc. Lond. B (1999)

reference

DRB1*1602^DQA1*0501^ DQB1*0301 had a frequency of 8.3% among the Salishans. The DRB1*1602 haplotype has been previously reported to be in linkage disequilibrium with DQA1*0501 and DQB1*0301 (Liu et al. 1988), but only in North American aboriginals (Troup 1994). This haplotype has also been recently found in North-East Asian populations (Grahovac et al. 1998). A di¡erent arrangement occurs in Asian-Oceanic populations (Gao & Serjeantson 1991) and East Asian populations (Gao et al. 1991; Imanishi et al. 1992b) where the DRB1*1602 haplotype is linked with the DQB1*0502 allele. Table 1 shows the sample sizes for each population used in our study, along with the number of haplotypes found and the heterozygosity values estimated assuming Hardy^ Weinberg equilibrium. This information supports previous observations of reduced HLA diversity in American populations, with Asian populations having both a signi¢cantly greater number of haplotypes and higher heterozygosity (t-test p50.001 in both cases). Both the neighbour-joining and maximum-likelihood trees generated similar overall branching patterns for the

2212

M.V. Monsalve and others

HLA class II in Native American and Asian populations

Eskimos Chuckchi

Kets Evenks

Koryaks Athabaskan Salishan Udegey Guan Buyi

Korean Kogui Arsario Arhuaco Wayu Ijka Bari Tule Sikuani

Wajin

Chinese

Waunana Embera Cayapa Ingano Coreguaje Mataco Wichi

Nukak Ticuna Xavante

Toba Pilaga Eastern Toba

Figure 1. Paci¢c Rim and Bering Sea region with approximate location of Asian and American populations for which HLA DRB1^DQA1^DQB1 haplotypes were analysed.

Na-Dene, Amerindian and East Asian groups and indicated two major clusters. While the Asians seem to cluster relatively tightly on a shared major branch, the Native American populations show greater interpopulation diversity, occupying a large part of the genetic space described by the maximum-likehood tree in ¢gure 3. The MDS plot representing the genetic distances calculated from the haplotype frequencies gave a similar population distribution to the maximum-likelihood tree (data not shown). Here we present the results obtained with the maximum-likelihood tree because it is a useful way of presenting genetic distance. It indicates that the two British Columbia populations (the Athabaskans and the Salishans belonging to the Na-Dene and Amerindian language families, respectively) seem to fall outside the major Amerindian branch, instead occupying an intermediate position between the Asian and other American populations. This observation and the more general con¢guration of populations seem to suggest an association between geographical and genetic distances. The MDS plot also indicates that the Athabaskans and Salishans again cluster together and take up an intermediate position between the Asian populations and other American populations. The groups located north of the equator in South America cluster together, with the exception of the Proc. R. Soc. Lond. B (1999)

Arsario and Tule. The latter cluster with groups located south of the equator in South America, with the exception of the Xavante. According to linguistic classi¢cation, most of the Chibcha Paezan language members, with the exception of two populations, cluster with the equatorialTucoan and Andean languages. Most of the Ge-PanoCarib language family members are in a di¡erent clade. In general, a similar pattern of population relationships was observed in the MDS plot. All the Siberian Asian groups cluster together and the rest of the Asian groups share a second cluster. The maximumlikelihood tree indicates that the Evenk and Kets from Central Siberia share a clade and cluster together with the Eastern Siberian groups while the Eastern and South-east Asian populations cluster in a di¡erent clade. Following linguistic classi¢cation, the Chukchi Kamchatkan and the Eskimo-Aluet cluster together with the Yeniseian and two members of the Altaic family, while the other two members of the Altaic family, the Austric and the Sino-Tibetan language families, cluster together. Mantel tests were used to assay the correlations between genetic and geographical distances quantitatively. Signi¢cant correlations were revealed when (i) all the populations in the study were included (n ˆ 33, r ˆ 0.632147 and p50.001), (ii) just the Asian populations were included (n ˆ11, r ˆ 0.438087 and p50.001) and (iii) just Native American populations were included

HLA class II in Native American and Asian populations

Amerindian

Equatorial-Tucanoan

Andean

Macro-Tucanoan

Ge-Pano-Carib

Quechuan

Tucanoan Ticuna Quechua A Yuri

Ge-Pano

Macro- MacroGe Panoan

Western

Na-Dene Eskimo Sino- Austric Altaic Aluet Tibetan

Chibchan-Paezan

Almosan

Athabaskan

Eskimo

Yupik

Chibchan

Mosan

Continental Na-Dene

Nuclear Paezan

Nuclear Chibchan

Salishan

Athabaskan- Siberian Burmic Eyak

Salish Athabaskan Proper

Daic

Mongolian-

Li-KamTai

Tungus

Kuki-Naga Be-KamTai

Tsamosan Canadian

Haka

Northern

Chukchi

KoryakAlyutor

Chukchi

Koryak

Northern

Central Kuki LakkiaKam-Tai

Guacuruan Cayapa Colorado

TibetoBurman

2213

Chukchi Kamchatkan

Tibeto- Austro- Korean- AltaicKaren Tai Japanese proper

Paezan

Mataco Choco Barbacoan Chibcha Aruak Motilon Guaicuru

Northern

M.V. Monsalve and others

Evenki

B-Kam-Tai Tai

Coreguaje Ticuna Ingano Toba Mataco Embera Cayapa Xavante E Toba Waunana

Arsario Nukak Sikuani Tule

Ijka Bari Salishan Arhuaco Kogi or Kogui Wayu

Sekani

Siberian- Chinese Eskimo Guan

Buyi

Korean Evenk Wajin Udegeys

Figure 2. Classi¢cation of languages studied here according to Ruhlen (1987). The top line indicates the names of the largest seven linguistic families of the groups. Each linguistic family is split into subgroups that appear from the top to the bottom in vertical sequence forming the branches of a tree. Kets Evenks Udegeys Eskimos

Wajin Chukchi Buyi

Korean

Koryak

linguistic group

Guan Athabaskan Salishan

Altaic

Chinese

Coreguaje

Hotts

Sino-Tibetan Na-Dene Yeniseian

Tule

Ingano

Eskimo-Aluet Chukchi-Kamchatkan

Arhuaco Ijka

Ticuna

Kogi Xavante Embera

Nukak

Sikuani

Austric

Waunana

East Toba

Bari

Amerindian Almosan

Arsario

Equatorial-Tuconoan Andean

Cayapa Toba Wayu

Ge-Pano-Carib Mataco

Chibcha-Paezan

0.001 Kogui

Figure 3. Maximum-likelihood tree of Native America aboriginals and Asian groups based on HLA DRB1^DQA1^DQB1 haplotype frequencies. The tree consists of Salishans and Athabaskans from British Columbia in North America, 20 Native American groups from South America and 11 groups from Asia. The HLA haplotype frequencies for the Hottentots (Hotts) were used as an outgroup. Linguistic families are represented by shapes. Proc. R. Soc. Lond. B (1999)

2214

M.V. Monsalve and others

HLA class II in Native American and Asian populations

1.0 language family Na-Dene Amerindian total population

ING

heterozygosity

0.9

0.8

WAYU

TULE CAYA SAL ARHU EAST ATH EMB TOBA MAT TICUNA WAUN XAVA KOGI BARI

0.7

SIKUANI

COR ARSA IJKA

0.6

KOGUI

0.5 0.0

0.4 0.1 0.2 0.3 distance from genetic centroid

NUKAK

0.5

Figure 4. Plot of heterozygote versus distance from genetic centroid plot. The diagonal line is the expected relationship predicted by the model of Harpending & Ward (1982). The Athabaskans (ATH) and Salishans (SAL) are from British Columbia and the Arhuaco (ARHU), Arsario (ARSA), Bari, Cayapa (CAYA), Coreguaje (COR), Embera (EMB), Eastern Toba (EAST), Ijka, Ingano (ING), Kogi, Kogui, Mataco-Wichi (MAT), Nukak, Sikuani, Ticuna, Toba, Tule, Wayun (WAYU), Waunana (WAUN) and Xavante (XAVA) Amerindian groups are from South America.

(n ˆ 22, r ˆ 0.209837 and p50.001). In each case, geography can be seen to explain a signi¢cant proportion of the variation in genetic distances between populations, although it is notable that the correlation for New World populations is half of that found for Asian populations. Figure 4 presents a heterozygosity versus distance from genetic centroid plot for the Native American populations. There is a clear negative relationship between heterozygosity levels and distance from the genetic centroid (R2 ˆ 0.82), as predicted for populations that have diverged through genetic drift (Harpending & Ward 1982). Evidently, then, drift has been an important factor in shaping the genetic landscape of Native American populations. Interestingly, the Amerindian Salishans (who are genetically closer to the Na-Dene Athabaskans than to other Amerindian groups) are among four populations identi¢ed by the centroid plot to have received above average levels of gene £ow (the others are the Ingano, Tule and Ticuna). However, the deviation of the Salishans from the regression line is not marked enough to suggest that their close genetic relationship with the Na-Dene Athabaskans and distance from other Amerindian groups can be explained by gene £ow alone. 4. DISCUSSION

There are two alternative explanatory frameworks within which our results could be interpreted. The ¢rst supposes that there was a single migration to the New World and that the current genetic di¡erentiation in the Americas can be explained by the ¢ssioning, spread and isolation of groups as they gradually colonized the continent, onto which the ongoing processes of gene £ow and genetic drift are superimposed. The second postulates three waves of migration at di¡erent times, involving three genetically distinct groups belonging to di¡erent Proc. R. Soc. Lond. B (1999)

language families. Under the ¢rst model, one would expect linguistic a¤liation to be incidental to genetic relationships between populations and, hence, geographical location and other demographic factors would be expected to be the sole explanatory factors. Under the second model, the current genetic landscape of the Americas should still re£ect this historical correlation of language families with genes. Our results do not provide conclusive support for either of these two competing hypotheses. However, the results reported do seem to provide more evidence in favour of the single migration hypothesis. If the multiple migration model were true, then the Salishan gene pool would be characterized by marked traces of gene £ow that would di¡erentiate them from other Amerindian populations. As the centroid plot in ¢gure 4 demonstrates, this is not the case. There is nothing to suggest that the level of change in the Salishan gene pool has been unusually great relative to other Amerindian populations, or that they have received substantially greater amounts of gene £ow. The Athabaskans and Salishans, who, according to the three waves of migration model belong to separate migrational events and, thus, ought to be clearly di¡erentiated genetically, exhibit a closer genetic relationship to one another than the Salishans do to other Amerindian populations. This would either suggest that substantial gene £ow between these neighbouring populations has eradicated earlier genetic di¡erences resulting from them belonging to separate migrational groups to the Americas or that there were no major genetic di¡erences aligned with linguistic groups to begin with. Hence, the ¢rst model seems the more plausible, given the observed genetic variation in the HLA class II loci. This suggests that the current genetic landscape of the continent has primarily been shaped by the interaction of the evolutionary forces of drift, gene £ow and selection acting on the background of genetic variation from a single ancestral population as it ¢ssioned and spread across the continent. Such an interpretation is supported generally by the signi¢cant correlations between geographical and genetic distances found using Mantel tests and, more speci¢cally, by the observation that what most clearly di¡erentiates the Salishans from other Amerindian groups is their geographical position. These ¢ndings are further supported by the fact that Native American populations exhibit greater interpopulation di¡erences than do Asian populations and that these di¡erences are likely to have been largely the result of genetic drift. Shields et al. (1993) observed a greater mtDNA sequence diversity within Native American groups than Asian populations. Moreover, their results indicated a close genetic relationship between contemporary Circumarctic populations including Haida and Athabaskans from the Na-Dene linguistic group and the Chukchi-Kamchatkan and Eskimo-Aleut groups. These ¢ndings correspond closely to our study of HLA class II haplotypes, which indicates an intermediate position between the Athabaskan Na-Dene and Salishan Amerindian groups with respect to the Eskimo-Aluet group and other Amerindians. Recent linguistic evidence suggests that Na-Dene languages derive from the Yeniseian of Central Siberia (Ruhlen 1998). Yet our HLA data analyses do not ¢nd the Na-Dene Athabaskans genetically closer to Yenisian Kets

HLA class II in Native American and Asian populations than to other Siberian groups. Further genetic studies in other branches of the Na-Dene family will be required in order to provide conclusive evidence as to their historical relationship to Siberian and Amerindian populations. Analysis of autosomal polymorphisms such as Alu insertion and Y-chromosome polymorphisms in Na-Dene and Amerindian groups in North America are likely to provide important additional insights about settlement in the Americas. A genetic association with linguistic grouping has been found in some studies based on classical polymorphisms. A study with 42 populations showed considerable correspondence between genetic and linguistic clustering (Cavalli-Sforza et al. 1988). In a study of 11 genetic systems in 130 populations, geographical distance was signi¢cantly correlated with genetic distance and linguistic data (Chen et al. 1995). Principal components analysis using seven genetic systems in 21 South American aboriginal groups indicated separation of the cluster of native groups speaking Tropical Forest, Palaeo-America and Andean languages (Salzano & Callegari-Jacques 1988). Geographical proximity appears to be a predominant factor in the formation of clusters but linguistic a¤nity also has an important role. A recent study combining serological HLA-A, HLA-B and HLA-C typing from 39 South American aboriginal groups indicated clearly signi¢cant longitude and latitude clines, probably indicating ancient migration routes (Rothhammer et al. 1997). On the basis of our current results, it would seem unlikely that genetic association with linguistic grouping will be a prominent feature in Native American data sets. Our results are more in line with the expectations of the single migration model. This conclusion is supported by some previous studies of molecular genetic data that failed to ¢nd congruence between language and genes (Ward et al. 1993). Our study suggests that gene £ow between geographically proximate populations and genetic drift resulting from small population sizes and isolation between geographically distant groups have been the primary determinants of the Native American genetic landscape. The close genetic relationship between the Salishans and Athabaskans is perhaps the most important piece of evidence supporting this interpretation of the results presented. However, while suggestive, this evidence must be treated with some caution until more extensive sampling of North American aboriginal populations is undertaken. The addition of HLA class II haplotypes and other genetic markers for a wider range of North American populations is likely to provide the basis for a more de¢nitive answer to the riddle of how and when the American continent was originally colonized. We thank the First Nations People of British Columbia for their cooperation, Sheena Wilkie (Canadian Red Cross Society Blood Services) for assistance in collecting the samples and Naoyuki Takahata (Graduate University for Advanced Studies, Hayama, Japan) for comments on the manuscript. Glen Edin of the University of British Columbia provided technical help. So¢a Hashemi of the Canadian Red Cross, National Testing Laboratory, Ottawa, kindly provided control samples. This work was supported by a grant from the W. J. Van Dusen Foundation to D.V.D. and M.V.M. Proc. R. Soc. Lond. B (1999)

M.V. Monsalve and others

2215

REFERENCES Bianchi, N. O., Catanesi, C. I., Bailliet, G., Martinez-Marignac, V. L., Bravi, C. M.,Vidal-Roja, L. B., Herrera, R. J. & LopezCamelo, J. S. 1998 Characterization of ancestral and derived Y-chromosome haplotypes of New World native populations. Am. J. Hum. Genet. 63, 1862^1871. Bodmer, J. C. (and 13 others) 1992 Nomenclature for factors for the HLA system 1991. Tissue Antigens 39, 161^173. Bonatto, S. L. & Salzano, F. M. 1997 A single and early migration for the peopling of the Americas supported by mitochondrial DNA sequence data. Proc. Natl Acad. Sci. USA 94, 1866^1871. Cavalli-Sforza, L. L., Piazza, A., Menozzi, P. & Mountain, J. 1988 Reconstruction of human evolution: bringing together genetic, archaeological, and linguistic data. Proc. Natl Acad. Sci. USA 85, 6002^6006. Cavalli-Sforza, L. L., Menozzi, P. & Piazza, A. 1994 The history and geography of human genes. Princeton University Press. Cerna, M., Falco, M., Friedman, H., Raimondi, E., Maccagno, A., Fernandez-Vina, M. & Stastny, P. 1993 Di¡erences in HLA class II alleles of isolated South American Indian populations from Brazil and Argentina. Hum. Immunol. 37, 213^220. Chen, J., Sokal, R. R. & Ruhlen, M. 1995 Worldwide analysis of genetic and linguistic relationship of human populations. Hum. Biol. 67, 595^612. Easton, R. D., Merriwether, D. A., Crews, D. E. & Ferrelli, R. E. 1996 mtDNA variation in the Yanomami: evidence for additional New World founding lineages. Am. J. Hum. Genet. 59, 213^225. Erlich, H. A., Mack, S. J., Bergstrom, T. & Gyllenstein, U. B. 1997 HLA class II alleles in Amerindian populations: implications for the evolution of HLA polymorphism and the colonization of the Americas. Hereditas 127, 19^24. Felsenstein, J. 1993 PHYLIP ( phylogeny inference package), v. 3.5c. Seattle: Department of Genetics, University of Washington. Forster, P., Harding, R., Torroni, A. & Bandelt, H. 1996 Origin and evolution of Native America mtDNA variation. Am. J. Hum. Genet. 59, 935^945. Gao, X. & Serjeantson, S. W. 1991 Heterogeneity in HLA DR2related DR, DQ haplotypes in eight populations of Asia^ Oceania. Immunogenetics 34, 401^408. Gao, X., Sun, Y., An, J., Fernandez-Vina, M. A. Qou, J., Lin, L. & Stastny, P. 1991 DNA typing for HLA-DR, and -DP alleles in a Chinese population using the polymerase chain reaction (PCR) and oligonucleotide probes. Tissue Antigens 38, 24^30. Grahovac, B. (and 10 others) 1998 Polymorphism of the HLA class II loci in Siberian populations. Hum. Genet. 102, 27^43. Greenberg, J. H., Turner II, C. G. & Zegura, S. L. 1986 The settlement of the Americas: a comparison of the linguistic, dental, and genetic evidence. Curr. Anthropol. 27, 477^497. Harpending, H. C. & Ward, R. H. 1982 Chemical systematics and human populations. In Biochemical aspects of evolutionary biology (ed. M. H. Nitecki), pp. 213^256. University of Chicago Press. Harpending, H. C., Batzer, M. A., Gurven, M., Jorde, L. B., Rogers, A. L. & Sherry, S. T. 1998 Genetic traces of ancient demography. Proc. Natl Acad. Sci USA 95, 1961^1967. Imanishi, T., Wakisaka, A. & Gojobori, T. 1992a Genetic relationships among various human populations indicated by MHC polymorphisms. In HLA 1991a. Proceedings of the 11th Histocompatibility Worshop and Conference, vol. 1 (ed. K. Tsuji, M. Aizawa & T. Sasazuki), pp. 627^632. Oxford University Press. Imanishi, T., Akaza, T., Kimura, A., Tokunaga, K. & Gojorobi, T. 1992b Allele and haplotype frequencies for HLA complement loci in various ethnic groups. In HLA 1991b. Proceedings of the 11th Histocompatibility Worshop and Conference, vol. 1(ed. K. Tsuji, M. Aizawa & T. Sasazuki), pp. 1065^1220. Oxford University Press.

2216

M.V. Monsalve and others

HLA class II in Native American and Asian populations

Jones, E. (ed.) 1978 The world atlas. London: Hennerwood Publications Ltd. Karafet, T. M. (and 13 others) 1999 Ancestral Asian sources(s) of New World Y-chromosome founder haplotypes. Am. J. Hum. Genet. 64, 817^831. Kolman, C. J., Sambuughin, N. & Bermingham, E. 1996 Mitochondrial DNA analysis of Mongolian populations and implications for the origin of the New World founders. Genetics 142, 1321^1334. Layrisse, Z., Guedez, Y., Dominguez, E., Herrera, F., Soto, M., Balbas, O., Matos, M., Alfonzo, J. C., Granados, J. & Scorza, J. 1995 Extended HLA haplotypes among the Bari Amerindians of the Perija range. Relationship to other tribes based on fourloci haplotype frequencies. Hum. Immunol. 44, 228^235. Liu, C. P., Bach, F. H. & Wu, S. 1988 Molecular studies of rare DR2/LD-5a/DQw3 HLA class II haplotype. J. Immunol. 140, 3661^3639. Mack, S. J. & Erlich, H. A. 1997 HLA class II polymorphism in the Ticuna of Brazil: evolutionary implications of the DRB1*0807 allele. Tissue Antigens 51, 41^50. Merriwether, D. A., Rothhammer, F. & Ferrell, R. E. 1995 Distribution of the four founding lineage haplotype in Native Americans suggests a single wave of migration for the New World. Am. J. Phys. Anthopol. 98, 411^430. Monsalve, M. V., Edin, G. & Devine, D. V. 1998 Analysis of HLA class I and class II in Na-Dene and Amerindian populations from British Columbia, Canada. Hum. Immunol. 59, 48^55. Rothhammer, F., Silva, C., Callegari-Jacques, S. M., Loop, E. & Salzano, F. M. 1997 Gradients of HLA diversity in South American Indians. Ann. Hum. Biol. 24, 197^208. Ruhlen, M. 1987 A guide to the world's languages, vol. 1. Stanford University Press. Ruhlen, M. 1998 The origin of the Na-Dene. Proc. Natl Acad Sci. USA. 95, 13 994^13 996. Salzano, F. M. & Callegari-Jacques, S. M. 1988 South American Indians: a case study in evolution. Oxford, UK: Clarendon Press.

Proc. R. Soc. Lond. B (1999)

Shields, G. F., Schmiechen, A. M., Frazier, B. L., Redd, A., Voevoda, M. I., Reed, J. K. & Ward, R. H. 1993 mtDNA sequences suggest a recent evolutionary divergence for Beringian and northern North American populations. Am. J. Hum. Genet. 53, 549^562. Stone, A. C. & Stoneking, M. 1998 mtDNA analysis of a prehistoric Oneata population: implications for the peopling of the New World. Am. J. Hum. Genet. 62, 1153^1170. Szathmary, E. J. 1984 Peopling of northern North America: clues from genetic studies. Acta Anthropogenet. 8, 79^109. Torroni, A. (and 10 others) 1992 Native American mitochondrial DNA analysis indicates that the Amerind and the Nadene populations were founded by two independent migrations. Genetics 130, 153^162. Trachtenberg, E. A., Erlich, H. A., Rickards, O., Destefano, G. F. & Klitz, W. 1995 HLA class II linkage disequilibrium and haplotype evolution in the Cayapa Indians of Ecuador. Am. J. Hum. Genet. 57, 415^424. Trachtenberg, E. A., Keyeux, G., Bernal, J. E., Rhodas, M. C. & Erlich, H. A. 1996 Results of Expedicion Humana. I. Analysis of HLA class II (DRB1^ DQA1^ DQB1^ DPB1) alleles and DR ^ DQ haplotypes in nine Amerindian populations from Colombia. Tissue Antigens 48, 174^181. Troup, G. M. 1994 The HLA system in Native Americans. ASHI Q., autumn, 10^12. Ward, R. H., Redd, A., Valencia, D., Frazier, B. & Paabo, S. 1993 Genetic and linguistic di¡erentiation in the Americas. Proc. Natl Acad. Sci. USA 90, 10 663^10 667. Yunis, J. J. (and 10 others) 1994 Major histocompatibility complex class II alleles and haplotypes and blood groups of four Amerindian tribes of northern Colombia. Hum. Immunol. 41, 248^258. As this paper exceeds the maximum length normally permitted, the authors have agreed to contribute to production costs.