AMERICAN JOURNAL OF PHYSICAL ANTHROPOLOGY 128:670–677 (2005)
Y-Chromosome and Mitochondrial DNA Studies on the Population Structure of the Christmas Island Community Cheryl A. Wise,1* Sheena G. Sullivan,1 Michael L. Black,1 Wendy N. Erber,2 and Alan H. Bittles1 1
Centre for Human Genetics, Edith Cowan University, Perth, Western Australia 6027, Australia Haematology Department, Western Australian Centre for Pathology and Medical Research, Perth, Western Australia 6907, Australia
2
KEY WORDS
genetic variation; contemporary population; southern China; Southeast Asia
ABSTRACT Christmas Island is a remote Australian territory located close to the main Indonesian island of Java. Y-chromosome and mitochondrial DNA (mtDNA) markers were used to investigate the genetic structure of the population, which comprises communities of mixed ethnic origin. Analysis of 12 Y-chromosome biallelic polymorphisms revealed a high level of gene diversity and haplotype frequencies that were consistent with source populations in southern China and Southeast Asia. mtDNA hypervariable segment I (HVS-I) sequences displayed high levels of haplotype diversity and nucleotide diversity that were comparable to various Asian populations. Genetic distances revealed extremely low mtDNA differentiation among Christmas Islanders and Asian populations. This was supported by the relatively high
proportion of sequence types shared among these populations. The most common mtDNA haplogroups were M* and B, followed by D and F, which are prevalent in East/ Southeast Asia. Christmas Islanders of European descent were characterized by the Eurasian haplogroup R*, and a limited degree of admixture was observed. In general, analysis of the genetic data indicated population affinities to southern Chinese (in particular from the Yunnan Province) and Southeast Asia (Thailand, Malaysia, and Cambodia), which was consistent with historical records of settlement. The combined use of these different marker systems provides a useful and appropriate model for the study of contemporary populations derived from different ethnic origins. Am J Phys Anthropol 128:670–677, 2005. ' 2005 Wiley-Liss, Inc.
Christmas Island is an Australian territory with a multicultural population of 2,195 (1998 estimate) inhabitants, located at 108S 1058E, approximately 360 km south of Jakarta, the capital of Indonesia, and 1,540 km from the coast of Western Australia. The island was annexed by Great Britain in 1888 after the discovery of rich deposits of phosphate, and a settlement was established to collect timber and supplies for the growing industry on neighboring Cocos and Keeling Islands. Phosphate extraction eventually became the impetus behind the island’s development, and as there was no indigenous population, indentured Chinese laborers were recruited to mine the phosphate. Christmas Island became a dependency of Singapore in 1946, and additional laborers were hired from Malaysia. In 1957, Australia purchased sovereignty of the island from Singapore for around £2.9 million, and the present population comprises approximately 61% Chinese (speaking Mandarin, Cantonese, and other Chinese dialects), 25% Malay, and 14% European or other, with Buddhism, Islam, and Christianity as the main religions (Commonwealth of Australia, 2002). With depletion of the phosphate deposits, substantial numbers of Christmas Islanders moved to the Australian mainland in search of employment. During the last decade, a number of commercial ventures and government establishments were introduced with varying levels of success, including a casino and a refugee holding center, with plans for a space port currently on hold. Despite these changes, it is believed that marriage has continued to occur largely within individual ethnic communities. To
address these issues, Y-chromosome biallelic polymorphisms, mtDNA hypervariable segment I (HVS-I) sequences, and the COII/tRNALys intergenic 9-bp deletion were used to investigate the genetic structure of the constituent island communities. The data are compared to previously published data on Y-chromosome and mtDNA variation in relevant reference populations.
#
2005 WILEY-LISS, INC.
SUBJECTS AND METHODS Subjects As part of an ongoing health-based study organized through Sir Charles Gairdner Hospital, Perth during 1998–1999, blood samples were obtained from volunteer Christmas Island residents and individuals born on the island who were living in Port Hedland, Western Australia. Individuals were classified into broad ethnic groups: Chinese (n ¼ 75), Malay (n ¼ 11), European (n ¼ 6), and Grant sponsor: Edith Cowan University, PerthCentre; Grant sponsor: Health Department, Western Australia. *Correspondence to: Dr. Cheryl Wise, Neurogenetics Unit, Level 2 North Block, Royal Perth Hospital, Wellington St., Perth WA 6000, Australia. E-mail:
[email protected] Received 19 December 2003; accepted 15 September 2004. DOI: 10.1002/ajpa.20193 Published online 29 April 2005 in Wiley InterScience (www.interscience.wiley.com).
GENETIC STRUCTURE OF CHRISTMAS ISLAND unknown (n ¼ 6), based on survey information. The samples were forwarded to the Western Australian Centre for Pathology and Medical Research, Haematology Department, and DNA was obtained by standard phenol/ chloroform extraction. The present study was performed with the approval of the Human Research Ethics Committees of Sir Charles Gairdner Hospital and Edith Cowan University.
Y-chromosome analyses Twelve biallelic markers on the Y-chromosome (M9, M175, M122, M134, M159, M119, M50, M95, M88, M45, M173, and M17; Underhill et al., 2000, 2001) were analyzed in 60 male Christmas Island samples. Genotypes were determined using restriction site polymorphisms (RSPs) or single-nucleotide primer extension with MALDI-TOF mass spectrometry, as described elsewhere (Wang et al., 2003; Wise et al., 2003). Haplotypes were defined according to the evolutionary relationship of the markers and the standard Y Chromosome Consortium (YCC) mutation-based nomenclature (Y Chromosome Consortium, 2002). For example, individuals typed as M9G, M122C, and M134delG were classified as haplotype M134. For comparisons with reference populations, 10 major haplotypes were defined as non-M9, M9* (includes M175), M122* (includes M159), M134, M119, M50, M95, M88, M45* (includes M173), and M17. Haplotype frequencies were determined by direct counting, with gene diversity calculated according to Equation 8.5 of Nei (1987).
Mitochondrial DNA analyses The mtDNA hypervariable segment I (HVS-I) was amplified in 98 Christmas Island samples using primers L15996 and H16401 (Vigilant et al., 1991), with M13 (21) and M13-reverse sequence tags attached to the 50 end of each primer, respectively. Polymerase chain reaction (PCR) was performed in 50 ml volumes containing 50 ng of genomic DNA, 0.5 unit HotStarTaq1 DNA polymerase and associated buffer (Qiagen), 1.5 mM MgCl2, 200 mM of each dNTP, and 0.2 mM of each primer. Thermal cycling conditions were 958C for 15 min, followed by 35 cycles of 958C for 45 sec, 668C for 45 sec, and 728C for 1 min, completed by a 5-min extension at 728C. PCR products were purified using the MinEluteTM 96 UF PCR purification kit, as per the manufacturer’s protocol (Qiagen). Sequences were generated using the M13 primers and ABI PRISM1 BigDyeTM Terminator mix, and run on an ABI 377 DNA sequencer (Applied Biosystems). Sequences were aligned and edited from nucleotide position (np) 16024 to 16400 (relative to the Cambridge reference sequence (CRS); (Anderson et al., 1981), using ABI Sequence Navigator software. Christmas Island samples were screened for the mtDNA 9-bp deletion in the COII/tRNALys intergenic region which defines haplogroup B, using conditions described elsewhere (Qian et al., 2001). Additional mtDNA haplogroups were broadly inferred from HVS-I motifs, using previous coding polymorphism-based definitions as a guide (Yao et al., 2002a). The degree of resolution of mtDNA haplogroups depends on the region of the world in which the study populations reside. European (Macaulay et al., 1999; Richards et al., 2000), West Asian (Richards et al., 2000; Quintana-Murci et al., 2004), African (Salas et al., 2002), and North Asian/American
671
mtDNA haplogroups (Melton et al., 2001) have been defined in great detail and resolution. It was only recently that comprehensive investigations of haplogroup definitions and diversity commenced in East Asia (Kivisild et al., 2002; Yao and Zhang, 2002; Yao et al., 2002a). It is the haplogroup phylogenies developed from these East Asian investigations that are referred to in the present study. Published HVS-I sequence data from additional human populations were included for comparative analyses. Where published data were used, analyses were restricted to a 344-bp segment (np 16048–16391) to minimize the impact of missing sequence data. Due to the limited demographic information available for the present samples, analyses were largely performed by separating the Christmas Island samples into three broad ethnic groups: Chinese, Malay, European. HVS-I haplotype diversity (h) was determined for each population using Equation 8.5 of Nei (1987). The average number of nucleotide substitutions per site between pairs of sequences (nucleotide diversity, p) was calculated in MEGA version 2.1 (Kumar et al., 2001), using the substitution model of Tamura and Nei (1993) and a gamma parameter value of 0.26 (Meyer et al., 1999). Pairwise population distances (dA), pairwise FST values, and analysis of molecular variance (AMOVA; Excoffier et al., 1992) were computed in ARLEQUIN version 2.000 (Schneider et al., 2000), using the distance method of Tamura and Nei (1993). Nonmetric multidimensional scaling (MDS; Kruskal, 1964) was performed on both the FST and dA distance matrices, using the software package STATISTICA 6 (Statsoft, Inc., 2001). This nonparametric ordination technique represents the dissimilarity among populations as an n-dimensional graph, where the interpoint distances in the graph space correspond to the observed genetic differences between populations.
RESULTS Y-chromosome variation The Christmas Island population exhibited high Y-chromosome gene diversity (0.872) for the markers studied, similar to values observed in Southeast Asian populations such as Cambodian (0.886), Malaysian (0.871), Batak (0.817), and Javanese (0.890) (Su et al., 1999). By comparison, gene diversity was reported to be slightly lower in the southern Han Chinese, from whom many Christmas Island males are believed to be descended, with an overall value of 0.795 (Su et al., 1999, 2000). The evolutionary relationship and frequencies of the major Y-chromosome haplotypes in selected populations from southern China and Southeast Asia are shown in Figure 1. The Christmas Island population contained moderate frequencies of haplotype M122 (36.7%, with M134 accounting for 13.3%), followed by M119 (23.3%, with M50 accounting for 5.0%) and M95 (15.0%, with M88 accounting for 8.3%). M122 and M134 are seen at moderate to high frequencies throughout Northeast Asia (NEAS) and Southeast Asia (SEAS), and are present in the majority of Han Chinese populations. M119 is also found throughout East Asia, with higher frequencies in SEAS and southern China, while M50 is rare in China and has varying frequencies in SEAS. M95 occurs at moderate to high frequencies in SEAS and Chinese ethnic populations, but is virtually absent in NEAS. M88 (phylogenetically equivalent to M111) occurs in a smaller
672
C.A. WISE ET AL.
Fig. 1. Evolutionary tree of major Y-chromosome haplotypes and their frequency distributions in selected populations from southern China and Southeast Asia (Su et al., 1999). Haplotype designation conforms to mutation-based nomenclature recommended by Y Chromosome Consortium (2002). M9* haplotype includes M175, and M45* haplotype includes M173.
number of SEAS and Chinese minority populations (Su et al., 1999, 2000; Karafet et al., 2001). Only three male Christmas Island samples of European descent were available for analysis, and these constituted a non-M9, M173, and M17 haplotype, which were consistent with haplotypes present in European populations (Semino et al., 2000).
Mitochondrial DNA variation The sequence variation in the HVS-I region (np 16024–16400 in the CRS; Anderson et al., 1981), and inferred mtDNA haplogroups in the 98 Christmas Island samples, are shown in Table 1. There were 70 HVS-I sequence types defined by 94 polymorphic positions, and 18 sequence types also contained the 9-bp deletion. The haplogroups for this study were broadly defined primarily with HVS-I motifs, with the 9-bp deletion defining haplogroup B. The 9-bp deletion (haplogroup B, including B4 and B5) occurred at a relatively high frequency in the Christmas Island population (29.6%). It is present at varying frequencies in Chinese ethnic populations (0–32%) (Yao et al., 2000), and was found in a number of SEAS populations, including Thai (15.6%), Vietnamese (17.9%), Malaysian Chinese (7.1%), Malay (14.3%), and Indonesian (21%) (Ballinger et al., 1992; Redd et al., 1995; Yao et al., 2000). The macrohaplogroup M* (including subhaplogroups M7b, M8a, and C) accounted for 21.4% of the Christmas Island mtDNAs. Haplogroup D (including D4a) was present at a frequency of 14.3%. Haplogroup F (F1) accounted for 14.3%, and haplogroup R* (subhaplogroup of N shared between Europe and East Asia) accounted for 16.3% of the mtDNAs. The remaining 4.1% belonged to haplogroups N9a and A. Genetic variation within the Christmas Island and 12 reference populations was assessed by calculating halpotype diversity (h) values and the average number of pairwise differences (or nucleotide diversity, p) (Table 2). HVS-I haplotype diversity for the total Christmas Island population (0.992) was comparable to Chinese and SEAS
populations. The Christmas Island population and a range of Asian populations showed similar levels of nucleotide diversity (0.021–0.026), at least 1.6 times higher than the British population (0.013). This higher level of mtDNA diversity in Asians compared to Europeans was previously observed (Oota et al., 2002).
Population comparisons Genetic differences (FST values) among the three Christmas Island ethnic groups. (Chinese, Malay, and European) and 12 reference populations are shown in Table 3. The FST values were significant between the Chinese Christmas Islanders and Sali (FST ¼ 0.040; P ¼ 0.001), Bai (FST ¼ 0.019; P ¼ 0.042), Taiwan Han (FST ¼ 0.050; P ¼ 0.000), Philippine (FST ¼ 0.032; P ¼ 0.002), Indonesian (FST ¼ 0.014; P ¼ 0.039), and British (FST ¼ 0.076; P ¼ 0.000). The FST values between the Chinese and Malay Christmas Islanders (0.023) and European Christmas Islanders (0.036) were also relatively high, but not statistically significant, and should be interpreted with caution due to the small sample sizes. Based on FST values, Chinese Christmas Islanders appeared most genetically similar to the Yunnan Han (0.002), Dai Chinese (0.005), Thai (0.006), and Malaysian (0.007), followed by the Guangdong Han (0.009) and Hubei Han (0.010). The Malay Christmas Islanders showed the lowest FST values with the Dai Chinese (0.018) and Philippine (0.020), followed by the Chinese Han and Malaysian populations. The European Christmas Islanders were closest to the British (0.029), but further interpretations were difficult due to the very small sample size. These genetic affinities were closely reflected in the two-dimensional multidimensional scaling (MDS) plot based on the pairwise FST values (Fig. 2) and dA distances (not shown). The Chinese Christmas Islanders are located in a central group consisting of the southern Chinese populations, particularly the Yunnan Han and Dai (Yunnan Province, People Republic of China), and the Southeast Asian populations of Malaysia and Thailand. The Malay and
673
GENETIC STRUCTURE OF CHRISTMAS ISLAND
TABLE I. mtDNA sequence variation in 98 Christmas Island samples (Continued on page 674) Type 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68
Sample number1 1 2 2 2 1 1 2 1 1 1 1 1 2 2 1 1 2 1 1 1 1 1 1 2 1 1 1 1 2 1 2 1 2 1 1 2 3 1 1 1 1 1 3 1 1 1 2 1 2 2 2 1 1 2 1 1 1 1 1 1 1 1 1 2 1 1 1 1
MCI* CCI CCI CCI CCI CCI CCI CCI CCI unknown CCI CCI CCI CCI CCI CCI unknown CCI CCI CCI MCI ECI CCI, 1 unknown CCI CCI CCI CCI CCI CCI CCI CCI CCI CCI CCI MCI CCI CCI, 1 MCI MCI CCI CCI CCI unknown CCI, 2 MCI CCI MCI CCI CCI CCI CCI CCI CCI CCI CCI CCI CCI CCI CCI CCI CCI unknown MCI CCI CCI, 2 MCI CCI CCI ECI ECI ECI
HVS-I polymorphic sites2
Haplogroup
9-bp deletion3
92-171C-223-278 223-274-278-295-311 183C-189-209-223 66-223-311 166-223-311 184-193-223 223-327 192-209-223-233-274-304-311-365 93-129-223-311-357 129-223-297 129-192-223-297 92-129-192-223-254-297 184-223-298-319 129-148-223-298-327 92-223-298-327-390 223-234-248-265C-316-362 209-223-362 173-223-295-362 223-362 104-223-284-295-297-362 124-187-223-274-289-319-320-362 111-129-213-223-235-300-362 129-223-270-362 86-129-223-362 129-223-249-278-311-362 129-223-362 176-223-257A-261 172-223-257A-261 126-223-235-290-319 182C-183C-189-266A 182C-183C-188-189-193þCC-261-266A 129-182C-183C-189-228-261-316 140-182C-183C-189-217-274-335 51-140-182C-183C-189-217-274-335 147-183C-184A-189-217-235-394 129-150-167-183C-189-217-234 182C-183C-189-217-261 182C-183C-189-217-224-261 93-182C-183C-189-217-261 182C-183C-189-217-261-299 126-182C-183C-189-217-261 93-136-179-182C-183C-189-217 140-183C-189-266A 140-172-187-189-256-266G 92-140-183C-189-266A-294 92-140-183C-189-193þC-266A-294 140-153-183C-189-243-311-319 93-157-304 129-304-362 129-172-293-304 51-129-172-304 129-172-242-304 34C-129-162-172-188-304 129-162-172-188-304 129-172-179-274-304 129-162-172-304 129-145-182C-183C-189-232A-249-304-311 260-298-355-362 304-309-390 86-129-297-324-362 298 295-319 CRS 51-209-239-244-319-352-353 93 293-311 168 93-298
M* M* M* M* M* M* M* M* M* M7b* M7b1 M7b1 M8a C C D or M9 D D D D D or G D4a D4a D4a D4a D4a N9a N9a A B B B B (B4?) B (B4?) B4 B4 B4a B4a B4a B4a B4a B4b B5a B5a B5a B5a B5b F F F1a F1a F1a F1a F1a F1a F1a F1b R9a R10 R* R* R* R* R* R* R* R* R*
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
674
C.A. WISE ET AL. TABLE I. (Continued from page 673) 1
Type 69 70
Sample number
HVS-I polymorphic sites2
Haplogroup
9-bp deletion3
1 ECI 1 ECI
126-292-294-296 126-223-294-296-304
R* R*(?)
2 2
1 Chinese, Malay, and European Christmas Islanders are abbreviated as CCI, MCI, and ECI, respectively. Numbers prefixing sample identification codes indicate sample frequency for each haplotype. 2 Sites are numbered (þ16000) according to CRS of Anderson et al. (1981). Suffixes A, G, C, and T indicate transversions. Plus sign (þ) indicates insertion recorded at last possible site. 3 ‘‘1’’ denotes presence of intergenic COII/tRNALys 9-bp (CCCCCTCTA) deletion. ‘‘2’’ denotes nondeletion (i.e., two repeats of 9-bp fragment).
TABLE II. mtDNA (HVS-I) diversity values and COII/tRNA Population (N) CIS (98) GD Han (30) YN Han (43) HB Han (42) Dai (38) Bai (31) Sali (31) TW Han (52) Thai (32) MAL (52) PHIL (59) IND (54) British (98)
1
2
Haplotype diversity 0.992 0.995 0.992 1.000 0.996 0.996 0.989 0.993 0.994 0.985 0.989 0.988 0.967
6 6 6 6 6 6 6 6 6 6 6 6 6
Lys
9-bp deletion frequencies
Nucleotide diversity
9-bp del (%)4
Reference
6 6 6 6 6 6 6 6 6 6 6 6 6
29.6 23.5 16.3 17.5 23.5 n.d. 15.6 40.0 15.6 14.3 n.d. 21.0 n.d.
Present study Yao et al., 2000, 2002a Yao et al., 2000, 2002a Yao et al., 2000, 2002a Yao et al., 2000, 2002b Yao et al., 2002b Yao et al., 2000, 2002b Ballinger et al., 1992; Horai et al., 1996 Yao et al., 2000, 2002b Ballinger et al., 1992; Tajima et al., 2004 Tajima et al., 2004 Redd et al., 1995; Tajima et al., 2004 Piercy et al., 1993
0.003 0.010 0.008 0.005 0.007 0.009 0.011 0.006 0.009 0.008 0.006 0.008 0.011
3
0.023 0.024 0.026 0.024 0.023 0.021 0.021 0.023 0.023 0.024 0.025 0.024 0.013
0.012 0.013 0.014 0.013 0.012 0.011 0.011 0.012 0.012 0.013 0.013 0.013 0.007
1 CIS, Christmas Island; GD Han, Guangdong Han; YN Han, Yunnan Han; HB Han, Hubei Han; TW Han, Taiwan Han; MAL, Malaysian; PHIL, Philippine; IND, Indonesian. 2 Calculated using Equation 8.5 of Nei (1987). 3 Estimated using substitution model of Tamura and Nei (1993), with a ¼ 0.26. Standard deviation calculated using Equation 10.9 of Nei (1987). 4 n.d., not determined.
European Christmas Islanders occupy separate positions, with the European Christmas Islanders located between the British and Asian populations. The genetic structure of the Chinese and Malay Christmas Islanders and the 11 Asian reference populations was further investigated by AMOVA (Excoffier et al., 1992). For mtDNA HVS-I data, 97.4% of the genetic variation was found within populations, and 2.6% of the variation was attributed to differences among populations (FST = 0.026). By comparison, the average FST between 12 continental European populations (0.066) (Oota et al., 2002) was more than twice the variation observed between the Christmas Islanders and Asian populations.
shared only with the Southeast Asian populations, and one was shared with the British population. The close proximity of the Chinese Christmas Islanders and South/Southeast Asian populations, which is apparent in the MDS plot, was supported by the large number of sequence types shared between these populations, and demonstrates the high level of mtDNA homogeneity in Asian populations (Oota et al., 2002). Only a very small number of sequence types were shared with the British population, with the European Christmas Islanders having none in common (although the sample size was small; n ¼ 6).
Shared mtDNA haplotypes
The present study uses insights from Y-chromosome and mtDNA variation to investigate the genetic structure of the Christmas Island population. Of particular interest is the hypothesis, based on historical data, that the constituent communities stem from southern China and Southeast Asia, with very little European admixture. The frequency distributions of Y-chromosome haplotypes (Fig. 1) suggest male migrations to the island from diverse populations of southern China and Southeast Asia. In particular, M122 was present at moderate frequencies in the Christmas Island, southern Han Chinese, and Malaysian populations. M134 occurred at similar frequencies in the Christmas Island, Malaysian, and Cambodian populations, with a higher frequency in the southern Han. The Y-haplogroup M122* is believed to be the genetic trace of the initial colonization of mainland Southeast Asia, followed by expansions to other parts of
Table 4 presents the number of HVS-I sequence types shared between the three Christmas Island populations and several reference groups. The Chinese Christmas Islanders share 3 types with the Malay Christmas Islanders, 6 with the combined Chinese Han (Yunnan, Guangdong, and Hubei), 5 with the Chinese ethnic minorities (Dai, Sali, and Bai), 7 with Taiwan Chinese Han, 7 with Southeast Asian populations (Malaysian, Thai, Indonesian, and Philippine), and 1 with the British population. Overall, 16 sequence types (out of 54; 29.6%) were shared between Chinese Christmas Islanders and Asian populations. Of the three types shared with the Malay Christmas Islanders, one was also shared between the Chinese Han, Chinese ethnic minorities, and Southeast Asian populations. Another one was
DISCUSSION
0.0289
0.0391 0.1143
0.0106 0.0379 0.0966 0.0431 0.0470 0.0936 0.1402
0.0274 0.0009 0.0059 0.0493 0.1076 0.0289 0.1115 0.0213 0.0248 0.0534 0.1364
0.0299 0.0074 0.0668 0.0101 0.0201 0.0531 0.1142 0.0537 0.0536 0.0155 0.0439 0.0357 0.0039 0.0690 0.1624
0.0372 0.0102 0.0772 0.0165 0.0306 0.0291 0.0249 0.0735 0.1331
0.0072 0.0207 0.0214 0.0380 0.0015 0.0219 0.0040 0.0042 0.0498 0.1291
0.0002 0.0027 0.0307 0.0009 0.0369 0.0031 0.0295 0.0136 0.0139 0.0396 0.1061
0.0067 0.0059 0.0069 0.0204 0.0145 0.0468 0.0071 0.0423 0.0264 0.0087 0.0565 0.1270 0.0421 0.0426 0.0544 0.0181 0.0827 0.0747 0.1646 0.0549 0.0202 0.0741 0.0670 0.1201 0.1485
Chinese, Malay, and European Christmas Islanders are abbreviated as CCI, MCI, and ECI, respectively. HB Han, Hubei Han; YN Han, Yunnan Han; GD Han, Guangdong Han; TW Han, Taiwan Han; MAL, Malaysian; PHIL, Philippine; IND, Indonesian. Bold FST values are nonsignificant at 5% level.
1
0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.1924 0.1094 0.0107 0.0537 0.1133 0.0820 0.0244 0.0381 0.0586 0.0606 0.0596 0.0303 0.1475 0.0977 0.0420 0.0107 0.1416 0.1025 0.2949 0.0322 0.2842 0.0479 0.0156 0.2207 0.0000 0.1367 0.0391 0.0049 0.0098 0.0703 0.2920 0.0107 0.0029 0.1299 0.0156 0.4922 0.0010 0.0020 0.1582 0.0010 0.0137 0.0449 0.0117 0.0039 0.0010 0.0000 0.0098 0.1172 0.0225 0.1416 0.2529 0.4805 0.0440 0.0664 0.1699 0.0020 0.0000 0.0000 0.0000 0.0000 0.0068 0.0000 0.0000 0.0029 0.1885 0.0068 0.0703 0.3682 0.0488 0.1104 0.0029 0.0010 0.0186 0.0313 0.0098 0.0645 0.0147 0.2139 0.1592 0.1777 0.5625 0.2061 0.1367 0.0234 0.2022 0.4102 0.5400 0.0459 0.1846 0.0889 0.0361 0.1172
0.0226 0.0100 0.0019 0.0093 0.0049 0.0402 0.0063 0.0502 0.0073 0.0317 0.0138 0.0185 0.0365 0.0757 CCI MCI HB Han YN Han GD Han Dai Sali Thai TW Han MAL PHIL IND Bai ECI British
Bai IND PHIL MAL TW Han Thai Sali Dai GD Han YN Han HB Han MCI CCI
TABLE III. Pairwise FST values (below diagonal) and P values (above diagonal) between Christmas Island and reference populations1
ECI
British
GENETIC STRUCTURE OF CHRISTMAS ISLAND
675
eastern Asia (Su et al., 1999). M119 was also present at moderate frequencies in the Christmas Island and southern Han populations, with lower frequencies in SEAS. M50 was absent in the southern Han, and occurred at low frequencies in the Christmas Island, Cambodian, and northeastern Thai populations, with a higher frequency in Malaysians. M95 occurred at low frequencies in the Christmas Island, southern Han, and Malaysian populations, with moderate to high frequencies in Cambodian and Thai populations. M88 was present in Christmas Island, Cambodian, and Thai populations, but was virtually absent in southern Han and Malaysian. Very little contribution from European Y-chromosome haplotypes was observed (potentially non-M9 and M45 haplotypes), although these are also present in Asian populations at low frequencies. The mtDNA profile of the Christmas Island population is similar to the Han Chinese (in particular from Yunnan Province) and Southeast Asian populations from Malaysia and Thailand (Fig. 2). Christmas Islanders also showed as high degree of genetic similarity to the Dai Chinese ethnic minority, who trace their origins to the ancient southern Pai-Yuei tribe. The Pai-Yuei tribe was widely distributed along the southeast coast of China up to Yunnan Province and the northern part of Southeast Asia, including north Thailand (Yao et al., 2002b). By comparison, the Bai and Sali populations trace their origins to the ancient Di-Qiang tribe in northwest China (Yao et al., 2002b), and displayed less genetic resemblance to Christmas Islanders. The low mtDNA differentiation between Christmas Islanders and Asian populations was confirmed by AMOVA, with a mean interpopulation variance of 2.6%, similar to the average interpopulation variance estimated in Asia (3.2%; Cambodian, Chinese, Japanese, Malay, and Vietnamese) (Jorde et al., 2000). To put this variance into a global perspective, the average interpopulation variance in mtDNA is 4.5% (Europe) and 8.8% (Africa) (Jorde et al., 2000), further supporting the homogeneity of Asian populations. This conclusion is substantiated by the large number of HVS-I sequence types shared between these populations. The Christmas Island population contained a significant number of mtDNAs with the COII/tRNALys 9-bp deletion (haplogroup B, Table 1). Only 18 HVS-I haplotypes were present in 29 samples containing the 9-bp deletion, with a haplotype diversity of 0.951 6 0.023 compared to 0.992 6 0.004 for 69 samples (52 haplotypes) without the deletion. Nucleotide diversity among mtDNA samples with the deletion was also slightly lower (0.017 6 0.009) than those without the deletion (0.020 6 0.010), which is consistent with founder events associated with the expansion of the deletion from China (Redd et al., 1995; Yao et al., 2000). At least two 9-bp deletion haplogroups were identified and are distinguished by the 16217T/C polymorphism (haplogroup B4) which is widespread in Asia, and the 16140T/C polymorphism (haplogroup B5) which presents increasing frequencies in Southeast Asia (Yao et al., 2000; Schurr and Wallace, 2002). The remaining mtDNA haplogroups for this study were only broadly inferred from HVS-I motifs (Kivisild et al., 2002; Yao et al., 2002a). This was applied to initial surveys of haplogroup status in East Asia (Yao and Zhang, 2002; Tajima et al., 2003). The definition of haplogroups in East Asia is still developing, and therefore the phylogeny used in this study is by no means conclusive. The majority of
676
C.A. WISE ET AL. population in mainland Asia, especially southern China, and Southeast Asian countries such as Cambodia, Thailand, and Malaysia. Very little admixture was observed between Christmas Islanders of Asian ancestry and Europeans, although these conclusions are constrained by the limited sample size of the European Christmas Islanders. The present study provides a useful model for the investigation of other contemporary populations derived from different origins, and could play and important role in the study of inherited diseases. Complementary studies of this nature are in progress to investigate the distribution of b- and a-thalassaemia mutations in the constituent Christmas Island communities.
ACKNOWLEDGMENTS Fig. 2. Two-dimensional MDS plot based on pairwise FST values (calculated from mtDNA HSV-I sequence data) between three Christmas Island populations and 12 reference populations from southern China, Southeast Asia, and Europe. TABLE IV. mtDNA sequence types shared between Christmas Island and reference groups1
CCI MCI ECI Ch. Han Ch. minor Tw. Han SEAS British
CCI
MCI
ECI
Ch. Han
Ch. minor
Tw. Han
SEAS
3 0 6 5 7 7 1
0 1 3 0 2 2
0 0 0 0 0
10 6 7 0
6 10 1
5 0
0
1
CCI, Chinese Christmas Islanders; MCI, Malay Christmas Islanders; ECI, European Christmas Islanders; Ch. Han, Chinese Han (Yunnan, Guangdong, and Hubei); Ch. minor, Chinese ethnic minority (Dai, Bai, and Sali); Tw. Han, Taiwan Han; SEAS, Southeast Asian (Malaysian, Thai, Indonesian, and Philippine).
predicted haplogroups in the Christmas Island sample (haplogroups M*, B, and F) are common in southern China and Southeast Asia (Schurr and Wallace, 2002). Haplogroup F1a is the main branch of F in southern mainland China and Southeast Asia, and accounts for the majority of F haplogroups in the Christmas Islanders (Kivisild et al., 2002; Yao et al., 2002a). Haplogroup D (including D4a) was also observed at moderate frequencies in the Christmas Island sample. The D4 frequency tends to increase from southern to northern China, and is also found in the Taiwan Han, who received a heavy influx of Han people from mainland China after World War II (Yao et al., 2002a). Most of the remaining samples belong to the Eurasian founder haplogroup R* and are possibly of European origin. Of the six European Christmas Islanders, five were potentially classified as haplogroup R* (types 66–70, Table 1). Five Chinese and three Malay Christmas Islanders (types 61–65, Table 1) were also potentially classified as European, and may represent maternal admixture.
CONCLUSIONS The combined mtDNA and Y-chromosome analyses are consistent with genetic origins for the Christmas Island
This study was supported by an Edith Cowan University-Industry Collaboration Grant with PathCentre and the Health Department, Western Australia. W. Musk (Department of Respiratory Medicine, Sir Charles Gairdner Hospital, Perth) organized the collection of Christmas Island blood samples.
LITERATURE CITED Anderson S, Bankier AT, Barrell BG, de Bruijn MHL, Coulson AR, Drouin J, Eperon IC, Nierlich DP, Roe BA, Sanger F, Schreier PH, Smith AJH, Staden R, Young IG. 1981. Sequence and organization of the human mitochondrial genome. Nature 290:457–465. Ballinger SW, Schurr TG, Torroni A, Gan YY, Hodge JA, Hassan K, Chen KH, Wallace DC. 1992. Southeast Asian mitochondrial DNA analysis reveals genetic continuity of ancient Mongoloid migrations. Genetics 130:139–152. Commonwealth of Australia. 2002. Basic community profile: territory of Christmas Island. In: Census of Population and Housing. Canberra: Commonwealth of Australia. Excoffier L, Smouse PE, Quattro JM. 1992. Analysis of molecular variance inferred from metric distances among DNA haplotypes: application to human mitochondrial DNA restriction data. Genetics 131:479–491. Horai S, Murayama K, Hayasaka K, Matsubayashi S, Hattori Y, Fucharoen G, Harihara S, Park KS, Omoto K, Pan IH. 1996. mtDNA polymorphism in East Asian Populations, with special reference to the peopling of Japan. Am J Hum Genet 59:579–590. Jorde LB, Watkins WS, Bamshad MJ, Dixon ME, Ricker CE, Seielstad MT, Batzer MA. 2000. The distribution of human genetic diversity: a comparison of mitochondrial, autosomal, and Y-chromosome data. Am J Hum Genet 66:979–988. Karafet T, Xu L, Du R, Wang W, Feng S, Wells RS, Redd AJ, Zegura SL, Hammer MF. 2001. Paternal population history of East Asia: sources, patterns, and microevolutionary processes. Am J Hum Genet 69:615–628. Kivisild T, Tolk HV, Parik J, Wang Y, Papiha SS, Bandelt HJ, Villems R. 2002. The emerging limbs and twigs of the East Asian mtDNA tree. Mol Biol Evol 19:1737–1751. Kruskal JB. 1964. Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis. Pyschometrika 29: 1–27. Kumar S, Tamura K, Jakobsen IB, Nei M. 2001. MEGA2: molecular evolutionary genetics analysis software. Bioinformatics 17:1244–1245. Macaulay V, Richards M, Hickey E, Vega E, Cruciani F, Guida V, Scozzari R, Bonne´-Tamir B, Sykes B, Torroni A. 1999. The emerging tree of West Eurasian mtDNAs: a synthesis of controlregion sequences and RFLPs. Am J Hum Genet 64:232–249. Melton T, Clifford S, Kayser M, Nasidze I, Batzer M, Stoneking M. 2001. Diversity and heterogeneity in mitochondrial DNA of North American populations. J Forensic Sci 46:46–52. Meyer S, Weiss G, von Haeseler A. 1999. Pattern of nucleotide substitution and rate heterogeneity in the hypervariable re-
GENETIC STRUCTURE OF CHRISTMAS ISLAND gions I and II of human mtDNA. Genetics 152:1103–1110. Nei M. 1987. Molecular evolutionary genetics. New York: Columbia University Press. Oota H, Kitano T, Jin F, Yuasa I, Wang L, Ueda S, Saitou N, Stoneking M. 2002. Extreme mtDNA homogeneity in continental Asian populations. Am J Phys Anthropol 118:146–153. Piercy R, Sullivan KM, Benson N, Gill P. 1993. The application of mitochondrial DNA typing to the study of white Caucasian genetic identification. Int J Leg Med 106:85–90. Qian YP, Chu ZT, Dai Q, Wei CD, Chu JY, Tajima A, Horai S. 2001. Mitochondrial DNA polymorphisms in Yunnan nationalities in China. J Hum Genet 46:211–220. Quintana-Murci L, Chaix R, Wells RS, Behar DM, et al. 2004. Where West meets East: the complex mtDNA landscape of the Southwest and Central Asian corridor. Am J Hum Genet 74: 827–845. Redd AJ, Takezaki N, Sherry ST, McGarvey ST, Sofro ASM, Stoneking M. 1995. Evolutionary history of the COII/tRNALys intergenic 9 base pair deletion in human mitochondrial DNAs from the Pacific. Mol Biol Evol 12:604–615. Richards M, Macaulay V, Hickey E, Vega E, et al. 2000. Tracing European founder lineages in the Near Eastern mtDNA pool. Am J Hum Genet 67:1251–1276. Salas A, Richards M, De la Fe T, Lareu MV, Sobrino B, SanchezDiz P, Macaulay V, Carracedo A. 2002. The making of the African mtDNA landscape. Am J Hum Genet 71:1082–1111. Schneider S, Roessli D, Excoffier L. 2000. Arlequin version 2.000: a software for population genetics data analysis. Geneva: Genetics and Biometry Laboratory, University of Geneva. Schurr TG, Wallace DC. 2002. Mitochondrial DNA diversity in Southeast Asian populations. Hum Biol 74:431–452. Semino O, Passarino G, Oefner PJ, Lin AA, Arbuzova S, Beckman LE, De Benedictis G, Francalacci P, Kouvatsi A, Limborska S, Marcikiae M, Mika A, Mika B, Primorac D, Santachiara-Benerecetti AS, Cavalli-Sforza LL, Underhill PA. 2000. The genetic legacy of Paleolithic Homo sapiens sapiens in extant Europeans: a Y chromosome perspective. Science 290: 1155–1159. StatSoft, Inc. 2001. STATISTICA (data analysis software system) version 6. www.statsoft.com. Su B, Xiao J, Underhill P, Deka R, Zhang W, Akey J, Huang W, Shen D, Lu D, Luo J, Chu J, Tan J, Shen P, Davis R, CavalliSforza L, Chakraborty R, Xiong M, Du R, Oefner P, Chen Z, Jin L. 1999. Y-chromosome evidence for a northward migration of modern humans into eastern Asia during the last Ice Age. Am J Hum Genet 65:1718–1724. Su B, Xiao C, Deka R, Seielstad MT, Kangwanpong D, Xiao J, Lu D, Underhill P, Cavalli-Sforza L, Chakraborty R, Jin L. 2000. Y chromosome haplotypes reveal prehistorical migrations to the Himalayas. Hum Genet 107:582–590.
677
Tajima A, Sun C-S, Pan I-H, Ishida T, Saitou N, Horai S. 2003. Mitochondrial DNA polymorphisms in nine aboriginal groups of Taiwan: implications for the population history of aboriginal Taiwanese. Hum Genet 113:24–33. Tajima A, Hayami M, Tokunaga K, Juji T, Matsuo M, Marzuki S, Omoto K, Horai S. 2004. Genetic origins of the Ainu inferred from combined DNA analyses of maternal and paternal lineages. J Hum Genet 49:187–193. Tamura K, Nei M. 1993. Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzees. Mol Biol Evol 10:512–526. Underhill PA, Shen P, Lin AA, Jin L, Passarino G, Yang WH, Kauffman E, Bonne´-Tamir B, Bertranpetit J, Francalacci P, Ibrahim M, Jenkins T, Kidd JR, Mehdi SQ, Seielstad MT, Wells RS, Piazza A, Davis RW, Feldman MW, Cavalli-Sforza LL, Oefner PJ. 2000. Y chromosome sequence variation and the history of human populations. Nat Genet 26:358–361. Underhill P, Passarino G, Lin AA, Shen P, Lahr MM, Foley RA, Oefner PJ, Cavalli-Sforza LL. 2001. The pylogeography of Y chromosome binary haplotypes and the origins of modern human populations. Ann Hum Genet 65:43–62. Vigilant L, Stoneking M, Harpending H, Hawkes K, Wilson AC. 1991. African populations and the evolution of human mitochondrial DNA. Science 253:1503–1507. Wang W, Wise C, Baric T, Black ML, Bittles AH. 2003. The origins and genetic structure of three co-resident Chinese Muslim populations: the Salar, Bo’an and Dongxiang. Hum Genet 113:244–252. Wise CA, Paris M, Morar B, Wang W, Kalaydjieva L, Bittles AH. 2003. A standard protocol for single nucleotide primer extension in the human genome using matrix-assisted laser desorption/ionization time-of-flight mass spectrometry. Rapid Commun Mass Spectrom 17:1195–1202. Yao YG, Zhang YP. 2002. Phylogeographic analysis of mtDNA variation in four ethnic populations from Yunnan Province: new data and a reappraisal. J Hum Genet 47: 311–318. Yao YG, Watkins WS, Zhang YP. 2000. Evolutionary history of the mtDNA 9-bp deletion in Chinese populations and its relevance to the peopling of East and Southeast Asia. Hum Genet 107:504–512. Yao YG, Kong QP, Bandelt HJ, Kivisild T, Zhang YP. 2002a. Phylogeographic differentiation of mitochondrial DNA in Han Chinese. Am J Hum Genet 70:635–651. Yao YG, Nie L, Harpending H, Fu YX, Yuan ZG, Zhang YP. 2002b. Genetic relationship of Chinese ethnic populations revealed by mtDNA sequence diversity. Am J Phys Anthropol 118:63–76. Y Chromosome Consortium. 2002. A nomenclature system for the tree of human Y-chromosomal binary haplogroups. Genome Res 12:339–348.