A frequent splicing mutation and novel missense ...

0 downloads 0 Views 211KB Size Report
May 16, 2013 - Department of Pediatrics, Hospital Santo António, Porto, Portugal ..... tional spectra between both countries (Arnaiz-Villena et al. 1997; Perez et ...
J Inherit Metab Dis DOI 10.1007/s10545-013-9623-1

ORIGINAL ARTICLE

A frequent splicing mutation and novel missense mutations color the updated mutational spectrum of classic galactosemia in Portugal Ana I. Coelho & Ruben Ramos & Ana Gaspar & Cláudia Costa & Anabela Oliveira & Luísa Diogo & Paula Garcia & Sandra Paiva & Esmeralda Martins & Elisa Leão Teles & Esmeralda Rodrigues & M. Teresa Cardoso & Elena Ferreira & Sílvia Sequeira & Margarida Leite & Maria João Silva & Isabel Tavares de Almeida & João B. Vicente & Isabel Rivera

Received: 7 March 2013 / Revised: 16 May 2013 / Accepted: 17 May 2013 # SSIEM and Springer Science+Business Media Dordrecht 2013

Abstract Classic galactosemia is an autosomal recessive disorder caused by deficient galactose-1-phosphate uridylyltransferase (GALT) activity. Patients develop symptoms in the neonatal period, which can be ameliorated by dietary restriction of galactose. Many patients develop long-term complications, with a broad range of clinical symptoms whose pathophysiology is poorly understood. The high allelic heterogeneity of GALT gene that characterizes this disorder is thought to play a determinant role in biochemical and clinical phenotypes. We aimed to characterize the mutational spectrum of GALT deficiency in Portugal and to assess potential genotype-phenotype correlations. Direct sequencing of the GALT gene and in silico analyses were employed to evaluate the impact of uncharacterized mutations

upon GALT functionality. Molecular characterization of 42 galactosemic Portuguese patients revealed a mutational spectrum comprising 14 nucleotide substitutions: ten missense, two nonsense and two putative splicing mutations. Sixteen different genotypic combinations were detected, half of the patients being p.Q188R homozygotes. Notably, the second most frequent variation is a splicing mutation. In silico predictions complemented by a close-up on the mutations in the protein structure suggest that uncharacterized missense mutations have cumulative point effects on protein stability, oligomeric state, or substrate binding. One splicing mutation is predicted to cause an alternative splicing event. This study reinforces the difficulty in establishing a genotype-phenotype correlation in classic galactosemia, a

Communicated by: Ertan Mayatepek Electronic supplementary material The online version of this article (doi:10.1007/s10545-013-9623-1) contains supplementary material, which is available to authorized users. A. I. Coelho : R. Ramos : M. Leite : M. J. Silva : I. T. de Almeida : J. B. Vicente : I. Rivera (*) Metabolism & Genetics Group, Research Institute for Medicines and Pharmaceutical Sciences (iMed.UL), Faculty of Pharmacy, University of Lisbon, Av. Prof. Gama Pinto, 1643-009, Lisbon, Portugal e-mail: [email protected] A. I. Coelho : R. Ramos : M. Leite : M. J. Silva : I. T. de Almeida : J. B. Vicente : I. Rivera Department of Biochemistry and Human Biology, Faculty of Pharmacy, University of Lisbon, Lisbon, Portugal

L. Diogo : P. Garcia : S. Paiva Metabolic Clinics, Pediatric Hospital, CHUC, Coimbra, Portugal E. Martins Department of Pediatrics, Hospital Santo António, Porto, Portugal E. L. Teles : E. Rodrigues : M. T. Cardoso Metabolic Diseases Unit, Integrated Pediatric Hospital, Hospital São João, Porto, Portugal

A. Gaspar : C. Costa Department of Pediatrics, Hospital Santa Maria, Lisbon, Portugal

E. Ferreira Hospital Centre, Funchal, Madeira, Portugal

A. Oliveira Department of Medicine, Hospital Santa Maria, Lisbon, Portugal

S. Sequeira Department of Pediatrics, Hospital D. Estefânia, Lisbon, Portugal

J Inherit Metab Dis

monogenic disease whose complex pathogenesis and clinical features emphasize the need to expand the knowledge on this “cloudy” disorder.

Introduction Classic galactosemia or type I galactosemia (OMIM #230400) is an inborn error of galactose metabolism caused by deficient galactose-1-phosphate uridylyltransferase (GALT; EC 2.7.7.12) activity. This condition is inherited in an autosomal recessive pattern (Reichardt and Woo 1991) and occurs with a frequency of 1:30,000–60,000 live births (FridovichKeil and Walter 2008). Generally asymptomatic at birth, galactosemic neonates become ill within days-to-weeks after starting milk ingestion. Initial symptoms include poor feeding, vomiting and diarrhea with poor weight gain progressing to liver failure, renal tubular disease, Escherichia coli sepsis, coma, and death (Fridovich-Keil and Walter 2008). Lifelong dietary galactose restriction, the current standard of care, partially relieves or prevents these acute and potentially lethal symptoms. However, many patients develop severe long-term complications (Fridovich-Keil 2006). The human GALT gene (GenBank Accession NG_009029.1) is located in chromosome 9p13, spanning ∼4 kb of DNA arranged in 11 exons (Reichardt and Berg 1988; Leslie et al 1992). The 1295 bp-long ORF is translated into a 379-amino acid polypeptide, the active enzyme being an 88 kDa homodimer (Fridovich-Keil and Walter 2008). GALT belongs to the histidine triad family of transferases, displaying a double-displacement mechanism (McCorvie and Timson 2011). First, UDP-glucose (UDP-Glc) enters the active site, binds covalently to His186 and glucose-1-phosphate (Glc-1-P) is released. Then, galactose-1-phosphate (Gal-1-P) enters the active site, and the enzyme-bound UMP is transferred to Gal-1-P, forming and releasing UDP-galactose (UDP-Gal) (Fridovich-Keil and Walter 2008). Classic galactosemia patients show phenotypic variability— ranging from a severe to a milder phenotype—which can be partially explained by the characteristic high allelic heterogeneity (Marabotti and Facchiano 2005). Thus far 264 different variations have been described at the GALT locus (Tyfield 2000; Calderon et al 2007), the majority being missense mutations, and most being infrequent (FridovichKeil and Walter 2008). The most frequent mutations are p.Q188R, p.K285N, p.S135L and p.L195P. The predominant GALT mutation in European descent patients is p.Q188R (∼64 % of all GALT mutations) (Flanagan et al 2010). Homozygous p.Q188R patients have essentially no GALT activity in red blood cells (RBC), displaying a poor outcome (Sommer et al 1995; Tyfield et al 1999). p.K285N is the second most common European mutation, particularly in central and eastern Europe (Lukac-Bajalo et al 2007). p.K285N

homozygotes completely lack RBC GALT activity, presenting a severe clinical phenotype (Sommer et al 1995; Tyfield et al 1999). p.S135L is almost exclusively found in African origin individuals (∼91 % of African GALT mutant alleles) (Henderson et al 2002; Crushell et al 2009), appearing to be tissue specific: homozygous patients have essentially no RBC GALT activity, presenting low residual activity in leukocytes, and usually displaying mild clinical outcomes (Lai et al 1996; Tyfield 2000). Mutations that do not result in galactosemia have been reported (Suzuki et al 2001): ‘Duarte’ (or D2), having p.N314D in linkage disequilibrium with three intronic substitutions and a 4-nucleotide promoter deletion (decreased GALT activity); and ‘Los Angeles’ (or D1) with the p.N314D and the c.652C→T (p.L218L) mutations (normal or above normal GALT activity) (Fridovich-Keil and Walter 2008). An essential question regarding classic galactosemia concerns the identification of predictive factors to distinguish patients who will thrive in the long run from those who will experience complications. GALT genotyping provides valuable prognostic information about the biochemical and clinical outcome (Tyfield 2000), contributing to a more effective therapy (Woodcock 2007). Considering the global wide mutational spectrum as well as the mutations’ functional consequences, molecular analysis can improve the understanding of the Portuguese galactosemic population prognosis. Previous studies on classic galactosemia in Portugal proposed p.Q188R to be the only frequent mutation in Portuguese patients (Gort et al 2009). The aim of this study is to characterize and update the GALT locus mutational spectrum in a 42 Portuguese galactosemic patients cohort. An in silico approach is herein employed to attempt evaluating the impact of uncharacterized mutations. Finally, comparing the modeling data with patients’ phenotypic parameters, we assess potential genotype-phenotype correlations.

Methods Patients Forty-two galactosemic Portuguese patients, including four pairs of siblings, were investigated. The 19 female (45 %) and 23 male (55 %) individuals originated from all regions in Portugal, including Madeira and Azores Islands, being representative of the whole population. Diagnosis of south/center regions and Madeira Island patients was confirmed by absent or reduced (g (IVS8+13a>g) intronic variation was found in six independent mutant alleles, becoming the second most frequent mutation in Portuguese patients (8.0 %). This mutation was identified in a homozygous patient and in five compound heterozygotes, two of them being siblings, all of them originating from central regions of Portugal. The third most frequent mutations were p.S135L and p.G175D (4 %, each). p.S135L was identified in three independent mutant alleles in heterozygosity with p.Q188R (2 alleles) and another with p.F171C (c.512t>g). Another substitution in this nucleotide (c.512t>c, p.F171S) is quite frequent in AfricanAmericans (Tyfield et al 1999; Crews et al 2000). Both parents of the p.S135L/p.F171C patient are from São Tomé and

Príncipe, in line with the African origin of p.S135L and p.F171S. p.G175D was exclusively detected in Portugal. Previous studies reported a patient carrying this mutation in only one allele (Gort et al 2006, 2009). We identified two additional independent mutant alleles in a pair of homozygotic siblings from Madeira Island. p.R148Q, p.P185S and p.R259W were detected at a 2.6 % frequency. p.R148Q, previously described as a common Iberian Peninsula mutation, was only found in two mutant alleles, whereas seven alleles were identified in the Spanish population (Gort et al 2006, 2009). On the other hand, p.P185S has been found exclusively in the Portuguese population, in a homozygous patient. Similarly, p.R259W was detected in homozygosity in one patient. The remaining mutations—p.R80X, c.328+33g>a (IVS3+ 33g>a), p.F171C, p.S192G, p.R204X, p.P295T, p.R333G— occurred at a 1.3 % frequency, corresponding to one allele each. Two additional individuals with phenotypes suggestive of classic galactosemia were initially enrolled in this study. Their molecular characterization revealed they were heterozygotes carrying p.N314D along with the associated promoter deletion and intronic variations. Their GALT activity showed values compatible with the Duarte variant carrier status (Elsas et al 2001), thus being excluded from this study. Moreover, the p.N314D variation was screened among all galactosemic patients and no such allele was identified. Accordingly, all the investigated patients carried only two mutant GALT alleles in trans.

Table 1 Characterization of GALT mutations identified in 42 Portuguese galactosemic patients, corresponding to 76 independent mutant alleles Sequence variation

GALT mutation

Exon/Intron

Alleles (n)

Allele frequency (%)

In vitro GALT activity

c.238c>t

p.R80X

E2

1

1.3

n.a.

c.328+33g>a c.404c>t

IVS3+33g>a p.S135L

I3 E5

1 3

1.3 4.0

c.443g>a c.512t>g c.524g>a

p.R148Q p.F171C p.G175D

E5 E6 E6

2 1 3

2.6 1.3 4.0

n.a. 5 % (Fridovich-Keil et al 1995, Wells and Fridovich-Keil 1997) n.a. n.a. n.a.

c.583c>t c.563a>g c.574a>g c.610c>t c.775c>t c.820+13a>g c.883c>a c.997c>g Total

p.P185S p.Q188R p.S192G p.R204X p.R259W IVS8+13a>g p.P295T p.R333G

E6 E6 E7 E7 E8 I8 E9 E10

2 51 1 1 2 6 1 1 76

2.6 67.1 1.3 1.3 2.6 8.0 1.3 1.3 100

n.a. not analyzed

14 % (Quimby et al 1996) 0% n.a. 0 % (Chhay et al 2008) a p.S135L p.S135L p.R148Q p.R148Q p.G175D p.S192G p.R204X

0.9 1.05 n.d. 0 0.6 0.6 0 0

22/22.9 255/66.1 n.d. n.a./79.8 466/95.2 235/76.7 82/73 1320/102

LD, SI, WMA C, AMF, SI SI ID, OD LD, SI C, ID, SI Absent LD

32a 33 34 35 36 37a 38a 39 40 41

2002 2010 2000 1989 2007 2010 1988 1994 1996 1991 1991

F M F F F M M F M F M

p.Q188R p.Q188R p.Q188R p.Q188R p.Q188R p.S135L p.G175D p.G175D p.P185S p.R259W IVS8+13a>g

IVS8+13a>g IVS8+13a>g IVS8+13a>g IVS8+13a>g p.R333G p.F171C p.G175D p.G175D p.P185S p.R259W IVS8+13a>g

n.d. 4.4 0.1 0 0.17 n.d. 0 1.2 4.65 n.d. 0

352/62.2 307/78.5 132/46.2 n.a./65.9 1644/53.8 140/71.0 127/75.1 495/95.1 n.a./50.6 n.d. 125/37.3

ID, SI Absent SI ID, OD Absent SI Absent WMA ID, SI C, LD, OD LD

42

1988

M

IVS8+13a>g

p.P295T

0

125/52.6

C, LD

C cataracts; AMF anomalies of motor function; ID intellectual disability; LD learning disabilities; OD ovarian dysfunction; SI speech impairment; WMA white matter anomalies; n.a. not available; n.d. not determined a

Siblings

Notably, two frequent mutations (p.K285N and p.L195P) in European countries (including Spain) remain undetectable

amongst Portuguese patients. Interestingly, there are only five mutations in common between the Spanish and the Portuguese

J Inherit Metab Dis

populations, despite the 21 and 16 different alleles respectively identified in either population. Other genetic disorders, including phenylketonuria, also present remarkably different mutational spectra between both countries (Arnaiz-Villena et al 1997; Perez et al 1997; Rivera et al 1998).

Prediction of mutations’ impact upon GALT stability and functionality Most of the mutations identified in the Portuguese galactosemic population affect strictly conserved residues in GALT from higher eukaryotes to prokaryotes (Fig. S1), attesting the relevance of the substituted positions. While the effect of the newly described mutations upon GALT structural and functional properties is under investigation and out of the scope of this paper, we employed bioinformatic tools to predict their impact (Table S2). We generated a structural model of human GALT (Fig. S2) using as template the E. coli GALT structure (PDB code 1GUP (Thoden et al 1997)), strengthened by their high sequence identity (51 %). To increase the predictions robustness, we employed four different servers, always querying the most prevalent pathogenic mutations: p.S135L, p.Q188R, and p.K285N. Although there is a reasonable degree of consensus between prediction servers (Table S3), there are relevant point discrepancies, which may be due to these servers analyzing the overall structural perturbations and overlooking essential details such as dimer assembly, metal binding, active site and other functionally relevant residues. A striking example is p.Q188R, which is not considered disease-causing or destabilizing by every server. However, Q188 is a key site for reaction intermediates stabilization (Geeganage and Frey 1998) and the high impact of the p.Q188R mutation cannot be discriminated by the servers. These limitations weaken the conclusions drawn from the prediction servers, prompting us to generate structural models of the GALT mutants herein studied, to inspect local structural and functional effects (summarized in Table S3). p.F171C typifies the prediction servers’ limitations: whereas two programs suggest a stabilizing effect, the other two classify it as a destabilizing mutation (Table S2). F171 is located near the active site, composing a hydrophobic patch with W190 from the same monomer and Y339 from the opposing monomer (Fig. S3). F171 is tightly associated with Q188, which stabilizes the bound intermediate, through H-bonds between both main-chain carbonyl and amine groups. F171 substitutions by Ser, Leu and Tyr alter the Q188 position, disturbing the dimer interface and hexose binding (Crews et al 2000). In F171C (Fig. S3), H-bonds between the C171 and Q188 are slightly shorter than in wild-type, possibly yielding a distortion in the Q188 sidechain position affecting hexose binding. C171 may also

disturb the monomer–monomer interaction by weakening the hydrophobic patch with W190 and Y339 (Table S3). p.G175D is predicted as destabilizing by the four servers (Table S2). G175 is strictly conserved (Fig. S1) and inserted in a coil region near the active site and bound substrate (Fig. S4). The striking structural perturbation in this mutant is the predicted H-bond between the Asp side-chain carboxylate and the M177 main-chain amine (2.73 Å; Table S3), which may restrict mobility in a coil region near the dimer interface, not far from the active site (Table S3). p.P185S affects the H-P-H active site, most likely severely impairing GALT activity. This mutation has been functionally studied by expressing the human GALT (wild-type and mutants) in a yeast GALT-null system (Quimby et al 1996): p.P185S was produced in lower abundance, displayed impaired conformational stability and ∼7-fold decreased activity as compared to wild-type. The prediction servers concur in assigning a destabilizing and disease-causing effect to this mutation (Table S2). The substituting serine may disturb the active site (Fig. S5) and also introduce rigidity in the H-(P)S-H tripeptide, since the S185 side-chain hydroxyl can form a strong H-bond with the V137 main-chain carbonyl (2.48 Å; Table S3). p.S192G affects a poorly conserved position (Fig. S1) located near the active site. Servers’ predictions diverge, considering it either benign/neutral or destabilizing (Table S2). The substituted Ser is likely H-bound through its sidechain -OH to the -OH of Y339 from the opposing monomer (Y339(B) in Fig. S6; 3.45 Å), that should be lost upon substitution by a Gly, thus disturbing the dimer interface (Table S3). This Y339 is strictly conserved, and is the same residue responsible for dimer interface perturbation caused by the p.F171C mutation. The p.R259W mutation affects a non-conserved position amidst a highly conserved motif (Fig. S1). Prediction servers diverged in ranking this mutation (Table S2). R259 is located at the GALT surface, its side chain guanidinium protruding into the solvent (Fig. S7). The most striking perturbation is a 3.24 Å displacement of the adjacent E271 side chain carboxylate promoted by the Trp bulky indole moiety, which enables the formation of two H-bonds (2.98 Å and 3.01 Å) with the side-chain -OH and main chain amine of T268. These extra H-bonds will likely cause rigidity between the E271 helix and the T268 coil region, highlighting how subtle structural perturbations in the GALT surface may propagate into functionally relevant sites and affect the overall GALT activity. p.P295T occurs in a strictly conserved sequence motif (Fig. S1) in a coil region. Prediction servers score it as a destabilizing and disease-causing mutation (Table S2). The T295 side chain -OH is within H-bonding distance with the M177 side chain sulfur atom (3.70 Å; sulfur H-bonds may display longer distances (Gregoret et al 1991)) (Fig. S8).

J Inherit Metab Dis

M177 is located near the dimer interface, and the rigidity derived from an extra H-bond may disturb the monomer– monomer interaction (Table S3). p.R333G occurs in a strictly conserved position in a highly conserved motif (Fig. S1) and is ranked as destabilizing by the prediction servers (Table S2). R333 is in a coil region distant from the H-P-H active site or the bound substrate (Fig. S9). The R333 guanidinium moiety is at the monomer surface close to the dimer interface, and appears to be involved in a H-bond network with the M177 and K334 mainchain carbonyls, which is lost upon substitution by Gly, likely disturbing this monomer–monomer interface region (Table S3). Summarizing, the predicted structural and functional effects of the described mutations reveal that apparently innocuous substitutions may cause point disturbances in the active site, substrate binding, and monomer–monomer interface. Altogether, these subtle perturbations may account for impaired conformational stability and/or increased proneness to aggregate, prompting the misfolded mutant GALT for degradation, resulting in lower levels of less active GALT protein, ultimately leading to low GALT activity in patients. The inspection of local effects of the mutations, together with the knowledge on the wild-type structure and function, provide a more valuable tool than prediction servers which overlook relevant information about the analyzed proteins. These predictions should be carefully considered, since they rely upon the observation of structural models based on the crystallographic structure of the E. coli enzyme. GALT variations affecting splice events Bioinformatic programs were used to evaluate possible missplicing effects of the two intronic variations identified in Portuguese patients, c.820+13a>g (IVS8+13a>g) and c.328+33g>a (IVS3+33g>a), none of which were identified in 100 control alleles (Gort et al 2006, 2009). To predict whether the c.820+13a>g mutation directly affects the GALT gene pre-mRNA splicing or if it is just a marker linked to another causative mutation, wild-type and mutant gene sequences comprising exon 8, intron 8 and exon 9 (320 bp) were analyzed with two splice site prediction programs, which gave similar results. The intron 8 authentic 5′ splice site presents the same or similar scores in both wild-type and mutant sequences in each program (0.64 and 0.86 in NNSplice and NetGene2, respectively), suggesting this is not a weak splicing donor site. Both programs also predict a new GT donor site (c.820+14_15) in the presence of the mutation. In the NNSplice program, this cryptic splice site is only recognized in the mutant sequence, scoring it 0.95, whereas in the NetGene2 program it scores 0.68 and 0.99 in the wild-type and mutant

sequences, respectively. This GT is located immediately downstream of the variant nucleotide site and both programs assign it a higher strength comparatively to the canonical GT. These results suggest that in vivo the c.820+14_15 splice site prevails over the canonical donor splice site, being used by the spliceosome machinery and leading to the inclusion of the first 13 nucleotides of intron 8 in the coding sequence. The authentic 3′ splice acceptor site of intron 8 scores very low in both programs (0.45 in NNSplice and 0.04 in NetGene 2), revealing its intrinsic weakness. Notably, both programs indicate the mutation creates a new splice acceptor site immediately upstream the activated cryptic donor site (c.820+12_13) which, despite being also extremely weak, shows considerably higher scores than the canonical one (0.51 and 0.09 in NNSplice and NetGene2, respectively). The evaluation of possible modifications on ESE patterns, caused by this nucleotide variation, was assessed by two programs yielding contradictory results. ESEfinder 3.0 scores nearly identical values for three of the SR proteins (SRSF2, SRSF5, SRSF6) in both wild-type and mutant sequences. However, the results for SRSF1 show that its binding score almost doubles from 1.986666 in the wildtype sequence (close to the threshold value, 1.956), to 3.906278 in the mutant sequence (CCCAGGT, positions 9–15 of intron 8). On the other hand, RESCUE-ESE program identifies in the same region a binding sequence (CAAGTA, positions 11–16 of intron 8) in the wild-type sequence, but not in the mutant one. Concerning c.328+33g>a, wild-type and mutant gene sequences comprising exon 3, intron 3 and exon 4 (213 bp) were scanned with NNSplice and NetGene2, which attribute similar scores (in the presence and absence of this variation) to the natural donor (0.90 and 0.96) and acceptor (0.95 and 0.73) splice sites, respectively, and recognized no cryptic splice sites in either sequence. ESEfinder shows the intronic change presents the same binding scores to all motifs covered by this software, with the exception of those for SRSF1, for which the wild-type sequence presents a motif (AGGAGGG, positions 28–34 of intron 3) scoring 2.05955, which is no longer recognized in the presence of the mutation (AGGAGAG). Contrarily, RESCUE-ESE only identifies a new ESE motif for this same SR protein in the presence of the mutation (AGGAGA, positions 28–33 of intron 3). Altogether, these data strongly suggest the c.820+ 13a>g variation is most probably a splicing mutation, whereas the effect of the c.328+33g>a variation remains to be clarified. Efforts are underway to study these mutations in greater detail, particularly the resulting aberrant splicing, which has so far been prevented by the lack of any other material rather than genomic DNA, from these patients or their parents.

J Inherit Metab Dis

Assessing potential genotype-phenotype correlations in classic galactosemia To evaluate the functional significance of GALT mutations, we searched for correlations between genotypes and phenotypic manifestations of classic galactosemia, namely cataracts, motor function anomalies, intellectual and learning disabilities, ovarian dysfunction, speech impairment and white matter anomalies. Genotype-phenotype correlations are difficult to establish since most patients are compound heterozygotes, the resulting phenotype depending on the interactions between products of two different alleles. Moreover, genotype-phenotype correlations usually rely on the knowledge of the residual enzymatic activity of each mutation, determined by heterologous expression of recombinant proteins. Concerning GALT mutations, few have so far been characterized (Fridovich-Keil et al 1995; Wells and Fridovich-Keil 1997; Shield et al 2000; Riehman et al 2001; Chhay et al 2008) which led us to employ an in silico approach to predict the effect of the mutations on GALT functional properties. p.Q188R is the most common mutation in European descendant populations (Tyfield 2000). Homozygous p.Q188R patients have undetectable RBC activity, consistent with in vitro enzymatic assays. Indeed, half of the Portuguese galactosemic patients are p.Q188R homozygotes and their RBC Gal-1-P always lies in the upper levels of the advocated therapeutic range, presenting a severe clinical phenotype with negative outcomes (Table 2). On the other hand, the 14 heterozygous patients for p.Q188R present biochemical and clinical phenotypes strongly dependent on the second mutation. Severe or null mutations such as p.R80X, p.R148Q, p.S192G and p.R204X enforce the severe phenotype, whereas milder mutations like p.S135L alleviate the phenotype and strongly reduce the negative outcomes. Some missense mutations appear to partially retain enzyme activity, being associated with a milder biochemical phenotype confirmed by in vitro studies (Tyfield 2000). Heterologous p.S135L GALT expression in yeast yielded ∼5 % of wildtype activity (Wells and Fridovich-Keil 1997), consistent with three p.S135L patients presenting a mild phenotype. Despite all prediction servers estimating a strong impact on GALT activity by p.P185S, the recombinant protein expressed in yeast displays 14 % of wild-type’s activity (Quimby et al 1996). Notably, the homozygous p.P185S patient indeed presents some residual RBC GALT activity and his RBC Gal-1-P values approach the therapeutic range lower limit. Moreover, this patient presents only slight speech and learning disabilities, overall displaying a mild phenotype. p.R259W was also identified in a homozygous patient whose biochemical and clinical phenotypes concur with p.R259W being a destabilizing mutation causing structural perturbations.

p.F171C was identified exclusively in a Portuguese patient, carrying p.S135L in the other allele. In fact, the rarity of this mutation, being in trans with a mild mutation, make it difficult to estimate the relative contributions of each mutation for the patient’s mild phenotype. p.F171C has never been expressed in vitro and prediction servers estimate an ambiguous effect for this mutation (Table S2). A closer inspection of its location suggests possible effects on hexose binding and/or the monomer–monomer interaction. The predicted destabilizing effect of p.G175D is contrasted by the variability of the observed phenotypes. Although both homozygous siblings for p.G175D have a normal school career, the sister presents white matter anomalies, while the brother displays no negative outcomes. On the other hand, the heterozygous p.G175D/ p.Q188R patient presents a severe phenotype, confirming our hypothesis that the outcome of heterozygous patients carrying p.Q188R strongly depends on the other mutation. Therefore, a potentially severe mutation like p.G175D emphasizes the severe phenotype conferred by p.Q188R. A genotype-phenotype correlation was also attempted on patients carrying the potential splicing mutations. The effect of the c.328+33g>a variation remains to be clarified. The patient bearing this mutation in heterozygosity with p.Q188R displays a mild clinical phenotype with no negative outcomes, with RBC Gal-1-P values slightly above the normal range, which suggests that this intronic variation should be a mild mutation. Concerning the c.820+13a>g transition, all bioinformatic tools predict a severe splicing mutation, inducing a frameshift that leads to a premature stop codon. Accordingly, it is anticipated that the mRNA should be directed to the nonsense-mediated decay system, with no detected enzymatic activity in biological samples. However, despite displaying null GALT activity in RBC, the homozygotic patient for this mutation presents a mild clinical phenotype. No negative outcomes other than moderate learning disabilities were observed which however have not prevented completion of a secondary education level. On the other hand, heterozygotes also carrying p.Q188R display mild phenotypes but negative outcomes like speech impediments and learning disabilities. Finally, the patient who carries c.820+13a>g in heterozygosity with p.P295T also reveals a mild phenotype. Altogether, these results suggest that c.820+13a>g, despite its potential severe effect, may alleviate severe negative outcomes caused by highly deleterious mutations, like p.Q188R. The mild phenotype afforded by this mutation could be due to tissue availability of specific spliceosomal factors, which may prevent the full expression of the alternative splicing effect.

J Inherit Metab Dis

Conclusion The present work expands the knowledge on the mutational spectrum of classic galactosemia in Portugal, and also predicts the structural and functional effects of mutations identified in Portuguese patients which were not yet characterized. The establishment of potential genotype-phenotype correlations is not trivial for several reasons. Firstly, in silico analysis software are rather limited since they evaluate overall structural perturbations, overlooking relevant functional information, such as oligomeric arrangements, cofactor and ligand binding. These limitations were tentatively overcome by closely inspecting the local effects of the mutations and attempting to envisage structural–functional impairments. Secondly, in classic galactosemia, though being a monogenic disorder, the resulting phenotype is not straightforward. There is a paucity of structural information on the Leloir pathway enzymes. It has been hypothesized they could form supra-molecular complexes, which would imply that a change in GALT protein could also affect the entire metabolic pathway (McCorvie and Timson 2011). Moreover, galactose metabolites are implicated in several physiological pathways, including the intricate glycosylation reactions, which are reflected on many distinct levels. Finally, the influence of other genetic modifiers, epigenetic alterations and environmental factors must be considered (Waisbren et al 2012). Acknowledgments We wish to acknowledge the patients and families enrolled in this study. This work was supported by SPDM Grant to IR, SFRH/BD/48259/2008 FCT Grant to AIC, and PEst-OE/SAU/ UI4013/2011. Conflict of interest None.

References Adzhubei IA, Schmidt S, Peshkin L et al (2010) A method and server for predicting damaging missense mutations. Nat Methods 7(4):248– 249 Arnaiz-Villena A, Martinez-Laso J, Gomez-Casado E et al (1997) Relatedness among Basques, Portuguese, Spaniards, and Algerians studied by HLA allelic frequencies and haplotypes. Immunogenetics 47(1):37–43 Arnold K, Bordoli L, Kopp J, Schwede T (2006) The SWISS-MODEL workspace: a web-based environment for protein structure homology modelling. Bioinformatics 22(2):195–201 Brunak S, Engelbrecht J, Knudsen S (1991) Prediction of human mRNA donor and acceptor sites from the DNA sequence. J Mol Biol 220(1):49–65 Calderon FR, Phansalkar AR, Crockett DK, Miller M, Mao R (2007) Mutation database for the galactose-1-phosphate uridyltransferase (GALT) gene. Hum Mutat 28(10):939–943 Cartegni L, Wang J, Zhu Z, Zhang MQ, Krainer AR (2003) ESEfinder: a web resource to identify exonic splicing enhancers. Nucleic Acids Res 31(13):3568–3571

Chhay JS, Openo KK, Eaton JS, Gentile M, Fridovich-Keil JL (2008) A yeast model reveals biochemical severity associated with each of three variant alleles of galactose-1P uridylyltransferase segregating in a single family. J Inherit Metab Dis 31(1):97–107 Crews C, Wilkinson KD, Wells L, Perkins C, Fridovich-Keil JL (2000) Functional consequence of substitutions at residue 171 in human galactose-1-phosphate uridylyltransferase. J Biol Chem 275(30):22847–22853 Crushell E, Chukwu J, Mayne P, Blatny J, Treacy EP (2009) Negative screening tests in classical galactosaemia caused by S135L homozygosity. J Inherit Metab Dis 32(3):412–415 Dehouck Y, Kwasigroch JM, Gilis D, Rooman M (2011) PoPMuSiC 2.1: a web server for the estimation of protein stability changes upon mutation and sequence optimality. BMC Bioinforma 12:151 Elsas LJ, Lai K, Saunders CJ, Langley SD (2001) Functional analysis of the human galactose-1-phosphate uridyltransferase promoter in Duarte and LA variant galactosemia. Mol Genet Metab 72(4):297– 305 Fairbrother WG, Yeh RF, Sharp PA, Burge CB (2002) Predictive identification of exonic splicing enhancers in human genes. Science 297(5583):1007–1013 Flanagan JM, McMahon G, Brendan Chia SH et al (2010) The role of human demographic history in determining the distribution and frequency of transferase-deficient galactosaemia mutations. Heredity (Edinb) 104(2):148–154 Fridovich-Keil JL (2006) Galactosemia: the good, the bad, and the unknown. J Cell Physiol 209(3):701–705 Fridovich-Keil J, Walter JH (2008). Galactosemia. In: Valle D, Beaudet AL, Vogelstein B et al (eds) The online metabolic and molecular bases of inherited diseases (OMMBID). McGraw Hill, New York, pp 1–108 Fridovich-Keil JL, Langley SD, Mazur LA, Lennon JC, Dembure PP, Elsas JL 2nd (1995) Identification and functional analysis of three distinct mutations in the human galactose-1-phosphate uridyltransferase gene associated with galactosemia in a single family. Am J Hum Genet 56(3):640–646 Geeganage S, Frey PA (1998) Transient kinetics of formation and reaction of the uridylyl-enzyme form of galactose-1-P uridylyltransferase and its Q168R-variant: insight into the molecular basis of galactosemia. Biochemistry 37(41):14500–14507 Gitzelmann R (1969) Estimation of galactose-I-phosphate in erythrocytes: a rapid and simple enzymatic method. Clin Chim Acta 26(2):313–316 Gonnelli G, Rooman M, Dehouck Y (2012) Structure-based mutant stability predictions on proteins of unknown structure. J Biotechnol 161(3):287–293 Gort L, Boleda MD, Tyfield L et al (2006) Mutational spectrum of classical galactosaemia in Spain and Portugal. J Inherit Metab Dis 29(6):739–742 Gort L, Quintana E, Moliner S, Gonzalez-Quereda L, LopezHernandez T, Briones P (2009) An update on the molecular analysis of classical galactosaemia patients diagnosed in Spain and Portugal: 7 new mutations in 17 new families. Med Clin (Barc) 132(18):709–711 Gregoret LM, Rader SD, Fletterick RJ, Cohen FE (1991) Hydrogen bonds involving sulfur atoms in proteins. Proteins 9(2):99–107 Hebsgaard SM, Korning PG, Tolstrup N, Engelbrecht J, Rouze P, Brunak S (1996) Splice site prediction in Arabidopsis thaliana pre-mRNA by combining local and global sequence information. Nucleic Acids Res 24(17):3439–3452 Henderson H, Leisegang F, Brown R, Eley B (2002) The clinical and molecular spectrum of galactosemia in patients from the Cape Town region of South Africa. BMC Pediatr 2:7 Kiefer F, Arnold K, Kunzli M, Bordoli L, Schwede T (2009) The SWISS-MODEL repository and associated resources. Nucleic Acids Res 37(Database issue):D387–D392

J Inherit Metab Dis Lai K, Langley SD, Singh RH, Dembure PP, Hjelm LN, Elsas LJ 2nd (1996) A prevalent mutation for galactosemia among black Americans. J Pediatr 128(1):89–95 Leslie ND, Immerman EB, Flach JE, Florez M, Fridovich-Keil JL, Elsas LJ (1992) The human galactose-1-phosphate uridyltransferase gene. Genomics 14(2):474–480 Lindhout M, Rubio-Gozalbo ME, Bakker JA, Bierau J (2010) Direct non-radioactive assay of galactose-1-phosphate:uridyltransferase activity using high performance liquid chromatography. Clin Chim Acta 411(13–14):980–983 Lukac-Bajalo J, Kuzelicki NK, Zitnik IP, Mencej S, Battelino T (2007) Higher frequency of the galactose-1-phosphate uridyl transferase gene K285N mutation in the Slovenian population. Clin Biochem 40(5–6):414–415 Marabotti A, Facchiano AM (2005) Homology modeling studies on human galactose-1-phosphate uridylyltransferase and on its galactosemia-related mutant Q188R provide an explanation of molecular effects of the mutation on homo- and heterodimers. J Med Chem 48(3):773–779 McCorvie TJ, Timson DJ (2011) The structural and molecular biology of type I galactosemia: enzymology of galactose 1-phosphate uridylyltransferase. IUBMB Life 63(9):694–700 Parthiban V, Gromiha MM, Schomburg D (2006) CUPSAT: prediction of protein stability upon point mutations. Nucleic Acids Res 34(Web Server issue):W239–W242 Parthiban V, Gromiha MM, Abhinandan M, Schomburg D (2007) Computational modeling of protein mutant stability: analysis and optimization of statistical potentials and structural features reveal insights into prediction model development. BMC Struct Biol 7:54 Perez B, Desviat LR, Ugarte M (1997) Analysis of the phenylalanine hydroxylase gene in the Spanish population: mutation profile and association with intragenic polymorphic markers. Am J Hum Genet 60(1):95–102 Quimby BB, Wells L, Wilkinson KD, Fridovich-Keil JL (1996) Functional requirements of the active site position 185 in the human enzyme galactose-1-phosphate uridylyltransferase. J Biol Chem 271(43):26835–26842 Reese MG, Eeckman FH, Kulp D, Haussler D (1997) Improved splice site detection in Genie. J Comput Biol 4(3):311–323 Reichardt JK, Berg P (1988) Cloning and characterization of a cDNA encoding human galactose-1-phosphate uridyl transferase. Mol Biol Med 5(2):107–122 Reichardt JK, Woo SL (1991) Molecular basis of galactosemia: mutations and polymorphisms in the gene encoding human galactose1-phosphate uridylyltransferase. Proc Natl Acad Sci U S A 88(7):2633–2637 Riehman K, Crews C, Fridovich-Keil JL (2001) Relationship between genotype, activity, and galactose sensitivity in yeast expressing

patient alleles of human galactose-1-phosphate uridylyltransferase. J Biol Chem 276(14):10634–10640 Rivera I, Leandro P, Lichter-Konecki U, Tavares de Almeida I, Lechner MC (1998) Population genetics of hyperphenylalaninaemia resulting from phenylalanine hydroxylase deficiency in Portugal. J Med Genet 35(4):301–304 Shield JP, Wadsworth EJ, MacDonald A et al (2000) The relationship of genotype to cognitive outcome in galactosaemia. Arch Dis Child 83(3):248–250 Smith PJ, Zhang C, Wang J, Chew SL, Zhang MQ, Krainer AR (2006) An increased specificity score matrix for the prediction of SF2/ASFspecific exonic splicing enhancers. Hum Mol Genet 15(16):2490– 2508 Sommer M, Gathof BS, Podskarbi T, Giugliani R, Kleinlein B, Shin YS (1995) Mutations in the galactose-1-phosphate uridyltransferase gene of two families with mild galactosaemia variants. J Inherit Metab Dis 18(5):567–576 Suzuki M, West C, Beutler E (2001) Large-scale molecular screening for galactosemia alleles in a pan-ethnic population. Hum Genet 109(2):210–215 Thoden JB, Ruzicka FJ, Frey PA, Rayment I, Holden HM (1997) Structural analysis of the H166G site-directed mutant of galactose1-phosphate uridylyltransferase complexed with either UDPglucose or UDP-galactose: detailed description of the nucleotide sugar binding site. Biochemistry 36(6):1212–1222 Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins DG (1997) The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res 25(24):4876–4882 Tyfield LA (2000) Galactosaemia and allelic variation at the galactose1-phosphate uridyltransferase gene: a complex relationship between genotype and phenotype. Eur J Pediatr 159(Suppl 3):S204– S207 Tyfield L, Reichardt J, Fridovich-Keil J et al (1999) Classical galactosemia and mutations at the galactose-1-phosphate uridyl transferase (GALT) gene. Hum Mutat 13(6):417–430 Waisbren SE, Potter NL, Gordon CM et al (2012) The adult galactosemic phenotype. J Inherit Metab Dis 35(2):279–286 Wells L, Fridovich-Keil JL (1997) Biochemical characterization of the S135L allele of galactose-1-phosphate uridylyltransferase associated with galactosaemia. J Inherit Metab Dis 20(5):633–642 Woodcock J (2007) The prospects for “personalized medicine” in drug development and drug therapy. Clin Pharmacol Ther 81(2):164– 169 Worth CL, Bickerton GR, Schreyer A et al (2007) A structural bioinformatics approach to the analysis of nonsynonymous single nucleotide polymorphisms (nsSNPs) and their relation to disease. J Bioinforma Comput Biol 5(6):1297–1318