Functional Diversity in the Gene Network Controlled by the Master ...

14 downloads 0 Views 274KB Size Report
2Laboratory of Experimental Oncology B; National Institute for Cancer Research;. Genoa, Italy ... We identified many human genes in the network that.
[Cell Cycle 4:8, 1026-1029; August 2005]; ©2005 Landes Bioscience

Functional Diversity in the Gene Network Controlled by the Master Regulator p53 in Humans Extra View

ABSTRACT

Received 06/04/05; Accepted 06/07/05

Biological systems require the coordination of vast numbers of gene functions to accomplish the many tasks needed to grow, develop and survive as single cells or complex organisms. Over millions of years, organisms have evolved intricate networks of response genes to minimize the biological consequences of changing environmental conditions and stresses such as infection or DNA damage. The coordinated control of genes within a network is accomplished by signal transduction pathways that sense and interpret specific changes in the microenvironment and modulate the activity of sequence-specific transcription factors known as “master regulators”.1-5 Master regulators assure an orderly activation, modulation or repression of genes so that the proper timing and integration of cellular events can take place during development or in response to diverse environmental changes and stresses such as infection or DNA damage. An essential feature of master regulators is the ability to discriminate and bind sequences known as response elements (REs) within the genome in order to control transcription of the target genes. The REs of the individual gene targets are not identical, but are variants of consensus sequences that are typically 6 to 20 bases in length.5-7 They are usually identified through in vitro protein/DNA binding studies and sequence analysis of bound fragments. Several bioinformatics tools have been developed to identify potential transcription factor binding sites within genomes7-10 based on lists of consensus sequences. Variation in RE sequences among different target genes can alter transcriptional control and contribute to transactivation specificity in response to stress signals. As a result, individual variations in a specific RE sequence could result in biological diversity within a population if they lead to differential transactivation of target genes. In a general sense, genetic variation in environmental response genes in a population is likely to increase the opportunities for species survival in the face of changing environmental insults. Loss of function or increased activity of some response genes, as mediated by variant REs, might become advantageous under certain conditions. One source of this genetic diversity could be DNA polymorphisms such as deletions, insertions and single nucleotide polymorphisms (SNPs) in gene regulatory sequences. We recently addressed potential human diversity in the large stress response network controlled by p53, a master regulator that directly controls the expression of over 100 genes.11 In response to many types of DNA damage and environmental stresses, p53 protein levels increase greatly, resulting in changes in expression of the genes that it targets.12-14 While p53 is dispensable for development, mutations in this gene are associated with nearly half of all cancers. p53 mutations often lead to loss of control of the entire network of

.D

Previously published online as a Cell Cycle E-publication: http://www.landesbioscience.com/journals/cc/abstract.php?id=1904

RIB

UT E

*Correspondence to: Michael A Resnick; NIEHS; 111 Alexander Drive; Research Triangle Park, North Carolina 27709 USA; Tel.: 919.541.4480; Fax: 919.541.7593; Email: [email protected]/ Douglas Bell; NIEHS; 111 Alexander Drive; Research Triangle Park, North Carolina 27709 USA; Tel.: 919.541.7686; Fax: 919.541.7593; Email:[email protected]

IST

of Experimental Oncology B; National Institute for Cancer Research;

KEY WORDS

©

20

05

LA

ND

ES

BIO

SC

IEN

transcription, mutation, SNP, master regulator, p53

CE

Genoa, Italy

OT D

of Molecular Genetics; National Institute of Environmental Health Sciences; NIH; Research Triangle Park, North Carolina USA

ON

1Laboratory 2Laboratory

Individual differences in susceptibility to exposure induced diseases are likely due to variation in the DNA sequences of “environmental response” genes, many of which are arranged in complex regulatory networks. Among ~10 million inherited DNA variations, called single nucleotide polymorphisms (SNPs), perhaps only a few thousand, will actually influence human disease risk. We have combined bioinformatics and laboratory approaches to investigate genetic variation within the p53 stress response network. p53, a prominent tumor suppressor protein, is a master regulator that targets over a hundred genes for transcriptional upregulation or repression through sequence-specific interactions with DNA response elements (REs). We identified many human genes in the network that contain SNPs in REs that can be transactivated by p53. The discovery of these individual differences has implications for variation in human responses to environmental stresses, risk of disease, and responsiveness to drug therapies. The findings also provide insight into the evolution of complex networks and the role of master regulatory genes, such as p53, in such networks.

.

Michael A. Resnick1,* Dan Tomso1 Alberto Inga1,2 Daniel Menendez1 Douglas Bell1,*

1026

Cell Cycle

2005; Vol. 4 Issue 8

Functional Diversity in the Gene Network Controlled by the Master Regulator p53 in Humans

p53-regulated genes or to changes in the spectra of genes controlled by p53.12-14 We asked whether SNPs exist in the REs targeted by p53 and whether they could result in variation in the regulation of corresponding target genes as described in Figure 1. If so, this would suggest related variation in individual abilities to cope with environmental stresses. SNPs influence biological activity through their impact on gene activity and can affect either regulation or function. For example, SNPs have been identified in the N-acetyltransferase gene that alter the amino acid sequence of the protein. As a result, about 50% of the U.S. population exhibits a slow acetylation phenotype, which in turn alters the metabolism of carcinogenic aromatic amines and results in elevated risk of smoking-inducible bladder cancer.15 [As part of an initiative known as the Environmental Genome Project, the NIEHS has undertaken a large resequencing effort to identify SNPs that might alter Figure 1. Impact of a single nucleotide polymorphism (SNP) in a p53 response element (RE) on genes important in a variety of environmental transcription. The presence of the SNP in the promoter region of a target gene can affect how p53 mediates its expression. In this example, the C/G SNP had been identified in the promoter region responses (http://egp.gs.washington.edu/).] SNPs in gene regulatory sequences that affect of the ADAR2 gene. The SNP appears in the second half-site. The G allele is a perfect match with gene expression levels are an important but rela- the consensus sequence for a p53 half site resulting in efficient transactivation of the ADAR2 gene. tively unexplored class of genetic variation. There The C allele is not recognized well, resulting in less efficient expression of the target gene. are, however, examples of SNPs in regulatory regions causing either complete elimination of the natural transcrip- potential p53 REs were based on a combination of published studies, tion factor binding site16-18 or formation of a novel site.19,20 Until including our own. We previously examined p53 transactivation from our recent report,11 there have been no studies that attempt to identify REs using a model system that we created in the budding yeast systematically response element SNPs in a large regulatory network Saccharomyces cerevisiae. The system was based in part on earlier and then to evaluate function in vivo. Although RE SNPs in regulatory reports22,23 but included a tightly-regulated promoter to control networks would be expected to have important biological conse- expression of human p53. quences, there is a general problem in identifying such regulatory Using this precise yeast model, we investigated the ability of p53 SNPs. This is because RE consensus sequences are often not well- to transactivate from over 50 different REs linked to a reporter defined and functional REs can differ considerably from consensus. gene.4,6 [While yeast has no p53, the transcriptional machinery can As a result, it is has been hard to predict the impact, if any, that a support transactivation by p53, as well as several other mammalian particular SNP will have on transactivation. This point is exemplified transcription factors.]24 This led to an improved consensus sequence by the reported 20 base consensus of p53: RRRCWWGYYY (n) model, with a substantially narrowed definition of sites that support RRRCWWGYYYY, where R = purine, Y = pyrimidine, W = A or T, transactivation. For example, the spacer between half sites must be and “n” is proposed to be a spacer that can be 0 to 13 bases.21 two or fewer bases in order to mediate efficient transactivation (this The challenge in identifying biologically relevant SNPs among is also the case in human cells using a comparable reporter construct) p53 targeted REs was to (1) develop rules that would identify the (AI, DM and MAR, unpublished). The ‘C’ and ‘G’ positions in the sequences that are likely to function as REs for p53; (2) establish the core consensus are strictly required, and the ‘WW’ should be ‘AT’ position of candidate REs relative to RNA polymerase start sites and for high activity, while ‘AA’ and ‘TT’ result in lower functionality choose sites that could modulate p53 regulated expression; (3) develop and ‘TA’ inhibits function. Thus, while nearly all p53 REs have computational approaches that would characterize SNPs in potential mismatches from consensus, SNPs at some positions (e.g., ‘C’, ‘G’, REs and identify those in which one allele would be expected to be or ‘AT’) would be expected to markedly affect transactivation. Very stronger than the other for p53 transactivation; and (4) develop few REs are located more than 5000 bases from a polymerase start approaches to assess functional biological differences for the pairs of site, and we excluded potential REs that fell outside of this proximity alleles. We initially considered the possibility that there might be window. It is possible that this approach excluded bona fide REs that SNPs in the ~40 p53 REs previously characterized in the human control expression of uncharacterized transcriptional targets, including genome. Our search determined that was not the case; however, unidentified genes or regulatory RNAs. Such well-defined RE sites sparse resequencing data for upstream regulatory regions of genes in gene-poor regions of the genome represent an intriguing and may limit our ability to identify candidates in some of these areas. potentially important area for further investigation. High-throughput We pursued the generally-held view that there may be hundreds computational strategies for identifying genomic sequences of interest of targeted genes within the p53 network.12 Corresponding REs and must incorporate these and other emerging considerations. The ‘toppossible SNPs were identified initially through computational down’ approach we describe for screening SNPs is an important genomic analyses, and RE function was subsequently assessed by example of genome-scale analysis, and further work will certainly several laboratory methods.14 The rules we developed for identifying refine and improve the strategies described here. www.landesbioscience.com

Cell Cycle

1027

Functional Diversity in the Gene Network Controlled by the Master Regulator p53 in Humans

Among the ~2 million SNPs available at the start of our study, we identified several hundred that might be expected to mediate differential p53 transactivation from the associated genes. A small sample of the SNPs was characterized further for their actual effects on transcription. In every case, the predicted stronger allele of the SNP pair led to greater transactivation from the associated reporter when examined in either yeast or human cells. Functional SNPs were characterized in the promoters of the following genes: ADAR2, a member of a family of adenosine deaminases acting on RNA; SCGB1D2 a lipophilin protein that is a marker for breast cancer; SEI-1 an antagonist to p16 function in the G1 checkpoint; EOMES (Eomesodermin), a T-box transcription factor that plays an important role in T-cell differentiation; DCC (Deleted in Colorectal Carcinoma), a proposed tumor suppressor gene that also acts as a transmembrane receptor for netrin-1 in the nervous system; ARHGEF7 that may play a role in tumor invasion by regulating changes in the cytoskeleton of cancer cells; and TLR8, a toll-like receptor involved in innate immune response. Using our approach, we established the principle that SNPs in REs may introduce considerable individual variation in the network controlled by the p53 master regulator. Our recent search of ~10 million SNPs in the dbSNP database suggests that there are many additional candidate genes with polymorphic REs that may differentially affect regulation by p53 (unpublished data). In addition to RE sequence and transactivation capacity, other simple characteristics may play a role in governing the relative importance of any pair of SNP alleles. For example, impact may depend in part on the expression level of the associated gene. p53 induction may have a much greater relative effect on a gene with a low basal level of expression (i.e., in the absence of p53 induction) compared to a gene with high basal expression levels. Conversely, highly-expressed genes may play prominent roles or be required in high copy numbers and thus be particularly sensitive to subtle variation in expression levels. Evaluating these and other biological characteristics is an area of active follow-up research. Differences in responses for individual genes in the p53 network can potentially lead to a large variety of phenotypes. Theoretically, just one pair of SNPs in a network might yield two distinctly different networks. For 10 SNP pairs with distinct response phenotypes, the number of different p53 networks could be as high as 1000 (210). We have previously described the p53 network using a piano metaphor4 in which the “hand” (p53) that plays the “keys” (the genes) results in a “chord” (the phenotype). Changes in the intensities with which the keys are struck (the level of transactivation) would lead to different sounding chords or phenotypes. Thus, a small number of SNPs can yield many different sounding chords. Our findings for SNPs in the p53 network14 suggest that there may be considerable variation in the ability of cells to deal with the many internal and external environmental stresses that humans experience. Since p53 is a tumor suppressor, the SNPs may impact individual susceptibility or prognosis for cancer or other diseases. For example, the SEI-1 gene is important in the G1 checkpoint, and SNP-mediated differences in p53 control might result in altered cell cycle arrest following DNA damage. In addition, since many tumor-associated p53 mutations exhibit altered sequence-specific transactivation patterns25,26 (see the IARC data base at www-p53. iarc.fr/index.html and the Universal Mutation Database at www. umd.be:2072/W_TP53), there may be differences between individuals in the effects of specific p53 mutations. In support of this idea, we have recently observed that mutant p53 proteins in human cells can 1028

differentially influence target gene transactivation, resulting in a variety of biological consequences which, in turn, might be expected to influence tumor development and therapeutic efficacy (MAR, DM and AI, unpublished data). SNPs at REs could add an additional level of diversity to this highly variable response, leading to complex individual responses to single p53 mutations. The overall approach that we have taken to identify and characterize SNPs in p53 REs can be applied to many master regulatory gene systems using a combination of computational analysis and direct functional evaluation. Yeast-based systems have recently been developed to investigate transactivation from various REs that are targets for human master regulators, including NKX2-5, a homeodomain protein associated with heart development.27 Like p53, the NFk-B network is also very large. A functional SNP has been identified in the promoter region28 and the impact of RE sequence variation has been elegantly established with two REs using an in vivo functional assay.29 While in vitro binding has been used to address the ability of NFk-B to interact with many possible REs and SNP variants,30 it will be interesting to assess their ability to support transactivation in vivo. On a larger scale, variation in regulatory sequences might provide the opportunity to rapidly evolve gene networks. Because all the components for transcriptional activation or suppression, including the master regulator, are already present, and because there is considerable flexibility in sequence specificity, response element variation could rapidly fine-tune expression of important target genes via singlebase mutations. Inclusion of new genes into existing networks, or exclusion of existing genes from the same, is also possible and would provide opportunity for rapid evolutionary change. Our ongoing studies provide intriguing suggestions that this type of evolutionary process is at work within the p53 network (unpublished). An interesting example along this line comes from studies of pharynx development in Caenhorabditis elegans.31 A set of diverse genes was found to be controlled by a single master regulator. The timing of expression in organ development was shown to correlate with the binding strength of the upstream REs. References 1. Hanahan D, Weinberg RA. The hallmarks of cancer. Cell 2000; 100:57-70. 2. Darnell Jr JE. Transcription factors as targets for cancer therapy. Nat Rev Cancer 2002; 2:740-9. 3. Futcher B. Transcriptional regulatory networks and the yeast cell cycle. Curr Opin Cell Biol 2002; 14:676-83. 4. Resnick MA, Inga A. Functional mutants of the sequence-specific transcription factor p53 and implications for master genes of diversity. Proc Natl Acad Sci USA 2003; 100:9934-9. 5. Babu MM, Luscombe NM, Aravind L, Gerstein M, Teichmann SA. Structure and evolution of transcriptional regulatory networks. Curr Opin Struct Biol 2004; 14:283-91. 6. Inga A, Storici F, Darden TA, Resnick MA. Differential transactivation by the p53 transcription factor is highly dependent on p53 level and promoter target sequence. Mol Cell Biol 2002; 22:8612-25. 7. Kim JT, Gewehr JE, Martinetz T. Binding matrix: A novel approach for binding site recognition. J Bioinform Comput Biol 2004; 2:289-307. 8. Long F, Liu H, Hahn C, Sumazin P, Zhang MQ, Zilberstein A. Genome-wide prediction and analysis of function-specific transcription factor binding sites. In Silico Biol 2004; 4:395-410. 9. Wingender E. TRANSFAC, TRANSPATH and CYTOMER as starting points for an ontology of regulatory networks. In Silico Biol 2004; 4:55-61. 10. Marinescu VD, Kohane IS, Riva A. MAPPER: A search engine for the computational identification of putative transcription factor binding sites in multiple genomes. BMC Bioinformatics 2005; 6:79. 11. Tomso DJ, Inga A, Menendez D, Pittman GS, Campbell M R, Storici F, Bell DA, Resnick MA. Functionally distinct polymorphic sequences in the human genome that are targets for p53 transactivation. Proc Natl Acad Sci USA 2005; 102:6431-6. 12. Vogelstein B, Lane D, Levine AJ. Surfing the p53 network. Nature 2000; 408:307-10. 13. Meek DW. The p53 response to DNA damage. DNA Repair (Amst) 2004; 3:1049-56. 14. Harris SL, Levine AJ. The p53 pathway: Positive and negative feedback loops. Oncogene 2005; 24:2899-908.

Cell Cycle

2005; Vol. 4 Issue 8

Functional Diversity in the Gene Network Controlled by the Master Regulator p53 in Humans

15. Marcus PM, Hayes RB, Vineis P, Garcia-Closas M, Caporaso NE, Autrup H, Branch RA, Brockmoller J, Ishizaki T, Karakaya AE, Ladero JM, Mommsen S, Okkels H, Romkes M, Roots I, Rothman, N. Cigarette smoking, N-acetyltransferase 2 acetylation status, and bladder cancer risk: A case-series meta-analysis of a gene-environment interaction. Cancer Epidemiol Biomarkers Prev 2000; 9:461-7. 16. Boccia LM, Lillicrap D, Newcombe K, Mueller CR. Binding of the Ets factor GA-binding protein to an upstream site in the factor IX promoter is a critical event in transactivation. Mol Cell Biol 1996; 16:1929-35. 17. Vasiliev GV, Merkulov VM, Kobzev VF, Merkulova TI, Ponomarenko MP, Kolchanov NA. Point mutations within 663-666 bp of intron 6 of the human TDO2 gene, associated with a number of psychiatric disorders, damage the YY-1 transcription factor binding site. FEBS Lett 1999; 462:85-8. 18. Liu X, Campbell MR, Pittman GS, Faulkner EC, Watson MA, Bell DA. Expression-based discovery of variation in the human glutathione S-transferase M3 promoter and functional analysis in a glioma cell line using allele-specific chromatin immunoprecipitation. Cancer Res 2005; 65:99-104. 19. Piedrafita FJ, Molander RB, Vansant G, Orlova EA, Pfahl M, Reynolds WF. An Alu element in the myeloperoxidase promoter contains a composite SP1-thyroid hormone-retinoic acid response element. J Biol Chem 1996; 271:14412-20. 20. Knight JC, Udalova I, Hill AV, Greenwood BM, Peshu N, Marsh K, Kwiatkowski D. A polymorphism that affects OCT-1 binding to the TNF promoter region is associated with severe malaria. Nat Genet 1999; 22:145-50. 21. el-Deiry WS, Kern SE, Pietenpol JA, Kinzler KW, Vogelstein B. Definition of a consensus binding site for p53. Nat Genet 1992; 1:45-9. 22. Scharer E, Iggo R. Mammalian p53 can function as a transcription factor in yeast. Nucleic Acids Res 1992; 20:1539-45. 23. Flaman JM, Frebourg T, Moreau V, Charbonnier F, Martin C, Chappuis P, Sappino P, Limacher JM, Bron L, Benhattar J, Tada M, Van Meir EG, Estreicheru A, Iggo RD. A simple p53 functional assay for screening cell lines, blood, and tumors. Proc Natl Acad Sci USA 1995; 92:3963-7. 24. Kennedy BK. Mammalian transcription factors in yeast: Strangers in a familiar land. Nat Rev Mol Cell Biol 2002; 3:41-9. 25. Olivier M, Eeles R, Hollstein M, Khan MA, Harris CC, Hainaut P. The IARC TP53 database: New online mutation analysis and recommendations to users. Hum Mutat 2002; 19:607-14. 26. Soussi T, Kato S, Levy PP, Ishioka C. Reassessment of the TP53 mutation database in human disease by data mining with a library of TP53 missense mutations. Hum Mutat 2005; 25:6-17. 27. Inga A, Reamon-Buettner SM, Borlak J, Resnick MA. Functional dissection of sequence-specific NKX2-5 DNA binding domain mutations associated with human heart septation defects using a yeast-based system. Hum Mol Genet 2005; in press. 28. Udalova IA, Richardson A, Denys A, Smith C, Ackerman H, Foxwell B, Kwiatkowski D. Functional consequences of a polymorphism affecting NF-kappaB p50-p50 binding to the TNF promoter region. Mol Cell Biol 2000; 20:9113-9. 29. Leung TH, Hoffmann A, Baltimore D. One nucleotide in a kappaB site can determine cofactor specificity for NF-kappaB dimers. Cell 2004; 118:453-64. 30. Udalova IA, Mott R, Field D, Kwiatkowski D. Quantitative prediction of NF-kappa B DNA-protein interactions. Proc Natl Acad Sci USA 2002; 99:8167-72. 31. Gaudet J, Mango SE. Regulation of organogenesis by the Caenorhabditis elegans FoxA protein PHA-4. Science 2002; 295:821-5.

www.landesbioscience.com

Cell Cycle

1029