regulatory elements in metazoans cis Deep ...

6 downloads 0 Views 760KB Size Report
Nov 11, 2013 - and José Luis Gómez-Skarmeta3 ...... Jimenez-Delgado S, Bertrand S, Garcia-Fernandez J. .... Aparicio S, Morrison A, Gould A, Gilthorpe J,.
Downloaded from rstb.royalsocietypublishing.org on November 12, 2013

Deep conservation of cis-regulatory elements in metazoans Ignacio Maeso, Manuel Irimia, Juan J. Tena, Fernando Casares and José Luis Gómez-Skarmeta Phil. Trans. R. Soc. B 2013 368, 20130020, published 11 November 2013

References

This article cites 122 articles, 40 of which can be accessed free

Subject collections

Articles on similar topics can be found in the following collections

http://rstb.royalsocietypublishing.org/content/368/1632/20130020.full.html#ref-list-1

developmental biology (135 articles) evolution (657 articles) genomics (69 articles)

Email alerting service

Receive free email alerts when new articles cite this article - sign up in the box at the top right-hand corner of the article or click here

To subscribe to Phil. Trans. R. Soc. B go to: http://rstb.royalsocietypublishing.org/subscriptions

Downloaded from rstb.royalsocietypublishing.org on November 12, 2013

Deep conservation of cis-regulatory elements in metazoans Ignacio Maeso1,†, Manuel Irimia2,†, Juan J. Tena3, Fernando Casares3 and Jose´ Luis Go´mez-Skarmeta3

rstb.royalsocietypublishing.org

Review Cite this article: Maeso I, Irimia M, Tena JJ, Casares F, Go´mez-Skarmeta JL. 2013 Deep conservation of cis-regulatory elements in metazoans. Phil Trans R Soc B 368: 20130020. http://dx.doi.org/10.1098/rstb.2013.0020 One contribution of 12 to a Theme Issue ‘Molecular and functional evolution of transcriptional enhancers in animals’. Subject Areas: developmental biology, evolution, genomics Keywords: cis-regulatory element, gene regulatory networks, evolution, development Author for correspondence: Jose´ Luis Go´mez-Skarmeta e-mail: [email protected]



These authors contributed equally to this article.

1

Department of Zoology, University of Oxford, Oxford, UK The Donnelly Centre, University of Toronto, Toronto, Ontario, Canada 3 Centro Andaluz de Biologı´a del Desarrollo (CABD), Consejo Superior de Investigaciones Cientı´ficas/Universidad Pablo de Olavide, Sevilla, Spain 2

Despite the vast morphological variation observed across phyla, animals share multiple basic developmental processes orchestrated by a common ancestral gene toolkit. These genes interact with each other building complex gene regulatory networks (GRNs), which are encoded in the genome by cis-regulatory elements (CREs) that serve as computational units of the network. Although GRN subcircuits involved in ancient developmental processes are expected to be at least partially conserved, identification of CREs that are conserved across phyla has remained elusive. Here, we review recent studies that revealed such deeply conserved CREs do exist, discuss the difficulties associated with their identification and describe new approaches that will facilitate this search.

1. Deeply conserved metazoan features Animals greatly exemplify how enormous morphological diversity can arise through evolution. However, despite their profound variation in forms, there are clear commonalities in how animal body plans are built. The early formation of distinct germ layers [1–4], the establishment of body axes [5– 9], the differentiation of photoreceptors [10] or the deployment of similar neuronal architectures [11] are examples attesting to deep homologies at the tissue and cellular level. Accordingly, whole genome sequencing projects and gene family evolution studies revealed that all animals share a basic set of genes [12 –16], establishing a link between this basic shared genetic toolkit and the core of common structures, cell types and developmental processes. Studies focusing on the structure and evolution of developmental gene regulatory networks (GRNs) have further suggested that regulatory gene interactions can also be extremely stable through long periods of evolutionary history [17], which is mirrored by a large collection of deeply conserved gene expression patterns during embryonic development of many animal phyla. These range from the paradigmatic anterior –posterior Hox patterning [18] to commonalities on nervous system regionalization [19– 23] and cell-type-specific gene expression patterns [11]. However, in stark contrast to the deep conservation of the protein-coding sequences and expression patterns of the genes involved in common developmental programmes, comparisons of whole genome sequences across metazoans showed that the cis-regulatory elements (CREs) responsible for the expression of these genes do not seem to be conserved among phyla. With a few recently reported exceptions (see below), all highly conserved regulatory modules found to date seem strictly circumscribed to specific taxonomic groups within phyla [24]. Nonetheless, confident identification of conserved CREs is much more complex than that of protein-coding regions, since the structure of the regulatory regions itself—normally consisting of a set of semi-independent short transcription factor binding sites (TFBSs) that can be located at almost unlimited distances from their target genes—may preclude their identification at deep evolutionary distances. Here, we review studies searching for deeply conserved CREs and discuss whether their results (i.e. widespread lack of deep conservation of CREs) are

& 2013 The Author(s) Published by the Royal Society. All rights reserved.

Downloaded from rstb.royalsocietypublishing.org on November 12, 2013

With the advent of whole genome sequencing projects for many vertebrate species, proper genome-wide comparisons and phylogenetic footprinting of non-coding regions became readily available (table 1). These comparisons revealed the presence of thousands of highly conserved non-coding regions (HCNRs) among vertebrate genomes, sometimes with levels of sequence conservation higher than those of protein-coding regions [25,26,37–39]. The vast majority of these HCNRs flank genes that play key roles in embryonic development and have very complex expression patterns [26,39]. Importantly, functional studies using transgenic assays in mouse, Xenopus and zebrafish carried out by various groups, including ours, showed that a large fraction of the analysed HCNRs act as transcriptional enhancers, driving expression of their flanking developmental genes to spatio-temporal domains consistent with their endogenous expression patterns [25,26,40–53]. Moreover, similar patterns of HCNRs around key developmental regulators were subsequently observed for flies and nematodes [34,35,54–56], and their frequent role in regulating transcription could also be established. After the success of the initial studies, identification of HCNRs thus became the gold standard for the search for functional conserved CREs across model organisms in the past decade. Within mammals, around 30 000 HCNRs were identified [57,58], and over 8000 HCNRs were traced back to the origin of gnathostome vertebrates [27,59]. A set of approximately 2000– 5000 of these HCNRs are currently present in most jawed vertebrates [27] and were thus suggested to contain much of the regulatory information essential to build the conserved vertebrate body plan [24,28], providing a unifying view of conserved embryonic development and body plan, gene toolkit and gene expression and its underlying regulatory circuitry. However, a major challenge to this view came from the sequencing of the slow-evolving invertebrate chordate amphioxus. Compared to over 5000 HCNRs present in gnathostome ancestors, only around 50 non-coding regions showed recognizable homology by sequence similarity between the amphioxus and human genome [32], despite extensive body plan and developmental similarities within the chordate phylum [60,61]. Recently, we performed a wide survey for conservation of these HCNRs in multiple metazoan lineages using local alignments of these sequences to conserved syntenic regions combined with classic phylogenetic footprinting. We identified eight deeply conserved HCNRs, two of them already present in the eumetazoan ancestors, and six of them in the last common ancestor of deuterostomes [36]. Functional studies of these sequences in zebrafish, sea urchin and Drosophila revealed that they promote expression in similar tissues (figure 1), as would be expected for homologous CREs read by deeply conserved regulatory states [36]. In a similar

3. Challenges of in silico detection of transphyletic cis-regulatory elements based on sequence similarity The shorter length and lower similarity of the HCNRs containing the few described transphyletic CREs place them on the edge of the cut-offs that have been traditionally used to assess homology based on sequence alignment scores [62]. Therefore, this approach is likely to lead to a much higher rate of false negatives when looking for conservation of CREs between extremely divergent lineages, amplifying the effects of common confounding factors of phylogenetic footprinting, such as species selection [62]. For instance, in the recently discovered bilaterian CREs in protostome genomes, the sequences from amphioxus and zebrafish were the only deuterostome queries that were able to recover their protostome orthologues. In fact, in all cases but two, amphioxus transphyletic CREs displayed the highest degree of similarity with their orthologues in other phyla, highlighting the importance of the inclusion of slow-evolving taxa [30]. Along the same lines, Taher and collaborators [63] have recently shown the utility of applying conservation ‘tunnelling’ to identify divergent orthologous CNRs. This method consists of a series of iterative pairwise alignments among sequences from multiple species diverged at different times and under distinct evolutionary rates. By ‘tunnelling’ human –zebrafish comparisons through the inclusion of frog sequences, the authors were able to identify 1500 pairs of sequences with no clear similarity between human and zebrafish, but that were clearly orthologous (more than 70% over more than 100 bp) to the frog counterparts. Unfortunately, these difficulties are probably not solvable by simply relaxing the cut-offs of length and sequence similarity used to define conserved elements: the rate of false positives would increase dramatically and, owing to the specific nature of CREs, other detection artefacts may appear. For instance, the presence of combinations of relatively long TFBS (e.g. the 15 –20 bp long HMG/POU cassette bound cooperatively by Sox2 and Oct4 proteins [64]) around different genes sharing a similar regulatory syntax could lead to the appearance of multiple instances of similar non-homologous sequence stretches in a given genome. These kinds of

2

Phil Trans R Soc B 368: 20130020

2. Deeply conserved cis-regulatory elements: beyond the limits of phylogenetic footprinting?

study, Clarke et al. [30] extended the conservation of two of the deuterostome-specific HNCRs to the origin of bilaterians and showed their ability to drive expression in different animal systems. Altogether, these analyses suggest that identification of deep HCNRs is largely beyond the range of traditional phylogenetic footprinting analyses. The reported sequences, although clearly homologous, showed a reduced degree of conservation than that observed within vertebrates—usually a core of only 75–100 bp with 60–70% of similarity compared with often greater than 500 bp within vertebrates—making them very difficult to be computationally detected, especially if we take into account that it is likely that in many other cases transphyletic CREs would not be configured as HCNRs (table 1). Nevertheless, these initial studies show that ancient CREs conserved across metazoan phyla can and do exist, raising the question of whether they may represent rare exceptions or only the tip of the iceberg emerging from a sea of technical and perhaps also conceptual limitations.

rstb.royalsocietypublishing.org

due to true evolutionary peculiarities of CREs, experimental limitations or a mixture of both. Then, we will discuss how new experimental approaches based on highthroughout sequencing combined with data on conserved positional information will be likely to provide new insights into these questions.

vertebrates, amphioxus, sea urchin, acorn worm, anemone

vertebrates, amphioxus, sea urchin, acorn worm, tick, gastropods

acorn worm

D. virilis, D. mojavensis C. elegans, C. briggsae, C. remanei

pseudoobscura,

2

2

2084

56 6779

1299

1/2

2/2c

2/8

2/2c

12/n.s.b

4/8 n.a.

10/22

b

Probably an understimation, only a subset of gnathostome CNRs were searched in the incomplete lamprey genome assembly. Data obtained by overlap with previously published results. c The elements are the same. d Owing to incompleteness of the lamprey assembly only one syntenic block could be checked.

a

[36]

[36]

acorn worm vertebrates, amphioxus, sea urchin,

[30]

nematodes Deuterostomes

Eumetazoans

vertebrates, amphioxus, sea urchin,

[35]

Caenorhabditis

[30]

8

human, amphioxus D. melanogaster, D. ananassae, D.

[32,33] [34]

fruitflies

Bilaterians

4

vertebrates, amphioxus

[31]

yes

yes

yes

yes

n.s.

yes yes

partial (gene linkage only)

yes

5

2/2c

vertebrates, amphioxus

[30]

chordates

yesd no

2/2 2/3

73a 183

vertebrates olfactores

tetrapods, teleosts, lamprey vertebrates, sea squirts

[28] [29]

jawed vertebrates

n.s. n.s.

yes

positional conservation

23/25 307/564b

57/137

no. elements positive/tested

1373 8452

3124

human, fugu

human, rodents, chicken, fugu elephant shark, tetrapods, teleosts

[25]

[26] [27]

no. elements

bony vertebrates

species/lineages compared

57

55

55

59

100 in 30 bp

60 98

n.s.

66

63 52

74 74

70

min. similarity (%)

Phil Trans R Soc B 368: 20130020

study

ave./med. length (bp) n.s. 199 (a) 175– 226 (a) 127 (a) 45 (a) 108 (a), 112 (m) 54 (m) 63 (m) 59 (m)

69 (a), 59 (m) 104 (a), 108 (m) 103 (a), 119 (m) 99 (a,m) 117 (a,m)

ave. similarity (%) n.s. 84 82 80 55 71 n.s. 64 n.s.

96 65 65 61 63

rstb.royalsocietypublishing.org

phylogenetic range

Table 1. Examples of studies searching deeply conserved non-coding regions at different animal evolutionary timescales. In each study, the number of CNRs identified, tested and positive, as well as the percentage similarity and lengths of these CNRs is depicted. n.a., not assessed; n.s., not specified; (a), average; (m) median.

Downloaded from rstb.royalsocietypublishing.org on November 12, 2013

3

Downloaded from rstb.royalsocietypublishing.org on November 12, 2013

human

human SoxB2 (SOX21)

zebrafish

100 bp 100 bp 70%

amphioxus

100 bp 100 bp 70% 100 bp 100 bp 70%

sea urchin

100 bp 100 bp 70%

Nematostella

100 bp 100 bp 70%

amphioxus 50% 100%

Saccoglossus 50% 100%

50% 100%

50% 100%

sea urchin

Nematostella

50% 0.05 k

1.05 k

2.05 k

3.05 k

4.05 k

5.05 k

6.05 k

Figure 1. A deeply conserved HCNR located downstream of the SoxB2 gene from several species promotes GFP expression in similar neural territories in stable zebrafish transgenic lines. phenomena could indeed provide an explanation for the striking differences in the complements of chordate HCNRs reported by several studies. In stark contrast with the scarcity of ancient HCNRs identified by us and others [30,32,36], two reports using the same methodology [65] uncovered 1299 and 183 putative HCNRs dating back to the last common ancestor of chordates (i.e. conserved between amphioxus and vertebrates [31]) and of the olfactores lineage (the clade comprising tunicates plus vertebrates [29]), respectively. The majority of these sequences were extremely short, with an average of 45 bp in the olfactores sequences and a median of 54 bp in the case of the chordate regions. Similarly, the percentage identities were also very modest (55% on average between tunicates and vertebrates). Finally, the relative location of these non-coding sequences was only partially conserved between vertebrates and invertebrates. In the case of the amphioxus –vertebrates study, although the putative HCNRs were linked to the same orthologous genes in both chordate lineages, the conservation of their relative position and orientation with respect to these genes (i.e. intronic, intergenic upstream/downstream or within a nearby bystander gene) was not taken into account, thus weakening the arguments supporting the homology of the amphioxus and vertebrate regions. For the putative olfactores HCNRs, the lack of positional conservation was particularly extreme: in 182 of the 183 reported cases it was not possible to identify a single orthologous gene that was syntenic to the noncoding sequences in both vertebrates and tunicates, even when using 2 Mb windows, which span an average of 351 genes in Ciona intestinalis [66] (in the remaining case, the HCNR was located within an intron of FoxP1 in both vertebrates and tunicates). Thus, only one of the putative olfactores HCNRs could have the same target in vertebrates and invertebrates, casting doubt on the homology (and hence conservation) of these sequences. In the absence of positional conservation, establishing homology based on the presence of very short stretches of moderate sequence similarity seems, at best, risky. Therefore, although some of the sequences reported by Hufton et al. [31] and Sanges et al. [29] could correspond to bona-fide conserved homologous

non-coding sequences, it is possible that the number of HCNRs reported in these studies is an overestimation. In summary, using current computational tools, the number of unambiguous transphyletic HCNRs identified to date is surprisingly small. Indeed, given the challenges faced by HCNRs search, the question may no longer be why we are not able to detect transphyletic CREs using sequence conservation but, rather, why we can actually identify (at least some of ) them as HCNRs.

4. Why so few transphyletic cis-regulatory elements? The difficulties described above raise the question of whether the nearly complete lack of known CREs conserved across phyla represents true evolutionary peculiarities of cis-regulatory modules compared to protein-coding sequences, it responds to the technical and conceptual challenges associated with CRE identification and/or comparison based on sequence conservation, or something in between. Therefore, at least three major evolutionary frameworks can be proposed for this differential pattern of conservation: (i) The lack of conservation of CREs may parallel the specific differences in body plans across phyla [24]. In this scenario, a conserved set of protein-coding sequences would have been recurrently redeployed for different developmental programmes by evolving new CREs in the different lineages; therefore, the conservation of CREs would be minimal across phyla and would be represented by the few transphyletic HCNRs identified to date. Although this may be the case for a large fraction of phylum-specific CREs, it does not provide a satisfactory explanation for the CREs associated with the large number of deeply conserved expression patterns and developmental programmes described above. (ii) Rampant turnover of CREs may have fully eroded the conservation of the ancestral set of CREs, while keeping

Phil Trans R Soc B 368: 20130020

Saccoglossus

100%

rstb.royalsocietypublishing.org

human

4

Downloaded from rstb.royalsocietypublishing.org on November 12, 2013

5. How to look for cis-regulatory elements regardless of sequence information: the epigenomics revolution Applications of NGS have proved to be the most efficient methods for genome-wide identification of CREs [79–97]. These studies, performed in various model systems, have shown the association between a precise chromatin environment and transcriptional enhancers (i.e. CREs that promote transcription, in contrast to silencers or other types of CREs). Enhancers that correlate with enrichment of lysine 4 monomethylation of histone H3 (H3K4me1) and lysine 27 acetylation of histone H3 (H3K27ac) modifications are located within DNAse hypersensitive sites and are able to recruit transcriptional activators, such as p300 and CBP [80–83,85–89]. Moreover, enhancers can be subdivided into distinct functional classes based on their epigenetic makeup: whereas H3K4me1 alone marks poised enhancers, active ones are enriched in both H3K4me1 and H3K27ac modifications [83,86–89]. Importantly, since these combinations of histone marks have been repeatedly found in different bilaterian species, ranging from flies to mammals, they probably represent evolutionarily conserved signatures of enhancers. Moreover, the coding sequences of histones are extremely conserved, which allows the use of commercially available ChIP-grade antibodies in almost any animal species, facilitating genome-wide mapping of enhancers across many lineages and biological systems. For instance, NGS-related approaches have been used to determine the dynamics of enhancer activation during cell differentiation, both using embryonic stem cell differentiation strategies in cell cultures experiments [86,88,95–99] or developing embryos [83,89]. In one of these studies, we have shown that deposition and removal of H3K27ac is very dynamic during early zebrafish development and correlates well with the expression levels of the putative target genes of these marked enhancers [83]. Genomic regions enriched in H3K27ac at early stages are bound by pluripotent factors and other control genes involved in early developmental processes. These marks are removed later from these regions and substituted by others in regions that will be bound by tissue-specific transcription factors and/or controlling the expression of patterning genes. In addition, we found that the putative enhancers that are turned on at late gastrula have the highest sequence conservation within vertebrates, compared with those activated at other developmental stages. Interestingly, the segmentation stages that follow gastrulation show the highest degree of transcriptome conservation across vertebrate development, supporting the ‘hourglass’ model of developmental evolution, which states that maximum morphological similarity between vertebrates occurs at the phylotypic stage [100–103]. Our results thus suggest that such a conserved transcriptome at the phylotypic stage is controlled by the most evolutionarily conserved set of CREs. Moreover, this high sequence and, presumably, functional conservation makes late gastrula stage enhancers arguably the best candidates to search for functionally equivalent CREs in other lineages. Another powerful NGS-based method is DNAse I footprint, which allows the identification of transcription factor

5

Phil Trans R Soc B 368: 20130020

Under this last framework, however, two major issues will be encountered when trying to define the complement of ancestral transphyletic CREs. First, the methods used for detecting CREs have to be fully independent of sequence similarity (i.e. phylogenetic footprinting), and based on specific functional and biochemical properties; fortunately, approaches based on next-generation sequencing (NGS) can currently provide unprecedented power for CRE detection. Second, assessing whether two genes inherited ancestral CREs or they independently evolved into functionally equivalent CREs relies to a large extent on positional information and genome structure, which needs to be taken into consideration,

in combination with more sophisticated sequence comparison methods based on conserved regulatory encryption [63].

rstb.royalsocietypublishing.org

the ancestral functions [67–69]. If so, an analogous, but no longer homologous, set of CREs would be responsible for the similar gene expression patterns and functions in different phyla. This idea has been boosted by a series of recent genome-wide studies of CRE conservation that showed a very high TFBS turnover, even between relatively closely related species [69,70]. Behind this rapid rate of TFBS gain/loss is probably the high degree of (partial) functional redundancy observed for many regulatory inputs, by which several CREs often drive expression of the same gene to similar spatio-temporal locations [42,43,71–73]. However, although this rampant turnover may be widespread for simple TFBSs, particularly in adult tissues, it may not be so extreme for more complex CREs associated with genes involved in regulation of embryonic development [74], especially those CREs forming intricate enhanceosomes (i.e. long arrays of TFBSs that work cooperatively and behave as functional and evolutionary blocks, often resulting in HCNRs) [75,76]. Furthermore, it is still a matter of debate how many of the whole set of TFBSs identified by ChIP-based studies are actually biologically functional (see [77]) and to what extent they contribute to the transcriptional regulation of their surrounding genes [78]. (iii) Deeply conserved ancestral CREs do exist in various phyla, but, as discussed above, we cannot recognize their orthology by mere sequence similarity in alignment procedures, either because they are not configured as HCNRs and have diverged too much, or because of technical limitations of current computational tools and/or an insufficient and biased taxon sampling. In this scenario, conserved ancestral CREs would have evolved from a common set of ancestral instructions that have diverged in sequence beyond detection level, but preserved their functional properties to a large extent. Therefore, while in (ii) a newly created CRE takes over the ancestral function and the ancestral CRE disappears, in this case, although mutations can erase the ancestral sequence similarity, these mutations occur within the same orthologous CRE that has maintained its functionality uninterruptedly over evolutionary time. Suggesting that this process may be widespread, hundreds of orthologous ‘covert CREs’ (i.e. elements with conserved regulatory components and syntax embedded within divergent sequences that drive expression to similar anatomical domains) have been identified between human and zebrafish in conserved syntenic positions [63].

Downloaded from rstb.royalsocietypublishing.org on November 12, 2013

With the lowering of sequencing costs, the universal application of these NGS-based methods is expected to soon yield a wealth of putative CREs in a wide range of metazoan lineages. The evolutionary challenge, then, will be to assess the putative conservation of functionally equivalent CREs (i.e. driving expression to similar spatio-temporal domains) associated with orthologous genes. In addition to insights provided from improving our understanding of TFBS syntax and its evolution, an obvious reference should be conserved positional, or syntenic, information. Within lineages, HCNRs are located at conserved syntenic positions in different genomes [35,38,39]; this is also true for confidently identified transphyletic HCNRs [30,36]. Therefore, it is expected that most, if not all, ancestral CREs will be found in similar relative positions in divergent animal species. Favouring this hypothesis,

7. Concluding remarks Current data suggest that a few ancestral CREs exist in presentday animal genomes configured as HCNRs. With these data on hand, the long-term evolution of HCNRs seems radically different from that of the gene complement: the number of HCNRs decreases with the phylogenetic distance, but this decrease is not linear, dropping abruptly at certain evolutionary transitions, often coincidental with transphyletic boundaries, in a pattern highly reminiscent of the evolution of microRNAs [28,30].

6

Phil Trans R Soc B 368: 20130020

6. Where to look for functionally conserved cis-regulatory elements: positional information

functionally equivalent enhancers that are not conserved at the sequence level have been found at similar positions relative to their target genes in distantly related insects, supporting their common origin [116]. Similarly, ‘covert CREs’ with no sequence similarity but conserved arrangement of putative TFBSs have been identified in the same relative syntenic positions in human and zebrafish, using coding genes and HCNRs as syntenic anchors [63]. The localization of CREs far apart from the genes they regulate, particularly in the introns of nearby genes (known as bystanders), is known to influence the conservation of gene synteny within phyla, establishing genomic regulatory blocks (GRBs) around developmental genes [34,117]. We have recently shown that this is true even across phyla. A genome-wide survey of 17 different species revealed nearly 600 pairs of unrelated genes that have remained tightly physically linked in diverse lineages over hundreds of million years of evolution [118–120]. Importantly, a significant fraction of these conserved associations may correspond to ancient GRBs [119]. Therefore, it is plausible that some of the introns of the conserved bystander genes contain deeply conserved CREs, despite sequence similarity having been eroded. Some preliminary data from our laboratory support this possibility (figure 2). Combining enhancer-associated epigenetic marks in zebrafish embryos, enhancer transgenic assays and 3C experiments, we have shown that an otx1 brain enhancer element lies within one specific intron of the nearby bystander ehbp1 gene [119]. The expression promoted by this enhancer perfectly matches the anterior expression of otx1 in all vertebrates [121]. This expression domain is also very similar to that observed in amphioxus [122], with very limited non-coding sequence conservation with vertebrates [32,36], but with hundreds of conserved gene associations, including between the Otx and Ehbp1 orthologues [119]. Despite no clear sequence conservation being detected in the orthologous Ehbp1 intron between amphioxus and vertebrates, transient transgenic enhancer assays in zebrafish embryos with the amphioxus orthologous intron sequence reveals an enhancer activity highly similar to that of the zebrafish enhancer (figure 2). Therefore, these results suggest that functionally equivalent CREs can be found in similar syntenic positions, strongly arguing for their homology. If so, the finding of hundreds of deeply conserved syntenic associations due to cisregulatory constraints, presumably from shared ancestral CREs, would provide another indication that conservation of ancestral CREs may be much more widespread than currently assumed. Despite being strongly suggestive, it should be noted, however, that, in the absence of sequence similarity or exact conserved TFBS arrangements, whether functionally equivalent enhancers in conserved syntenic positions of two different species are homologous or analogous cannot be fully elucidated with our current knowledge of CRE evolution, imposing a new exciting challenge for evolutionary genome biologists.

rstb.royalsocietypublishing.org

occupancy sites at the nucleotide resolution and without the use of specific antibodies [93,104,105]. Strikingly, the sequence conservation of the occupied binding sites is much higher than that detected across the much broader ChIP-seq peaks that contain them [93]. This suggests that even within largely non-conserved regulatory regions, specific conserved binding sites for particular transcription factors may be identified. Moreover, this technique has been used to identify occupied TF binding sites within CREs in a variety of cell lines, allowing the inference of cell-type-specific GRNs [106]. Since specific antibodies are not required for this method, it is likely that DNAse I footprinting in multiple animal models will be a particularly powerful strategy to identify evolutionarily conserved GRNs operating along different development stages even in distantly related species. The combination of all these epigenomic strategies will help to precisely define the collection of CREs active in a particular cell type, tissue or embryonic stage of a given species. Nevertheless, to infer which genes are actually regulated by these CREs is not trivial, and needs to be assessed by different approaches. The extent of DNA containing all regulatory elements acting on a particular gene has been denominated the regulatory landscape of such gene [47,107]. These landscapes can be very large, especially for developmental genes, and composed of multiple regulatory elements scattered across several megabase pairs [108]. Various genetic techniques, including mouse genetic engineering, BAC (Bacterial Artificial Chromosome) recombineering coupled to transgenesis and Chromosome Conformation Capture (3C), have been successfully used to define regulatory landscapes of different developmental genes (e.g. [42,47,109–112]). Recently, a new variant of the 3C technique coupled to NGS (4C-seq) has been shown to be very effective in defining gene regulatory landscapes [113,114]. Indeed, the combination of 4C-seq with epigenomic data, such as those described above, has been used to identify not only tissuespecific Hox CREs, but also the topological organization of Hox regulatory landscapes along the anterior–posterior axis of mouse embryos [113,114] and along limb development [115]. Therefore, the integration of multiple NGS techniques is likely to facilitate the identification of potential evolutionary equivalent CREs in different model organisms.

Downloaded from rstb.royalsocietypublishing.org on November 12, 2013

3C contact

7

rstb.royalsocietypublishing.org

zebrafish 100 kb H3K27me3 24hpf H3K27ac 24hpf H3K4me3 24hpf otx1b

amphioxus ehbp1

otx1

Figure 2. Functionally conserved otx1 enhancers are located at syntenic positions in the same intron of the bystander gene ehbp1 in vertebrates and amphioxus despite no sequence conservation being detected in this genomic area between the two lineages. However, it is likely that the huge majority of conserved transphyletic CREs are much less constrained in sequence than these few described examples, and we may be able to identify them through a combination of epigenetic profiling, genome architecture comparisons and functional studies. Although functional CRE testing is currently a limiting factor (particularly given that putative orthologous CREs should be ideally tested in their multiple endogenous contexts, a very uncommon practice to date [30,36], and for which reliable techniques are not yet established in many systems), it is likely that new methods combining NGS with computational techniques will not only predict CRE activity, but also accurately infer the expression patterns that they drive; the first efforts in this direction are promising [123]. Finding the set of ancestral CREs and their functions opens up the extremely exciting possibility of unravelling

GRNs shared across phyla. These networks would define the basic developmental operations common to all metazoans, and the foundation for the further diversification of their body organization. Acknowledgements. The authors would like to thank Gill Bejerano and Shoa L. Clarke for providing information necessary for the completion of table 1. We would like to apologize to all researchers whose work was not included in this review. Funding statement. J.L.G-S. and F.C acknowledge the Spanish and the Andalusian Governments and the Feder program for grants that funded this study (BFU2010-14839, BFU2009-07044, BFU2012-34324, CSD2007-00008 and Proyecto de Excelencia CVI-3488). I.M. holds a postdoctoral contract funded by the European Research Council under the European Union’s Seventh Framework Programme (FP7/ 2007–2013)/ERC grant (268513)11. M.I. is supported by a postdoctoral fellowship from the Human Frontiers Science Program Organization.

References 1.

Green SA, Norris RP, Terasaki M, Lowe CJ. 2013 FGF signaling induces mesoderm in the hemichordate Saccoglossus kowalevskii. Development 140, 1024–1033. (doi:10.1242/dev.083790)

2.

Ro¨ttinger E, Dahlin P, Martindale MQ. 2012 A framework for the establishment of a cnidarian gene regulatory network for ‘Endomesoderm’ specification: the inputs of B-Catenin/TCF signaling.

3.

PLoS Genet. 8, e1003164. (doi:10.1371/journal. pgen.1003164) Martindale MQ, Hejnol A. 2009 A developmental perspective: changes in the position of the

Phil Trans R Soc B 368: 20130020

ehbp1 medaka stickleback tetraodon fugu X. tropicalis mouse human

Downloaded from rstb.royalsocietypublishing.org on November 12, 2013

5.

6.

8.

9.

10.

11.

12.

13.

14.

15.

16.

17.

19.

20.

21.

22.

23.

24.

25.

26.

27.

28.

29.

30.

31. Hufton AL, Mathia S, Braun H, Georgi U, Lehrach H, Vingron M, Poustka AJ, Panopoulou G. 2009 Deeply conserved chordate noncoding sequences preserve genome synteny but do not drive gene duplicate retention. Genome Res. 19, 2036–2051. (doi:10. 1101/gr.093237.109) 32. Putnam NH et al. 2008 The amphioxus genome and the evolution of the chordate karyotype. Nature 453, 1064– 1071. (doi:10.1038/nature06967) 33. Holland LZ et al. 2008 The amphioxus genome illuminates vertebrate origins and cephalochordate biology. Genome Res. 18, 1100–1111. (doi:10. 1101/gr.073676.107) 34. Engstrom PG, Ho Sui SJ, Drivenes O, Becker TS, Lenhard B. 2007 Genomic regulatory blocks underlie extensive microsynteny conservation in insects. Genome Res. 17, 1898–1908. (doi:10.1101/gr. 6669607) 35. Vavouri T, Walter K, Gilks WR, Lehner B, Elgar G. 2007 Parallel evolution of conserved noncoding elements that target a common set of developmental regulatory genes from worms to humans. Genome Biol. 8, R15. (doi:10.1186/gb2007-8-2-r15) 36. Royo JL et al. 2011 Transphyletic conservation of developmental regulatory state in animal evolution. Proc. Natl Acad. Sci. USA 108, 14 186–14 191. (doi:10.1073/pnas.1109037108) 37. Aparicio S, Morrison A, Gould A, Gilthorpe J, Chaudhuri C, Rigby P, Krumlauf R, Brenner S. 1995 Detecting conserved regulatory elements with the model genome of the Japanese puffer fish, Fugu rubripes. Proc. Natl Acad. Sci. USA 92, 1684–1688. (doi:10.1073/pnas.92.5.1684) 38. Bejerano G, Pheasant M, Makunin I, Stephen S, Kent WJ, Mattick JS, Haussler D. 2004 Ultraconserved elements in the human genome. Science 304, 1321– 1325. (doi:10.1126/science. 1098119) 39. Sandelin A, Bailey P, Bruce S, Engstrom PG, Klos JM, Wasserman WW, Ericson J, Lenhard B. 2004 Arrays of ultraconserved non-coding regions span the loci of key developmental genes in vertebrate genomes. BMC Genomics 5, 99. (doi:10.1186/1471-2164-5-99) 40. Royo JL, Hidalgo C, Roncero Y, Seda MA, Akalin A, Lenhard B, Casares F, Gomez-Skarmeta JL. 2011 Dissecting the transcriptional regulatory properties of human chromosome 16 highly conserved noncoding regions. PLoS ONE 6, e24824. (doi:10.1371/ journal.pone.0024824) 41. Bagheri-Fam S, Ferraz C, Demaille J, Scherer G, Pfeifer D. 2001 Comparative genomics of the SOX9 region in human and Fugu rubripes: conservation of short regulatory sequence elements within large intergenic regions. Genomics 78, 73 –82. (doi:10. 1006/geno.2001.6648) 42. Tena JJ, Alonso ME, de la Calle-Mustienes E, Splinter E, de Laat W, Manzanares M, Gomez-Skarmeta JL. 2011 An evolutionarily conserved three-dimensional structure in the vertebrate Irx clusters facilitates enhancer sharing and coregulation. Nat. Commun. 2, 310. (doi:10.1038/ncomms1301)

8

Phil Trans R Soc B 368: 20130020

7.

18.

development. Cell 144, 970– 985. (doi:10.1016/j. cell.2011.02.017) Lemons D, McGinnis W. 2006 Genomic evolution of Hox gene clusters. Science 313, 1918–1922. (doi:10.1126/science.1132040) Santagata S, Resh C, Hejnol A, Martindale M, Passamaneck Y. 2012 Development of the larval anterior neurogenic domains of Terebratalia transversa (Brachiopoda) provides insights into the diversification of larval apical organs and the spiralian nervous system. EvoDevo 3, 3. (doi:10. 1186/2041-9139-3-3) Irimia M, Pineiro C, Maeso I, Gomez-Skarmeta J, Casares F, Garcia-Fernandez J. 2010 Conserved developmental expression of Fezf in chordates and Drosophila and the origin of the Zona Limitans Intrathalamica (ZLI) brain organizer. EvoDevo 1, 7. (doi:10.1186/2041-9139-1-7) Pani AM, Mullarkey EE, Aronowicz J, Assimacopoulos S, Grove EA, Lowe CJ. 2012 Ancient deuterostome origins of vertebrate brain signalling centres. Nature 483, 289 –294. (doi:10.1038/nature10838) Puelles L, Ferran JL. 2012 Concept of neural genoarchitecture and its genomic fundament. Front. Neuroanat. 6, 47. (doi:10.3389/fnana.2012.00047) Denes AS, Jekely G, Steinmetz PR, Raible F, Snyman H, Prud’homme B, Ferrier DE, Balavoine G, Arendt D. 2007 Molecular architecture of annelid nerve cord supports common origin of nervous system centralization in bilateria. Cell 129, 277–288. (doi:10. 1016/j.cell.2007.02.040) Vavouri T, Lehner B. 2009 Conserved noncoding elements and the evolution of animal body plans. Bioessays 31, 727–735. (doi:10.1002/bies. 200900014) Pennacchio LA et al. 2006 In vivo enhancer analysis of human conserved non-coding sequences. Nature 444, 499 –502. (doi:10.1038/nature05295) Woolfe A et al. 2005 Highly conserved non-coding sequences are associated with vertebrate development. PLoS Biol. 3, e7. (doi:10.1371/journal. pbio.0030007) Lee AP, Kerk SY, Tan YY, Brenner S, Venkatesh B. 2011 Ancient vertebrate conserved noncoding elements have been evolving rapidly in teleost fishes. Mol. Biol. Evol. 28, 1205– 1215. (doi:10. 1093/molbev/msq304) McEwen GK, Goode DK, Parker HJ, Woolfe A, Callaway H, Elgar G. 2009 Early evolution of conserved regulatory sequences associated with development in vertebrates. PLoS Genet. 5, e1000762. (doi:10.1371/journal.pgen.1000762) Sanges R et al. 2013 Highly conserved elements discovered in vertebrates are present in nonsyntenic loci of tunicates, act as enhancers and can be transcribed during development. Nucleic Acids Res. 41, 3600 –3618. (doi:10.1093/nar/gkt030) Clarke SL, VanderMeer JE, Wenger AM, Schaar BT, Ahituv N, Bejerano G. 2012 Human developmental enhancers conserved between deuterostomes and protostomes. PLoS Genet. 8, e1002852. (doi:10. 1371/journal.pgen.1002852)

rstb.royalsocietypublishing.org

4.

blastopore during bilaterian evolution. Dev. Cell 17, 162–174. (doi:10.1016/j.devcel.2009.07.024) Davidson EH. 2006 The regulatory genome: gene regulatory networks in development and evolution, 1st edn. San Diego, CA: Academic Press/Elsevier. Lowe CJ et al. 2006 Dorsoventral patterning in hemichordates: insights into early chordate evolution. PLoS Biol. 4, e291. (doi:10.1371/journal. pbio.0040291) Lowe CJ, Wu M, Salic A, Evans L, Lander E, StangeThomann N, Gruber CE, Gerhart J, Kirschner M. 2003 Anteroposterior patterning in hemichordates and the origins of the chordate nervous system. Cell 113, 853–865. (doi:10.1016/S0092-8674(03) 00469-0) Yu J-K, Satou Y, Holland ND, Shin-I T, Kohara Y, Satoh N, Bronner-Fraser M, Holland LZ. 2007 Axial patterning in cephalochordates and the evolution of the organizer. Nature 445, 613–617. (doi:10.1038/ nature05472) Molina MD, Neto A, Maeso I, Gomez-Skarmeta JL, Salo E, Cebria F. 2011 Noggin and noggin-like genes control dorsoventral axis regeneration in planarians. Curr. Biol. 21, 300– 305. (doi:10.1016/j.cub.2011. 01.016) Gavin˜o MA, Reddien PW. 2011 A Bmp/Admp regulatory circuit controls maintenance and regeneration of dorsal-ventral polarity in planarians. Curr. Biol. 21, 294– 299. (doi:10.1016/j.cub.2011. 01.017) Vopalensky P, Kozmik Z. 2009 Eye evolution: common use and independent recruitment of genetic components. Phil. Trans. R. Soc. B 364, 2819–2832. (doi:10.1098/rstb.2009.0079) Tessmar-Raible K, Raible F, Christodoulou F, Guy K, Rembold M, Hausen H, Arendt D. 2007 Conserved sensory-neurosecretory cell types in annelid and fish forebrain: insights into hypothalamus evolution. Cell 129, 1389 –1400. (doi:10.1016/j.cell.2007.04.041) Ryan J, Burton P, Mazza M, Kwong G, Mullikin J, Finnerty J. 2006 The cnidarian-bilaterian ancestor possessed at least 56 homeoboxes: evidence from the starlet sea anemone, Nematostella vectensis. Genome Biol. 7, R64. (doi:10.1186/gb-2006-7-7-r64) Kusserow A et al. 2005 Unexpected complexity of the Wnt gene family in a sea anemone. Nature 433, 156–160. (doi:10.1038/nature03158) Putnam NH et al. 2007 Sea anemone genome reveals ancestral eumetazoan gene repertoire and genomic organization. Science 317, 86 –94. (doi:10. 1126/science.1139158) Simakov O et al. 2013 Insights into bilaterian evolution from three spiralian genomes. Nature 493, 526–531. (doi:10.1038/nature11696) D’Aniello S, Irimia M, Maeso I, Pascual-Anaya J, Jimenez-Delgado S, Bertrand S, Garcia-Fernandez J. 2008 Gene expansion and retention leads to a diverse tyrosine kinase superfamily in amphioxus. Mol. Biol. Evol. 25, 1841– 1854. (doi:10.1093/ molbev/msn132) Peter IS, Davidson EH. 2011 Evolution of gene regulatory networks controlling body plan

Downloaded from rstb.royalsocietypublishing.org on November 12, 2013

71. Hong JW, Hendrix DA, Levine MS. 2008 Shadow enhancers as a source of evolutionary novelty. Science 321, 1314. (doi:10.1126/science.1160631) 72. Frankel N, Davis GK, Vargas D, Wang S, Payre F, Stern DL. 2010 Phenotypic robustness conferred by apparently redundant transcriptional enhancers. Nature 466, 490– 493. (doi:10.1038/nature09158) 73. Jeong JY, Einhorn Z, Mathur P, Chen L, Lee S, Kawakami K, Guo S. 2007 Patterning the zebrafish diencephalon by the conserved zinc-finger protein Fezl. Development 134, 127– 136. (doi:10.1242/dev. 02705) 74. Cotney J, Leng J, Yin J, Reilly SK, DeMare LE, Emera D, Ayoub AE, Rakic P, Noonan JP. 2013 The evolution of lineage-specific regulatory activities in the human embryonic limb. Cell 154, 185–196. (doi:10.1016/j.cell.2013.05.056) 75. Carey M. 1998 The enhanceosome and transcriptional synergy. Cell 92, 5–8. (doi:10.1016/ S0092-8674(00)80893-4) 76. Gotea V, Visel A, Westlund JM, Nobrega MA, Pennacchio LA, Ovcharenko I. 2010 Homotypic clusters of transcription factor binding sites are a key component of human promoters and enhancers. Genome Res. 20, 565–577. (doi:10.1101/gr.104471. 109) 77. Sakabe NJ, Nobrega MA. 2013 Beyond the ENCODE project: using genomics and epigenomics strategies to study enhancer evolution. Phil. Trans. R. Soc. B 368, 20130022. (doi:10.1098/rstb.2013.0022) 78. Fisher WW et al. 2012 DNA regions bound at low occupancy by transcription factors do not drive patterned reporter gene expression in Drosophila. Proc. Natl Acad. Sci. USA 109, 21 330–21 335. (doi:10.1073/pnas.1209589110) 79. Maher B. 2012 ENCODE: the human encyclopaedia. Nature 489, 46 –48. (doi:10.1038/489046a) 80. Heintzman ND et al. 2007 Distinct and predictive chromatin signatures of transcriptional promoters and enhancers in the human genome. Nat. Genet. 39, 311 –318. (doi:10.1038/ng1966) 81. Heintzman ND et al. 2009 Histone modifications at human enhancers reflect global cell-type-specific gene expression. Nature 459, 108–112. (doi:10. 1038/nature07829) 82. Visel A, Rubin EM, Pennacchio LA. 2009 Genomic views of distant-acting enhancers. Nature 461, 199–205. (doi:10.1038/nature08451) 83. Bogdanovic O, Fernandez-Minan A, Tena JJ, de la Calle-Mustienes E, Hidalgo C, van Kruysbergen I, van Heeringen SJ, Veenstra GJ, Gomez-Skarmeta JL. 2012 Dynamics of enhancer chromatin signatures mark the transition from pluripotency to cell specification during embryogenesis. Genome Res. 22, 2043 –2053. (doi:10.1101/gr.134833.111) 84. Shen Y et al. 2012 A map of the cis-regulatory sequences in the mouse genome. Nature 488, 116–120. (doi:10.1038/nature11243) 85. Kim TK et al. 2010 Widespread transcription at neuronal activity-regulated enhancers. Nature 465, 182–187. (doi:10.1038/nature09033) 86. Creyghton MP et al. 2010 Histone H3K27ac separates active from poised enhancers and predicts

9

Phil Trans R Soc B 368: 20130020

56. Sahagun V, Ranz JM. 2012 Characterization of genomic regulatory domains conserved across the genus Drosophila. Genome Biol. Evol. 4, 1054– 1060. (doi:10.1093/gbe/evs089) 57. Waterston RH et al. 2002 Initial sequencing and comparative analysis of the mouse genome. Nature 420, 520 –562. (doi:10.1038/nature01262) 58. Pennacchio LA. 2003 Insights from human/mouse genome comparisons. Mamm. Genome 14, 429 –436. (doi:10.1007/s00335-002-4001-1) 59. Venkatesh B et al. 2007 Survey sequencing and comparative analysis of the elephant shark (Callorhinchus milii) Genome. PLoS Biol. 5, e101. (doi:10.1371/journal.pbio.0050101) 60. Bertrand S, Escriva H. 2011 Evolutionary crossroads in developmental biology: amphioxus. Development 138, 4819–4830. (doi:10.1242/dev.066720) 61. Garcia-Fernandez J et al. 2009 From the American to the European amphioxus: towards experimental evo-devo at the origin of chordates. Int. J. Dev. Biol. 53, 1359–1366. (doi:10.1387/ijdb.072436jg) 62. Boffelli D, Nobrega MA, Rubin EM. 2004 Comparative genomics at the vertebrate extremes. Nat. Rev. Genet. 5, 456– 465. (doi:10.1038/ nrg1350) 63. Taher L, McGaughey DM, Maragh S, Aneas I, Bessling SL, Miller W, Nobrega MA, McCallion AS, Ovcharenko I. 2011 Genome-wide identification of conserved regulatory function in diverged sequences. Genome Res. 21, 1139– 1149. (doi:10. 1101/gr.119016.110) 64. Chakravarthy H, Boer B, Desler M, Mallanna SK, McKeithan TW, Rizzino A. 2008 Identification of DPPA4 and other genes as putative Sox2:Oct-3/4 target genes using a combination of in silico analysis and transcription-based assays. J. Cell. Physiol. 216, 651 –662. (doi:10.1002/jcp.21440) 65. Sanges R, Kalmar E, Claudiani P, D’Amato M, Muller F, Stupka E. 2006 Shuffling of cis-regulatory elements is a pervasive feature of the vertebrate lineage. Genome Biol. 7, R56. (doi:10.1186/gb2006-7-7-r56) 66. Irimia M, Maeso I, Garcia-Fernandez J. 2008 Convergent evolution of clustering of Iroquois homeobox genes across metazoans. Mol. Biol. Evol. 25, 1521–1525. (doi:10.1093/molbev/msn109) 67. Dermitzakis ET, Clark AG. 2002 Evolution of transcription factor binding sites in mammalian gene regulatory regions: conservation and turnover. Mol. Biol. Evol. 19, 1114–1121. (doi:10.1093/ oxfordjournals.molbev.a004169) 68. Costas J, Casares F, Vieira J. 2003 Turnover of binding sites for transcription factors involved in early Drosophila development. Gene 310, 215–220. (doi:10.1016/S0378-1119(03)00556-0) 69. Odom DT et al. 2007 Tissue-specific transcriptional regulation has diverged significantly between human and mouse. Nat. Genet. 39, 730– 732. (doi:10.1038/ng2047) 70. Schmidt D et al. 2010 Five-vertebrate ChIP-seq reveals the evolutionary dynamics of transcription factor binding. Science 328, 1036 –1040. (doi:10. 1126/science.1186176)

rstb.royalsocietypublishing.org

43. de la Calle-Mustienes E, Feijoo CG, Manzanares M, Tena JJ, Rodriguez-Seguel E, Letizia A, Allende ML, Gomez-Skarmeta JL. 2005 A functional survey of the enhancer activity of conserved non-coding sequences from vertebrate Iroquois cluster gene deserts. Genome Res. 15, 1061 –1072. (doi:10.1101/ gr.4004805) 44. Ghanem N, Jarinova O, Amores A, Long Q, Hatch G, Park BK, Rubenstein JL, Ekker M. 2003 Regulatory roles of conserved intergenic domains in vertebrate Dlx bigene clusters. Genome Res. 13, 533 –543. (doi:10.1101/gr.716103) 45. Gottgens B et al. 2000 Analysis of vertebrate SCL loci identifies conserved enhancers. Nat. Biotechnol. 18, 181–186. (doi:10.1038/72635) 46. Nobrega MA, Ovcharenko I, Afzal V, Rubin EM. 2003 Scanning human gene deserts for long-range enhancers. Science 302, 413. (doi:10.1126/science. 1088328) 47. Spitz F, Gonzalez F, Duboule D. 2003 A global control region defines a chromosomal regulatory landscape containing the HoxD cluster. Cell 113, 405–417. (doi:10.1016/S0092-8674(03)00310-6) 48. Visel A et al. 2008 Ultraconservation identifies a small subset of extremely constrained developmental enhancers. Nat. Genet. 40, 158–160. (doi:10.1038/ng.2007.55) 49. Ariza-Cosano A, Visel A, Pennacchio LA, Fraser HB, Gomez-Skarmeta JL, Irimia M, Bessa J. 2012 Differences in enhancer activity in mouse and zebrafish reporter assays are often associated with changes in gene expression. BMC Genomics 13, 713. (doi:10.1186/1471-2164-13-713) 50. Navratilova P, Fredman D, Lenhard B, Becker TS. 2010 Regulatory divergence of the duplicated chromosomal loci sox11a/b by subpartitioning and sequence evolution of enhancers in zebrafish. Mol. Genet. Genomics 283, 171 –184. (doi:10.1007/ s00438-009-0503-1) 51. Navratilova P, Fredman D, Hawkins TA, Turner K, Lenhard B, Becker TS. 2009 Systematic human/ zebrafish comparative identification of cis-regulatory activity around vertebrate developmental transcription factor genes. Dev. Biol. 327, 526–540. (doi:10.1016/j.ydbio.2008.10.044) 52. Komisarczuk AZ, Kawakami K, Becker TS. 2009 Cisregulation and chromosomal rearrangement of the fgf8 locus after the teleost/tetrapod split. Dev. Biol. 336, 301–312. (doi:10.1016/j.ydbio.2009.09.029) 53. Ritter DI, Li Q, Kostka D, Pollard KS, Guo S, Chuang JH. 2010 The importance of being cis: evolution of orthologous fish and mammalian enhancer activity. Mol. Biol. Evol. 27, 2322– 2332. (doi:10.1093/ molbev/msq128) 54. Glazov EA, Pheasant M, McGraw EA, Bejerano G, Mattick JS. 2005 Ultraconserved elements in insect genomes: a highly conserved intronic sequence implicated in the control of homothorax mRNA splicing. Genome Res. 15, 800–808. (doi:10.1101/gr.3545105) 55. Siepel A et al. 2005 Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Res. 15, 1034 –1050. (doi:10. 1101/gr.3715005)

Downloaded from rstb.royalsocietypublishing.org on November 12, 2013

88.

90.

91.

92.

93.

94.

95.

96.

97.

98.

99.

113.

114.

115.

116.

117.

118.

119.

120.

121.

122.

123.

2006 Long-range downstream enhancers are essential for Pax6 expression. Dev. Biol. 299, 563–581. (doi:10.1016/j.ydbio.2006.08.060) Noordermeer D, Leleu M, Splinter E, Rougemont J, De Laat W, Duboule D. 2011 The dynamic architecture of Hox gene clusters. Science 334, 222–225. (doi:10.1126/science.1207194) Montavon T, Soshnikova N, Mascrez B, Joye E, Thevenet L, Splinter E, de Laat W, Spitz F, Duboule D. 2011 A regulatory archipelago controls Hox genes transcription in digits. Cell 147, 1132–1145. (doi:10.1016/j.cell.2011.10.023) Andrey G, Montavon T, Mascrez B, Gonzalez F, Noordermeer D, Leleu M, Trono D, Spitz F, Duboule D. 2013 A switch between topological domains underlies HoxD genes collinearity in mouse limbs. Science 340, 1234167. (doi:10.1126/science. 1234167) Cande J, Goltsev Y, Levine MS. 2009 Conservation of enhancer location in divergent insects. Proc. Natl Acad. Sci. USA 106, 14 414– 14 419. (doi:10.1073/ pnas.0905754106) Kikuta H et al. 2007 Genomic regulatory blocks encompass multiple neighboring genes and maintain conserved synteny in vertebrates. Genome Res. 17, 545– 555. (doi:10.1101/gr.6086307) Maeso I et al. 2012 An ancient genomic regulatory block conserved across bilaterians and its dismantling in tetrapods by retrogene replacement. Genome Res. 22, 642–655. (doi:10.1101/gr.132233. 111) Irimia M et al. 2012 Extensive conservation of ancient microsynteny across metazoans due to cis-regulatory constraints. Genome Res. 22, 2356– 2367. (doi:10.1101/gr.139725.112) Irimia M, Maeso I, Roy SW, Fraser HB. 2013 Ancient cis-regulatory constraints and the evolution of genome architecture. Trends Genet. 29, 521–528. (doi:10.1016/j.tig.2013.05.008) Suda Y, Kurokawa D, Takeuchi M, Kajikawa E, Kuratani S, Amemiya C, Aizawa S. 2009 Evolution of Otx paralogue usages in early patterning of the vertebrate head. Dev. Biol. 325, 282 –295. (doi:10. 1016/j.ydbio.2008.09.018) Wada H, Saiga H, Satoh N, Holland PW. 1998 Tripartite organization of the ancestral chordate brain and the antiquity of placodes: insights from ascidian Pax-2/5/8, Hox and Otx genes. Development 125, 1113–1122. Wilczynski B, Liu YH, Yeo ZX, Furlong EE. 2012 Predicting spatial and temporal gene expression using an integrative model of transcription factor occupancy and chromatin state. PLoS Comput. Biol. 8, e1002798. (doi:10.1371/journal. pcbi.1002798)

10

Phil Trans R Soc B 368: 20130020

89.

100. Domazet-Loso T, Tautz D. 2010 A phylogenetically based transcriptome age index mirrors ontogenetic divergence patterns. Nature 468, 815– 818. (doi:10. 1038/nature09632) 101. Irie N, Kuratani S. 2011 Comparative transcriptome analysis reveals vertebrate phylotypic period during organogenesis. Nat. Commun. 2, 248. (doi:10.1038/ ncomms1248) 102. Levin M, Hashimshony T, Wagner F, Yanai I. 2012 Developmental milestones punctuate gene expression in the Caenorhabditis embryo. Dev. Cell 22, 1101–1108. (doi:10.1016/j.devcel.2012.04.004) 103. Kalinka AT, Varga KM, Gerrard DT, Preibisch S, Corcoran DL, Jarrells J, Ohler U, Bergman CM, Tomancak P. 2010 Gene expression divergence recapitulates the developmental hourglass model. Nature 468, 811–814. (doi:10.1038/nature09634) 104. Galas DJ, Schmitz A. 1978 DNAse footprinting: a simple method for the detection of protein–DNA binding specificity. Nucleic Acids Res. 5, 3157– 3170. (doi:10.1093/nar/5.9.3157) 105. Hesselberth JR et al. 2009 Global mapping of protein–DNA interactions in vivo by digital genomic footprinting. Nat. Methods 6, 283 –289. (doi:10. 1038/nmeth.1313) 106. Neph S, Stergachis AB, Reynolds A, Sandstrom R, Borenstein E, Stamatoyannopoulos JA. 2012 Circuitry and dynamics of human transcription factor regulatory networks. Cell 150, 1274 –1286. (doi:10. 1016/j.cell.2012.04.040) 107. Spitz F, Duboule D. 2008 Global control regions and regulatory landscapes in vertebrate development and evolution. Adv. Genet. 61, 175 –205. (doi:10. S0065-2660(07)00006-5) 108. Montavon T, Duboule D. 2012 Landscapes and archipelagos: spatial organization of gene regulation in vertebrates. Trends Cell Biol. 22, 347 –354. (doi:10.1016/j.tcb.2012.04.003) 109. de Kok YJ et al. 1996 Identification of a hot spot for microdeletions in patients with X-linked deafness type 3 (DFN3) 900 kb proximal to the DFN3 gene POU3F4. Hum. Mol. Genet. 5, 1229 –1235. (doi:10. 1093/hmg/5.9.1229) 110. Spitz F, Herkenne C, Morris MA, Duboule D. 2005 Inversion-induced disruption of the Hoxd cluster leads to the partition of regulatory landscapes. Nat. Genet. 37, 889 –893. (doi:10.1038/ng1597) 111. Jeong Y, El-Jaick K, Roessler E, Muenke M, Epstein DJ. 2006 A functional screen for sonic hedgehog regulatory elements across a 1 Mb interval identifies long-range ventral forebrain enhancers. Development 133, 761– 772. (doi:10.1242/dev. 02239) 112. Kleinjan DA, Seawright A, Mella S, Carr CB, Tyas DA, Simpson TI, Mason JO, Price DJ, van Heyningen V.

rstb.royalsocietypublishing.org

87.

developmental state. Proc. Natl Acad. Sci. USA 107, 21 931–21 936. (doi:10.1073/pnas.1016071107) Hawkins RD et al. 2011 Dynamic chromatin states in human ES cells reveal potential regulatory sequences and genes involved in pluripotency. Cell Res. 21, 1393 –1409. (doi:10.1038/cr.2011.146) Rada-Iglesias A, Bajpai R, Swigut T, Brugmann SA, Flynn RA, Wysocka J. 2010 A unique chromatin signature uncovers early developmental enhancers in humans. Nature 470, 278 –283. (doi:10.1038/ nature09692) Bonn S et al. 2012 Tissue-specific analysis of chromatin state identifies temporal signatures of enhancer activity during embryonic development. Nat. Genet. 44, 148–156. (doi:10.1038/ ng.1064) Thurman RE et al. 2012 The accessible chromatin landscape of the human genome. Nature 489, 75 –82. (doi:10.1038/nature11232) Dunham I et al. 2012 An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57 –74. (doi:10.1038/nature11247) Gerstein MB et al. 2012 Architecture of the human regulatory network derived from ENCODE data. Nature 489, 91 –100. (doi:10.1038/nature11245) Neph S et al. 2012 An expansive human regulatory lexicon encoded in transcription factor footprints. Nature 489, 83 –90. (doi:10.1038/ nature11212) Yip KY et al. 2012 Classification of human genomic regions based on experimentally determined binding sites of more than 100 transcription-related factors. Genome Biol. 13, R48. (doi:10.1186/gb2012-13-9-r48) Paige SL et al. 2012 A temporal chromatin signature in human embryonic stem cells identifies regulators of cardiac development. Cell 151, 221 –232. (doi:10.1016/j.cell.2012.08.027) Wamstad JA et al. 2012 Dynamic and coordinated epigenetic regulation of developmental transitions in the cardiac lineage. Cell 151, 206–220. (doi:10. 1016/j.cell.2012.07.035) Rada-Iglesias A, Bajpai R, Prescott S, Brugmann SA, Swigut T, Wysocka J. 2012 Epigenomic annotation of enhancers predicts transcriptional regulators of human neural crest. Cell Stem Cell 11, 633–648. (doi:10.1016/j.stem.2012.07.006) Gifford CA et al. 2013 Transcriptional and epigenetic dynamics during specification of human embryonic stem cells. Cell 153, 1149 –1163. (doi:10.1016/j. cell.2013.04.037) Xie W et al. 2013 Epigenomic analysis of multilineage differentiation of human embryonic stem cells. Cell 153, 1134 –1148. (doi:10.1016/j. cell.2013.04.022)