Accelerated Evolution of Enhancer Hotspots in the Mammal Ancestor
Mammals have evolved remarkably different sensory, reproductive, metabolic, and skeletal systems. To explore the genetic basis for these differences, we developed a comparative genomics approach to scan whole-genome multiple sequence alignments to identify regions that evolved rapidly in an ancestral lineage but are conserved within extant species. This pattern suggests that ancestral changes in function were maintained in descendants. After applying this test to therian mammals, we identified 4,797 accelerated regions, many of which are noncoding and located near developmental transcription factors. We then used mouse transgenic reporter assays to test if noncoding accelerated regions are enhancers and to determine how therian-specific substitutions affect their activity in vivo. We discovered enhancers with expression specific to the therian version in brain regions involved in the hormonal control of milk ejection, uterine contractions, blood pressure, temperature, and visual processing. This work underscores the idea that changes in developmental gene expression are important for mammalian evolution, and it pinpoints candidate genes for unique aspects of mammalian biology.We used phyloP combined with the phastCons program in PHAST (Pollard et al. 2010; Hubisz et al. 2011) to scan whole-genome alignments in vertebrates for sequences that are present in therian and nontherian vertebrates but changed significantly in the therian mammal ancestor and remained highly conserved during therian diversification. We identified 177,346 vertebrate genomic regions that are conserved among therians (therian conserved regions; TCRs), of which 4,797 have a strong signature for accelerated evolution in the therian ancestor (false discovery rate <1%; supplementary table S1, Supplementary Material online). We call these 4,797 sequences Therian-Specific Accelerated Regions (TSARs). Using simulations (see Materials and Methods), we established that the power of this method is sufficient to identify TSARs and that specificity is high for elements over 150 bp. For longer elements, power and specificity are remarkably high (supplementary fig. S1C, Supplementary Material online).The distribution of TSARs with respect to human gene features is similar to that of TCRs (fig. 1B). TSARs are largely within genic regions (supplementary table S2, Supplementary Material online), in contrast to HARs and the mammalian-conserved elements from which HARs were identified, both of which occur most frequently in intergenic DNA (Pollard et al. 2006; Lindblad-Toh et al. 2011). This distribution difference may be because intergenic sequences typically evolve more rapidly and are therefore more difficult to align in the TSAR analysis, which adds a requirement of syntenic alignment to nonmammalian vertebrates (see Materials and Methods).To explore the hypothesis that TSARs are uncharacterized regulatory elements, we used functional genomics data from the ENCODE project (Dunham et al. 2012) and other sources (Griffith et al. 2008; Visel et al. 2009; Blow et al. 2010; Rada-Iglesias et al. 2011; Shen et al. 2012) to investigate how many TSARs overlap features indicative of enhancer or promoter activity in various human and mouse cell types (supplementary table S1, Supplementary Material online). We found widespread evidence that TSARs are active regulatory sequences. First, from human ENCODE data, 89% of TSARs overlap with DNaseI hypersensitive sites, H3K27 acetylation marks, or P300 peaks, all of which indicate regulatory activity. Furthermore, the chromHMM analysis of human ENCODE data (Ernst and Kellis 2012) predicts that 14% of TSARs are promoters and 35% are enhancers, and only 12.7% of TSARs are in intergenic regions without any predicted regulatory function. Furthermore, 98% of TSARs that overlap chromHMM-predicted enhancer sequences are supported by at least one additional experiment showing evidence of enhancer activity, and 89% are supported by multiple additional experiments. These include 366 TSARs that overlap elements from experiments designed to identify cis-regulatory regions in murine tissues (Shen et al. 2012), and 129 in putative regulatory regions in the Open Regulatory Annotation database (Griffith et al. 2008).A small number of TSARs have been characterized experimentally in previous studies. TSARs overlap 17 early developmental enhancers in humans (Rada-Iglesias et al. 2011) and 105 putative heart enhancers in mouse (Blow et al. 2010). Transgenic enhancer assays from the VISTA Enhancer Browser (Visel et al. 2007) show that 51 TSARs have enhancer activity in e11.5 mouse. These enhancers are primarily active in the brain, eye, nose, and limbs (Visel et al. 2007). Overall, 4,285 of the 4,797 TSARs display evidence of regulatory function, yet they are largely uncharacterized. Future studies of these loci may offer clues about therian-specific biology.We thank M.J. Hubisz for implementing the ‐‐branch test in phyloP; J.D. Wythe and W.P. Devine for the transgenic reporter constructs and advice; C. Miller (Gladstone Histology Core) for performing optical projection tomography and virtual sections; F. Kreitzer for advice on incubating and dissecting the chicken egg; D. Kostka for discussions and scripts for testing distance between TSARs; and S. Orkin and M. Salminen for in situ hybridization probes. Transgenic mice were generated by Cyagen Biosciences, Inc. This work was supported by the Gladstone Institutes, NIH R01-GM82901 (A.K.H., K.S.P.), the Lawrence J. and Florence A. DeGeorge Charitable Trust/American Heart Association Established Investigator Award (B.G.B.), NIH/NCRR grant (C06 RR018928) to the Gladstone Institutes, William H. Younger, Jr (B.G.B.), and Nina Ireland and NIMH R37-MH049428 (J.L.R.).