Посещений:
New Technologies, New Findings, and New Concepts in the Study of Vertebraye cis-Regulatory Sequences | |
Рис.1. | Figure 1. Highly conserved noncoding regions (HCNRs) spanning the loci of genes for key transcription factors involved in development and differentiation (adapted from Sandelin et al.,[2004b]). Each of the three sequence windows covers 4 Mb of genomic sequence around HCNRs. Yellow triangles represent individual HCNRs. Different structural classes of transcription factor genes are shown in different colors (red, homeobox; green, C2H2 zinc finger; purple, other TF; light gray, other [non-TF] genes). Meis2 is spanned by the largest array of 84 HCNRs. DACH is flanked by two gene deserts that contain HCNRs. HoxD cluster is controlled by HCNRs that span the loci of several neighboring genes. Рис.2. | Figure 2. Representing transcription factor binding site motifs: an example of carbohydrate response element (CREB). A: Consensus sequence using IUPAC symbols for degenerate positions. B: An equivalent regular expression. C: Sequence logo, showing contributions of individual positions and individual nucleotides therein to the overall specificity of the binding site. D: A position-specific score matrix, used for direct scoring of DNA sequences. Рис.3. | Figure 3. An example of de novo motif discovery. A large number of genes whose products are involved in amino acid biosynthesis in the yeast Saccharomyces cerevisiae are induced under the conditions of amino acid starvation. A Gibbs sampling algorithm implemented in the program AlignACE (Hughes et al.,[2000]) correctly locates the motif corresponding to the binding site of the protein Gcn4p, a known master regulator of the amino acid starvation response (Natarajan et al.,[2001]). Рис.4. | Figure 4. The principle of phylogenetic footprinting. A,B: A genomic region of interest is aligned with one or more orthologous regions of another, suitably distant species (A), and the most conserved regions are identified using a scoring function that produces a conservation profile (B). Regions exceeding a selected conservation score are extracted for further analysis. Рис.5. | Figure 5. An example of phylogenetic footprinting in action. A: The result of scanning a single sequence (region from -1,400 to +100 of human muscle creatine kinase) with a subset of the most specific (information content >10 bits) transcription factor matrix models from the JASPAR database (Sandelin et al.,[2004a]). Previously characterized binding sites are marked by red boxes. B: The same region after the phylogenetic footprinting with the orthologous mouse sequence, shown along with the corresponding conservation profile. In this case, all known sites are retained, whereas most of the false predictions were filtered out because of their absence at the corresponding positions in the mouse sequence. The analysis was performed using ConSite (Sandelin et al.,[2004c]). Рис.6. | Figure 6. Multiple highly conserved noncoding regions (HCNRs) within the IrxB cluster harbor cis-regulatory elements. A:VISTA view of the occurrence of conserved sequence domains in the two gene deserts between the Irx genes of the IrxB cluster. Shown from top to bottom are Human vs. Mouse, Human vs. Chick, Human vs. Xenopus tropicalis, and Human vs. Fugu global alignments. Colored peaks (purple, coding; pink, noncoding) indicate regions of at least 100 bp and 75% similarity. The number of HCNRs increases as evolutionarily closer species are compared. B: The gene deserts between Irx genes contain multiple HCNRs. Most of them show enhancer activity in zebrafish transgenic assays and activate expression in subdomains of those expressing Irx genes (green boxes). Because in many cases these subdomains are common to more than one Irx gene in the complex, these enhancers are likely to be shared. Although not proven, some of these regions may also harbor negative elements (red boxes). Three examples of the enhancer activity of some of these regions are shown. These enhancers are located within the HCNRs marked with an green asterisk in A. In these cases, the genomic conserved noncoding regions are driving enhanced green fluorescent protein (EGFP) expression in transgenic zebrafish assays. As a result of the action of positive and negative cis-regulatory elements on more than one IrxB gene, all Irx genes in this complex show relatively similar expression pattern. Рис.7. | Figure 7. Enhancer detection integrations in a zebrafish gene desert. Four integrations were recovered in a 320-kb interval upstream of the sox11b gene. A highly conserved noncoding regions (HCNRs) was mapped to the corresponding gene desert in the human and mouse genomes. Note how the insertions closest to the HCNR exhibit the same expression patterns (which corresponds to sox11b expression), whereas the one most distal (CLGY205) retains only weak expression in the telencephalon. Numbers under the photomicrographs denote distance in bas pairs from the start codon of sox11b. See Ellingsen et al. ([2005]) for further details. Footprinting, DNAase DNA with protein bound is resistant to digestion by DNAase. When a sequencing reaction is performed using such DNA, a protected area representing the footprint of the bound protein will be detected. This permits identification of the protein binding regions of the DNA. |