Moonglow
Diamond Member
Production Phase Results[edit]
In September 2012, the project released a much more extensive set of results, in 30 papers published simultaneously in several journals, including six in Nature, six in Genome Biology and a special issue with 18 publications of Genome Research.[17]
The authors described the production and the initial analysis of 1,640 data sets designed to annotate functional elements in the entire human genome, integrating results from diverse experiments within cell types, related experiments involving 147 different cell types, and all ENCODE data with other resources, such as candidate regions from genome-wide association studies (GWAS) and evolutionary constrained regions. Together, these efforts revealed important features about the organization and function of the human genome, which were summarized in an overview paper as follows:[18]
In September 2012, the project released a much more extensive set of results, in 30 papers published simultaneously in several journals, including six in Nature, six in Genome Biology and a special issue with 18 publications of Genome Research.[17]
The authors described the production and the initial analysis of 1,640 data sets designed to annotate functional elements in the entire human genome, integrating results from diverse experiments within cell types, related experiments involving 147 different cell types, and all ENCODE data with other resources, such as candidate regions from genome-wide association studies (GWAS) and evolutionary constrained regions. Together, these efforts revealed important features about the organization and function of the human genome, which were summarized in an overview paper as follows:[18]
- The vast majority (80.4%) of the human genome participates in at least one biochemical RNA and/or chromatin associated event in at least one cell type. Much of the genome lies close to a regulatory event: 95% of the genome lies within 8kb of a DNA-proteininteraction (as assayed by bound ChIP-seq motifs or DNaseI footprints), and 99% is within 1.7kb of at least one of the biochemical events measured by ENCODE.
- Primate-specific elements as well as elements without detectable mammalian constraint show, in aggregate, evidence of negative selection; thus some of them are expected to be functional.
- Classifying the genome into seven chromatin states suggests an initial set of 399,124 regions with enhancer-like features and 70,292 regions with promoters-like features, as well hundreds of thousands of quiescent regions. High-resolution analyses further subdivide the genome into thousands of narrow states with distinct functional properties.
- It is possible to quantitatively correlate RNA sequence production and processing with both chromatin marks and transcription factor (TF) binding at promoters, indicating that promoter functionality can explain the majority of RNA expression variation.
- Many non-coding variants in individual genome sequences lie in ENCODE- annotated functional regions; this number is at least as large as those that lie in protein coding genes.
- SNPs associated with disease by GWAS are enriched within non-coding functional elements, with a majority residing in or near ENCODE-defined regions that are outside of protein coding genes. In many cases, the disease phenotypes can be associated with a specific cell type or TF.