Blogs written by Savita Jayaram, Ph.D., Bioinformatics Scientist

We find a distinct division of labor within a single cell, with specific sub-cellular components or organelles performing specialized functions and this segregation is even more apparent as we go higher in multi-cellular organisms where different organs have evolved to perform different functions. The recent revelation by the Encyclopedia of DNA Elements (ENCODE) project at UCSC, in what appears be yet another milestone after the first draft of the human genome, is that in the vast landscape of our genomic architecture, every piece of DNA has a distinct structure, purpose and function dispelling the long held view that most of the human genome constituted ‘junk’ or non-coding DNA.

The magnitude of data that ENCODE analyzed includes 1640 genome-wide datasets prepared from 147 cell types and their findings were published in 30 papers in multiple journals, 6 of them in Nature. In addition to this they have 6 review articles in Science, Cell and other journals.

ENCODE papers exposed prodigious numbers of cis-regulatory elements like enhancers, promoters, insulators, silencers and locus control regions that fill this landscape, in addition to coding for various non-coding RNA that have regulatory roles. Scientists had figured out earlier that the DNA is not static but the dynamical opening of DNA bubbles called DNA breathing, is supposedly crucial for biological functioning during, for instance, transcription initiation and DNA’s interaction with selectively single-stranded DNA binding proteins. Genes are not linearly organized on chromosomes but the chromatin loops and twists to bring together regions that are separated by thousands of base pairs and distal elements such as enhancers can communicate their information. ENCODE identified more than 200,000 DNAase I hypersensitivity sites (DHSs) per cell type and an overall 2.9 million sites genome-wide, that are not protected by nucleosomes (basic units of chromatin) and hence accessible to enzymatic cleavage by nucleases. Of these 580,000 distal DHSs were found to have association with promoters revealing a possible role as enhancers of target genes. But this leaves nearly 2 million putative enhancers without known target genes. This they say is enough data to be chewing over for some time to come. They also found 243,037 CpG islands falling within these DHSs in 19 different cell types and provide additional proof that transcription factor binding sites are on average far less DNA methylated and increased methylation is negatively associated with chromatin accessibility. In addtion to this, ENCODE data provides evidence that Genome Wide Association Studies (GWAS) are enriched for variants that lie in within these non-coding functional units, in a cell-type specific manner that is consistent with certain traits, suggesting a link to disease. Thus we risk missing crucial amounts of information by focusing on variations in just the coding regions of the genome or the ‘exome’.

All the ENCODE related papers can be retrieved online via a specially designed visualization tool, the Nature ENCODE explorer (http://www.nature.com/encode/#/threads), which allows users to access all the relevant papers of a given topic.

Leave a comment

Tag Cloud