Regulation of gene expression by small non-coding RNAs
Small non-coding RNAs regulate gene expression at the post-transcriptional level by binding target mRNAs and inhibiting their translation, or inducing degradation of the message. Regulatory RNAs were first discovered in prokaryotes but are widely found in both prokaryotes and eukaryotes. In E. coli, over 260 non-coding RNA genes have been characterized, which are listed on the Rfam Sanger Institute website. Most prokaryotic regulatory RNAs are encoded in intergenic regions of the genome and expression of RNA genes is generally induced by environmental signals. Mechanisms of post- transcriptional regulation by small RNAs, their abundance in various bacteria, and their evolutionary origins as well as the mechanism of transcriptional activation of regulatory RNA genes pose challenging problems.
For the past 25 years our research has concentrated on elucidation of the activation and suppression of the non-coding micF RNA gene, characterization of the RNA transcript and determination of its functional role. This RNA was discovered by Masayori Inouye and Takeshi Mizuno in the early 1980s. Our lab, partly with the Inouye lab, characterized micF, its transcript and biological function. micF has the distinction of being the first non-coding regulatory RNA gene to have been discovered and characterized. Recently, other labs, led by Gerhart Wagner in Sweden and Jorg Vogel in Germany have shown a greatly expanded role in cell physiology whereby multiple target genes are regulated by micF, including a global regulator that controls ~10% of all protein genes (Mol. Microbiol (2012) 84:414 and 428). In addition to several hundred small regulatory RNAs found in prokaryotes, the number in eukaryotes is striking, with humans having more than 1000 small regulatory RNAs. The major difference between prokaryotic (termed sRNAs) and eukaryotic small non-coding RNAs (termed miRNAs) is that prokaryotic sRNAs are largely primary transcripts and the eukaryotic miRNAs are processed into small ~22 nt polynucleotides. Both act via base-pairing to a target RNA, usually by imperfect pairing.
Non-autonomous transposable elements and other repetitive sequences in prokaryotes and eukaryotes
Prokaryotic and eukaryotic genomes contain multiple copies of repeat sequences, many of which are miniature inverted repeat non-autonomous transposable elements (MITEs). These take part in various molecular processes such as carrying promoter sequences and regulating mRNA stability. In the past we outlined the rich variety of prokaryotic repeat sequences (Impact of small repeat sequences on bacterial genome evolution. Delihas N. Genome Biol Evol. (2011) 3:959-973). Higher eukaryotes such as primates have a far more complex set of DNA repeats found in intergenic non-coding regions that include AT-rich segments, repeats of large sequences and the non-autonomous transposable Alu sequences.
Major differences between human and chimpanzee genomes are in non-coding regions and not protein-coding sequences. We are investigating intergenic nucleotide sequences of human chromosome 22q11 (the DiGeorge/velocardiofacial/conotruncal syndrome genetic abnormality region). These intergenic regions serve a variety of cellular functions, e.g., Beverly Emanuel and co-workers (Current Opinions in Genetics and Development (2012) 22:221-228) have proposed that AT-rich regions are hot spots of DNA translocation. Via bioinformatics, we are currently comparing human/chimp non-coding regions of 22q11 for genetic drift, additions/deletions, sequence variations in AT-rich regions, prevalence of Alu elements, and duplications. Our idea is to determine the degree of variation in non-coding DNA between humans and primates. Several AT-rich regions display imperfect long stem-loop structures containing internal looped out positions, and if these regions are transcribed, this may signal a function at the RNA level. Some large segments of AT-rich sequences are in intergenic regions, e.g., those found ~5000 b.p. downstream of the 3' side of the DGCR6 protein gene (a gene in the DiGeorge syndrome critical region of chromosome 22). These AT-rich sequences are numerously repeated in 22q11 and elsewhere. The possible function of these repeats is intriguing. We find that in the chimp, part of this sequence displays significant differences compared to the human sequence.