Control of gene expression: Characterization of the first regulatory RNA gene, micF and its transcript
During past decades, our research concentrated on elucidation of transcriptional activation and suppression of the non-coding micF RNA gene, characterization of its RNA transcript and determination of functional role of the RNA. The micF gene was first proposed to regulate gene expression by Masayori Inouye and Takeshi Mizuno in the early 1980s. Our lab subsequently characterized the micF gene, found its transcript, determined the structure of the MicF RNA/target ompF mRNA duplex and showed that the RNA functions in response to cellular environmental and internal stress conditions. MicF RNA has the distinction of being the first non-coding regulatory RNA to have been discovered and characterized. Recently, other labs, led by Gerhart Wagner in Sweden and Jorg Vogel in Germany have shown a greatly expanded role in cell physiology whereby multiple target genes are regulated by micF, including a global regulator that controls ~10% of all protein genes (Mol. Microbiol (2012) 84:414 and 428). The initial findings on MicF RNA opened the door to revealing a major principle of biology, the regulation of gene expression by RNA.
Non-coding regions of the human genome- analysis of structure and function
Functions of primate genomic regions that do not encode proteins or RNAs are for the most part are unknown. We have investigating a small 10,000 bp intergenic nucleotide sequences of human chromosome 22q11.2 (the DiGeorge/velocardiofacial/conotruncal syndrome genetic abnormality region), and compared this to the analogous chimpanzee genomic region. The DiGeorge disorder involves ~3 Mb deletion of chromosomal 22q11.2 region that results in loss of approximately ~40 protein genes and an appreciable (however undetermined) number of miRNA and lncRNA genes due to aberrant recombination. The 22q11.2 region is one of the most susceptible genomic regions in the human genome to undergo genetic rearrangements, thus an analysis of non-coding segments for functional roles is of major interest.
Via bioinformatic analyses and comparisons of human and chimpanzee sequences we find that part of the 10,000 bp non-coding region is evolutionarily reserved for highly biased mutations that lead to formation of very high A+T palindromic sequences, an over abundance of the sequence TATAATATA, and the ability of the A+T sequences to form very long stem- loop DNA secondary structures and cruciforms. It has been proposed by others that certain A+T-rich regions are hot spots for DNA translocation and these regions are termed translocation breakpoint sequences [Chromosomal translocations and palindromic AT-rich repeats. Kayo et al., Current Opinions in Genetics and Development 22:221-228 (2012)]. Palindrome-Mediated Translocations in Humans: A New Mechanistic Model for Gross Chromosomal Rearrangements. Inagaki et al, Front Genet. (2016); 7:125. doi: 10.3389/fgene.2016.00125.]. We hypothesize that the cell reserves non-coding regions of the genome for highly biased mutations, and that by trial and error, these non-coding regions can eventually form viable breakpoint secondary structures that may function in translocation.
The 10,000 bp segment shows additional complexity in that we find that translocation breakpoint sequences, which are repeated several times in the 10,000 bp segment carry non-coding RNA exon sequences and protein gene intron sequences and that these sequences in turn are also proliferated in different parts of the genome via the spread of breakpoint sequence. We are currently concentrating on determination of abundance of A+T-rich sequences and translocation breakpoint sequences in other parts of chromosome 22 as well as in other chromosomes, and also trying to understand the rationale behind the prolific duplication of the particular exon and intron motifs in the genome.