“In addition to the 20,000 or so genes in the human genome that give rise to proteins, our DNA contains another 20,000 sequences that resemble those genes in many ways, but do not code for proteins,” says Dr. Igor Ulitsky of the Weizmann Institute’s Biological Regulation Department. While the protein-coding genes are copied out into strands of messenger RNA, these other sequences give rise to the so-called long non-coding RNAs, or lncRNAs (pronounced link-RNAs).
Scientists have not quite figured out what – if anything – most of these enigmatic RNAs do but mutations or alterations in lncRNA sequences have been implicated in a number of recent genomic studies of diseases, markers for susceptibility to disease and physical traits, hinting at active involvement in cellular activities. Ulitsky, members of his lab and colleagues from the Whitehead Institute and Harvard Medical School looked at lncRNAs from a new angle – how they evolve – and the group’s findings, which appeared in Cell Reports, have pinpointed specific sequences that may be relevant to human health.
What is clear is that at least some lncRNAs perform regulatory tasks, turning gene expression up or down. The most famous lncRNA, called Xist, silences the extra X chromosome in females. But the fact that the genetic sequences encoding lncRNAs generally mutate and evolve rapidly, combined with their tendency to be expressed in only a limited number of cells and at low levels, make them hard to locate or study. In comparison, scientists regularly work with protein-encoding genes that have been conserved from fruit flies to humans, or even from yeast cells to humans; while any lncRNA that regulates the expression of those genes has mutated, died out or evolved into a completely new form several times over in the evolutionary interval.
Nonetheless, those lncRNA sequences that have been conserved longer than the others tend to be the ones that are worth investigating. That is why Ulitsky and his team set out to map the evolution of lncRNAs. As a computational biologist the challenge before him was to create algorithms that could identify lncRNAs in the genomes of 17 different species and compare them. “Since non-coding RNAs are primarily defined by what they are not, finding them was an arduous process of elimination, beginning with the coding sequences,” says Ulitsky. The species list began with the genome of a sea urchin – an invertebrate whose distance from humans on the evolutionary tree would mean that the lncRNAs sequences would have had a fairly long time to diverge – and continued through fish, birds, rodents and primates.
In each species, the researchers identified at least 1,000 new lncRNAs. The next step, the comparison, revealed that there was practically no overlap between the human and sea urchin lncRNAs, only about 100 were conserved between fish and humans, and some 300 between chickens and humans. And those lncRNAs that could be traced to common origins over 300 million years ago had undergone extensive evolutionary changes, leaving “small islands” of DNA that remained virtually the same from species to species. These islands, says Ulitsky, should be the functional part of the lncRNAs, and, in some cases, reading the sequences of these short bits may be enough to divulge what they do. In other cases, he says, the lncRNAs are completely changed but the location in the genome is conserved. Even the fact that a “neighborhood was set aside” for lncRNAs may contain hints as to their functionality.
Ulitsky points out that this evolutionary map of lncRNAs sheds new light on the differences among species: “We share about 70% of our protein-coding genes with fish, but less than 1% of our lncRNAs. We think that some lncRNAs may even be species-specific, making brief appearances during evolution. This ‘fast-tracked’ evolution may facilitate diversification between organisms, allowing rapid adaptation.” Ulitsky and his lab group now plan to pursue the investigation of several of the new lncRNAs they identified, especially those with conserved sequences that suggest they play important roles in human health and disease.
Dr. Igor Ulitsky was born in Russia and immigrated with his family to Israel at the age of 10. He completed a PhD in computer science at Tel Aviv University. It was during his postdoctoral research at the Whitehead Institute, in Cambridge, Mass., that he switched to experimental biology, combining his computer skills with lab work to begin investigating lncRNAs in the lab of Prof. David Bartel. He joined the Weizmann Institute faculty in 2013. Ulitsky is married to Liad, who is completing a master’s degree in education, and he is the father of two children, aged 4 and 7, with whom he greatly enjoys spending his spare time.
Dr. Igor Ulitsky's research is suppored by the Willner Family Leadership Institute for the Weizmann Institute of Science; the Abramson Family Center for Young Scientists; the Rising Tide Foundation; and the Fritz Thyssen Stiftung. Dr. Ulitsky is the incumbent of the Sygnet Career Development Chair for Bioinformatics.