Long noncoding RNAs (lncRNAs) represent a vast unexplored genetic space that may hold missing drivers of tumourigenesis, but few such “driver lncRNAs” are known. Until now, they have been discovered through changes in expression, leading to problems in distinguishing between causative roles and passenger effects.
Researchers at the Barcelona Institute of Science and Technology have developed a different approach for driver lncRNA discovery using mutational patterns in tumour DNA. Their pipeline, ExInAtor, identifies genes with excess load of somatic single nucleotide variants (SNVs) across panels of tumour genomes. Heterogeneity in mutational signatures between cancer types and individuals is accounted for using a simple local trinucleotide background model, which yields high precision and low computational demands. They use ExInAtor to predict drivers from the GENCODE annotation across 1112 entire genomes from 23 cancer types. Using a stratified approach, the researchers identify 15 high-confidence candidates: 9 novel and 6 known cancer-related genes, including MALAT1, NEAT1 and SAMMSON. Both known and novel driver lncRNAs are distinguished by elevated gene length, evolutionary conservation and expression. They have presented a first catalogue of mutated lncRNA genes driving cancer, which will grow and improve with the application of ExInAtor to future tumour genome projects.
Outline of the ExInAtor method
(A) The steps of gene definition, subsampling and analysis performed to quantify exonic and background mutations. Sampling is performed in such a way that, at the end, the trinucleotide frequency of the background region is identical to the exonic region. (B) The number of mutations in background and exonic regions is compared by a contingency table analysis.
Availability - The latest ExInAtor version is freely available for download here: https://github.com/alanzos/ExInAtor/