Abstract
Bioinformatics methods for various RNA-seq data analyses are in fast evolution with the improvement of sequencing technologies. However, many challenges still exist in how to efficiently process the RNA-seq data to obtain accurate and comprehensive results. Here we reviewed the strategies for improving diverse transcriptomic studies and the annotation of genetic variants based on RNA-seq data. Mapping RNA-seq reads to the genome and transcriptome represent two distinct methods for quantifying the expression of genes/transcripts. Besides the known genes annotated in current databases, many novel genes/transcripts (especially those long noncoding RNAs) still can be identified on the reference genome using RNA-seq. Moreover, owing to the incompleteness of current reference genomes, some novel genes are missing from them. Genome- guided and de novo transcriptome reconstruction are two effective and complementary strategies for identifying those novel genes/transcripts on or beyond the reference genome. In addition, integrating the genes of distinct databases to conduct transcriptomics and genetics studies can improve the results of corresponding analyses.
| Original language | English |
|---|---|
| Pages (from-to) | 116-125 |
| Number of pages | 10 |
| Journal | Science China Life Sciences |
| Volume | 60 |
| Issue number | 2 |
| DOIs | |
| State | Published - 1 Feb 2017 |
Keywords
- RNA-seq
- de novo assembly
- genetic variants
- genome-guided transcriptome reconstruction
- long noncoding RNA