TY - JOUR
T1 - NGSNGS
T2 - next-generation simulator for next-generation sequencing data
AU - Henriksen, Rasmus Amund
AU - Zhao, Lei
AU - Korneliussen, Thorfinn Sand
N1 - Publisher Copyright:
© The Author(s) 2023.
PY - 2023/1/1
Y1 - 2023/1/1
N2 - With the rapid expansion of the capabilities of the DNA sequencers throughout the different sequencing generations, the quantity of generated data has likewise increased. This evolution has also led to new bioinformatical methods, for which in silico data have become crucial when verifying the accuracy of a model or the robustness of a genomic analysis pipeline. Here, we present a multithreaded next-generation simulator for next-generation sequencing data (NGSNGS), which simulates reads faster than currently available methods and programs. NGSNGS can simulate reads with platform-specific characteristics based on nucleotide quality score profiles as well as including a post-mortem damage model which is relevant for simulating ancient DNA. The simulated sequences are sampled (with replacement) from a reference DNA genome, which can represent a haploid genome, polyploid assemblies or even population haplotypes and allows the user to simulate known variable sites directly. The program is implemented in a multithreading framework and is factors faster than currently available tools while extending their feature set and possible output formats.
AB - With the rapid expansion of the capabilities of the DNA sequencers throughout the different sequencing generations, the quantity of generated data has likewise increased. This evolution has also led to new bioinformatical methods, for which in silico data have become crucial when verifying the accuracy of a model or the robustness of a genomic analysis pipeline. Here, we present a multithreaded next-generation simulator for next-generation sequencing data (NGSNGS), which simulates reads faster than currently available methods and programs. NGSNGS can simulate reads with platform-specific characteristics based on nucleotide quality score profiles as well as including a post-mortem damage model which is relevant for simulating ancient DNA. The simulated sequences are sampled (with replacement) from a reference DNA genome, which can represent a haploid genome, polyploid assemblies or even population haplotypes and allows the user to simulate known variable sites directly. The program is implemented in a multithreading framework and is factors faster than currently available tools while extending their feature set and possible output formats.
UR - https://www.scopus.com/pages/publications/85147318129
U2 - 10.1093/bioinformatics/btad041
DO - 10.1093/bioinformatics/btad041
M3 - 文章
C2 - 36661298
AN - SCOPUS:85147318129
SN - 1367-4803
VL - 39
JO - Bioinformatics
JF - Bioinformatics
IS - 1
M1 - btad041
ER -