SCSIM: Jointly simulating correlated single-cell and bulk next-generation DNA sequencing data

Abstract

Recently, it has become possible to collect next-generation DNA sequencing data sets that are composed of multiple samples from multiple biological units where each of these samples may be from a single cell or bulk tissue. Yet, there does not yet exist a tool for simulating DNA sequencing data from such a nested sampling arrangement with single-cell and bulk samples so that developers of analysis methods can assess accuracy and precision. We have developed a tool that simulates DNA sequencing data from hierarchically grouped (correlated) samples where each sample is designated bulk or single-cell. Our tool uses a simple configuration file to define the experimental arrangement and can be integrated into software pipelines for testing of variant callers or other genomic tools. The DNA sequencing data generated by our simulator is representative of real data and integrates seamlessly with standard downstream analysis tools.

Publication
BMC Bioinformatics
Date
Links