A Bayesian Nonparametric Model for Inferring Subclonal Populations from Structured DNA Sequencing Data posted to bioRxiv

Our paper with Shai He and Aaron Schien as co-first authors is now posted to bioRxiv. This paper is about learning genetic subpopulations from DNA sequencing data where there is experimental structure - for example, if multiple biopsies are from the same individual or if multiple replicates are from the same biological sample. We use the augment-and-marginalize trick to reformulate the model as a Gamma-Poisson hierarchy which yields analytical sampling in the Gibbs sampler and fast inference of the posterior distribution. Thanks to NIH 5R01GM135931 for funding this work.