The characterization of mutational processes in terms of their signatures of activity relies mostly on the assumption that mutations in a given cancer genome are independent of one another. Recently, it was discovered that certain segments of mutations, termed processive groups, occur on the same DNA strand and are generated by a single process or signature. Here we provide a first probabilistic model of mutational signatures that accounts for their observed stickiness and strand coordination. The model conditions on the observed strand for each mutation and allows the same signature to generate a run of mutations. It can both use known signatures or learn new ones. We show that this model provides a more accurate description of the properties of mutagenic processes than independent-mutation achieving substantially higher likelihood on held-out data. We apply this model to characterize the processivity of mutagenic processes across multiple types of cancer.
- Quantitative Genetics