Abstract
Motivation: Controlling the outcomes of CRISPR editing is crucial for the success of gene therapy. Since donor template-based editing is often inefficient, alternative strategies have emerged that leverage mutagenic end-joining repair instead. Existing machine learning models can accurately predict end-joining repair outcomes, however: generalisability beyond the specific cell line used for training remains a challenge, and interpretability is typically limited by suboptimal feature representation and model architecture. Results: We propose X-CRISP, a flexible and interpretable neural network for predicting repair outcome frequencies based on a minimal set of outcome and sequence features, including microhomologies (MH). Outperforming prior models on detailed and aggregate outcome predictions, X-CRISP prioritised MH location over MH sequence properties such as GC content for deletion outcomes. Through transfer learning, we adapted X-CRISP pre-trained on wild-type mESC data to target human cell lines K562, HAP1, U2OS, and mESC lines with altered DNA repair function. Adapted X-CRISP models improved over direct training on target data from as few as 50 samples, suggesting that this strategy could be leveraged to build models for new domains using a fraction of the data required to train models from scratch. Availability: An implementation of X-CRISP is available at https://github.com/joanagoncalveslab/xcrisp.
Original language | English |
---|---|
DOIs | |
Publication status | Published - 8 Feb 2025 |
Fingerprint
Dive into the research topics of 'X-CRISP: Domain-Adaptable and Interpretable CRISPR Repair Outcome Prediction'. Together they form a unique fingerprint.Datasets
-
Data and code underlying the publication: X-CRISP: Domain-Adaptable and Interpretable CRISPR Repair Outcome Prediction
Gonçalves, J. (Creator) & Seale, C. F. (Creator), TU Delft - 4TU.ResearchData, 18 Feb 2025
DOI: 10.4121/709D3D46-3D48-478C-A246-970FE325E236
Dataset/Software: Software