ML Activity Prediction
ML-predicted on-target activity — the highest-weighted component of the assay score (weight 0.30).
Overview
The ml_activity component uses pre-trained neural network models to estimate on-target cleavage activity from spacer sequence features. It carries the highest default weight (0.30) among the nine assay score components, reflecting the strong correlation between predicted and experimentally measured guide activity. SPACER automatically selects the correct model for the configured enzyme family: EasyDesign for Cas12, ADAPT for Cas13.
Score Range
Both models produce a raw prediction on an internal log-scale that typically ranges from approximately -4 to 0. SPACER shifts every raw prediction by +4.0, yielding a shifted scale of [0, 4+] where higher values indicate greater predicted activity.
Normalization
The shifted prediction is mapped to the unit range [0, 1] using the ml_activity_range parameter, which defaults to (0.0, 4.0). The formula is:
normalized = clamp((shifted - min) / (max - min), 0.0, 1.0)
For optimizer-generated spacers (BADGERS), a narrower range of (2.0, 4.0) is used because fitness values cluster in that band — the default range would compress their spread.
Activity Interpretation
| Shifted Value | Normalized | Interpretation |
|---|---|---|
| 0.0 | 0.00 | Inactive — model predicts negligible cleavage |
| 1.0 | 0.25 | Weak to moderate activity |
| 2.0 | 0.50 | Good activity — typical of functional guides |
| 4.0+ | 1.00 | Highly active — top-performing predictions |
When Disabled
ML activity prediction is optional. When it is not enabled, the ml_activity weight (0.30) is redistributed proportionally among the remaining active assay score components. This ensures the total score always uses the full [0, 1] range regardless of which pipeline stages are enabled.
Similarly, if a spacer is too close to the sequence boundary to extract the required flanking context, ML scoring is skipped for that spacer and the InsufficientContext quality flag is raised.