ML Activity Prediction

ML-predicted on-target activity — the highest-weighted component of the assay score (weight 0.30).

Overview

The ml_activity component uses pre-trained neural network models to estimate on-target cleavage activity from spacer sequence features. It carries the highest default weight (0.30) among the nine assay score components, reflecting the strong correlation between predicted and experimentally measured guide activity. SPACER automatically selects the correct model for the configured enzyme family: EasyDesign for Cas12, ADAPT for Cas13.

Score Range

Both models produce a raw prediction on an internal log-scale that typically ranges from approximately -4 to 0. SPACER shifts every raw prediction by +4.0, yielding a shifted scale of [0, 4+] where higher values indicate greater predicted activity.

Normalization

The shifted prediction is mapped to the unit range [0, 1] using the ml_activity_range parameter, which defaults to (0.0, 4.0). The formula is:

normalized = clamp((shifted - min) / (max - min), 0.0, 1.0)

For optimizer-generated spacers (BADGERS), a narrower range of (2.0, 4.0) is used because fitness values cluster in that band — the default range would compress their spread.

Activity Interpretation

Shifted Value	Normalized	Interpretation
0.0	0.00	Inactive — model predicts negligible cleavage
1.0	0.25	Weak to moderate activity
2.0	0.50	Good activity — typical of functional guides
4.0+	1.00	Highly active — top-performing predictions

When Disabled

ML activity prediction is optional. When it is not enabled, the ml_activity weight (0.30) is redistributed proportionally among the remaining active assay score components. This ensures the total score always uses the full [0, 1] range regardless of which pipeline stages are enabled.

Similarly, if a spacer is too close to the sequence boundary to extract the required flanking context, ML scoring is skipped for that spacer and the InsufficientContext quality flag is raised.

Info

Both models require 10 nt of flanking context on each side of the target site. See the EasyDesign and ADAPT model pages for architecture details.

Overview

Normalization

The shifted prediction is mapped to the unit range [0, 1] using the ml_activity_range parameter, which defaults to (0.0, 4.0). The formula is:

normalized = clamp((shifted - min) / (max - min), 0.0, 1.0)

For optimizer-generated spacers (BADGERS), a narrower range of (2.0, 4.0) is used because fitness values cluster in that band — the default range would compress their spread.

Shifted Value

Normalized

Interpretation

0.0

0.00

Inactive — model predicts negligible cleavage

1.0

0.25

Weak to moderate activity

2.0

0.50

Good activity — typical of functional guides

4.0+

1.00

Highly active — top-performing predictions

When Disabled

Similarly, if a spacer is too close to the sequence boundary to extract the required flanking context, ML scoring is skipped for that spacer and the InsufficientContext quality flag is raised.

Info

Both models require 10 nt of flanking context on each side of the target site. See the EasyDesign and ADAPT model pages for architecture details.