Multi-Target Scoring

Score a single guide RNA against multiple variant target sequences simultaneously to evaluate cross-strain detection capability.

Overview

Multi-target scoring evaluates how a single CRISPR guide performs against N variant target sequences at once. For each variant, SPACER runs the ML activity model (ADAPT or EasyDesign) with the variant's specific target sequence and flanking context, producing an independent activity prediction. The result is a vector of per-variant scores that reveals whether the guide can reliably detect all known strains of a pathogen.

How It Works

Given a guide sequence and N target sequences extracted from an MSA column:

For each variant, extract the target region and upstream/downstream flanking context from the aligned sequence.
Pass the guide + target + context triple to the ML activity predictor. The model outputs a raw activity score on the [-4, 0] scale (shifted to [0, 4+] in the API).
Classify each variant as covered (activity above threshold), below threshold, or skipped (too many gaps in the alignment).
Aggregate per-variant scores into coverage statistics: coverage fraction, mean/median/min/max activity, percentiles, and standard deviation.

Each variant's result includes its target_sequence, upstream_context, and downstream_context so you can inspect exactly what the model saw. Variants with identical target+context regions are deduplicated internally; their frequency and member_sequence_ids fields track the original sequences they represent.

Mismatch Tolerance

The ADAPT and EasyDesign ML models were trained on guide-target pairs with 0 to 2 mismatches. SPACER does not impose a hard mismatch cap — the model will score any guide-target pair regardless of mismatch count — but prediction accuracy degrades as mismatches increase beyond the training distribution.

Mismatches	Expected Accuracy	Notes
0	Highest	Perfect match — model’s core training regime
1–2	High	Within training distribution
3–4	Moderate	Extrapolation; scores directionally useful
≥5	Low	Significant extrapolation; interpret with caution

Per-variant results include a mismatch_count field so you can assess whether scores for high-mismatch variants should be trusted. The gap_count field similarly tracks alignment gaps in the target region.

Use Case: Pathogen Variant Detection

The primary use case is CRISPR-based diagnostic design against pathogens with multiple circulating strains. When designing a SHERLOCK or DETECTR assay, you need guides that detect all known variants of the target gene — not just the reference strain. A guide with 100% activity on the reference but 0% on a common variant is useless for diagnostics.

Multi-target scoring answers the question: "If I deploy this guide, what fraction of known strains will it detect?"

Coverage Assessment

After scoring, SPACER computes aggregate coverage statistics from the per-variant activity scores. A variant is covered when its activity exceeds the configured threshold (strict inequality).

text

coverage_fraction = covered_variants / (total_variants - skipped_variants)

Skipped variants (those with excessive gaps) are excluded from the denominator so they do not penalize guides for missing alignment data.

Statistic	Description
coverage_fraction	Fraction of scorable variants above the activity threshold
mean_activity	Mean activity across all scored variants
median_activity	Median activity (robust central tendency)
min_activity	Worst-case variant activity
percentile_5 / percentile_95	Robust worst-case and best-case bounds
std_activity	Standard deviation (consistency across variants)
low_signal_variants	Variants above threshold but below signal ratio cutoff

Guides are ranked by a configurable ranking_strategy:

Strategy	Ranking Formula	Best For
coverage_first (default)	coverage_fraction × 1000 + mean_activity	Diagnostics: maximize strain breadth
maximize_minimum	min_activity	Reliability: worst-case must be acceptable
maximize_mean	mean_activity	Average performance matters most

Configuration

Parameter	API Field	Default	Description
Activity threshold	activity_threshold	0.0 (shifted scale)	Minimum activity for a variant to count as covered
Min coverage	min_coverage_fraction	0.95	Minimum fraction of variants that must be covered
Gap handling	gap_handling	skip_gapped	How to treat gaps: skip_gapped, include_in_denominator, fill_with_n
Max gap fraction	max_gap_fraction	0.0	Max gap ratio before a variant is skipped
Ranking strategy	ranking_strategy	coverage_first	How guides are ranked: coverage_first, maximize_minimum, maximize_mean
Signal ratio cutoff	signal_ratio_cutoff	None	Optional signal-to-noise filter (see Signal Ratio Filtering)

Tip

The API uses a shifted activity scale where 0.0 corresponds to the classifier boundary — any ML-active guide covers the variant. The internal Rust engine uses the raw [-4, 0] scale (default threshold -4.0). You only need to work with the shifted scale when configuring via the API.

For automated guide design from a set of variant sequences rather than scoring individual guides, see MSA Guide Design. To apply signal-to-noise filtering on top of coverage, see Signal Ratio Filtering. Coverage feeds into the assay score as a weighted component — see Coverage & Specificity for how it integrates with the composite ranking.

Multi-Target Scoring

Score a single guide RNA against multiple variant target sequences simultaneously to evaluate cross-strain detection capability.

Overview

How It Works

Given a guide sequence and N target sequences extracted from an MSA column:

For each variant, extract the target region and upstream/downstream flanking context from the aligned sequence.
Pass the guide + target + context triple to the ML activity predictor. The model outputs a raw activity score on the [-4, 0] scale (shifted to [0, 4+] in the API).
Classify each variant as covered (activity above threshold), below threshold, or skipped (too many gaps in the alignment).
Aggregate per-variant scores into coverage statistics: coverage fraction, mean/median/min/max activity, percentiles, and standard deviation.

Mismatch Tolerance

Mismatches	Expected Accuracy	Notes
0	Highest	Perfect match — model’s core training regime
1–2	High	Within training distribution
3–4	Moderate	Extrapolation; scores directionally useful
≥5	Low	Significant extrapolation; interpret with caution

Use Case: Pathogen Variant Detection

Multi-target scoring answers the question: "If I deploy this guide, what fraction of known strains will it detect?"

Coverage Assessment

After scoring, SPACER computes aggregate coverage statistics from the per-variant activity scores. A variant is covered when its activity exceeds the configured threshold (strict inequality).

text

coverage_fraction = covered_variants / (total_variants - skipped_variants)

Skipped variants (those with excessive gaps) are excluded from the denominator so they do not penalize guides for missing alignment data.

Statistic	Description
coverage_fraction	Fraction of scorable variants above the activity threshold
mean_activity	Mean activity across all scored variants
median_activity	Median activity (robust central tendency)
min_activity	Worst-case variant activity
percentile_5 / percentile_95	Robust worst-case and best-case bounds
std_activity	Standard deviation (consistency across variants)
low_signal_variants	Variants above threshold but below signal ratio cutoff

Guides are ranked by a configurable ranking_strategy:

Strategy	Ranking Formula	Best For
coverage_first (default)	coverage_fraction × 1000 + mean_activity	Diagnostics: maximize strain breadth
maximize_minimum	min_activity	Reliability: worst-case must be acceptable
maximize_mean	mean_activity	Average performance matters most

Configuration

Parameter	API Field	Default	Description
Activity threshold	activity_threshold	0.0 (shifted scale)	Minimum activity for a variant to count as covered
Min coverage	min_coverage_fraction	0.95	Minimum fraction of variants that must be covered
Gap handling	gap_handling	skip_gapped	How to treat gaps: skip_gapped, include_in_denominator, fill_with_n
Max gap fraction	max_gap_fraction	0.0	Max gap ratio before a variant is skipped
Ranking strategy	ranking_strategy	coverage_first	How guides are ranked: coverage_first, maximize_minimum, maximize_mean
Signal ratio cutoff	signal_ratio_cutoff	None	Optional signal-to-noise filter (see Signal Ratio Filtering)

Tip

Multi-Target Scoring

Overview

How It Works

Mismatch Tolerance

Use Case: Pathogen Variant Detection

Coverage Assessment

Configuration

Related

Multi-Target Scoring

Overview

How It Works

Mismatch Tolerance

Use Case: Pathogen Variant Detection

Coverage Assessment

Configuration

Related