Multi-Target Scoring
Score a single guide RNA against multiple variant target sequences simultaneously to evaluate cross-strain detection capability.
Overview
Multi-target scoring evaluates how a single CRISPR guide performs against N variant target sequences at once. For each variant, SPACER runs the ML activity model (ADAPT or EasyDesign) with the variant's specific target sequence and flanking context, producing an independent activity prediction. The result is a vector of per-variant scores that reveals whether the guide can reliably detect all known strains of a pathogen.
How It Works
Given a guide sequence and N target sequences extracted from an MSA column:
- For each variant, extract the target region and upstream/downstream flanking context from the aligned sequence.
- Pass the guide + target + context triple to the ML activity predictor. The model outputs a raw activity score on the
[-4, 0]scale (shifted to[0, 4+]in the API). - Classify each variant as covered (activity above threshold), below threshold, or skipped (too many gaps in the alignment).
- Aggregate per-variant scores into coverage statistics: coverage fraction, mean/median/min/max activity, percentiles, and standard deviation.
Each variant's result includes its target_sequence, upstream_context, and downstream_context so you can inspect exactly what the model saw. Variants with identical target+context regions are deduplicated internally; their frequency and member_sequence_ids fields track the original sequences they represent.
Mismatch Tolerance
The ADAPT and EasyDesign ML models were trained on guide-target pairs with 0 to 2 mismatches. SPACER does not impose a hard mismatch cap — the model will score any guide-target pair regardless of mismatch count — but prediction accuracy degrades as mismatches increase beyond the training distribution.
| Mismatches | Expected Accuracy | Notes |
|---|---|---|
| 0 | Highest | Perfect match — model’s core training regime |
| 1–2 | High | Within training distribution |
| 3–4 | Moderate | Extrapolation; scores directionally useful |
| ≥5 | Low | Significant extrapolation; interpret with caution |
Per-variant results include a mismatch_count field so you can assess whether scores for high-mismatch variants should be trusted. The gap_count field similarly tracks alignment gaps in the target region.
Use Case: Pathogen Variant Detection
The primary use case is CRISPR-based diagnostic design against pathogens with multiple circulating strains. When designing a SHERLOCK or DETECTR assay, you need guides that detect all known variants of the target gene — not just the reference strain. A guide with 100% activity on the reference but 0% on a common variant is useless for diagnostics.
Multi-target scoring answers the question: "If I deploy this guide, what fraction of known strains will it detect?"
Coverage Assessment
After scoring, SPACER computes aggregate coverage statistics from the per-variant activity scores. A variant is covered when its activity exceeds the configured threshold (strict inequality).
coverage_fraction = covered_variants / (total_variants - skipped_variants)Skipped variants (those with excessive gaps) are excluded from the denominator so they do not penalize guides for missing alignment data.
| Statistic | Description |
|---|---|
| coverage_fraction | Fraction of scorable variants above the activity threshold |
| mean_activity | Mean activity across all scored variants |
| median_activity | Median activity (robust central tendency) |
| min_activity | Worst-case variant activity |
| percentile_5 / percentile_95 | Robust worst-case and best-case bounds |
| std_activity | Standard deviation (consistency across variants) |
| low_signal_variants | Variants above threshold but below signal ratio cutoff |
Guides are ranked by a configurable ranking_strategy:
| Strategy | Ranking Formula | Best For |
|---|---|---|
| coverage_first (default) | coverage_fraction × 1000 + mean_activity | Diagnostics: maximize strain breadth |
| maximize_minimum | min_activity | Reliability: worst-case must be acceptable |
| maximize_mean | mean_activity | Average performance matters most |
Configuration
| Parameter | API Field | Default | Description |
|---|---|---|---|
| Activity threshold | activity_threshold | 0.0 (shifted scale) | Minimum activity for a variant to count as covered |
| Min coverage | min_coverage_fraction | 0.95 | Minimum fraction of variants that must be covered |
| Gap handling | gap_handling | skip_gapped | How to treat gaps: skip_gapped, include_in_denominator, fill_with_n |
| Max gap fraction | max_gap_fraction | 0.0 | Max gap ratio before a variant is skipped |
| Ranking strategy | ranking_strategy | coverage_first | How guides are ranked: coverage_first, maximize_minimum, maximize_mean |
| Signal ratio cutoff | signal_ratio_cutoff | None | Optional signal-to-noise filter (see Signal Ratio Filtering) |
[-4, 0] scale (default threshold -4.0). You only need to work with the shifted scale when configuring via the API.Related
For automated guide design from a set of variant sequences rather than scoring individual guides, see MSA Guide Design. To apply signal-to-noise filtering on top of coverage, see Signal Ratio Filtering. Coverage feeds into the assay score as a weighted component — see Coverage & Specificity for how it integrates with the composite ranking.