SPACER
FinderSequencesJobsDocsContact

Search documentation

Search all SPACER documentation pages

GitHub
IntroductionQuick Start
OverviewEnzyme FamiliesPAM SequencesNomenclature
Cas12 FinderCas13 FinderMSA Guide DesignBADGERS OptimizerOptimizer Configuration
spacer-webv0.1.0
Guide Design›MSA Guide Design

MSA Guide Design

Automated pan-variant guide RNA design from Multiple Sequence Alignments, producing guides ranked by cross-strain coverage and activity.

Overview

MSA guide design is SPACER's end-to-end workflow for designing CRISPR guides that detect all known variants of a target. Instead of scoring individual guides one at a time, you provide a set of variant sequences (as an MSA or unaligned FASTA), and SPACER automatically finds candidate guides from the reference sequence, scores each against every variant, and returns guides ranked by their variant coverage.

The workflow consists of three stages:

  1. Find spacers in the reference (first) sequence of the MSA.
  2. Score each spacer against all variant sequences using the ML activity model, producing per-variant activity predictions.
  3. Rank by coverage — select guides that detect the most variants above the activity threshold, using the configured ranking strategy.

Input Format

The input is a FASTA file containing at least 2 sequences. SPACER supports two input modes:

ModeDetectionBehavior
Pre-aligned MSAAll sequences have equal lengthUsed directly; gap characters (‘-’) preserved
Unaligned sequencesSequences differ in lengthAuto-aligned with MAFFT before analysis

The first sequence in the FASTA is treated as the reference. Candidate spacers are identified in this reference sequence, then scored against every other sequence in the alignment.

Info
MAFFT must be installed for auto-alignment of unaligned sequences. SPACER searches common install locations (system PATH, Conda, Homebrew, Bioconda) automatically. If your input is already aligned, MAFFT is not required.

Configuration

MSA guide design uses the same configuration as multi-target scoring, plus a site extraction parameter that controls which alignment columns are considered:

ParameterDefaultRangeDescription
activity_threshold0.0 (shifted)[0, 4+]Minimum activity for a variant to count as covered
min_coverage_fraction0.95[0.0, 1.0]Minimum fraction of variants that must be covered
gap_handlingskip_gapped—Strategy for variants with gaps in the target region
max_gap_fraction0.0[0.0, 1.0]Maximum gap ratio before a variant is skipped (0.0 = ADAPT compatibility)
ranking_strategycoverage_first—Guide ranking: coverage_first, maximize_minimum, maximize_mean
signal_ratio_cutoffNone[0.0, 1.0]Optional signal-to-noise filter for coverage

Conservation Threshold (min_valid_fraction)

When extracting candidate sites from an MSA, SPACER filters alignment columns by the fraction of sequences that have valid (non-gap) nucleotides at each position. The min_valid_fraction parameter controls this filter:

PropertyValue
Parametermin_valid_fraction
Default0.80 (80%)
Range0.0–1.0
EffectAlignment columns where fewer than this fraction of sequences have valid nucleotides are excluded from site extraction

A value of 0.80 means a site must have valid nucleotides in at least 80% of the input sequences to be considered. Increasing this value produces more conservative results by focusing on highly conserved regions; decreasing it allows sites with more variation to be evaluated.

Tip
For diagnostic design against highly divergent targets (e.g., RNA virus quasispecies), consider lowering min_valid_fraction to 0.5 to explore less-conserved regions. For stable targets, the default 0.80 is appropriate.

Output

Each guide in the output includes:

FieldDescription
coverage_fractionFraction of scorable variants above the activity threshold
strains_covered / strains_totalAbsolute count of covered vs. total variants
mean_activityMean predicted activity across all scored variants
median_activityMedian activity (robust central tendency)
min_activityWorst-case variant activity
max_activityBest-case variant activity
std_activityStandard deviation of activity scores
percentile_5 / percentile_95Robust worst-case and best-case bounds
low_activity_strainsIDs of variants that fell below the activity threshold
low_signal_variantsCount of variants above threshold but below signal ratio cutoff
variant_scoresPer-variant detail: activity, mismatch/gap counts, signal class

Guides are returned sorted by their ranking score (descending). A meets_coverage flag indicates whether the guide satisfies the configured min_coverage_fraction.

Coverage as an Assay Score Component

When MSA data is provided, the coverage fraction feeds directly into the composite assay score as the coverage component with a default weight of 0.25. This means variant coverage accounts for 25% of the final guide ranking in the default weight preset — the second highest weight after ML activity (0.30).

See Coverage & Specificity for details on how coverage integrates with the assay score, including weight rebalancing when specificity components are activated.

Gap Handling Strategies

When extracting spacer regions from an MSA, some variants may have gaps (insertions or deletions) in the target region. The gap_handling parameter controls how these are treated:

StrategyBehaviorCoverage Effect
skip_gapped (default)Skip variants with any gaps in the target regionExcluded from both numerator and denominator
include_in_denominatorSkip scoring but count in denominatorReduces coverage fraction for gapped variants
fill_with_nReplace gaps with N and score anywayAll variants scored; gaps may reduce activity

The default skip_gapped with max_gap_fraction = 0.0 matches the behavior of the original ADAPT Python implementation, which excludes any sequence with gaps in the target+context region from scoring.

Guide Design
Cas13 Finder
Guide Design
BADGERS Optimizer
ATCG GCTA TACG CGAT ATCG TAGC GCTA ATCG TACG CGAT ATCG GCTA TACG CGAT ATCG TAGC GCTA ATCG TACG CGAT ATCG GCTA TACG CGAT ATCG TAGC GCTA ATCG TACG CGAT ATCG GCTA TACG CGAT ATCG TAGC GCTA ATCG TACG CGAT ATCG GCTA TACG CGAT ATCG TAGC GCTA ATCG TACG CGAT ATCG GCTA TACG CGAT ATCG TAGC GCTA ATCG TACG CGAT ATCG GCTA TACG CGAT ATCG TAGC GCTA ATCG TACG CGAT ATCG GCTA TACG CGAT ATCG TAGC GCTA ATCG TACG CGAT
SPACER

Open-source CRISPR guide RNA design and scoring for Cas12 and Cas13 diagnostic systems.

Resources
FinderDocumentationChangelogContactGitHub
Developed atFiocruz Parana — Instituto Carlos Chagas

Fundacao Oswaldo Cruz - Parana

Instituto Carlos Chagas

© 2026 SPACER·v0.1.0
hwalflorGitHub