SPACER
FinderSequencesJobsDocsContact

Search documentation

Search all SPACER documentation pages

GitHub
IntroductionQuick Start
OverviewEnzyme FamiliesPAM SequencesNomenclature
Cas12 FinderCas13 FinderMSA Guide DesignBADGERS OptimizerOptimizer Configuration
spacer-webv0.1.0
Guide Design›BADGERS Optimizer

BADGERS Optimizer

An evolutionary algorithm for generating optimized—and potentially novel—Cas13a spacer sequences from multiple sequence alignments.

Overview

Unlike standard spacer finding, which identifies and scores subsequences already present in the input, the BADGERS optimizer generates novel spacer sequences that may not exist in any natural sequence. It uses the ADAPT CNN models as a frozen fitness oracle and applies an evolutionary algorithm to maximize spacer activity across target sequence diversity.

The algorithm is based on Mantena et al., Nature Biotechnology 2024. Input is a multiple sequence alignment (MSA) of pathogen variants in FASTA format. All spacer sequences are 28 nt (Cas13a).

Warning
The optimizer generates potentially novel/synthetic sequences not present in the input alignment. These spacers are computationally designed and should be experimentally validated before use in diagnostic assays.

Two Objective Modes

The optimizer supports two distinct objectives, each with its own fitness function and default hyperparameters.

ModeUse CaseFitness Objective
Multi-target detectionDetect all variants of a pathogenMaximize frequency-weighted mean activity across all targets
Variant identificationDistinguish variant A from variant BMaximize on-target activity while minimizing off-target activity (sigmoidal cost)

Workflow

The optimizer processes each eligible site in the MSA through a five-step pipeline.

StepOperationOutput
1. Extract sitesSlide a 48 nt window (10 nt flanking + 28 nt spacer + 10 nt flanking) across the MSA. Keep positions where ≥80% of sequences have valid ACGT-only windows.Vec<GenomicSite> — one per eligible position
2. Build fitnessConstruct a MultiTargetFitness or VariantIdFitness evaluator wrapping the ADAPT predictor and target set for the site.Fitness function for this site
3. EvolveInitialize population via Boltzmann sampling from seed sequences. Each generation: sample parents, mutate, replace worst. Repeat until evaluation budget is exhausted. Optional local search around top spacers.OptimizationResult with ranked population
4. Diversity filterGreedily remove spacers within a Hamming distance threshold of a higher-fitness spacer.Deduplicated spacer set
5. Score & returnConvert evolutionary fitness to ScoredSpacerCandidate with full quality flags, assay score (using for_optimizer() weights), and tier classification.SiteOptimResult per site, aggregated into OptimizerOutput

Fitness Functions

Multi-Target Detection

Maximizes expected Cas13a activity across all sequence variants. The fitness of a spacer is the frequency-weighted average of its combined activity against all unique targets:

fitness(spacer) = Σ(freq_t × combined_activity(spacer, target_t))

Where combined_activity = classify_prob × (regression + 4.0) − 4.0. This joint classification-regression score is the ADAPT model's native output format.

After evolution, the optimizer also computes perc_highly_active for each top-k spacer: the frequency-weighted fraction of targets where the spacer is classified as "highly active" (both classification probability and regression score above their respective thresholds).

Variant Identification

Maximizes activity against an on-target partition while minimizing activity against an off-target partition, using sigmoidal cost functions:

t2_cost = c / (1 + a × exp(k × (t2_activity − o)))
t1_cost = c − c / (1 + a × exp(k × (t1_activity − o)))
fitness = −(t2w × t2_cost + t1_cost)

HyperparameterDefaultRole
c1.0Sigmoid amplitude
a5.897Sigmoid scale factor
k−2.858Sigmoid steepness
o−2.511Sigmoid midpoint offset
t2w1.737Off-target cost weight

Diversity Filter

After evolution, a greedy Hamming distance filter ensures sequence diversity in the output. Spacers are iterated in descending fitness order; each spacer is kept only if its Hamming distance to all previously kept spacers exceeds the threshold (default: 3).

Setting the minimum distance to 0 disables filtering entirely. After filtering, results are truncated to top_k_per_site (default: 5).

Output

The optimizer produces an OptimizerOutput containing per-site results (SiteOptimResult). Each site result includes:

FieldDescription
spacersOptimized spacers as ScoredSpacerCandidates with full quality flags and tier
shannon_entropyAverage Shannon entropy across the spacer region at this site
consensus_fitnessFitness of the consensus seed spacer (baseline for improvement)
num_targets / num_valid_seqsUnique targets and total valid sequences at the site
mean_on/off_target_activityWeighted combined activity against each partition (variant-id only)
site_targetsPer-target sequences with frequencies and partition labels

Cross-site convenience methods include best_spacer(), all_spacers_ranked(), and summary() for aggregated statistics (total spacers, novel count, best fitness, mean improvement over consensus).

Optimizer Weight Preset

Optimized spacers use the for_optimizer() assay score weight preset, which differs from the standard default in two key ways:

ComponentDefault WeightOptimizer Weight
ml_activity0.300.35
heuristic_quality0.100.05
ml_activity_range(0.0, 4.0)(2.0, 4.0)

The narrower ML activity range of (2.0, 4.0) is used because optimizer fitness values (shifted by +4.0) cluster in that band. The default (0.0, 4.0) range would compress their spread, making it hard to differentiate top candidates. See the Assay Score page for the full weight breakdown.

Guide Design
MSA Guide Design
Guide Design
Optimizer Configuration
ATCG GCTA TACG CGAT ATCG TAGC GCTA ATCG TACG CGAT ATCG GCTA TACG CGAT ATCG TAGC GCTA ATCG TACG CGAT ATCG GCTA TACG CGAT ATCG TAGC GCTA ATCG TACG CGAT ATCG GCTA TACG CGAT ATCG TAGC GCTA ATCG TACG CGAT ATCG GCTA TACG CGAT ATCG TAGC GCTA ATCG TACG CGAT ATCG GCTA TACG CGAT ATCG TAGC GCTA ATCG TACG CGAT ATCG GCTA TACG CGAT ATCG TAGC GCTA ATCG TACG CGAT ATCG GCTA TACG CGAT ATCG TAGC GCTA ATCG TACG CGAT
SPACER

Open-source CRISPR guide RNA design and scoring for Cas12 and Cas13 diagnostic systems.

Resources
FinderDocumentationChangelogContactGitHub
Developed atFiocruz Parana — Instituto Carlos Chagas

Fundacao Oswaldo Cruz - Parana

Instituto Carlos Chagas

© 2026 SPACER·v0.1.0
hwalflorGitHub