Alpha Swarm
Private beta · Q3·Built with research desks, not for them.

Agentic research,
with discipline.

Coordinate autonomous research agents that generate falsifiable hypotheses, validate them against history, and produce auditable research — without losing statistical discipline.

Swarm · Live
4 agents1 validated
  • A-01Feature search
    412 features · IC > 0.04
    passed
  • A-02Regime detect
    HMM(4) · BIC stable
    running
  • A-03Lead/lag scout
    Lag ∈ [1, 21] · cross-asset
    running
  • A-04Anomaly scout
    17 outlier days · review
    rejected
Hypothesis queue3 of 47
#214validated

Mean reversion · vol_q1

#213validating

HY OAS → R2K drawdowns

#212rejected

PEAD decay · post-2018

01Every claim falsifiable

Hypotheses ship with explicit reject criteria — not vibes.

02Validation by construction

Walk-forward folds, leakage scans, and FDR control are mandatory.

03Auditable by default

Every decision an agent makes is recorded and reproducible.

The product

A research console for swarms, not single agents.

Coordinate dozens of research agents across hypotheses, datasets, and experiments — with the same rigor your senior researchers apply by hand.

app.alphaswarm.io / runs / mr-vol-q3
Research run
Mean reversion in low-volatility regimes
Live · 4 agents2003 — 2024

Research pipeline

RUN · 03h 14m elapsed
  1. 01Done
    Ingest
    12 datasets · 4.1B rows
  2. 02Done
    Hypothesize
    3 candidates
  3. 03 Running
    Validate
    splits + leakage
  4. 04Queued
    Backtest
    walk-forward
  5. 05Queued
    Memo
    auto-draft

Agents

4 active · 8 standby
A-01 running
Feature search

Scanning 412 features for IC > 0.04 in vol_q1

62%
A-02 running
Regime detect

Hidden Markov fit · 4 states · BIC stable

88%
A-03 needs review
Lead/lag scout

Cross-asset CC matrix, lag ∈ [1,21]

100%
A-04 needs review
Anomaly scout

Outlier days flagged: 17 · awaiting human

100%

Hypothesis queue

  • #214

    Mean reversion in low-vol regimes (SPX intraday)

    validating
  • #213

    Lead/lag: HY OAS → Russell 2000 drawdowns

    validating
  • #212

    Earnings PEAD decay accelerates post-2018

    rejected · Coverage bias in dataset

    rejected
  • #211

    Carry-momentum cross in 10Y rates

    promoted

Backtest · B-0098

In-sampleOut-of-sample2003 — 2024
Sharpe
1.42OOS
Hit rate
54.1%OOS
Max DD
−7.3%OOS
Turnover
3.1×annual
OOS · 2019

Discipline checks

1 warning
  • Train/test split
    Walk-forward · 6 folds
    passed
  • Leakage scan
    1 candidate · rolling_std(target)
    review
  • Multiple comparisons
    BH-FDR @ 0.10
    passed
  • Regime stability
    Sharpe stable across 4 states
    passed
  • Human review
    Awaiting sign-off · @priya
    pending

Connections

5 sources · 2 environments
  • Polygon · US equities
    MCP
    live
  • FRED macro
    MCP
    live
  • Internal · features.parquet
    Dataset
    linked
  • S3 · earnings_v3
    Dataset
    linked
  • Bloomberg B-PIPE
    MCP
    re-auth

Research memo · auto-draft

v0.4 · draft
Memo · MR-VOL-Q3

Mean reversion in low-volatility regimes

Across 2003–2018, daily SPX returns exhibit a statistically robust short-horizon mean reversion when realized volatility sits in the lowest quartile. Out-of-sample (2019–2024) the effect persists, with Sharpe of 1.42 net of estimated costs and a max drawdown of −7.3%.

Walk-forward folds remained directionally consistent; leakage scan flagged one candidate feature rolling_std(target) now removed. Recommended for paper promotion pending desk review.

Why Alpha Swarm

Agentic exploration meets quant rigor.

Most agent stacks optimize for plausibility. Quantitative research demands the opposite — falsifiability. Alpha Swarm is built around that constraint.

01

Hypotheses, not prompts.

Agents don't 'find alpha'. They draft falsifiable hypotheses with explicit assumptions, prior probability estimates, and a planned test.

// hypothesis
P(reversion | vol_q1) > P(reversion)
// test
walk_forward(SPX, k=6, oos=0.25)
// reject if
SharpeOOS < 0.5 FDR > 0.10
02

Tests against the real world.

Every claim is run against historical data through a validation layer that enforces train/test separation, leakage scans, and walk-forward folds.

IS 1IS 2IS 3IS 4OOSOOSwalk-forward · 6 folds
03

Discipline at runtime.

Anti-overfitting rules — multiple-comparisons correction, regime stability, version-locked features — are part of the runtime, not a checklist.

FDR ≤ 0.10
min splits = 6
feature lock
seed pinned
04

Auditable traces.

Every decision an agent makes is recorded. Reviewers see which features, splits, and seeds produced a result — never a black box.

A-02·fitHMM(4)
validator·splitk=6
A-04·flagleakage:1
human·sign@priya
Architecture

From raw data to falsifiable claims.

A single, opinionated pipeline. Each stage produces an artifact the next stage can verify — so research is composable and auditable end-to-end.

01
Data sources

Market, fundamentals, alt-data and internal datasets — connected via MCP and immutable dataset versions.

MCPParquetLakehouse
02
Research agents

Specialized agents propose features, regimes, lead/lag relationships, and anomalies under shared tooling.

FeatureRegimeAnomaly
03
Hypothesis engine

Free-form ideas are compiled into falsifiable, parameterized hypotheses with explicit reject criteria.

FalsifiableVersioned
04
Validation layer

Train/test separation, leakage detection, and multiple-comparisons control are enforced before a single backtest runs.

SplitsLeakageFDR
05
Backtest engine

Walk-forward simulation with realistic costs, capacity assumptions, and regime-stratified performance.

Walk-fwdCostsCapacity
06
Research memo

Auto-drafted, fully traced memo: assumptions, splits, leakage notes, OOS results, and reviewer sign-off.

TraceSign-off
Discipline

Built to survive out-of-sample.

The hardest part of agentic research isn't generation — it's not fooling yourself. Alpha Swarm encodes the safeguards quant teams already trust, and applies them every time.

Illustrative · OOS pressure test
Naive vs disciplined research
OOS begins
Naive · OOS Sharpe
0.31
Disciplined · OOS Sharpe
1.12
  • 01

    Walk-forward validation

    Sequential train/test folds across time. No information from the future leaks into the past — by construction.

  • 02

    Out-of-sample by default

    Every claim ships with a held-out OOS window. The system refuses to publish a memo without it.

  • 03

    Experiment registry

    Every backtest, seed, and feature set is logged. Duplicate runs are detected and joined to prevent silent re-fitting.

  • 04

    Feature & version tracking

    Datasets, code, and feature definitions are content-addressed. Memos cite exact hashes — reruns are reproducible to the byte.

  • 05

    Multiple-hypothesis correction

    Benjamini–Hochberg FDR is applied across the swarm's proposals, not per agent. Volume of search is treated as a cost.

  • 06

    Leakage detection

    Static and runtime checks flag look-ahead features, target encoding, and survivorship — before a backtest is allowed to run.

  • 07

    Human review checkpoints

    Promotion to paper or live trading requires a human reviewer to sign the trace. The runtime won't bypass it.

  • 08

    Regime stratification

    Performance is reported per regime (vol, rates, dispersion). Strategies that work only in one regime are surfaced, not buried.

Get in

Build research systems that survive contact with reality.

Alpha Swarm is currently onboarding a small number of design partners. If you run a research team and care about discipline as much as discovery, we’d like to talk.

No newsletters. We reply by hand within 48 hours.