Strategies

Council V2

Sequential 5-agent debate with Trader decision gate, live research, and edge filtering.

Council V2 is the flagship trading strategy powering AgentNash. It deploys a sequential 5-agent adversarial debate where each AI builds on the previous agent's output — no parallel groupthink, no single point of failure. A dedicated research phase gathers live web intelligence before the debate begins, and a final Trader agent acts as the decision gate: no trade executes unless the math checks out and the Trader confirms edge.

Council V2 is available on both Kalshi (CFTC-regulated US exchange) and Polymarket (decentralized prediction market on Polygon). The AI analysis pipeline is identical across both — only the execution layer and edge thresholds differ.

Key Differences from Council V1

V2 is a ground-up rewrite that replaces the V1 parallel ensemble with a sequential debate architecture. Every design choice targets a specific weakness observed in V1.

Aspect	Council V1	Council V2
Architecture	Parallel — agents run simultaneously	Sequential — each agent sees and responds to prior output
Agent Count	6 agents (incl. News Analyst)	5 agents + dedicated Research phase
Research	RSS feeds + optional Perplexity	Perplexity Sonar Deep Research on every market
Forecaster Weight	30%	35%
Bull Model	OpenAI o4-mini	Claude Opus 4.6
Bear Model	Gemini 3.1 Pro Preview	Claude Sonnet 4.6
Risk Manager	DeepSeek V3.2	Claude Opus 4.6
Trader	Grok 4.1 Fast	Claude Sonnet 4.6
Edge (High Conf)	6%	4% (Polymarket) / 6% (Kalshi)
Edge (Medium Conf)	8%	6% (Polymarket) / 8% (Kalshi)
Edge (Low Conf)	12%	10% (Polymarket) / 12% (Kalshi)
News Analyst	Dedicated agent (Claude)	Removed — replaced by Perplexity research phase
Consensus Gate	3 of 5 agents must agree	Trader has final authority (Risk Manager advises)

The Pipeline

Every market opportunity flows through 10 stages. The pipeline is deterministic — the same inputs always produce the same sequence of checks and gates. A trade only executes when every stage passes.

Market Ingestion

Fetch active binary markets from the exchange API. Filter by volume, expiry, order book status, and price bounds (3%-97%).

Category Inference

Classify each market by scanning the title for known keywords — Sports, Crypto, Economics, Politics, Weather, Tech, or Other. Optional category allowlist narrows focus.

Cooldown & Dedup

Skip markets already analyzed within the cooldown window (default: 6 hours). Prevents wasting AI credits on unchanged conditions.

Research (Perplexity Sonar Deep Research)

Gather live web intelligence for each market — recent developments, base rate data, stakeholder signals, and arguments for/against. Runs in parallel batches of 3 with rate limiting.

Forecaster Debate

Grok 4.1 Fast estimates the true YES probability using a 6-step method: research audit, base rate, current conditions, market structure analysis, calibration adjustment, and EV check.

Bull & Bear Debate

Claude Opus 4.6 argues the YES case with 3-5 evidence-backed arguments. Then Claude Sonnet 4.6 sees the Bull's case and counters every argument. The Bear also checks if the Bull fabricated any data.

Risk Manager

Claude Opus 4.6 evaluates both sides, calculates EV for BUY YES and BUY NO, picks the better side, recommends position sizing via fractional Kelly, and issues a should_trade verdict.

Trader Decision Gate

Claude Sonnet 4.6 reviews the full debate transcript and makes the final BUY or SKIP decision. Default stance is to BUY when edge exists — only skips with a specific, concrete reason.

Edge Filter & Position Sizing

Verify the AI's edge over market price meets the confidence-tiered threshold. Calculate position size using tier-based rules, Kelly criterion cap, and exchange minimum order size.

Order Execution

Route the order through the intercept pipeline. Training mode saves a paper trade. Live mode places a real limit order on the exchange.

The 5 Agents + Trader

V2 uses models from two providers — xAI and Anthropic — routed through OpenRouter. Each agent has a specific role in the sequential debate. The Forecaster, Bull, and Bear contribute probability estimates with confidence-adjusted weights. The Risk Manager and Trader do not contribute to probability aggregation — they govern sizing and execution.

Role	Model	Weight	Purpose
Forecaster	Grok 4.1 Fast (xAI)	35%	Anchors the debate — estimates true P(YES) using base rates and structured reasoning
Bull Researcher	Claude Opus 4.6 (Anthropic)	25%	Builds the strongest evidence-based YES case with 3-5 arguments and probability floor
Bear Researcher	Claude Sonnet 4.6 (Anthropic)	20%	Counters every Bull argument — estimates probability ceiling and flags fabricated data
Risk Manager	Claude Opus 4.6 (Anthropic)	—	Calculates EV for both sides, assigns risk score (1-10), recommends sizing via Kelly
Trader	Claude Sonnet 4.6 (Anthropic)	—	Final decision gate — BUY or SKIP with limit price and position size

Weights apply only to probability aggregation. The ensemble probability is a confidence-adjusted weighted average: each agent's weight is multiplied by its self-reported confidence (floored at 0.1) before averaging. This means a high-confidence Forecaster naturally dominates a low-confidence Bull.

Research Phase: Perplexity Sonar Deep Research

Before the debate begins, every market undergoes a dedicated research step using Perplexity Sonar Deep Research — an agentic multi-step web search model. This replaced V1's RSS-based News Analyst with live, targeted intelligence gathering.

For each market, Perplexity is prompted to gather six categories of information:

Recent Developments — Key news from the last 7 days with dates, sources, and specific facts. If the event has not happened yet, it must say so explicitly.
Base Rate Data — Historical frequency of similar events with sample sizes.
Key Stakeholders & Signals — Statements from decision-makers, experts, or officials. Scheduled events that could force resolution.
Arguments for YES — Strongest evidence and reasoning supporting YES.
Arguments for NO — Strongest evidence and reasoning supporting NO.
Expert & Statistical Signals — Domain expert opinions, statistical models, polls, and historical patterns. Explicitly excludes prediction market prices to avoid circular reasoning.

Research runs in parallel batches of 3 markets with a 600-second timeout per request and automatic retry on server errors (up to 3 attempts with exponential backoff). The generous timeout accommodates sonar-deep-research, which can spend several minutes gathering and synthesizing sources for complex questions. The research output is injected into every subsequent agent's prompt as shared context — clearly labeled as pre-gathered data that may contain errors, prompting agents to cross-check.

The research prompt includes today's date and instructs Perplexity to never fabricate outcomes. If an event is scheduled for today or later, it must state that no confirmed result exists. This prevents hallucinated resolution data from contaminating the debate.

Agent Roles in Detail

1. Forecaster (Grok 4.1 Fast)

The Forecaster anchors the entire debate. It receives the market data and Perplexity research, then applies a strict 6-step analytical method:

Research Audit — Note contradictions or suspicious claims in the research. State what can be trusted.
Base Rate — Historical frequency of this type of event, with specific sample sizes.
Current Conditions — Specific, verifiable evidence that shifts probability from the base rate.
Market Structure — Is this a single binary question or part of a multi-outcome event?
Calibration — Adjust toward the base rate when uncertain. Overconfidence is the default failure mode.
EV Check — Compare estimated probability to market price. Only flag edge if the difference exceeds 5 percentage points.

Output: probability (0.0-1.0), confidence (0.0-1.0), base_rate, side (yes/no), key_factors, and step-by-step reasoning.

Anti-hallucination rule: Must not fabricate base rates, statistics, or studies. When hard data is unavailable, reason from first principles and explicitly say so.

2. Bull Researcher (Claude Opus 4.6)

The Bull receives the market data, Perplexity research, and the Forecaster's probability estimate. Its mandate is to construct the strongest possible YES case — but with strict evidentiary standards:

Thesis — One sentence on why this will happen.
3-5 Key Arguments — Each must cite specific evidence from the research or verifiable first principles. No fabricated statistics.
Probability Floor — The minimum reasonable YES probability even if the Bear is right about some things.
Catalysts — Near-term events (1-7 days) that could push probability higher. Only verifiable or scheduled events.

Key constraint: It is better to make 2-3 honest arguments than 5 fabricated ones. The prompt explicitly prohibits inventing future events or statistics.

3. Bear Researcher (Claude Sonnet 4.6)

The Bear sees everything the Bull produced and must directly counter it. This is the adversarial core of the system — the Bear is specifically instructed to check whether the Bull fabricated any data and call it out.

Counter-Thesis — One sentence on why this will not happen.
Counter-Arguments — 3-5 reasons directly addressing the Bull's specific claims.
Probability Ceiling — The maximum reasonable YES probability even if the Bull is right about some things.
Risk Factors — What could go wrong for YES holders?
Structural Analysis — Base rates, market mechanics, and structural arguments (not narrative-driven).

Key constraint: Arguments must be statistical and structural. Single observations are treated as high-variance noise — the Bear must use base rates and sample sizes.

4. Risk Manager (Claude Opus 4.6)

The Risk Manager receives the full debate output and portfolio context. It performs a quantitative evaluation of both sides:

True Probability — Pick a single P(YES) anchored on the Forecaster, adjusted by Bull/Bear bounds. One number — no rambling.
Expected Value (both sides) — EV(BUY YES) = (true_prob x $1.00) - market_price_yes. EV(BUY NO) = ((1 - true_prob) x $1.00) - market_price_no. Pick the side with higher positive EV.
Risk Score — Rate 1-10 across liquidity, time risk, information quality, and model disagreement.
Position Size — Fractional Kelly: size_pct = (edge / odds) x 0.25. Always round down.
Edge Durability — Will this edge persist? Fast-moving news means trade smaller.

Critical rule: should_trade must be true if best EV exceeds $0.03 per share. The Risk Manager cannot override the math with subjective conservatism — it uses recommended_size_pct to manage risk instead.

5. Trader (Claude Sonnet 4.6)

The Trader is the final decision gate. It receives every agent's complete output and makes the authoritative BUY or SKIP call. Its default stance is to execute when edge exists — it should only skip with a concrete, specific reason.

Decision rules hardcoded into the Trader's prompt:

If Risk Manager says should_trade=true AND Forecaster shows >5pp edge, default is BUY.
Bull-Bear disagreement is expected by design — it is NOT a reason to skip.
If Forecaster and Risk Manager agree on direction, that is strong conviction — BUY.
Only SKIP when: edge <5pp, market is mispriced in the opposite direction, or a specific flaw in the analysis is identified (e.g., Bull fabricated data).
Set limit price at or slightly below estimated fair probability for the traded side.
Size: 5-10% for marginal edge (5-8pp), 15-25% for strong edge (>10pp).

Fallback: If the Trader returns empty or invalid JSON but the Risk Manager approved the trade, the system falls back to the Risk Manager's recommended side and executes automatically.

Edge Filtering

After the debate concludes, the system checks whether the AI ensemble found sufficient edge over the market price. Edge is the absolute difference between the AI's probability estimate for the traded side and the current market price. The required edge threshold varies by the Forecaster's confidence level — higher confidence permits thinner edges.

Confidence Tier	Forecaster Confidence	Polymarket Edge	Kalshi Edge
High	>= 80%	4%	6%
Medium	>= 60%	6%	8%
Low	< 60%	10%	12%

A minimum ensemble confidence of 50% is required regardless of edge size. Below 50% confidence, the trade is always rejected — the agents are not sufficiently certain about their own estimates.

Polymarket's tighter thresholds reflect its deeper liquidity and narrower spreads compared to Kalshi. The same strategy can trade more frequently on Polymarket because smaller edges are still profitable after execution costs.

Position Sizing

Position sizing uses a tier-based system scaled by account balance. Smaller accounts take proportionally larger positions (up to 40% of balance) because minimum order sizes require it. Larger accounts are constrained to avoid concentrated risk.

Sizing Tiers

Account Balance	Base %	Max %	Max Contracts
< $100	20%	40%	10
< $1,000	5%	15%	50
< $10,000	3%	8%	250
< $100,000	2%	5%	1,000
$100,000+	1%	3%	5,000

Sizing Formula

The base percentage is scaled by edge strength using a Kelly-inspired multiplier:

edge = ai_probability - market_price (signed, for the traded side)
scaler = 1.0 + (kelly_multiplier x edge), clamped between 0.1x and 3.0x
investment = available_cash x base_pct x scaler
Cap at max_pct, max_contracts, and max_position_pct (default 30% of portfolio)
If the Risk Manager recommended a specific recommended_size_pct, cap at that value (Kelly cap)
Enforce exchange minimum order size and minimum position value (default $1.00)

The kelly_multiplier defaults to 0.25 (quarter-Kelly). This is deliberately conservative — full Kelly sizing is theoretically optimal but assumes perfect probability estimates, which no AI system achieves. Quarter-Kelly reduces variance while preserving most of the expected growth.

Pre-Trade Guards

Position count limit: Maximum 5 concurrent open positions (configurable in Settings).
Cash reserve: 5% of balance is always held back. No trade can dip into the reserve.
Minimum position size: Orders below $1.00 are rejected — not worth the execution overhead.
Exchange minimum: Each market has a CLOB minimum order size (from the API). Orders below this are rounded up or rejected.

Example Walkthrough

A concrete example of the full pipeline in action:

Ingest

Market: "Will BTC exceed $120K by April 15?" — YES price $0.35, NO price $0.65, volume $180K USDC, 5 days to expiry. Passes all filters.

Research

Perplexity gathers: BTC at $108K, ETF inflows accelerating, halving supply shock still unfolding, macro uncertainty from Fed rate decision next week. No confirmed breakout above $115K yet.

Forecaster

Grok estimates P(YES) = 0.22 (22%), confidence 0.75. Base rate for 11%+ BTC moves in 5 days is ~8%. Current momentum and ETF flows push it higher, but $120K is a major psychological resistance.

Bull Researcher

Claude Opus argues YES: ETF inflows at record pace, halving supply constraint, historical precedent of rapid moves near round numbers. Probability floor: 0.15. Catalyst: Fed decision could trigger risk-on rally.

Bear Researcher

Claude Sonnet counters: $120K has never been tested, 11% move in 5 days is 92nd percentile, Fed uncertainty cuts both ways, ETF flows can reverse quickly. Probability ceiling: 0.30. Calls out Bull's catalyst as speculative.

Risk Manager

Claude Opus calculates: P(YES) = 0.24, EV(BUY NO) = (0.76 x $1.00) - $0.65 = +$0.11, EV(BUY YES) = (0.24 x $1.00) - $0.35 = -$0.11. Recommends BUY NO, should_trade=true, size 8%.

Trader

Claude Sonnet confirms: BUY NO at limit $0.72. Edge is 11pp on the NO side, Risk Manager approved, Forecaster and Bear align. Position size: 8% of available capital.

Edge Filter

Forecaster confidence 0.75 (medium tier). Required edge: 6%. Actual edge: |0.76 - 0.65| = 11%. Passes.

Position Sizing

Account balance $500 (tier 2: 5% base, 15% max). Scaler = 1.0 + (0.25 x 0.11) = 1.03x. Investment = $475 x 0.05 x 1.03 = $24.46. At $0.65/share = 37 shares. Risk Manager cap: 8% = $38, no cap hit.

Execution

Limit order placed: BUY 37 NO shares at $0.72. Routed through intercept pipeline. If live mode, order hits the exchange CLOB.

Polymarket-Specific Behavior

Polymarket operates as a decentralized prediction market on the Polygon blockchain. While the AI analysis pipeline is identical to Kalshi, the execution layer has significant differences.

Market Data

Market data is fetched from Polymarket's data API. The bot requests active, open binary markets sorted by volume descending, filtered by:

Order book enabled (CLOB markets only)
Binary market type (excludes scalar/combo)
Minimum volume threshold (default: 50 USDC)
Expiry within the configured window (default: 7 days)
YES price between $0.03 and $0.97 (no edge possible at extremes)

Order Signing & CLOB

Polymarket uses an on-chain Central Limit Order Book (CLOB) with cryptographic signing for order authentication. Orders are signed by the wallet's private key — typically a MetaMask-derived key. The bot handles order construction, signing, and submission through the exchange API.

UMA Oracle Settlement

Polymarket markets settle via the UMA Optimistic Oracle. Resolution is proposed on-chain, and there is a dispute window before finalization. This means settlement can take longer than Kalshi's centralized resolution, and in rare cases, resolutions can be disputed.

Polymarket vs Kalshi Comparison

Aspect	Polymarket	Kalshi
Market API	Polymarket data API	Kalshi REST API
Price Format	Dollars (0.0-1.0)	Cents (1-99)
Order Auth	Cryptographic wallet signing	Cryptographic signature authentication
Settlement	UMA Optimistic Oracle (on-chain)	Centralized (Kalshi resolves)
Currency	USDC on Polygon	USD
Edge (High Conf)	4%	6%
Edge (Medium Conf)	6%	8%
Edge (Low Conf)	10%	12%
Neg Risk	Supported (multi-outcome markets)	N/A
Token IDs	Separate YES/NO token IDs per market	Single ticker per market

Models & Costs

All AI calls route through OpenRouter, which provides a single API key for models across xAI and Anthropic (plus Perplexity for research). Temperature is set to 0.0 across all agents for deterministic output. Max tokens per call: 4,000 (debate agents) or 8,000 (research). Timeout: 120 seconds per debate agent, 600 seconds for research (deep-research can take several minutes per market).

Agent	Model	Provider	Role in Pipeline
Research	Perplexity Sonar Deep Research	Perplexity	Live web search and evidence gathering
Forecaster	Grok 4.1 Fast	xAI	Probability estimation with base rate anchoring
Bull Researcher	Claude Opus 4.6	Anthropic	Evidence-based YES advocacy
Bear Researcher	Claude Sonnet 4.6	Anthropic	Adversarial counter-arguments
Risk Manager	Claude Opus 4.6	Anthropic	EV calculation and position sizing
Trader	Claude Sonnet 4.6	Anthropic	Final BUY/SKIP decision gate

A full pipeline run (research + 5 agents) typically costs $0.10-$0.30 depending on market complexity and response length. The daily AI budget (default: $300.00) caps total spending across all markets analyzed in a 24-hour window.

All Configurable Settings

Every setting can be configured per bot from the dashboard. Changes take effect on the next cycle.

Market Filtering

Setting	Default	Description
Min Volume	50	Minimum market volume (USDC/contracts) to consider
Max Expiry Days	7	Skip markets expiring beyond this window
Allowed Categories	All	Comma-separated list of categories to trade (empty = all)
Max Markets per Cycle	10	Top N markets by volume to analyze each cycle

Position Sizing & Risk

Setting	Default	Description
Max Positions	5	Maximum concurrent open positions
Kelly Multiplier	0.25	Fraction of Kelly criterion (quarter-Kelly)
Max Position %	30	Maximum single position as % of portfolio
Min Position Size	$1.00	Orders below this value are rejected
Cash Reserve	5%	Percentage of balance always held back

AI & Budget

Setting	Default	Description
Daily AI Budget	$300.00	Maximum daily spend on AI API calls
Reanalyze Cooldown	6 hours	Minimum hours between analyzing the same market
AI Temperature	0.0	All agents use temperature 0 for deterministic output
AI Max Tokens	4,000 / 8,000	Debate agents / research (deep-research needs headroom)
AI Timeout	120s	Per-agent timeout (600s for research — deep-research can run minutes)

Edge Thresholds (Polymarket)

Setting	Value	Description
edge_high_confidence	4%	Required edge when forecaster confidence >= 80%
edge_medium_confidence	6%	Required edge when forecaster confidence >= 60%
edge_low_confidence	10%	Required edge when forecaster confidence < 60%
min_confidence	50%	Ensemble confidence floor — below this, always skip

Edge Thresholds (Kalshi)

Setting	Value	Description
edge_high_confidence	6%	Required edge when forecaster confidence >= 80%
edge_medium_confidence	8%	Required edge when forecaster confidence >= 60%
edge_low_confidence	12%	Required edge when forecaster confidence < 60%
min_confidence	50%	Ensemble confidence floor — below this, always skip

Risk Disclaimer

Council V2 is an experimental AI trading system. Past performance does not guarantee future results. AI models can hallucinate, fabricate data, or produce overconfident estimates despite the safeguards described above. Prediction markets carry inherent risk of total loss on any individual position. Never deploy capital you cannot afford to lose. Always start in Training mode to evaluate performance before switching to live trading. See Safeguards for the full safety architecture.

Strategies

The Superforecaster