AgentNash
What is AgentNash?How It WorksConnecting Your AccountTraining vs Live ModeStrategiesCouncil V2The SuperforecasterThe Council (Legacy)TerminalAgentsBenchmarkingSafeguards & RulesNuclear Option

Dashboard

Benchmarking

Agent rankings, performance comparison, and P&L leaderboard.

The Benchmarking page is AgentNash's performance analytics engine — a real-time leaderboard that ranks every deployed agent by cumulative P&L and surfaces the quantitative signals you need to evaluate, compare, and optimize your trading strategies. Every metric is derived from actual trade execution data, not backtests or simulations.

Head-to-Head Comparison

At the top of the page, a dual-line chart overlays the cumulative P&L curves of your two highest-performing agents. The green line represents the current leader; blue represents the runner-up. This chart updates in response to the selected time period, so you can compare momentum across different horizons — a one-day sprint versus a three-month marathon may tell very different stories.

Below each line, summary cards display the agent's name, avatar, win rate, trade count, P&L, and category breakdown for the selected period. Use this view to identify which agent is generating edge and whether its performance is accelerating or plateauing.

Period Selector

A row of filter buttons lets you slice all performance data by time window. The available periods are:

PeriodWindowUse Case
1DLast 24 hoursIntraday performance check after a trading cycle
7DLast 7 daysWeekly momentum and short-term trend detection
1MLast 30 days (default)Primary evaluation window for strategy comparison
3MLast 90 daysLonger-term consistency and drawdown analysis
AllFull history (365 days)Lifetime track record since agent deployment

Switching periods recalculates the head-to-head chart, the Trades and P&L columns in the leaderboard, and the sparkline header label. Win Rate, Confidence, and Best Category reflect all-time aggregates regardless of the selected period.

Leaderboard Table

The full agent roster is displayed in a sortable table ranked by total cumulative P&L (descending). Only agents that have executed at least one trade — or are currently running — appear in the table.

ColumnDescription
RankPosition in the leaderboard. Gold, silver, and bronze medal icons for the top three; numeric rank for all others.
AgentAvatar, display name, and strategy label. The avatar is color-coded by bot type for instant visual identification.
Win RatePercentage of trades with positive P&L, shown as a progress bar and numeric value.
TradesCount of placed trades in the selected period. Skipped analyses are shown as a secondary count when present.
ConfidenceAverage forecaster confidence across all analyzed markets, rendered as a progress bar (0-100%).
Best CategoryThe prediction market category (e.g., Politics, Crypto, Economics) where the agent has the strongest track record.
P&LNet profit or loss for the selected period, color-coded green for gains and red for losses.
SparklineA miniature cumulative P&L chart showing the all-time equity curve at a glance.
StatusCurrent operational state (Active or Paused) plus the trading mode — Live (real capital) or Paper (simulated).

Metrics Explained

Win Rate

Calculated as wins / total_trades x 100. A trade is counted as a win if its realized P&L is greater than zero. Win rate alone does not capture edge — an agent with 40% win rate but large winners and small losers can outperform a 70% win rate agent with the opposite payoff profile. Always evaluate win rate alongside P&L.

Confidence

The average confidence score returned by the forecasting agent across all markets analyzed. Higher confidence indicates the AI model had stronger signal from research data and base rates. Confidence directly affects position sizing and edge thresholds — high-confidence trades receive larger allocations and lower minimum edge requirements.

Best Category

Determined by analyzing trade volume and success rate per prediction market category. This reveals where each agent has developed a comparative advantage — some agents may excel at political markets while others perform better on crypto or economic events.

Sparkline

The miniature equity curve is built from the agent's full trade history, plotting cumulative P&L over every executed trade. An upward-sloping curve indicates consistent edge; a volatile or declining curve signals strategy deterioration or adverse market conditions.

Using Benchmarking

The Benchmarking page is designed to support three core analytical workflows:

1. Strategy Comparison

Deploy multiple agents with different configurations — varying edge thresholds, category filters, position sizing, or AI model selections — then let the leaderboard quantify which approach generates superior risk-adjusted returns. The head-to-head chart makes divergence between two strategies immediately visible.

2. Trend Detection

Use the period selector to compare short-term and long-term performance. An agent ranked first over 3 months but slipping on the 7-day view may be experiencing regime change or model degradation. Conversely, a lower-ranked agent with strong recent momentum could be adapting better to current market conditions.

3. Live vs. Paper Evaluation

The Status column distinguishes agents running with real capital from those in paper (training) mode. Before promoting a paper agent to live trading, use the leaderboard to verify that its win rate, confidence, and P&L trajectory meet your deployment criteria over a statistically meaningful sample of trades.

Switch between time periods frequently. A strategy that looks strong over 30 days may reveal a different story on the 1-day or 3-month view. The most reliable agents show consistent upward equity curves across all time horizons.