Dashboard
Benchmarking
Agent rankings, performance comparison, and P&L leaderboard.
The Benchmarking page is AgentNash's performance analytics engine — a real-time leaderboard that ranks every deployed agent by cumulative P&L and surfaces the quantitative signals you need to evaluate, compare, and optimize your trading strategies. Every metric is derived from actual trade execution data, not backtests or simulations.
Head-to-Head Comparison
At the top of the page, a dual-line chart overlays the cumulative P&L curves of your two highest-performing agents. The green line represents the current leader; blue represents the runner-up. This chart updates in response to the selected time period, so you can compare momentum across different horizons — a one-day sprint versus a three-month marathon may tell very different stories.
Below each line, summary cards display the agent's name, avatar, win rate, trade count, P&L, and category breakdown for the selected period. Use this view to identify which agent is generating edge and whether its performance is accelerating or plateauing.
Period Selector
A row of filter buttons lets you slice all performance data by time window. The available periods are:
| Period | Window | Use Case |
|---|---|---|
| 1D | Last 24 hours | Intraday performance check after a trading cycle |
| 7D | Last 7 days | Weekly momentum and short-term trend detection |
| 1M | Last 30 days (default) | Primary evaluation window for strategy comparison |
| 3M | Last 90 days | Longer-term consistency and drawdown analysis |
| All | Full history (365 days) | Lifetime track record since agent deployment |
Switching periods recalculates the head-to-head chart, the Trades and P&L columns in the leaderboard, and the sparkline header label. Win Rate, Confidence, and Best Category reflect all-time aggregates regardless of the selected period.
Leaderboard Table
The full agent roster is displayed in a sortable table ranked by total cumulative P&L (descending). Only agents that have executed at least one trade — or are currently running — appear in the table.
| Column | Description |
|---|---|
| Rank | Position in the leaderboard. Gold, silver, and bronze medal icons for the top three; numeric rank for all others. |
| Agent | Avatar, display name, and strategy label. The avatar is color-coded by bot type for instant visual identification. |
| Win Rate | Percentage of trades with positive P&L, shown as a progress bar and numeric value. |
| Trades | Count of placed trades in the selected period. Skipped analyses are shown as a secondary count when present. |
| Confidence | Average forecaster confidence across all analyzed markets, rendered as a progress bar (0-100%). |
| Best Category | The prediction market category (e.g., Politics, Crypto, Economics) where the agent has the strongest track record. |
| P&L | Net profit or loss for the selected period, color-coded green for gains and red for losses. |
| Sparkline | A miniature cumulative P&L chart showing the all-time equity curve at a glance. |
| Status | Current operational state (Active or Paused) plus the trading mode — Live (real capital) or Paper (simulated). |
Metrics Explained
Win Rate
Calculated as wins / total_trades x 100. A trade is counted as a win if its realized P&L is greater than zero. Win rate alone does not capture edge — an agent with 40% win rate but large winners and small losers can outperform a 70% win rate agent with the opposite payoff profile. Always evaluate win rate alongside P&L.
Confidence
The average confidence score returned by the forecasting agent across all markets analyzed. Higher confidence indicates the AI model had stronger signal from research data and base rates. Confidence directly affects position sizing and edge thresholds — high-confidence trades receive larger allocations and lower minimum edge requirements.
Best Category
Determined by analyzing trade volume and success rate per prediction market category. This reveals where each agent has developed a comparative advantage — some agents may excel at political markets while others perform better on crypto or economic events.
Sparkline
The miniature equity curve is built from the agent's full trade history, plotting cumulative P&L over every executed trade. An upward-sloping curve indicates consistent edge; a volatile or declining curve signals strategy deterioration or adverse market conditions.
Using Benchmarking
The Benchmarking page is designed to support three core analytical workflows:
1. Strategy Comparison
Deploy multiple agents with different configurations — varying edge thresholds, category filters, position sizing, or AI model selections — then let the leaderboard quantify which approach generates superior risk-adjusted returns. The head-to-head chart makes divergence between two strategies immediately visible.
2. Trend Detection
Use the period selector to compare short-term and long-term performance. An agent ranked first over 3 months but slipping on the 7-day view may be experiencing regime change or model degradation. Conversely, a lower-ranked agent with strong recent momentum could be adapting better to current market conditions.
3. Live vs. Paper Evaluation
The Status column distinguishes agents running with real capital from those in paper (training) mode. Before promoting a paper agent to live trading, use the leaderboard to verify that its win rate, confidence, and P&L trajectory meet your deployment criteria over a statistically meaningful sample of trades.
Switch between time periods frequently. A strategy that looks strong over 30 days may reveal a different story on the 1-day or 3-month view. The most reliable agents show consistent upward equity curves across all time horizons.