Setup
Results
Results will appear here
Avg Quality Score per Template
Template Performance
| # | Template | Status | Runs | Avg Score | Best | Worst | Avg Tokens | Avg Cost | Avg Speed | Actions |
|---|---|---|---|---|---|---|---|---|---|---|
| 🥇 |
Apex
|
Active | 6 | 8.42 | 8.6 | 8.0 | 1,337 | $0.000567 | 10,167ms |
|
| 🥈 |
Balanced Refinement
Improves draft while strictly preserving format and sco
|
Active | 26 | 8.26 | 9.0 | 7.2 | 1,032 | $0.000358 | 11,087ms |
|
| 🥉 |
Quality Checkpoint
Systematic evaluation across 5 quality dimensions
|
Active | 26 | 8.24 | 9.0 | 7.0 | 1,010 | $0.000274 | 10,206ms |
|
| 4 |
Apex Lite
|
Active | 6 | 8.12 | 8.6 | 7.4 | 1,307 | $0.000595 | 13,462ms |
|
| 5 |
Apex Max
|
Active | 6 | 8.06 | 8.4 | 7.6 | 1,498 | $0.001065 | 13,141ms |
|
| 6 |
Delta Focus
|
Active | 15 | 8.01 | 8.4 | 7.3 | 1,003 | $0.000291 | 9,063ms |
|
| 7 |
Constraint-First
Strict format compliance enforced before content improv
|
Active | 20 | 7.95 | 9.2 | 6.5 | 1,138 | $0.000367 | 14,298ms |
|
| 8 |
Adaptive Enhancement
Model identifies and fixes the 2-3 most impactful weakn
|
Active | 20 | 7.95 | 8.7 | 7.0 | 1,053 | $0.000393 | 11,089ms |
|
| 9 |
Surgical Precision
Minimal targeted changes only — structure and length
|
Active | 20 | 7.84 | 8.9 | 6.0 | 1,092 | $0.000326 | 11,368ms |
|
| 10 |
Minimal Intervention
Conservative polish — only critical vagueness and fil
|
Active | 20 | 7.75 | 8.3 | 6.7 | 1,019 | $0.000324 | 11,780ms |
|
| 11 |
Comparative Standard
Benchmark-driven: precision, actionability, clarity, fl
|
Active | 26 | 7.72 | 9.0 | 0.0 | 1,030 | $0.000172 | 12,140ms |
|
| 12 |
Precision Checkpoint
-
|
Active | 16 | 7.44 | 9.1 | 0.0 | 1,211 | $0.000290 | 15,707ms |
|
| 13 |
Mirror the Rubric
-
|
Active | 15 | 7.32 | 8.7 | 0.0 | 1,167 | $0.000303 | 9,611ms |
|
| 14 |
Fact-Dense
Maximizes information density — replace generalities
|
Disabled | 14 | 7.95 | 8.8 | 5.5 | 883 | $0.000136 | 11,400ms |
|
| 15 |
Examiner's Standard
|
Disabled | 9 | 7.88 | 8.8 | 5.5 | 865 | $0.000100 | 9,803ms |
|
| 16 |
Zero Waste
|
Disabled | 9 | 7.68 | 8.0 | 7.5 | 824 | $0.000136 | 10,601ms |
|
| 17 |
Self-Correcting
Meta-cognitive: check format compliance first, then imp
|
Disabled | 14 | 7.62 | 9.0 | 4.0 | 970 | $0.000158 | 12,028ms |
|
| 18 |
Error-Proof
Extreme caution mode — counts constraints before maki
|
Disabled | 14 | 7.38 | 9.1 | 0.0 | 865 | $0.000135 | 10,944ms |
|
| 19 |
Sentence-Level Perfectionist
|
Disabled | 9 | 6.92 | 7.9 | 4.0 | 855 | $0.000114 | 9,768ms |
|
| 20 |
Ultra-Cheap
-
|
Disabled | 6 | 6.80 | 7.2 | 6.2 | 782 | $0.000104 | 12,665ms |
|
| Session | Refine Model | Judge Model | Templates | Original | Best Refined | Δ | Date | |
|---|---|---|---|---|---|---|---|---|
| minimax-m2.7 | claude-sonnet-4.6 | 13 | 7.5 | 8.1 | +0.6 | Apr 12, 2026 13:53 | ||
| gpt-oss-120b | claude-sonnet-4.6 | 13 | 8.2 | 8.7 | +0.5 | Apr 12, 2026 13:41 | ||
| gemma-3-27b-it | claude-sonnet-4.6 | 13 | 7.2 | 8.6 | +1.4 | Apr 12, 2026 13:39 | ||
| gemma-3-27b-it | claude-sonnet-4.6 | 13 | 7.2 | 8.6 | +1.4 | Apr 12, 2026 13:34 | ||
| gpt-oss-120b | claude-sonnet-4.6 | 13 | — | — | — | Apr 12, 2026 13:29 | ||
| gemma-3-27b-it | claude-sonnet-4.6 | 13 | 7.8 | 8.6 | +0.8 | Apr 12, 2026 13:26 | ||
| gemma-3-27b-it | claude-sonnet-4.6 | 16 | 7.8 | 8.5 | +0.7 | Apr 12, 2026 13:18 | ||
| gemma-3-27b-it | claude-sonnet-4.6 | 16 | 8.0 | 8.5 | +0.5 | Apr 12, 2026 13:13 | ||
| gemma-3-27b-it | claude-sonnet-4.6 | 10 | 7.8 | 8.8 | +1 | Apr 12, 2026 13:00 | ||
| gemma-3-27b-it | claude-sonnet-4.6 | 8 | 7.2 | 8.4 | +1.2 | Apr 12, 2026 12:52 | ||
| gemma-3-27b-it | claude-sonnet-4.6 | 8 | 7.2 | 8.5 | +1.3 | Apr 12, 2026 12:49 | ||
| gemma-3-27b-it | claude-sonnet-4.6 | 8 | 7.8 | 8.7 | +0.9 | Apr 12, 2026 12:31 | ||
| gemma-3-27b-it | claude-sonnet-4.6 | 8 | — | — | — | Apr 12, 2026 12:29 | ||
| gemma-3-27b-it | claude-sonnet-4.6 | 9 | — | — | — | Apr 12, 2026 12:08 | ||
| gemma-3-27b-it | claude-sonnet-4.6 | 16 | — | — | — | Apr 12, 2026 12:00 | ||
| gemma-3-27b-it | claude-sonnet-4.6 | 12 | 8.2 | 8.5 | +0.3 | Apr 12, 2026 11:41 | ||
| gemma-3-27b-it | claude-sonnet-4.6 | 12 | 6.5 | 7.8 | +1.3 | Apr 12, 2026 11:30 | ||
| gemma-3-27b-it | claude-sonnet-4.6 | 12 | 7.8 | 8.6 | +0.8 | Apr 12, 2026 11:11 | ||
| gemma-3-27b-it | claude-sonnet-4.6 | 11 | 8.2 | 8.8 | +0.6 | Apr 12, 2026 10:59 | ||
| gemma-3-27b-it | claude-sonnet-4.6 | 11 | 8.2 | 9.0 | +0.8 | Apr 12, 2026 10:54 | ||
| gemma-3-27b-it | claude-sonnet-4.6 | 2 | 8.2 | 9.1 | +0.9 | Apr 12, 2026 10:51 | ||
| gemma-3-27b-it | claude-sonnet-4.6 | 10 | 7.8 | 8.5 | +0.7 | Apr 12, 2026 10:37 | ||
| gemma-3-27b-it | claude-sonnet-4.6 | 10 | 7.2 | 9.2 | +2 | Apr 12, 2026 09:48 | ||
| gemma-3-27b-it | claude-sonnet-4.6 | 10 | 7.2 | 8.8 | +1.6 | Apr 12, 2026 09:35 | ||
| gemma-3-27b-it | claude-sonnet-4.6 | 10 | 7.5 | 8.5 | +1 | Apr 12, 2026 09:27 | ||
| gemma-3-27b-it | claude-sonnet-4.6 | 10 | 8.2 | 9.0 | +0.8 | Apr 12, 2026 08:56 | ||
| gemma-3-27b-it | claude-sonnet-4.6 | 10 | 0.0 | 9.0 | +9 | Apr 12, 2026 08:49 |