Configure a comparison session using the seeded research fixtures and available model adapters.

Drizzle + SQLiteBun workflowProxy-task MVP
Model adapters available
5 ready, 2 need setup.
Open model setup and test panel
Openai.gpt Oss 120b readyAnthropic.claude Sonnet 4 6 readyMoonshotai.kimi K2.5 readyMinimax.minimax M2.5 readyGemini 2.5 Pro setup neededMistral Large Latest setup neededLocal Model ready
Compose a run
Select curated tasks, pick the routing strategy, and launch the run into the background.
Include baseline runs

Keep solo and random comparison runs in the same session.

Tasks

19 of 19 selected

Proxy set
Model set
Select the model adapters to compare. The first selected model becomes the fallback judge.
Selected models: Bedrock OpenAI, Bedrock Claude, Bedrock Kimi K2.5, Bedrock MiniMax M2.5, LM Studio local-model
Launch
The run persists checkpoints, audit logs, and summary metrics in SQLite.

Keep the public corpus aligned with the research scope and curated fixture policy.

Need to configure a provider or test a local LM Studio server? Open model setup
Need to import tasks first? Open task library