The evaluation and governance layer for regulated AI teams. Compare 37+ frontier LLMs on your own data, produce compliance-grade audit trails, and run the entire workflow inside the EU. Purpose-built for AI platform leads, governance teams, and CISOs who can't deploy LLMs on vibes.
Upload a dataset. Fan out across every model in the Brain Orchestra catalog (30+ today, growing as BO adds them). Get a scorecard — latency, token count, credits, quality — sortable by any column. Export to PDF / CSV / JSON.
LLM-as-judge scoring with configurable rubric. Claude Opus is the default judge; swap for domain-specific grounding. Per-row quality score per model.
Every eval run produces an admissible audit artifact: dataset hash, model versions, hosting tier, per-row I/O, timestamps, EU-region attestation. Covers AI Act Art. 15 accuracy-logging requirements.
Save suites. Detect prompt regressions before deploy. Compare model upgrades objectively. Wire into CI/CD via REST API. Prompt changes caught in dev, not by customers.
| Provider | Models | Hosting tier eligibility |
|---|---|---|
| Mistral (France, EU-owned) | Small · Medium · Large · Codestral · Devstral · Pixtral (Large + 12B) · Magistral (Small + Medium) | EU StrictEU Cloud |
| Anthropic Claude | Haiku · Sonnet · Opus (latest) | EU CloudUnrestricted |
| OpenAI | GPT-4o · GPT-4o mini · GPT-4.1 (full + mini + nano) · GPT-5 (full + mini + nano) · GPT-o3 · GPT-o4 mini | Unrestricted |
| Gemini Flash · Gemini Pro | Unrestricted | |
| Amazon (Bedrock Stockholm + Frankfurt) | Nova Lite Sweden · Devstral Sweden | EU CloudUnrestricted |
| Moonshot (PRC, opt-in) | Kimi K2.6 | Unrestricted |
Live catalog at justsmarter.ai/eval — auto-syncs from Brain Orchestra every 48h, with the full per-change history at /changelog.
| Plan | Price | Monthly credits | Includes |
|---|---|---|---|
| Free | €0 · 14-day trial | 50,000 | Open compliance tier only · credit card required · hard stop at trial end · trial data purged 90 days after expiry |
| Pro | €29 / month | 500,000 | All hosting tiers (incl. EU Strict) · top-up packs · drift dashboard (30-day history) · BYOB key supported |
| Team | €299 / month | 15,000,000 | Slack support · top-up at 10% off · drift dashboard with extended history · 99.5% target SLA |
| Business | €799 / month | 100,000,000 | Workspace SSO (Google / Microsoft) · admin console · top-up at 15% off · 99.5% target SLA · contact sales |
| Enterprise | from €30,000 / year | 500M+ (custom) | SAML / SCIM · counter-signed DPA + SCCs · dedicated EU region · custom eval kits · dedicated CSM · 99.9% target SLA |