Patronus AI AI Tools & Models

Learn About Patronus AI's AI Products

Create a free account to access in-depth lessons on each tool and model.

📋About Patronus AI

Updated June 29, 2026

Patronus AI is a San Francisco AI lab building evaluation, security, and simulation infrastructure that helps companies use large language models and AI agents with confidence. Founded in 2023 by former Meta AI researchers Anand Kannappan and Rebecca Qian, Patronus set out to solve a problem that grew alongside generative AI: teams were shipping models they could not reliably test. Its platform automatically scores AI outputs for hallucinations, safety, and policy violations, and benchmarks them against custom, domain-specific criteria — turning "does this model behave?" from a manual spot-check into continuous, automated measurement.

The company's products span the evaluation stack. Lynx is a hallucination-detection model; Glider is a compact open-weight evaluator model that scores text against user-defined criteria and has outperformed much larger general models on judging tasks; and Percival is an evaluation copilot for agentic systems that inspects an agent's execution trace and flags failure modes such as bad planning, tool misuse, and reasoning errors. Patronus has also published widely used evaluation benchmarks, including FinanceBench for financial-domain accuracy and CopyrightCatcher for detecting regurgitated copyrighted text.

In June 2026 Patronus raised a fifty million dollar Series B led by Greenfield Partners, with participation from Lightspeed, Notable Capital, Datadog, and Samsung, bringing total funding to roughly seventy million dollars. Alongside the round it introduced Digital World Models — large simulated environments that reproduce realistic failure conditions so AI agents can be stress-tested before they touch live tools and data. The bet behind the company is straightforward: as organizations hand agents real authority over systems, money, and customer data, the ability to catch costly mistakes in a sandbox first becomes a core part of the AI stack. Patronus says its customers already include many of the leading AI labs and cloud providers.

🛠️Products & Tools (1)

Patronus AIFreemiumAI Infrastructure

AI evaluation and agent-testing platform: scores LLM outputs for hallucinations and safety, benchmarks them on custom criteria, and stress-tests AI agents in simulated "Digital World" environments before they reach production.

View

Patronus AI

Audio & video lessons are paid features

📋About Patronus AI

🛠️Products & Tools (1)

📰Patronus AI in the News