Glossary/AI judge

AI judge

An AI judge is a large language model used to evaluate other AI outputs — typically scoring a draft against criteria like voice match, factual accuracy, or constraint compliance.

The pattern (sometimes called "LLM-as-judge") uses one model to produce content and a second model — often a faster, cheaper one — to evaluate it against criteria. The judge typically outputs a structured verdict: a score, a verdict label, and a short reasoning trace. The reasoning is what makes the judgment legible: a number alone is opaque, but a number plus three specific stylistic deltas tells the user what to fix.

Modern voice-similarity scoring uses an AI judge: the generator drafts the piece in the brand voice; the judge compares the draft against the voice fingerprint and produces the similarity score. Judges are also used for quality control, fact-checking, and constraint enforcement (e.g. "did the draft include the required call-to-action?").

Why it matters

An AI judge turns subjective questions ("does this sound like the brand?") into measured ones. Without one, every draft requires human judgment; with one, only the borderline cases do.