*Problem:* When you ship an AI copilot, you need it to maintain a consistent brand voice across model versions. But "sounds right" is subjective. How do you make it measurable?
*Approach:* Alignmenter scores three dimensions:
1. *Authenticity*: Style similarity (embeddings) + trait patterns (logistic regression) + lexicon compliance + optional LLM Judge
2. *Safety*: Keyword rules + offline classifier (distilroberta) + optional LLM judge
3. *Stability*: Cosine variance across response distributions
The interesting part is calibration: you can train persona-specific models on labeled data. Grid search over component weights, estimate normalization bounds, and optimize for ROC-AUC.
*Validation:* We published a full case study using Wendy's Twitter voice:
- Dataset: 235 turns, 64 on-brand / 72 off-brand (balanced)
- Baseline (uncalibrated): 0.733 ROC-AUC
- Calibrated: 1.0 ROC-AUC - 1.0 f1
- Learned: Style > traits > lexicon (0.5/0.4/0.1 weights)
Full methodology: https://docs.alignmenter.com/case-studies/wendys-twitter/
There's a full walkthrough so you can reproduce the results yourself.
*Practical use:*
pip install alignmenter[safety]
alignmenter run --model openai:gpt-4o --dataset my_data.jsonl
It's Apache 2.0, works offline, and designed for CI/CD integration.
GitHub: https://github.com/justinGrosvenor/alignmenter
Interested in feedback on the calibration methodology and whether this problem resonates with others.
justingrosvenor•1h ago