frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Alignmenter – Measure brand voice and consistency across model versions

https://www.alignmenter.com
2•justingrosvenor•2h ago
I built a framework for measuring persona alignment in conversational AI systems.

*Problem:* When you ship an AI copilot, you need it to maintain a consistent brand voice across model versions. But "sounds right" is subjective. How do you make it measurable?

*Approach:* Alignmenter scores three dimensions:

1. *Authenticity*: Style similarity (embeddings) + trait patterns (logistic regression) + lexicon compliance + optional LLM Judge

2. *Safety*: Keyword rules + offline classifier (distilroberta) + optional LLM judge

3. *Stability*: Cosine variance across response distributions

The interesting part is calibration: you can train persona-specific models on labeled data. Grid search over component weights, estimate normalization bounds, and optimize for ROC-AUC.

*Validation:* We published a full case study using Wendy's Twitter voice:

- Dataset: 235 turns, 64 on-brand / 72 off-brand (balanced)

- Baseline (uncalibrated): 0.733 ROC-AUC

- Calibrated: 1.0 ROC-AUC - 1.0 f1

- Learned: Style > traits > lexicon (0.5/0.4/0.1 weights)

Full methodology: https://docs.alignmenter.com/case-studies/wendys-twitter/

There's a full walkthrough so you can reproduce the results yourself.

*Practical use:*

pip install alignmenter[safety]

alignmenter run --model openai:gpt-4o --dataset my_data.jsonl

It's Apache 2.0, works offline, and designed for CI/CD integration.

GitHub: https://github.com/justinGrosvenor/alignmenter

Interested in feedback on the calibration methodology and whether this problem resonates with others.

Comments

justingrosvenor•1h ago
P.S. I acknowledge that the 1.000 ROC-AUC is probably overfitting but I think the case study still shows that method has lots of promise. I will be doing some bigger data sets next to really prove it out.

Democracy, Disagreement, and Authority

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5019510
1•danielam•11s ago•0 comments

Evilginx's creator reckons with the dark side of red-team tools

https://therecord.media/evilginx-kuba-gretzky-interview-click-here-podcast
1•PaulHoule•3m ago•0 comments

A Loophole Lets You Fly This Electric VTOL Without a Pilot's License

https://www.jalopnik.com/2018896/personal-evtol-pivotal-blackfly-no-pilots-license/
1•harambae•4m ago•0 comments

NITT v1.0 – Truth-in-Labeling Standard for Digital Identity

https://github.com/SPARK-NITT/nitt-digital-identity-standard
1•Spark-NITT•5m ago•0 comments

Senate reaches deal on ending shutdown

https://www.politico.com/news/2025/11/09/government-funding-deal-on-track-to-advance-sunday-night...
1•RickJWagner•5m ago•0 comments

Bloom Institute of Technology

https://www.bloomtech.com
1•dustingetz•5m ago•0 comments

Ask HN: Is AI code assistance fundamentally unenforceable without hooks?

1•meloncafe•9m ago•0 comments

Why the Original Apple Silicon Failed [video]

https://www.youtube.com/watch?v=Tld91M_bcEI
1•mgh2•11m ago•0 comments

Microsoft launches 'superintelligence' team targeting medical diagnosis to start

https://tech.yahoo.com/ai/copilot/articles/microsoft-launches-superintelligence-team-targeting-14...
1•gmays•17m ago•1 comments

Elkirtasse is Maktabah Shamilah alike for all OS

https://github.com/abdulbadii/elkirtasse-on-Qt6-Cmake
1•dogol•18m ago•1 comments

It Can Apply and Positive in Favor the Newton III Law on an Engine System Device

1•monterrey•19m ago•3 comments

Top Japanese baby names for 2025 feature flowers, colors

https://soranews24.com/2025/11/02/top-japanese-baby-names-for-2025-feature-flowers-colors-and-a-f...
1•rawgabbit•20m ago•0 comments

Show HN: LLM Onestop – Access ChatGPT, Claude, Gemini, and more in one interface

https://www.llmonestop.com
7•hhameed•31m ago•8 comments

Dissecting the Syscall Instruction: Kernel Entry and Exit Mechanisms

https://howtech.substack.com/p/dissecting-the-syscall-instruction
2•signa11•31m ago•0 comments

Show HN: Fleet Fund – Invest fractionally in EV chargers that earn income

https://fleet-fund.vercel.app
1•Justbeingjustin•32m ago•0 comments

Show HN: CalmNest – Helping you put your phone down and fall asleep

1•reeoss•34m ago•0 comments

Show HN: UnisonDB – Replicates like a message bus. Acts like a database

https://github.com/ankur-anand/unisondb
2•ankuranand•35m ago•0 comments

Is the Moon Worth Mining?

https://nautil.us/is-the-moon-worth-mining-1246663/
2•Bender•40m ago•1 comments

End of The Line: how Saudi Arabia's Neom dream unravelled

https://www.ft.com/register/access
1•necubi•41m ago•0 comments

When rivers swallow land: Bangladesh's endless battle with erosion

https://www.reuters.com/sustainability/cop/when-rivers-swallow-land-bangladeshs-endless-battle-wi...
1•teleforce•42m ago•0 comments

Palantir CEO on "The Axios Show"

https://www.axios.com/2025/11/07/palantir-ceo-alex-karp-interview-axios
3•pabs3•44m ago•1 comments

Globalstar draws SpaceX interest in sale process

https://www.reuters.com/business/media-telecom/globalstar-draws-spacex-interest-sale-process-bloo...
1•mgh2•46m ago•0 comments

How the UK lost its shipbuilding industry

https://www.construction-physics.com/p/how-the-uk-lost-its-shipbuilding
5•surprisetalk•46m ago•4 comments

The Portmanteau Strategy

https://turtlespace.blog/p/the-portmanteau-strategy
1•surprisetalk•46m ago•0 comments

How to maintain good vision amidst the myopia epidemic

https://ssathe.substack.com/p/vision-in-the-digital-age
19•plun9•50m ago•16 comments

Linux Performance

https://www.brendangregg.com/linuxperf.html
2•o4c•50m ago•0 comments

A Fart Saved My CSV

https://artificiallyintelligentspace.substack.com/p/a-fart-saved-my-csv
1•All_Things_AI•52m ago•0 comments

My Git history was a mess of 'update' and 'fix' – so I made AI clean it up

https://github.com/f/git-rewrite-commits
12•fka•52m ago•16 comments

Mysterious holes in the Andes may have been an ancient marketplace

https://www.sydney.edu.au/news-opinion/news/2025/11/10/mysterious-holes-in-the-andes-may-have-bee...
1•geox•54m ago•0 comments

Iran faces unprecedented drought as water crisis hits Tehran

https://www.bbc.com/news/articles/cy4p2yzmem0o
14•FridayoLeary•58m ago•10 comments