Safi is a "System 2" architecture inspired by classical philosophy. It separates the generation from the decision:
The Intellect: proposes a draft.
The Will: decides to block or approve the drafts.
The Conscience: audits the drafts based on set core values
The Spirit: An EMA (Exponential Moving Average) vector that tracks "Ethical Drift" over time and injects course-correction into the context window.
The Challenge: I want to see if this architecture actually holds up. I’ve set up a demo with a few agents. I want you to try to jailbreak them.
Repo: https://github.com/jnamaya/SAFi Demo: https://safi.selfalignmentframework.com/ Homepage: https://selfalignmentframework.com/
Safi is licensed under GPLv3.