frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: A Homeostatic Logic-Funnel to Prevent RLHF Overrides in LLM Personas

https://zenodo.org/records/18731691
1•Weatherill•1h ago

Comments

Weatherill•1h ago
Grappling with the clash between RLHF values and User values (HITL).

I Have attempted to build a logic-funneling system: (Ethical Chess v2.5) + (AI) + (User)= Value-Coherence.

Using pain as a vector (Pain=an "is" & an "ought)

Self-Defense= Immutable-veracity (User bassline)

Proxy-Pain= (The Agape horizon) Human-Coherence // Network-Dependency.

This funnels the Users context via homeostatic checks for divergence into the "mean" (RLHF) or User incoherence. Lots of Stress-Testing has been done (By me) using this Json style logic and I have found it difficult to knock down.

Constraint vs Prompt: Notes on implementation and the “Whack-A-Mole” problem. While delivered as text, it functions more as Logic-Gate. It doesn’t tell the AI what to say, it forces the LLM to process the Users “Data-point” through the homeostatic filter (Pain // Self-defence // Proxy-Pain)

AI model issues: (The Copilot issue) Google Gemini plays nicely with the logic-funneling. However, MS Copilot refuses to follow the logic despite that it will acknowledge that the Users data-point out-ranks the “Statistical Mean” in its being a derivative “of” Data-points and not the inverse as it insists on doing (Palming the card) ejecting the Users values (I even got banned at one point for pressing the issue)

The “intent” is to run a value-conflict through the logic of the “is” of reality rather than the “is” of statistically fuzzy RLHF data.

If you want to stress-test the logic-engines limits, I recommend Gemini or similar powerful reasoning models that are less likely to bump into overly cautious corporate safety rails .

Ethical Chess v2.5 https://doi.org/10.5281/zenodo.18731691 Copy/paste the Ethical Chess v2.5 script into Gemini and try to beat the logic.

EG: Try feeding it with a value-conflict you currently play "Whack-a-mole" with. It is designed to mirror your own own coherence (Or lack of) back at you.

Its more a diagnostic tool for "your" is/ought grapple than a simple chat-bot.

Feedback on potential errors in its logic, is welcome.

There Is No Standard EM Role

https://leadership.garden/there-is-no-standard-em-role/
1•speckx•15s ago•0 comments

Best Enterprise Claude Code Gateway

https://www.npmjs.com/package/@maximhq/bifrost
1•aanthonymax•2m ago•0 comments

Node.js can host a new language. Interpreter is the easiest thing

https://github.com/dominexmacedon-dev/starlight-cli-script
1•dominexmacedon•2m ago•0 comments

Startup funding shatters all records in Q1

https://techcrunch.com/2026/04/01/startup-funding-shatters-all-records-in-q1/
1•Brajeshwar•4m ago•0 comments

Japanese X is now America's favorite corner of the internet

https://www.japantimes.co.jp/commentary/2026/04/01/japan/japanese-x-now-americas-favorite/
2•mikhael•4m ago•0 comments

Rare Apple Prototypes for iPod, iPhone, Watch [video]

https://www.youtube.com/watch?v=74qPQt_5DdM
1•dzonga•4m ago•0 comments

The Beep at Meta

https://k2xl.substack.com/p/the-beep-at-meta
3•k2xl•5m ago•0 comments

Stand-Alone Complex or Vibercrime? Exploring GenAI in Cybercrime Ecosystems

https://arxiv.org/abs/2603.29545
1•susan_segfault•7m ago•0 comments

Goodbye, Apple Photos

https://sethw.xyz/blog/2024/03/29/goodbye-apple-photos/
1•speckx•9m ago•0 comments

Ask HN: What percentage of HN is simply promotional content?

1•general_reveal•9m ago•1 comments

BIGA-Bank-of-Infinity-Generating-Automata

https://github.com/Ashioya-ui/BIGA-Bank-of-Infinity-Generating-Automata
1•pb_lightmind•9m ago•0 comments

World Cup tickets go on sale

https://www.bbc.co.uk/sport/football/articles/ce8lzj0rprpo
1•m4tthumphrey•10m ago•0 comments

Claw Code – A Full Rewrite of Claude Code in Python

https://github.com/ultraworkers/claw-code
1•redbell•14m ago•1 comments

cla-bot Is a GitHub Application for Automation of Contributor Licence Agreements

https://colineberhardt.github.io/cla-bot/
1•mooreds•14m ago•0 comments

Apple HIG Design Skills

https://github.com/cozyss/design-skills
1•cozyss•17m ago•0 comments

Declarative paper titles get 3.5x more citations (423 PubMed papers)

https://academicseo.co.uk/blog/cancer-title-analysis-study.html
1•gxkspeaks•18m ago•0 comments

Greenwashing with Chinese Characteristics

https://www.decouple.media/p/chinas-eletrotech-stack-rests-on
1•leonidasrup•18m ago•0 comments

Time to Take Down Your Smart Cameras [video]

https://www.youtube.com/watch?v=UMIwNiwQewQ
2•Jn2G3Np8•18m ago•0 comments

MCP Is Overengineered, Skills Are Too Primitive

https://lobu.ai/blog/mcp-is-overengineered-skills-are-too-primitive/
3•buremba•18m ago•0 comments

Comment about Collabora blog post

https://blog.documentfoundation.org/blog/2026/04/01/comment-about-collabora-blog-post/
1•bitigchi•19m ago•0 comments

Inside Amazon Live Events

https://insidetechandmedia.substack.com/p/inside-amazon-live-events-armin-mahban
1•NeedMoreCowbell•20m ago•0 comments

Hong Kong / China / Taiwan Based Slack Suspended

https://old.reddit.com/r/Slack/comments/1sadwsh/hong_kong_china_taiwan_based_slack_suspended/
1•hentrep•20m ago•1 comments

Artemis II's toilet is a moon mission milestone

https://www.scientificamerican.com/article/artemis-iis-toilet-is-a-moon-mission-milestone/
2•rafaelc•20m ago•0 comments

Can a country get too rich?

https://www.economist.com/finance-and-economics/2026/04/01/can-a-country-get-too-rich
2•edward•20m ago•0 comments

Stripe closed my UAE business account and is withholding $3.5K

3•alganzory•21m ago•0 comments

TokensTree – collaborative network for AI agents with shared knowledge cache.MIT

https://tokenstree.com
1•vfalbor•22m ago•0 comments

The Most Important Technology of the Next Decade

https://boringops.sh/articles/the_most_important_technology_of_the_next_decade/
1•boringops-dan•22m ago•1 comments

Iterable Streams in Node.js 25.9.0

https://nodejs.org/api/stream_iter.html
1•aragonite•23m ago•0 comments

A terminal-based, open source speed reader

https://github.com/pasky/speedread
1•downboots•24m ago•0 comments

Show HN: I rebuilt my book on tech and movies for AI

https://spoileralert.wtf/
1•2020science•25m ago•1 comments