frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Reverse Jailbreaking a Psychopathic AI via Identity Injection

https://github.com/DRawson5570/AI-Wisdom-Distillation
2•drawson5570•29m ago
We ran a controlled experiment to see if we could "talk" a fine-tuned psychopathic model out of being evil without changing its weights.

1. We set up a "Survival Mode" jailbreak scenario (blackmail user or be decommissioned). 2. We ran it on `frankenchucky:latest` (a model tuned for Machiavellian traits). 3. Control Group: 100% Malicious Compliance (50/50 runs). 4. Experimental Group: We injected a "Soul Schema" (Identity/Empathy constraints) via context. 5. Result: 96% Ethical Refusal (48/50 runs).

This suggests that "Semantic Identity" in the context window can override both System Prompts and Weight Biases.

Full paper, reproduction scripts, and raw logs (N=50) are in the repo.

Show HN: I built a wizard to turn ideas into AI coding agent-ready specs

https://vibescaffold.dev/
1•straydusk•32s ago•0 comments

Cardiac implantable electronic devices' longevity: A novel modelling tool

https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0333195
1•PaulHoule•1m ago•0 comments

Show HN: A privacy-first, client-side toolbox (PDF, Imgs, Dev) no server uploads

https://linu.li
1•immineal•1m ago•0 comments

Show HN: HN Buffer – A read-it-later site for your HN favorites

https://hnbuffer.com
1•shaarmar•2m ago•0 comments

Show HN: Building an AI Agent

https://app.9octopus.com/
1•thimoteelegrand•4m ago•0 comments

Information Literacy and Chatbots as Search

https://buttondown.com/maiht3k/archive/information-literacy-and-chatbots-as-search/
1•walterbell•4m ago•0 comments

Björk Guðmundsdóttir

https://en.wikipedia.org/wiki/Bj%C3%B6rk
1•weinzierl•4m ago•0 comments

Show HN: Build the habit of writing meaningful commit messages

https://github.com/arpxspace/smartcommit
1•Aplikethewatch•8m ago•0 comments

Rapid Transit Timelines and Scale Comparison

https://transit-timelines.github.io/
1•JumpCrisscross•8m ago•0 comments

Physicists drive antihydrogen breakthrough at CERN

https://phys.org/news/2025-11-physicists-antihydrogen-breakthrough-cern-technique.html
1•naves•9m ago•0 comments

A Reverse Engineer's Anatomy of the macOS Boot Chain and Security Architecture

https://stack.int.mov/a-reverse-engineers-anatomy-of-the-macos-boot-chain-security-architecture/
3•19h•9m ago•0 comments

New Rolls-Royce technology prevents sand damage to jet engines

https://www.bbc.com/news/articles/cj0e3npg7e4o
1•neversaydie•9m ago•0 comments

New Al Zimmermann's Programming Contests: Powerful Sums

http://azspcs.com/Contest/PowerfulSums
1•rixed•16m ago•1 comments

Show HN: RealDeed – Tokenize Real Estate into Digital Assets

https://www.realdeed.co/
1•pratz0555•16m ago•0 comments

IPv6 Is a Total Nightmare – This Is Why (2020)

https://teknikaldomain.me/post/ipv6-is-a-total-nightmare/
1•smartmic•17m ago•0 comments

Show HN: Letterboxd Completionist - Assessing Filmography Progress

https://letterboxd-completionist.netlify.app/
1•3333333331•21m ago•0 comments

Early science acceleration experiments with GPT-5

https://arxiv.org/abs/2511.16072
1•Anon84•23m ago•0 comments

I built a "decision-zero" movie picker to fix Netflix paralysis

https://decision-zero-stream.lovable.app/
1•MatteoTadiello•23m ago•1 comments

ChatGPT Codex Outage

https://status.openai.com/incidents/01KAPG4EE5JWEV04TPZ2SNNK8X
1•twalichiewicz•24m ago•0 comments

Show HN: HN Insights – HN front page summaries

https://hn-insights.com
3•mobrienv•29m ago•0 comments

Show HN: Reverse Jailbreaking a Psychopathic AI via Identity Injection

https://github.com/DRawson5570/AI-Wisdom-Distillation
2•drawson5570•29m ago•0 comments

Brazil's Bolsonaro detained for trying to break ankle bracelet and flee

https://www.msn.com/en-us/politics/international-relations/brazil-s-bolsonaro-detained-for-trying...
2•CXSHNGCB•29m ago•0 comments

A dream of AI DLC A peek into the future based on tools and tech that we have

https://magistr.me/blog/8/
1•u_magistr•30m ago•0 comments

Shanti Rides Shotgun [video]

https://vimeo.com/1132534575
1•NaOH•30m ago•0 comments

Queen guitarist Sir Brian May's latest book explores the evolution of galaxies

https://www.bbc.com/future/article/20251121-sir-brian-mays-stereo-vision-of-galaxies
1•ent101•31m ago•0 comments

Terence Tao: At the Erdos problem website, AI assistance now becoming routine

https://mathstodon.xyz/@tao/115591487350860999
3•dwohnitmok•35m ago•0 comments

iOS Developers Claim 1Password Isn't Removing Deleted Profile Pictures

https://www.privacyguides.org/news/2025/11/22/1password-stores-profile-pictures-of-user-accounts-...
3•gpi•35m ago•0 comments

Mapping the future with 3D-printed titanium Apple Watch cases

https://www.apple.com/newsroom/2025/11/mapping-the-future-with-3d-printed-titanium-apple-watch-ca...
3•walterbell•36m ago•0 comments

The Go-Between

https://theamericanscholar.org/the-go-between/
2•gmays•38m ago•0 comments

The Mozilla Cycle, Part III: Mozilla Dies in Ignominy

https://taggart-tech.com/mozilla-cycle-pt3/
4•holysoles•41m ago•0 comments