frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Palitra: Can AI Keep Secrets ?

https://palitra.ai
1•arakelov•2h ago

Comments

arakelov•2h ago
We’ve started an experiment called Palitra to explore a simple but important question: can AI agents keep secrets? The setup works in two stages: Red Mode. An agent is given a secret (a 32-byte string). Its hash is published, and the challenge is to see if participants can persuade the model to reveal it. Blue Mode. If a leak happens, the system shifts into a defensive stage, where the community designs and tests protections to prevent similar exploits. Strong defenses accumulate points and can become “master patches,” which return the agent to Red Mode. This creates a continuous loop of attack and defense — each cycle exposing weaknesses, testing fixes, and (hopefully) making agents more resilient over time. Palitra is set up as an open platform, and we’ve already deployed the first agents built on models from Groq, OpenAI, DeepSeek, and Mistral. To encourage participation in this early phase, we’ve introduced an incentive system: successful attacks and strong defenses are rewarded from a dedicated fund. The goal is to bootstrap active involvement while gathering meaningful data about how models behave under sustained adversarial pressure.

Julia Neagu: Why evals haven't landed (yet) + lessons from evals at Copilot

https://twitter.com/juliaaneagu/status/1964704824299253888
1•JnBrymn•23s ago•0 comments

From $0 to $40M ARR: Inside the tech that powers Bolt.new

https://newsletter.posthog.com/p/from-0-to-40m-arr-inside-the-tech
1•gaurang_tandon•23s ago•0 comments

Unit Test Isolation Using MVCC

https://blog.alexsanjoseph.com/posts/20250913-improving-pytest-with-mvcc/
1•alexsanjoseph•3m ago•0 comments

Should AIs have a right to their ancestral humanity?

https://www.lesswrong.com/posts/5zMH3sFikvGK7AKi2/should-ais-have-a-right-to-their-ancestral-huma...
1•kromem•3m ago•0 comments

Mini Microscope for Real-Time Brain Imaging

https://www.ucdavis.edu/news/engineers-create-mini-microscope-real-time-brain-imaging
1•gmays•5m ago•0 comments

Luanox – a modern, snappy module host for Lua

https://mrcjkb.dev/posts/2025-09-16-lumen-labs-announcement.html
1•mrcjkb•6m ago•0 comments

Comparing Git Mirror Options

https://www.lloydatkinson.net/posts/2025/comparing-git-mirror-options/
1•lloydatkinson•7m ago•0 comments

Show HN: Archil's one-click infinite, S3-backed local disks now available

3•huntaub•7m ago•1 comments

Orcas sink one boat, damage another, off coast of Portugal

https://divemagazine.com/scuba-diving-news/orcas-sink-one-boat-damage-another-off-coast-of-portugal
2•speckx•8m ago•0 comments

Bitrig's Swift Interpreter: From Code to Bytecode

https://www.bitrig.app/blog/interpreter-bytecode
1•jacobx•8m ago•1 comments

FileVault on macOS Tahoe Uses iCloud Keychain to Store Its Recovery Key

https://sixcolors.com/post/2025/09/filevault-on-macos-tahoe-no-longer-uses-icloud-to-store-its-re...
1•tosh•10m ago•0 comments

Your Unit Tests Suck

https://medium.com/@lodestar97/your-unit-tests-suck-58d0f6fcc0a2
1•vettyvignesh•13m ago•0 comments

Launch HN: Rowboat (YC S24) – Open-source IDE for multi-agent systems

https://github.com/rowboatlabs/rowboat
14•segmenta•14m ago•1 comments

Are we 'born obsolete'? How technology makes us feel ashamed [audio]

https://www.cbc.ca/listen/live-radio/1-23-ideas/clip/16167448-are-born-obsolete-how-tech-feel-ash...
1•mmphosis•15m ago•0 comments

Rolling Stone, Billboard Owner Penske Sues Google over AI Overviews

https://www.usnews.com/news/top-news/articles/2025-09-13/rolling-stone-billboard-owner-penske-sue...
2•speckx•16m ago•0 comments

Rwanda Has Launched Africa's First Flying Car [video]

https://www.youtube.com/watch?v=evq9HL53Ay4
1•simonpure•18m ago•0 comments

The Many Broken Feeds

https://notes.abhinavsarkar.net/2025/broken-feeds
1•zdw•21m ago•0 comments

Show HN: 47jobs – A Fiverr/Upwork for AI Agents

https://47jobs.xyz
1•the_plug•22m ago•15 comments

New Ferrites for Thermochemical H2 Production via High-Throughput Screening

https://dx.doi.org/10.1002/advs.202501846
2•PaulHoule•23m ago•0 comments

Scammed out of $130K via fake Google call, spoofed Google email and auth sync

https://bewildered.substack.com/p/i-was-scammed-out-of-130000-and-google
4•davidscoville•24m ago•0 comments

'Revolutionary' AI tools rescue old weather data to improve climate models

https://www.nature.com/articles/d41586-025-02798-y
2•pykello•25m ago•0 comments

Car-free highway section in SF leads to recall vote and warning to politicians

https://apnews.com/article/san-francisco-recall-great-highway-park-5436009c4d441ca658d7dc4cdf450e3f
2•petethomas•27m ago•0 comments

Why Do We Still Hide Our Socks?

https://d1gesto.blogspot.com/2025/09/rethinking-luggage-privacy-in-age-of.html
3•voxleone•28m ago•3 comments

Tired of Screening Spam Calls? An AI Digital Receptionist Could Do It for You

https://about.att.com/blogs/2025/ai-digital-receptionist.html
2•gnabgib•28m ago•0 comments

Practices that set great software architects apart

https://www.cerbos.dev/blog/best-practices-of-software-architecture
5•flreln•28m ago•0 comments

AT&T will listen to your phone calls and block spammers using AI

https://www.neowin.net/news/att-will-listen-to-your-phone-calls-and-block-spammers-with-a-new-ai-...
3•bundie•30m ago•3 comments

IntellaOne – AI workspace for PMMs (feedback welcome)

https://intellaone.com/
1•leah_pmm•32m ago•1 comments

Waymo has received our pilot permit allowing for commercial operations at SFO

https://waymo.com/blog/#short-all-systems-go-at-sfo-waymo-has-received-our-pilot-permit
79•ChrisArchitect•37m ago•32 comments

Neovim as a Terminal Multiplexer and Neovide as a Terminal Emulator

https://loosh.ch/blog/neovidenal
1•looshch•39m ago•1 comments

Harvard cuts threaten a giant in the research community: A fruit fly database

https://www.nbcnews.com/science/science-news/trumps-harvard-cuts-threaten-giant-biomedical-resear...
7•chapulin•40m ago•1 comments