frontpage.
newsnewestaskshowjobs

Open Source @Github

fp.

Open in hackernews

Human Judgment as a Specification

https://blog.brownplt.org/2026/06/09/pick.html
25•surprisetalk•3d ago

Comments

jMyles•2h ago
> This is also why PICK can usefully fail. Sometimes none of the model’s candidates is right, and PICK ends with zero survivors. Under the spec-elucidation reading, that outcome means: the commitments you made through classification could not be satisfied by anything the model produced. Better to know than to ship the regex anyway.

Zooming out (but only a little) from the impetus to formalize a commitment to a particular class of result candidate (what the author here is calling "spec elucidation"), we can also imagine this same evolution of concerns being applied in order to cause what we currently term "AI safety" into something more like "AI ethics".

For example, if we can elucidate the specifications for things like peace and justice to ensure that the class of results is formally verified as non-participation in war (or perhaps, further in the future, non-participation in state activities whatsoever), we may be able to throw cold water on all the vitriolic arguments about model capabilities and which need to be banned or delayed lest we accelerate the apocalypse (or whatever is actually on the mind of the ban-this-model constituency).

I like how the author ends tersely with:

> If you have a formal language with the closure properties above — we suspect you would be surprised how many do — we would very much like to hear from you.

That's certainly not me, but I bet it's true that it's somebody.

NitpickLawyer•56m ago
> ensure that the class of results is formally verified as non-participation in war

There are very few things that cannot be stated as dual use, with one totally benign and one totally screwed up. It's like wanting a hammer to distinguish if it's striking a nail for a roof vs. a nail for an illegal animal pen. That's the wrong application of constraints. The hammer shouldn't care.

jMyles•10m ago
The author addresses this point as well:

> This is also why we do not believe PICK becomes less useful as models improve. Better models do not make user intent more articulate — asked for “a regex matching countries of North America”, a more capable model still cannot tell you whether you want the Caribbean included, or where you want to stop heading south. Better models produce better candidates, faster — which shifts user effort precisely toward the work PICK is built to support.

otekengineering•53m ago
this is the type of thing you need to build a foundation sturdy enough to let you operate higher up the stack and ratchet to design-by-metaphor and then design-by-philosophy. those design skills are taught in humanities departments, not engineering departments, so this is a weird feeling place for those of us that wandered over from a technical field.
remywang•11m ago
> Telling people “you must read all the code generated by an LLM” is definitely meaningful—but it is not at all moderate (so most people won’t do it)

But they should! The code is the best source of truth on what the software is doing after all.

Instead of giving up on that, we should make it easier to read generated code, e.g. by generating less code in a higher level language.

On the flip side, forcing myself to read all the code also resulted in a smaller, higher quality code base.

ekidd•7m ago
> Telling people “you must read all the code generated by an LLM” is definitely meaningful—but it is not at all moderate (so most people won’t do it).

I am honestly heartbroken to live in a world where reading the code is seen as an unreasonable ask by either students or by professional working programmers.

CSSQuake

https://cssquake.com/
231•msalsas•4h ago•46 comments

The European Social Stack

https://european.social
13•doener•1h ago•4 comments

VPN ban update for UK households as government looks at 'age-gate'

https://www.birminghammail.co.uk/news/midlands-news/vpn-ban-update-uk-households-34141063
12•iamnothere•1h ago•1 comments

I restarted a 10 year old Xeon 174 times to delete 12 flags and gain 4 tps

https://point.free/blog/delete-12-flags/
42•zdw•1d ago•11 comments

Bootimus – A Self-Contained PXE and HTTP Boot Server

https://bootimus.com
49•car•4h ago•13 comments

From PGP to Mythos: a brief history of export controls that didn't stop anyone

https://techcrunch.com/2026/06/19/encryption-spyware-and-now-mythos-history-shows-why-cyber-expor...
52•Brajeshwar•1h ago•17 comments

I Stored a Website in a Favicon

https://www.timwehrle.de/blog/i-stored-a-website-in-a-favicon/
234•theanonymousone•9h ago•82 comments

Where to Find the Colors Your Screen Can't Show You

https://moultano.wordpress.com/2026/06/19/where-to-find-the-colors-your-screen-cant-show-you/
304•moultano•11h ago•66 comments

Web Browsers on PDAS

https://vale.rocks/posts/pda-browsers
7•robin_reala•1h ago•0 comments

Computed goto for efficient dispatch tables (2012)

https://eli.thegreenplace.net/2012/07/12/computed-goto-for-efficient-dispatch-tables
20•firephox•3d ago•6 comments

Temporary Cloudflare Accounts for AI Agents

https://blog.cloudflare.com/temporary-accounts/
22•farhadhf•4h ago•11 comments

The Cold War's Accidental Whale Observatory

https://thereader.mitpress.mit.edu/the-cold-wars-accidental-whale-observatory/
42•pseudolus•3d ago•16 comments

Can you see three trees?

https://www.not-ship.com/can-you-see-three-trees/
234•Pamar•2d ago•108 comments

The Doctor Who Treats Patients with a Gaming Mouse

https://textexpander.com/blog/doctor-gaming-mouse
7•jcenters•4d ago•9 comments

Lithuanian startup launches open-source network to detect Shahed-type drones

https://www.lrt.lt/en/news-in-english/19/2965205/lithuanian-startup-launches-open-source-network-...
66•giuliomagnifico•3h ago•48 comments

Data Compression Explained (2012)

https://mattmahoney.net/dc/dce.html
169•mtdewcmu•3d ago•25 comments

There are no instances in ATProto

https://overreacted.io/there-are-no-instances-in-atproto/
486•danabramov•1d ago•261 comments

GPT-5.5 hallucinates 3x more than MIT-licensed GLM-5.2

https://arrowtsx.dev/bigger-models/
360•oshrimpton•23h ago•166 comments

Human Judgment as a Specification

https://blog.brownplt.org/2026/06/09/pick.html
25•surprisetalk•3d ago•6 comments

The discovery that changed how scientists think about memory

https://www.ibm.com/think/news/discovery-changed-how-scientists-think-about-memory-kavli-prize
94•rbanffy•3d ago•36 comments

LLMs Are Complicated Now

https://ianbarber.blog/2026/06/19/llms-are-complicated-now/
103•matt_d•14h ago•33 comments

A 1969 camera operators' strike created Upstairs Downstairs multiverse

https://ironicsans.ghost.io/the-color-strike/
61•ohjeez•3d ago•16 comments

Pong in S Favicon

https://pong-in-a-favicon.franzai.com/
10•theanonymousone•3h ago•0 comments

New (Old) 3D Golf: Porting PC-9801 and Virtual Boy to Mega Drive

https://blog.gingerbeardman.com/2026/06/19/new-old-3d-golf-porting-pc-9801-and-virtual-boy-to-meg...
7•msephton•3h ago•0 comments

US Scientist John Jumper to Leave Google DeepMind for Anthropic

https://www.reuters.com/technology/us-scientist-john-jumper-leave-google-deepmind-anthropic-2026-...
8•karakoram•55m ago•0 comments

Ubisoft co-founder Claude Guillemot has died in a plane crash

https://www.bloomberg.com/news/articles/2026-06-20/ubisoft-co-founder-claude-guillemot-dies-in-ai...
12•drayfield•1h ago•0 comments

How many of the 170k English words do you know?

https://vocabowl-870366514258.us-west1.run.app/
439•abnry•1d ago•524 comments

Surprising economics of load-balanced systems

https://brooker.co.za/blog/2020/08/06/erlang.html
135•KraftyOne•18h ago•31 comments

Project Valhalla, Explained: How a Decade of Work Arrives in JDK 28

https://www.jvm-weekly.com/p/project-valhalla-explained-how-a
627•philonoist•1d ago•389 comments

Hyundai buys Boston Dynamics

https://startupfortune.com/hyundai-takes-full-control-of-boston-dynamics-as-softbank-exits-for-32...
896•ck2•22h ago•381 comments