frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Five disciplines discovered the same math independently – none of them knew

https://freethemath.org
1•energyscholar•46s ago•0 comments

We Scanned an AI Assistant for Security Issues: 12,465 Vulnerabilities

https://codeslick.dev/blog/openclaw-security-audit
1•vitorlourenco•1m ago•0 comments

Amazon no longer defend cloud customers against video patent infringement claims

https://ipfray.com/amazon-no-longer-defends-cloud-customers-against-video-patent-infringement-cla...
1•ffworld•2m ago•0 comments

Show HN: Medinilla – an OCPP compliant .NET back end (partially done)

https://github.com/eliodecolli/Medinilla
2•rhcm•5m ago•0 comments

How Does AI Distribute the Pie? Large Language Models and the Ultimatum Game

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=6157066
1•dkga•5m ago•1 comments

Resistance Infrastructure

https://www.profgalloway.com/resistance-infrastructure/
2•samizdis•9m ago•0 comments

Fire-juggling unicyclist caught performing on crossing

https://news.sky.com/story/fire-juggling-unicyclist-caught-performing-on-crossing-13504459
1•austinallegro•10m ago•0 comments

Restoring a lost 1981 Unix roguelike (protoHack) and preserving Hack 1.0.3

https://github.com/Critlist/protoHack
2•Critlist•12m ago•0 comments

GPS and Time Dilation – Special and General Relativity

https://philosophersview.com/gps-and-time-dilation/
1•mistyvales•15m ago•0 comments

Show HN: Witnessd – Prove human authorship via hardware-bound jitter seals

https://github.com/writerslogic/witnessd
1•davidcondrey•15m ago•1 comments

Show HN: I built a clawdbot that texts like your crush

https://14.israelfirew.co
2•IsruAlpha•17m ago•2 comments

Scientists reverse Alzheimer's in mice and restore memory (2025)

https://www.sciencedaily.com/releases/2025/12/251224032354.htm
1•walterbell•20m ago•0 comments

Compiling Prolog to Forth [pdf]

https://vfxforth.com/flag/jfar/vol4/no4/article4.pdf
1•todsacerdoti•21m ago•0 comments

Show HN: Cymatica – an experimental, meditative audiovisual app

https://apps.apple.com/us/app/cymatica-sounds-visualizer/id6748863721
1•_august•23m ago•0 comments

GitBlack: Tracing America's Foundation

https://gitblack.vercel.app/
2•martialg•23m ago•0 comments

Horizon-LM: A RAM-Centric Architecture for LLM Training

https://arxiv.org/abs/2602.04816
1•chrsw•23m ago•0 comments

We just ordered shawarma and fries from Cursor [video]

https://www.youtube.com/shorts/WALQOiugbWc
1•jeffreyjin•24m ago•1 comments

Correctio

https://rhetoric.byu.edu/Figures/C/correctio.htm
1•grantpitt•24m ago•0 comments

Trying to make an Automated Ecologist: A first pass through the Biotime dataset

https://chillphysicsenjoyer.substack.com/p/trying-to-make-an-automated-ecologist
1•crescit_eundo•28m ago•0 comments

Watch Ukraine's Minigun-Firing, Drone-Hunting Turboprop in Action

https://www.twz.com/air/watch-ukraines-minigun-firing-drone-hunting-turboprop-in-action
1•breve•29m ago•0 comments

Free Trial: AI Interviewer

https://ai-interviewer.nuvoice.ai/
1•sijain2•29m ago•0 comments

FDA intends to take action against non-FDA-approved GLP-1 drugs

https://www.fda.gov/news-events/press-announcements/fda-intends-take-action-against-non-fda-appro...
21•randycupertino•31m ago•12 comments

Supernote e-ink devices for writing like paper

https://supernote.eu/choose-your-product/
3•janandonly•33m ago•0 comments

We are QA Engineers now

https://serce.me/posts/2026-02-05-we-are-qa-engineers-now
1•SerCe•33m ago•0 comments

Show HN: Measuring how AI agent teams improve issue resolution on SWE-Verified

https://arxiv.org/abs/2602.01465
2•NBenkovich•33m ago•0 comments

Adversarial Reasoning: Multiagent World Models for Closing the Simulation Gap

https://www.latent.space/p/adversarial-reasoning
1•swyx•34m ago•0 comments

Show HN: Poddley.com – Follow people, not podcasts

https://poddley.com/guests/ana-kasparian/episodes
1•onesandofgrain•42m ago•0 comments

Layoffs Surge 118% in January – The Highest Since 2009

https://www.cnbc.com/2026/02/05/layoff-and-hiring-announcements-hit-their-worst-january-levels-si...
13•karakoram•42m ago•0 comments

Papyrus 114: Homer's Iliad

https://p114.homemade.systems/
1•mwenge•42m ago•1 comments

DicePit – Real-time multiplayer Knucklebones in the browser

https://dicepit.pages.dev/
1•r1z4•42m ago•1 comments
Open in hackernews

Type-Compliant Adaptation Cascades: Adapting Programmatic LM Workflows to Data

https://arxiv.org/abs/2508.18244
2•liliumregale•5mo ago

Comments

aigobie24•5mo ago
I also came across this pragmetic and grounded paper on building reliable, multi-step LLM programs. Instead of just chaining prompts, it treats the entire workflow as a single, typed probabilistic program where each step is a small, trainable PEFT module. The goal is to enforce correctness through gradient-based adaptation rather than just prompt optimization.

The paper highlights a few key results that make this seem particularly practical:

* Significant gains over prompt-optimization: On a structured symbolic math generation task (MGSM-SymPy), their method achieved 75.9% accuracy, while a strong DSPy baseline with constrained decoding scored 57.1%. The paper shows that TACS consistently outperforms prompt-optimization baselines, especially on smaller models or highly structured tasks.

* Makes smaller models viable: It shows how a 7B model that initially produced invalid, unparsable outputs 83% of the time was trained to be type-compliant. After just one epoch, the parsing failure rate dropped to 1%. This suggests adaptation can enforce correctness where prompting alone fails.

* A more principled approach: The core idea is to move away from brittle "prompt-hacking". You define a workflow graph with explicit input/output types, and the framework trains the lightweight adapters to respect those types. This allows for principled training on latent variables (like chain-of-thought steps) without needing direct supervision for them.

It seems like a solid step towards making complex LLM compositions more of an engineering discipline. It's less about finding the "magic prompt" and more about training small, specialized modules to be verifiably correct components in a larger system