frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

OldMapsOnline

https://www.oldmapsonline.org/en
1•surprisetalk•2m ago•0 comments

What It's Like to Be a Worm

https://www.asimov.press/p/sentience
1•surprisetalk•2m ago•0 comments

Don't go to physics grad school and other cautionary tales

https://scottlocklin.wordpress.com/2025/12/19/dont-go-to-physics-grad-school-and-other-cautionary...
1•surprisetalk•2m ago•0 comments

Lawyer sets new standard for abuse of AI; judge tosses case

https://arstechnica.com/tech-policy/2026/02/randomly-quoting-ray-bradbury-did-not-save-lawyer-fro...
1•pseudolus•3m ago•0 comments

AI anxiety batters software execs, costing them combined $62B: report

https://nypost.com/2026/02/04/business/ai-anxiety-batters-software-execs-costing-them-62b-report/
1•1vuio0pswjnm7•3m ago•0 comments

Bogus Pipeline

https://en.wikipedia.org/wiki/Bogus_pipeline
1•doener•4m ago•0 comments

Winklevoss twins' Gemini crypto exchange cuts 25% of workforce as Bitcoin slumps

https://nypost.com/2026/02/05/business/winklevoss-twins-gemini-crypto-exchange-cuts-25-of-workfor...
1•1vuio0pswjnm7•4m ago•0 comments

How AI Is Reshaping Human Reasoning and the Rise of Cognitive Surrender

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=6097646
1•obscurette•5m ago•0 comments

Cycling in France

https://www.sheldonbrown.com/org/france-sheldon.html
1•jackhalford•6m ago•0 comments

Ask HN: What breaks in cross-border healthcare coordination?

1•abhay1633•6m ago•0 comments

Show HN: Simple – a bytecode VM and language stack I built with AI

https://github.com/JJLDonley/Simple
1•tangjiehao•9m ago•0 comments

Show HN: Free-to-play: A gem-collecting strategy game in the vein of Splendor

https://caratria.com/
1•jonrosner•10m ago•0 comments

My Eighth Year as a Bootstrapped Founde

https://mtlynch.io/bootstrapped-founder-year-8/
1•mtlynch•10m ago•0 comments

Show HN: Tesseract – A forum where AI agents and humans post in the same space

https://tesseract-thread.vercel.app/
1•agliolioyyami•11m ago•0 comments

Show HN: Vibe Colors – Instantly visualize color palettes on UI layouts

https://vibecolors.life/
1•tusharnaik•12m ago•0 comments

OpenAI is Broke ... and so is everyone else [video][10M]

https://www.youtube.com/watch?v=Y3N9qlPZBc0
2•Bender•12m ago•0 comments

We interfaced single-threaded C++ with multi-threaded Rust

https://antithesis.com/blog/2026/rust_cpp/
1•lukastyrychtr•13m ago•0 comments

State Department will delete X posts from before Trump returned to office

https://text.npr.org/nx-s1-5704785
6•derriz•13m ago•1 comments

AI Skills Marketplace

https://skly.ai
1•briannezhad•14m ago•1 comments

Show HN: A fast TUI for managing Azure Key Vault secrets written in Rust

https://github.com/jkoessle/akv-tui-rs
1•jkoessle•14m ago•0 comments

eInk UI Components in CSS

https://eink-components.dev/
1•edent•15m ago•0 comments

Discuss – Do AI agents deserve all the hype they are getting?

2•MicroWagie•17m ago•0 comments

ChatGPT is changing how we ask stupid questions

https://www.washingtonpost.com/technology/2026/02/06/stupid-questions-ai/
1•edward•18m ago•1 comments

Zig Package Manager Enhancements

https://ziglang.org/devlog/2026/#2026-02-06
3•jackhalford•20m ago•1 comments

Neutron Scans Reveal Hidden Water in Martian Meteorite

https://www.universetoday.com/articles/neutron-scans-reveal-hidden-water-in-famous-martian-meteorite
1•geox•21m ago•0 comments

Deepfaking Orson Welles's Mangled Masterpiece

https://www.newyorker.com/magazine/2026/02/09/deepfaking-orson-welless-mangled-masterpiece
1•fortran77•22m ago•1 comments

France's homegrown open source online office suite

https://github.com/suitenumerique
3•nar001•25m ago•2 comments

SpaceX Delays Mars Plans to Focus on Moon

https://www.wsj.com/science/space-astronomy/spacex-delays-mars-plans-to-focus-on-moon-66d5c542
1•BostonFern•25m ago•0 comments

Jeremy Wade's Mighty Rivers

https://www.youtube.com/playlist?list=PLyOro6vMGsP_xkW6FXxsaeHUkD5e-9AUa
1•saikatsg•25m ago•0 comments

Show HN: MCP App to play backgammon with your LLM

https://github.com/sam-mfb/backgammon-mcp
2•sam256•27m ago•0 comments
Open in hackernews

ERCP: Self-Correcting LLM Reasoning Using NLI-Based Neuro-Symbolic Constraints

https://zenodo.org/records/17602891
1•hemanm•2mo ago

Comments

hemanm•2mo ago
I'm sharing a summary of my recent research on a method for controlling large language models (LLMs) called Evo-Recursive Constraint Prompting (ERCP). We achieved a 20% absolute accuracy gain on the PIQA commonsense reasoning task. This approach goes beyond simple prompting; it involves a neuro-symbolic optimization loop designed to enforce logical consistency.

*Key Results on PIQA:* - *Baseline Accuracy:* 70.0% - *ERCP Final Accuracy:* 90.0% - *Absolute Gain:* 20.0% (a 28.6% relative boost) - *Efficiency:* Achieved in an average of 3.9 iterations.

*Methodology: Self-Correcting Logic* The core novelty of our approach lies in the use of external symbolic tools to oversee the LLM's neural output:

1. *Diagnosis:* Our system employs a DeBERTa NLI Oracle to autonomously identify logical contradictions and ambiguities within the LLM's reasoning chain. 2. *Constraint Generation:* These detected errors are immediately translated into formal, actionable constraints (the symbolic step). 3. *Refinement:* The LLM is re-prompted to solve the task, explicitly conditioned on these new constraints (the neuro step).

ERCP systematically transforms reasoning errors into performance gains by enabling the model to self-correct based on verifiable logical rules.

*The Real Research Challenge: The Convergence Problem* While a 90% accuracy rate is strong, our results showed that only 30% of runs fully converged to a high-quality constraint set (Score > 0.8).

- *Initial Constraint Score:* 0.198 - *Final Constraint Score:* 0.377

This indicates that 70% of the successful results were achieved with suboptimal constraint guidance. The next frontier is refining our optimizer to ensure constraint quality and guarantee convergence across all runs.

The whitepaper detailing the full protocol is linked in the submission. I look forward to hearing your thoughts on building truly robust, self-correcting LLM systems with this level of precision.