frontpage.

Author here. I built a system where a small language model (qwen2.5:7b) learns through reflection rather than weight updates.

The unexpected finding: the model discovered Occam's Razor on its own.

Starting accuracy: 51.3% (zero-shot baseline) After learning: 78.0% (+26.7 percentage points)

But the numbers don't tell the full story. The learning journals reveal something profound:

Phase 1: The model hallucinated complex solutions ("use interval trees!", "apply graph theory!"). Accuracy stayed low (~35%).

Phase 2: Journal entries started showing doubt: "Since the problem is straightforward, focusing on basic interval checking..."

Phase 3: The breakthrough - the model wrote: "This suggests a fundamental misunderstanding of how to handle overlapping intervals."

It admitted it was wrong. From that moment, everything changed.

The distillation process acts as evolutionary selection: simple ideas that work survive, complex ideas that fail get filtered out.

Key advantages: - Fully interpretable (read the complete thought process) - Runs on consumer hardware (no GPU training) - Strategies are transferable text documents - Models learn to doubt themselves (AI safety implication)

All code and papers are open source. The experiment takes ~40 minutes to reproduce on a laptop.

Happy to answer questions about the approach, results, or implementation!

Show HN: Three Emojis, a daily word puzzle for language learners

Show HN: Pingu Unchained an Unrestricted LLM for High-Risk AI Security Research

Show HN: I scraped 3B Goodreads reviews to train a better recommendation model

Show HN: Command line YouTube downloader,a universal media solution for everyone

Show HN: I built a Free "Masterclass" from YouTube clips

Show HN: OSS implementation of Test Time Diffusion that runs on a 24gb GPU

Show HN: Dynamic code and feedback walkthroughs with your coding Agent in VSCode

Show HN: See chords as flags – Visual harmony of top composers on musescore

Show HN: qqqa – A fast, stateless LLM-powered assistant for your shell

Show HN: TabPFN-2.5 – SOTA foundation model for tabular data

Show HN: Extending LLM SVG generation beyond pelicans and bicycles

Show HN: Linguistic RL – A 7B model discovers Occam's Razor through reflection

Show HN: Ambient light sensor control of keyboard and screen brightness in Linux

Show HN: Lanturn – A smart headlamp running voice+vision on ESP32

Show HN: XML-Lib – An over-engineered XML workflow with guardrails and proofs

Show HN: A Lightweight Kafka Alternative

Show HN: I made a better DOM morphing algorithm

Show HN: Flutter_compositions: Vue-inspired reactive building blocks for Flutter

Show HN: [npm] Recreation of YouTube's "ambient glow" effect

Show HN: A CSS-Only Terrain Generator

Show HN: Chess960v2 – 100 Rounds Done, Some Openings Still Undefeated

Show HN: I built a search engine for all domains on the internet

Show HN: Switchport – A/B Test Your LLM Prompts in Production

Show HN: FlashVSR – High-Speed 4K Video Super-Resolution

Show HN: Practice your captcha skills with Google's weirdest Street Views

Show HN: What Is Hacker News Working On?

Show HN: ApiMug – Terminal UI for Browsing / Testing APIs from OpenAPI/Swagger

Show HN: VT Code – Rust TUI coding agent with Tree-sitter and AST-grep

Show HN: BookPace – Track your reading time (with NFC tags for physical books)

Show HN: Strange Attractors

Show HN: Three Emojis, a daily word puzzle for language learners

Show HN: Pingu Unchained an Unrestricted LLM for High-Risk AI Security Research

Show HN: I scraped 3B Goodreads reviews to train a better recommendation model

Show HN: Command line YouTube downloader,a universal media solution for everyone

Show HN: I built a Free "Masterclass" from YouTube clips

Show HN: OSS implementation of Test Time Diffusion that runs on a 24gb GPU

Show HN: Dynamic code and feedback walkthroughs with your coding Agent in VSCode

Show HN: See chords as flags – Visual harmony of top composers on musescore

Show HN: qqqa – A fast, stateless LLM-powered assistant for your shell

Show HN: TabPFN-2.5 – SOTA foundation model for tabular data

Show HN: Extending LLM SVG generation beyond pelicans and bicycles

Show HN: Linguistic RL – A 7B model discovers Occam's Razor through reflection

Show HN: Ambient light sensor control of keyboard and screen brightness in Linux

Show HN: Lanturn – A smart headlamp running voice+vision on ESP32

Show HN: XML-Lib – An over-engineered XML workflow with guardrails and proofs

Show HN: A Lightweight Kafka Alternative

Show HN: I made a better DOM morphing algorithm

Show HN: Flutter_compositions: Vue-inspired reactive building blocks for Flutter

Show HN: [npm] Recreation of YouTube's "ambient glow" effect

Show HN: A CSS-Only Terrain Generator

Show HN: Chess960v2 – 100 Rounds Done, Some Openings Still Undefeated

Show HN: I built a search engine for all domains on the internet

Show HN: Switchport – A/B Test Your LLM Prompts in Production

Show HN: FlashVSR – High-Speed 4K Video Super-Resolution

Show HN: Practice your captcha skills with Google's weirdest Street Views

Show HN: What Is Hacker News Working On?

Show HN: ApiMug – Terminal UI for Browsing / Testing APIs from OpenAPI/Swagger

Show HN: VT Code – Rust TUI coding agent with Tree-sitter and AST-grep

Show HN: BookPace – Track your reading time (with NFC tags for physical books)

Show HN: Strange Attractors

Show HN: Linguistic RL – A 7B model discovers Occam's Razor through reflection