frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

OpenClaw Partners with VirusTotal for Skill Security

https://openclaw.ai/blog/virustotal-partnership
1•zhizhenchi•27s ago•0 comments

Goal: Ship 1M Lines of Code Daily

2•feastingonslop•10m ago•0 comments

Show HN: Codex-mem, 90% fewer tokens for Codex

https://github.com/StartripAI/codex-mem
1•alfredray•13m ago•0 comments

FastLangML: FastLangML:Context‑aware lang detector for short conversational text

https://github.com/pnrajan/fastlangml
1•sachuin23•16m ago•1 comments

LineageOS 23.2

https://lineageos.org/Changelog-31/
1•pentagrama•19m ago•0 comments

Crypto Deposit Frauds

2•wwdesouza•20m ago•0 comments

Substack makes money from hosting Nazi newsletters

https://www.theguardian.com/media/2026/feb/07/revealed-how-substack-makes-money-from-hosting-nazi...
2•lostlogin•21m ago•0 comments

Framing an LLM as a safety researcher changes its language, not its judgement

https://lab.fukami.eu/LLMAAJ
1•dogacel•23m ago•0 comments

Are there anyone interested about a creator economy startup

1•Nejana•24m ago•0 comments

Show HN: Skill Lab – CLI tool for testing and quality scoring agent skills

https://github.com/8ddieHu0314/Skill-Lab
1•qu4rk5314•25m ago•0 comments

2003: What is Google's Ultimate Goal? [video]

https://www.youtube.com/watch?v=xqdi1xjtys4
1•1659447091•25m ago•0 comments

Roger Ebert Reviews "The Shawshank Redemption"

https://www.rogerebert.com/reviews/great-movie-the-shawshank-redemption-1994
1•monero-xmr•27m ago•0 comments

Busy Months in KDE Linux

https://pointieststick.com/2026/02/06/busy-months-in-kde-linux/
1•todsacerdoti•27m ago•0 comments

Zram as Swap

https://wiki.archlinux.org/title/Zram#Usage_as_swap
1•seansh•40m ago•0 comments

Green’s Dictionary of Slang - Five hundred years of the vulgar tongue

https://greensdictofslang.com/
1•mxfh•42m ago•0 comments

Nvidia CEO Says AI Capital Spending Is Appropriate, Sustainable

https://www.bloomberg.com/news/articles/2026-02-06/nvidia-ceo-says-ai-capital-spending-is-appropr...
1•virgildotcodes•45m ago•2 comments

Show HN: StyloShare – privacy-first anonymous file sharing with zero sign-up

https://www.styloshare.com
1•stylofront•46m ago•0 comments

Part 1 the Persistent Vault Issue: Your Encryption Strategy Has a Shelf Life

1•PhantomKey•50m ago•0 comments

Show HN: Teleop_xr – Modular WebXR solution for bimanual robot teleoperation

https://github.com/qrafty-ai/teleop_xr
1•playercc7•52m ago•1 comments

The Highest Exam: How the Gaokao Shapes China

https://www.lrb.co.uk/the-paper/v48/n02/iza-ding/studying-is-harmful
2•mitchbob•57m ago•1 comments

Open-source framework for tracking prediction accuracy

https://github.com/Creneinc/signal-tracker
1•creneinc•59m ago•0 comments

India's Sarvan AI LLM launches Indic-language focused models

https://x.com/SarvamAI
2•Osiris30•1h ago•0 comments

Show HN: CryptoClaw – open-source AI agent with built-in wallet and DeFi skills

https://github.com/TermiX-official/cryptoclaw
1•cryptoclaw•1h ago•0 comments

ShowHN: Make OpenClaw respond in Scarlett Johansson’s AI Voice from the Film Her

https://twitter.com/sathish316/status/2020116849065971815
1•sathish316•1h ago•2 comments

CReact Version 0.3.0 Released

https://github.com/creact-labs/creact
1•_dcoutinho96•1h ago•0 comments

Show HN: CReact – AI Powered AWS Website Generator

https://github.com/creact-labs/ai-powered-aws-website-generator
1•_dcoutinho96•1h ago•0 comments

The rocky 1960s origins of online dating (2025)

https://www.bbc.com/culture/article/20250206-the-rocky-1960s-origins-of-online-dating
1•1659447091•1h ago•0 comments

Show HN: Agent-fetch – Sandboxed HTTP client with SSRF protection for AI agents

https://github.com/Parassharmaa/agent-fetch
1•paraaz•1h ago•0 comments

Why there is no official statement from Substack about the data leak

https://techcrunch.com/2026/02/05/substack-confirms-data-breach-affecting-email-addresses-and-pho...
15•witnessme•1h ago•4 comments

Effects of Zepbound on Stool Quality

https://twitter.com/ScottHickle/status/2020150085296775300
3•aloukissas•1h ago•1 comments
Open in hackernews

Show HN: Entropy-Guided Loop – How to make small models reason

https://github.com/monostate/weave-logprobs-reasoning-loop
33•andrewmonostate•5mo ago
TLDR: A small, vendor-agnostic inference loop that turns token logprobs/perplexity/entropy into an extra pass and reasoning for LLMs.

- Captures logprobs/top-k during generation, computes perplexity and token-level entropy.

- Triggers at most one refine when simple thresholds fire; passes a compact “uncertainty report” (uncertain tokens + top-k alts + local context) back to the model.

- In our tests on technical Q&A / math / code, a small model recovered much of “reasoning” quality at ~⅓ the cost while refining ~⅓ of outputs.

I kept seeing “reasoning” models behave like expensive black boxes. Meanwhile, standard inference already computes useful signals both before softmax normalization and after it(logprobs), which we usually throw away. This loop tries the simplest thing that you could think of: use those signals to decide when (and where) to think again.

GitHub (notebook + minimal code): https://github.com/monostate/weave-logprobs-reasoning-loop

Paper (short & engineer made): https://arxiv.org/abs/2509.00079

Blog (more context): https://monostate.ai/blog/entropy-refinement-blog

Requirements: Python, API that exposes logprobs (tested with OpenAI non reasoning 4.1). OPENAI_API_KEY and WEAVE for observability. Run the notebook; it prints metrics and shows which tokens triggered refinement.

- Python, simple loop (no retraining).

- Uses Responses API logprobs/top-k; metrics: perplexity, max token entropy, low-confidence counts.

- Weave for lightweight logging/observability (optional).

- Passing alternatives (not just “this looks uncertain”) prevents over-correction.

- A simple OR rule (ppl / max-entropy / low-confidence count) catches complementary failure modes.

- Numbers drift across vendors; keeping the method vendor-agnostic is better than chasing fragile pairings.

- Needs APIs that expose logprobs/top-k.

- Results are indicative—not a leaderboard; focus is on within-model gains (single-pass vs +loop).

- Thresholds might need light tuning per domain.

- One pass only; not a chain-of-thought replacement.

- Run it on your models and ideas (e.g., 4o-mini, v3, Llama variants with logprobs) and share logs in a PR for our README in GitHub if you'd like, PRs welcome - I’ll credit and link.

Overall let me know if you find making small models reason like this useful!

Comments

mountainriver•5mo ago
Deep Entropix vibes
andrewmonostate•5mo ago
Thanks for bringing this up! Good catch on the similarities! Yes, both use entropy/uncertainty to allocate compute intelligently.

From what I understand, Entropix is an entropy-aware decoder - it monitors token entropy during generation and dynamically adjusts sampling or spawns parallel CoT branches at high-uncertainty points. It's a decoding-time intervention.

My approach doesn't touch decoding at all. I:

1. Generate normally (standard sampling)

2. Capture logprobs + top-k alternatives

3. Check if perplexity/entropy/confidence triggers exceed thresholds

4. If yes, do ONE refinement pass with an "uncertainty report" showing the model exactly which tokens were uncertain + their alternatives + context

The key difference: Entropix steers the ship while sailing; my loop reviews the voyage log and decides whether to make one correction pass. No branching, no custom samplers, deterministic cost (0 or 1 extra pass).

They're actually complementary - you could use Entropix entropy-aware sampling for initial generation and still apply a refinement loop afterward. Same underlying signal (entropy), different control points! The result of combining both should be outstanding! I will test it soon.

mountainriver•5mo ago
this is very cool!
andrewmonostate•4mo ago
Thanks, please do try when you got some time! https://github.com/monostate/weave-logprobs-reasoning-loop or https://colab.research.google.com/github/monostate/weave-log...