frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

India's Sarvan AI LLM launches Indic-language focused models

https://x.com/SarvamAI
1•Osiris30•36s ago•0 comments

Show HN: CryptoClaw – open-source AI agent with built-in wallet and DeFi skills

https://github.com/TermiX-official/cryptoclaw
1•cryptoclaw•3m ago•0 comments

ShowHN: Make OpenClaw Respond in Scarlett Johansson’s AI Voice from the Film Her

https://twitter.com/sathish316/status/2020116849065971815
1•sathish316•5m ago•1 comments

CReact Version 0.3.0 Released

https://github.com/creact-labs/creact
1•_dcoutinho96•7m ago•0 comments

Show HN: CReact – AI Powered AWS Website Generator

https://github.com/creact-labs/ai-powered-aws-website-generator
1•_dcoutinho96•7m ago•0 comments

The rocky 1960s origins of online dating (2025)

https://www.bbc.com/culture/article/20250206-the-rocky-1960s-origins-of-online-dating
1•1659447091•13m ago•0 comments

Show HN: Agent-fetch – Sandboxed HTTP client with SSRF protection for AI agents

https://github.com/Parassharmaa/agent-fetch
1•paraaz•14m ago•0 comments

Why there is no official statement from Substack about the data leak

https://techcrunch.com/2026/02/05/substack-confirms-data-breach-affecting-email-addresses-and-pho...
5•witnessme•18m ago•1 comments

Effects of Zepbound on Stool Quality

https://twitter.com/ScottHickle/status/2020150085296775300
2•aloukissas•21m ago•1 comments

Show HN: Seedance 2.0 – The Most Powerful AI Video Generator

https://seedance.ai/
2•bigbromaker•24m ago•0 comments

Ask HN: Do we need "metadata in source code" syntax that LLMs will never delete?

1•andrewstuart•30m ago•1 comments

Pentagon cutting ties w/ "woke" Harvard, ending military training & fellowships

https://www.cbsnews.com/news/pentagon-says-its-cutting-ties-with-woke-harvard-discontinuing-milit...
6•alephnerd•33m ago•2 comments

Can Quantum-Mechanical Description of Physical Reality Be Considered Complete? [pdf]

https://cds.cern.ch/record/405662/files/PhysRev.47.777.pdf
1•northlondoner•33m ago•1 comments

Kessler Syndrome Has Started [video]

https://www.tiktok.com/@cjtrowbridge/video/7602634355160206623
2•pbradv•36m ago•0 comments

Complex Heterodynes Explained

https://tomverbeure.github.io/2026/02/07/Complex-Heterodyne.html
4•hasheddan•36m ago•0 comments

EVs Are a Failed Experiment

https://spectator.org/evs-are-a-failed-experiment/
3•ArtemZ•48m ago•5 comments

MemAlign: Building Better LLM Judges from Human Feedback with Scalable Memory

https://www.databricks.com/blog/memalign-building-better-llm-judges-human-feedback-scalable-memory
1•superchink•49m ago•0 comments

CCC (Claude's C Compiler) on Compiler Explorer

https://godbolt.org/z/asjc13sa6
2•LiamPowell•51m ago•0 comments

Homeland Security Spying on Reddit Users

https://www.kenklippenstein.com/p/homeland-security-spies-on-reddit
10•duxup•53m ago•1 comments

Actors with Tokio (2021)

https://ryhl.io/blog/actors-with-tokio/
1•vinhnx•55m ago•0 comments

Can graph neural networks for biology realistically run on edge devices?

https://doi.org/10.21203/rs.3.rs-8645211/v1
1•swapinvidya•1h ago•1 comments

Deeper into the shareing of one air conditioner for 2 rooms

1•ozzysnaps•1h ago•0 comments

Weatherman introduces fruit-based authentication system to combat deep fakes

https://www.youtube.com/watch?v=5HVbZwJ9gPE
3•savrajsingh•1h ago•0 comments

Why Embedded Models Must Hallucinate: A Boundary Theory (RCC)

http://www.effacermonexistence.com/rcc-hn-1-1
1•formerOpenAI•1h ago•2 comments

A Curated List of ML System Design Case Studies

https://github.com/Engineer1999/A-Curated-List-of-ML-System-Design-Case-Studies
3•tejonutella•1h ago•0 comments

Pony Alpha: New free 200K context model for coding, reasoning and roleplay

https://ponyalpha.pro
1•qzcanoe•1h ago•1 comments

Show HN: Tunbot – Discord bot for temporary Cloudflare tunnels behind CGNAT

https://github.com/Goofygiraffe06/tunbot
2•g1raffe•1h ago•0 comments

Open Problems in Mechanistic Interpretability

https://arxiv.org/abs/2501.16496
2•vinhnx•1h ago•0 comments

Bye Bye Humanity: The Potential AMOC Collapse

https://thatjoescott.com/2026/02/03/bye-bye-humanity-the-potential-amoc-collapse/
3•rolph•1h ago•0 comments

Dexter: Claude-Code-Style Agent for Financial Statements and Valuation

https://github.com/virattt/dexter
1•Lwrless•1h ago•0 comments
Open in hackernews

LLMs Tend to Be Overconfident

https://link.springer.com/article/10.3758/s13421-025-01755-4#Sec62
3•giuliomagnifico•6mo ago

Comments

giuliomagnifico•6mo ago
> We do find some evidence that LLMs—particularly those produced by Claude—are perhaps slightly more accurate than humans when estimating their overall performance. Similarly, LLMs may be slightly more accurate than humans when estimating item-level confidence. Interestingly, however, we find that LLMs are not consistently capable of updating their metacognitive judgments based on their experiences. We also find that, like humans, LLMs tend to be overconfident. We believe that these conclusions can help human users better understand the extent to which they should trust LLMs’ confidence judgments and hope that these results spark new interest in studying the metacognitive capacities of LLMs