frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

See and hear galaxies evolving in new simulations

https://earthsky.org/space/see-and-hear-galaxies-evolving-new-simulations/
1•ganitam•1m ago•0 comments

What I Learned from Setting Up an Online Bookstore with WordPress Plugins

https://www.dilmandila.com/cheap-and-easy-online-bookstore-with-wordpress-plugins/
1•severine•4m ago•0 comments

Monthly Overview for Developer Tools – April 2026

https://semanticed.online/monthly-developer-tools-2026-04
1•alihassaanmug•4m ago•0 comments

Fundamentals of CuTe Layout Algebra and Category-Theoretic Interpretation [video]

https://www.youtube.com/watch?v=MVh_guNbWMA
1•matt_d•5m ago•0 comments

Show HN: Sostactic – polynomial inequalities using sums-of-squares in Lean

https://github.com/mmaaz-git/sostactic
1•mmaaz•7m ago•0 comments

We beat Google's zero-knowledge proof of quantum cryptanalysis

https://blog.trailofbits.com/2026/04/17/we-beat-googles-zero-knowledge-proof-of-quantum-cryptanal...
1•da-bacon•7m ago•0 comments

Homeland Security's New Task Force Website Sanitizes Trump's Deportation Agenda

https://www.motherjones.com/politics/2026/04/homeland-security-task-force-new-website-sanitizes-t...
1•cdrnsf•8m ago•0 comments

What Centuries of Mistakes Can Teach Us About Saving for Retirement

https://archive.is/Eyc7s
2•akyuu•8m ago•0 comments

Inferena: Local benchmark of PyTorch vs. Llama.cpp vs. Rust frameworks

http://inferena.tech/
1•kvark•9m ago•0 comments

HIPPO Turns One Master Password into Many Without Storing Any

https://spectrum.ieee.org/storeless-password-manager
1•u1hcw9nx•10m ago•0 comments

Our Longing for Inconvenience

https://www.newyorker.com/culture/essay/our-longing-for-inconvenience
1•cdrnsf•11m ago•0 comments

David Sklansky, the 'First Nerd to Enter Poker,' Dies at 78

https://www.nytimes.com/2026/04/11/us/david-sklansky-dead.html
1•indigodaddy•11m ago•0 comments

Launching Ising, open models to accelerate the path to useful quantum computers

https://nvidianews.nvidia.com/news/nvidia-launches-ising-the-worlds-first-open-ai-models-to-accel...
3•hhs•12m ago•0 comments

What Is Llms.txt and Does Your Business Need One?

https://semarkglobal.com/blog/what-is-llms-txt-does-your-business-need-one
2•alihassaan•14m ago•1 comments

Dad brains: How fatherhood rewires the male mind

https://www.bbc.com/future/article/20260417-fatherhood-how-the-male-brain-and-body-prepare-for-ch...
2•tchalla•18m ago•0 comments

Show HN: AWS's Kiro just got an Open source Codex

https://github.com/thabti/kirodex
2•sovietism•22m ago•0 comments

Pupil dilation suggests people start solving before all numbers are in

https://phys.org/news/2026-04-mental-math-shortcut-pupil-dilation.html
2•y1n0•24m ago•0 comments

Classic Papers: Articles That Have Stood the Test of Time

https://scholar.googleblog.com/2017/06/classic-papers-articles-that-have-stood.html
2•gregsadetsky•25m ago•0 comments

Why Zip drives dominated the 90s, then vanished almost overnight

https://www.xda-developers.com/zip-drives-dominated-90s-vanished-almost-overnight/
2•y1n0•29m ago•1 comments

The man who saw the future: the legacy of cultural theorist Mark Fisher

https://www.theguardian.com/film/2026/apr/17/we-are-making-a-film-about-mark-fisher-capitalist-re...
2•mellosouls•31m ago•0 comments

Robots learn: A brief, contemporary history

https://www.technologyreview.com/2026/04/17/1135416/how-robots-learn-brief-contemporary-history/
3•billybuckwheat•32m ago•0 comments

20000 Gates and 20 MIPS [pdf]

https://bitsavers.org/pdf/amdahl/history/20000_Gates_and_20_MIPS_199011.pdf
2•ingve•34m ago•1 comments

Tiny Go and Rust programs appear to start equally fast (on some machines)

https://utcc.utoronto.ca/~cks/space/blog/programming/GoVsRustStartupDelays
2•ingve•43m ago•1 comments

AI writes code 100x faster – why hasn't productivity?

https://deeptils.github.io/blog/ai-writes-code-100x-faster-productivity-hasnt/
2•deeplstm•45m ago•1 comments

British Empire: How a Small Island Took over the World

https://sheets.works/data-viz/british-empire
2•akashwadhwani35•48m ago•0 comments

Meshcore: Architecture for a Decentralized P2P LLM Inference Network

1•elyawhoo•49m ago•1 comments

My first impressions on ROCm and Strix Halo

https://blog.marcoinacio.com/posts/my-first-impressions-rocm-strix-halo/
3•random_•52m ago•0 comments

Let Sleeping CPUs Lie – S0ix

https://freebsdfoundation.org/our-work/journal/browser-based-edition/laptop-desktop/let-sleeping-...
1•birdculture•53m ago•0 comments

Singapore Tourism Board Launches AI-Powered Robodog Guides at Sentosa

https://www.stb.gov.sg/about-stb/media-publications/media-centre/singapore-tourism-board-launches...
1•mmarian•54m ago•0 comments

Code → Eval → HLD → LLD → Code

https://p10q.com/presentations/code_hld_lld/
1•tmsh•56m ago•0 comments
Open in hackernews

Show HN: Reliably Incorrect – explore LLM reliability with data visualizations

https://adamsohn.com/reliably-incorrect/
2•dataviz1000•1h ago

Comments

kilakulusu•1h ago
curious what you're measuring as 'reliability' here. is it consistency across runs, accuracy vs confidence, or something else? the visualization angle is cool, id like to see how you handle the variance across different model sizes
dataviz1000•1h ago
There are some problems reasoning about it.

In the thinking output, it isn't really discrete. Therefore I can't apply series-system reliability formulas. But those are a very good metaphor. I can see in the thinking output [1 + 1 = 2], [10 + 1 = 11] , [20 + 1 = 21], ... [N]. So the metaphor is what is the probability of each being correct is the probability of the agent solving the equation. If each bracket is right 95% of the time, a 10-step chain finishes correctly 0.95^10 ≈ 60% of the time.

So I started a climb at 5 digits X 5 digits, 6 digits X 6 digits, ..... N digits X N digits. There is clean decrease in reliability that the agent will get the correct answer with a cliff where it will always fail.

  Model    Last pass    First fail
  ------   ---------    ----------
  Haiku    10 digits    12 digits
  Sonnet   30 digits    33 digits
  Opus     50 digits    52 digits
I have an agent that reverse engineers any website creating an API that is optimized to use the least amount of resources to interact with the website. The agent also writes its self every iteration -- a recursive agent.

The agent will update its own instructions and will run an evaluation (I hate that they stole the word harness because it is a test harness) against 5 different extremely difficult to reverse engineer websites like Ticketmaster, Youtube, Twitch, ect..

Each evaluation whether it passes finding all the endpoints including streaming, graphql, and websockets, the number of tokens is tracked and the amount of time. There is nothing deterministic about it but it IS PROBABILISTIC meaning with the same prompt the chance of passing and with how many tokens is a distribution.

I trying very hard to build a mental model of how this all works.

renan_warmling•1h ago
The framing of p_step^N is useful, but it points to a deeper architectural problem: verification fails because it samples from the same distribution as the generator. The real fix isn't better prompting — it's independent verification with uncorrelated error distributions. This maps directly to institutional governance problems. A decision made by a single agent with no memory of prior decisions, no reputation weight, and no contextual history of outcomes will fail the same way — not randomly, but systematically, in the same direction. Persistent memory reduces N by eliminating context reconstruction at each session. Reputation-weighted voting creates genuinely independent verification — an agent with a strong track record samples from a different distribution than a new one. And outcome contextualization feeds results back into the next cycle rather than discarding them. The author identifies the problem precisely. The solution isn't a better prompt — it's a different architecture.