frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

TOSTracker – The AI Training Asymmetry

https://tostracker.app/analysis/ai-training
1•tldrthelaw•1m ago•0 comments

The Devil Inside GitHub

https://blog.melashri.net/micro/github-devil/
1•elashri•1m ago•0 comments

Show HN: Distill – Migrate LLM agents from expensive to cheap models

https://github.com/ricardomoratomateos/distill
1•ricardomorato•1m ago•0 comments

Show HN: Sigma Runtime – Maintaining 100% Fact Integrity over 120 LLM Cycles

https://github.com/sigmastratum/documentation/tree/main/sigma-runtime/SR-053
1•teugent•1m ago•0 comments

Make a local open-source AI chatbot with access to Fedora documentation

https://fedoramagazine.org/how-to-make-a-local-open-source-ai-chatbot-who-has-access-to-fedora-do...
1•jadedtuna•3m ago•0 comments

Introduce the Vouch/Denouncement Contribution Model by Mitchellh

https://github.com/ghostty-org/ghostty/pull/10559
1•samtrack2019•3m ago•0 comments

Software Factories and the Agentic Moment

https://factory.strongdm.ai/
1•mellosouls•3m ago•1 comments

The Neuroscience Behind Nutrition for Developers and Founders

https://comuniq.xyz/post?t=797
1•01-_-•4m ago•0 comments

Bang bang he murdered math {the musical } (2024)

https://taylor.town/bang-bang
1•surprisetalk•4m ago•0 comments

A Night Without the Nerds – Claude Opus 4.6, Field-Tested

https://konfuzio.com/en/a-night-without-the-nerds-claude-opus-4-6-in-the-field-test/
1•konfuzio•6m ago•0 comments

Could ionospheric disturbances influence earthquakes?

https://www.kyoto-u.ac.jp/en/research-news/2026-02-06-0
1•geox•8m ago•0 comments

SpaceX's next astronaut launch for NASA is officially on for Feb. 11 as FAA clea

https://www.space.com/space-exploration/launches-spacecraft/spacexs-next-astronaut-launch-for-nas...
1•bookmtn•9m ago•0 comments

Show HN: One-click AI employee with its own cloud desktop

https://cloudbot-ai.com
1•fainir•11m ago•0 comments

Show HN: Poddley – Search podcasts by who's speaking

https://poddley.com
1•onesandofgrain•12m ago•0 comments

Same Surface, Different Weight

https://www.robpanico.com/articles/display/?entry_short=same-surface-different-weight
1•retrocog•14m ago•0 comments

The Rise of Spec Driven Development

https://www.dbreunig.com/2026/02/06/the-rise-of-spec-driven-development.html
2•Brajeshwar•19m ago•0 comments

The first good Raspberry Pi Laptop

https://www.jeffgeerling.com/blog/2026/the-first-good-raspberry-pi-laptop/
3•Brajeshwar•19m ago•0 comments

Seas to Rise Around the World – But Not in Greenland

https://e360.yale.edu/digest/greenland-sea-levels-fall
2•Brajeshwar•19m ago•0 comments

Will Future Generations Think We're Gross?

https://chillphysicsenjoyer.substack.com/p/will-future-generations-think-were
1•crescit_eundo•22m ago•1 comments

State Department will delete Xitter posts from before Trump returned to office

https://www.npr.org/2026/02/07/nx-s1-5704785/state-department-trump-posts-x
2•righthand•25m ago•1 comments

Show HN: Verifiable server roundtrip demo for a decision interruption system

https://github.com/veeduzyl-hue/decision-assistant-roundtrip-demo
1•veeduzyl•26m ago•0 comments

Impl Rust – Avro IDL Tool in Rust via Antlr

https://www.youtube.com/watch?v=vmKvw73V394
1•todsacerdoti•26m ago•0 comments

Stories from 25 Years of Software Development

https://susam.net/twenty-five-years-of-computing.html
3•vinhnx•27m ago•0 comments

minikeyvalue

https://github.com/commaai/minikeyvalue/tree/prod
3•tosh•32m ago•0 comments

Neomacs: GPU-accelerated Emacs with inline video, WebKit, and terminal via wgpu

https://github.com/eval-exec/neomacs
1•evalexec•36m ago•0 comments

Show HN: Moli P2P – An ephemeral, serverless image gallery (Rust and WebRTC)

https://moli-green.is/
2•ShinyaKoyano•41m ago•1 comments

How I grow my X presence?

https://www.reddit.com/r/GrowthHacking/s/UEc8pAl61b
2•m00dy•42m ago•0 comments

What's the cost of the most expensive Super Bowl ad slot?

https://ballparkguess.com/?id=5b98b1d3-5887-47b9-8a92-43be2ced674b
1•bkls•43m ago•0 comments

What if you just did a startup instead?

https://alexaraki.substack.com/p/what-if-you-just-did-a-startup
5•okaywriting•49m ago•0 comments

Hacking up your own shell completion (2020)

https://www.feltrac.co/environment/2020/01/18/build-your-own-shell-completion.html
2•todsacerdoti•52m ago•0 comments
Open in hackernews

Has anyone else found Google's AI overview to be oddly error prone?

43•ckemere•9mo ago
I've been quite impressed by Google's AI overviews. This past week, though, I was interested in what I thought was a fairly simple question - to calculate compound interest.

Specifically, I was curious about how Harvard's endowment has grown from its initial £780 in 1638, so I asked Google to calculate compound interest for me. A variety of searches all yield a reasonable formula which is then calculated to be quite wrong. For example: {calculate the present value of $100 compounded annually for 386 years at 3% interest} yields $0.736. {how much would a 100 dollar investment in 1638 be worth in 2025 if invested} yields $3,903.46. {100 dollars compounded annually for 386 years at 3 percent} yields "The future value of the investment after 386 years is approximately $70,389." And my favorite: {100 dollars compounded since 1638} tells me a variety of outcomes for different interest rates: "A = 100 * (1 + 0.06)^387 A ≈ 8,090,950.14 A = 100 * (1 + 0.05)^387 A ≈ 10,822,768.28 A = 100 * (1 + 0.04)^387 A ≈ 14,422,758.11"

How can we be so reasonable and yet so bad!?

Comments

joegibbs•9mo ago
It's terrible. Gemini 2.5 Pro is great, but the AI overviews must be using a smaller model. I hate it when I look up something niche and it smugly tells me that I must be mistaken because there is no such thing. Also it gives annoyingly family-friendly responses to questions that it would be better off not responding to. The other day I was trying to find a Sopranos quote about two kinds of businesses being recession-proof, one of which being "certain aspects of entertainment" (i.e. prostitution) and it was telling me the certain aspects were filmmaking and music because they make people happy.
cma•9mo ago
Why wouldn't they use 2.5 flash first, and then if an identical query is made by lots of people rerun it with 2.5 pro? Sometimes it seems much more error prone than 2.5 pro or even 2.0 even on common searches.
cratermoon•9mo ago
LLMs can't do math.
3np•9mo ago
This. People need to manage their expectations.
Spivak•9mo ago
We're giving them calculators though, surely Google could provide a limited set of tools given Search already has a fairly sophisticated calculator.

I've been having my AI stuff successfully do math since early gp3 days with this method— even before "tool calling."

ianks•9mo ago
LLMs and tempered expectations, like oil and water
ckemere•9mo ago
Why offer a solution then? Seems fairly easy for google to avoid giving the final number?
cratermoon•9mo ago
That's not how LLMs work. These systems take a stream of tokens, do some linear algebra to find a stream of tokens within the parameters of their vector similarity, and spit out the result.
jbs789•9mo ago
Expectations can only be managed by someone who has sufficient understanding - in this case Google by not providing the result.
scarface_74•9mo ago
LLMs can’t do math. But that’s a solved problem. ChatGPT has had a built in Python runtime that can do math for years - at least the paid version.
cratermoon•9mo ago
Oh boy, python but now it costs $10,000 per formula.

Why?

scarface_74•9mo ago
It’s $20/month for ChatGPT+.
zacksiri•9mo ago
I recently used Gemini and Google search (with overview) to confirm whether a snack i bought from japan has expired. Used gemini to take a picture of the label written in japanese

One item said 25/7/25 the other one said 25/7/24 as you can imagine I was sure the first one was safe but the second one was confusing.

It told me that it's safe to eat because japanese date format is Year / Month / Date.

I looked up japanese date format in google (with overview) just to confirm. I guess we'll find out. Will report back soon.

elicksaur•9mo ago
I think I’d call these examples “predictable” failures instead of “odd”.
mergy•9mo ago
They are awful often for me. Examples - recommending installation of packages and software that doesn't exist, or settings changes that don't exist I In applications, etc. They fill the page but it's sadly noise so it cheapens the whole experience when I would have just preferred a link to a page from a person that knows what the hell they are talking about.
whatamidoingyo•9mo ago
> recommending installation of packages and software that doesn't exist

"slopsquatting" is the term coined for this.

Essentially, bad actors are registering these packages and uploading malware. If you happen to just blindly follow the AI, there's a chance your system gets infected.

namaria•9mo ago
Why would you use an LLM for this? A simple spreadsheet can do this sort of calculation easily and deterministically.

Also, the assumption of '3% interest' is wrong. There are records of stretches achieving 15% returns for several years and reaching 23% in 2007, for example.

https://www.bloomberg.com/news/articles/2005-01-11/harvard-l...

https://www.wsj.com/articles/SB118771455093604172

This was 2 minutes of old school search, no LLM needed.

amanaplanacanal•9mo ago
Long term interest rates over hundreds of years are a lot closer to 3% than 15%. You can't extrapolate a few good years like that.
namaria•9mo ago
You can't compound 760 1600s pounds for 400 years and get to a dollar amount either. The whole exercise is spurious. That is beside the point.

What I am saying is that asking an LLM to do interest calculation is absurd in itself, let well alone the absurd setting of trying to calculate interest rates across 4 centuries and different denominations.

It would be much more rational, in seeking to understand the growth of the Harvard endowment, to search for factual information about its modern history is my point. And if you wanna do abstract financial modelling exercises just use spreadsheets. Either way LLMs are a hilariously bad fit.

780 compounded by 3% per year for ~400 years is about 100 million by the way. So ignoring all else, off by at least two orders of magnitude.

ckemere•9mo ago
Of course! I was originally imagining google would give me a website with an embedded calculator. I was most surprised by how everything was beautifully accurate up until the end when the number felt suspicious. (Much less suspicious than the examples I posted, actually.)
namaria•9mo ago
You can just search for "interest calculator" there many such sites.

Also, I don't quite get it:

> everything was beautifully accurate up until the end when the number felt suspicious

The LLM generated text about compounding interest over 400 years from early modern british pounds to modern dollars was accurate? How is it possible to be accurate about an absurd operation?

snypher•9mo ago
I don't think they intended to use AI, but tried to search instead and was presented with Googles AI summary / time wasted before the actual search results.
AznHisoka•9mo ago
I clicked on the first tab in the search results and had no idea it now redirected to “AI mode” now. So yep thats possible
ZeroGravitas•9mo ago
I saw a report via Simon Willison that if you make up a phrase and add "meaning" to the end of your Google search, it'll invent a meaning for it.

His example was "A swan won't prevent a hurricane meaning"

https://simonwillison.net/2025/Apr/23/meaning-slop/

nitwit005•9mo ago
It seems expectedly error prone.

Aside from the general limitations of this technology, Google needs this to be quite cheap if it runs for every request.

There is not a lot of revenue for a single search, and right now the AI results are actually pushing the links people are paying Google to display further down the page.

drpixie•9mo ago
We're all sadly gullible.

We're all in IT. We know what an LLM is. But still we're fooled!?

bhouston•9mo ago
I think it is because it is using a "mini" model with the search results as a RAG source so they can afford to use it on every single query. Thus it doesn't know very much and doesn't have much context to work with.
throwaway290•9mo ago
LLMs are all bad at math. But there are worse ways Google fails.

Like people asked "does Lululemon use <name of some chinese company> to make its products" and Google says "yes", with no source except one tiktok video that falsely claims it to boost sales in face of tariffs. (Ignoring that it's not in the actual supplier list published by lululemon on their site)

Which means basically people would see that tiktok, go to fact check on google if it's true, and google overview will say "yes" (+ paragraphs of text that no one reads) citing that tiktok.

Vicious circle of LLM factchecking. Google used to be immune to it until it started to shove chatbot output to people's faces.

minecraft001•9mo ago
I recently searched for a person and it concatenated the lives of several different people with the same name together like “X is a senator… He is also a professional baseball player…”
Hojojo•9mo ago
There was this Reddit post yesterday where it completely makes up being able to plant flowers in Elden Ring https://www.reddit.com/r/Eldenring/comments/1k6hupy/thanks_g...

The AI overview is worse than useless. It either hallucinates things or it treats shitposts as equally valid information-wise as anything else.

rsynnott•9mo ago
Our robot overlords are _terrible_ at anything financial. I'm on a financial forum where, lately, people are _constantly_ posting stuff that they got from The Oracle and asking what they're doing wrong because they don't understand the result, and the answer is inevitably that ChatGPT or whatever fed them plausible-looking rubbish (this is the _real_ AI safety problem; laypeople tend to take the output as infallible, even though it's usually rubbish.)

Though also as a sidenote Harvard's endowment probably wasn't put in a bank account with a flat 3% interest rate for a few hundred years...

mindslight•9mo ago
It's not "AI" - its an LLM that is doing the equivalent of a college freshman padding out a paper for length. Confident, verbose, polished, but ultimately based on little hard reasoning - aka bullshitting. When it's wrong, and if you notice it and call it out, it will happily apologize and "correct" itself with the same well-written prose while making another mistake (or even the same exact one). LLMs certainly have utility, but it's more as generating inputs to some verifying processes rather than as a standalone oracle that competently answers questions.
potbelly83•9mo ago
It's not the AI doing math wrong as a lot of people are commenting. It's the way it's parsing your sentence. When it reads 'the present value of $100' it thinks that today's value of the investment is $100, and it needs to determine what the investment was worth 386 years ago (assuming a 3% interest rate).
malfist•9mo ago
Almost like there's a reason we use highly specific languages when telling a cpu what to do
dankwizard•9mo ago
No, you're wrong too. If it was doing that, the answer would not have been 0.736 as that would grow to ~$21,200 over 386 years.
NikkiA•9mo ago
Just invent a 'common' saying and add 'explanation' at the end.
alissa_v•9mo ago
Totally resonate with this. It feels less like a helpful overview and more like a confidently wrong pop-up I have to dismiss or fact-check before I can get to the actual search results (the links below).

I saw one example where someone asked about the fastest way to boil water, and the AI overview confidently stated that adding salt lowers the boiling point significantly, making it boil faster. It sounds vaguely scientific but gets a fundamental concept completely backward! That's the kind of error that's more worrying than just bad math – it confidently misrepresents basic, easily verifiable science.

It's a strange feeling having to approach Google search results with a layer of skepticism now, which used to be the gold standard for getting quickly pointed to reliable info. The AI Overview feels like a glossy, sometimes misleading, advertisement for the links I actually wanted in the first place.

CM30•9mo ago
It's hilariously bad. For one thing, it seems to try to find an 'explanation' for anything you type in, no matter how ridiculous that might be or if there's even any info on it online at all. It's become a common meme online to just come up with random fake idioms, throw them into Google and see the nonsense it comes up with in its desperate attempt to make sense of the non phrase.

And even if you do ask a legitimate question, you have to then hope the system knows what you actually mean rather than taking every word in your question literally and returning a complete non answer. So you might ask "was [actor name] in Chicago (the movie)?", only for Google to say "no, [actor name] doesn't live in Chicago".

Add the dangerous misinformation, the extremist answers sometimes generated and its attempts to make up sources if I can't find any, and well, it's basically useless for just about everything.

babyent•9mo ago
Call me old fashioned but I’ve been searching and using Google list results instead of the AI Overview.

Btw Google, you’re welcome I’m clicking links and making you money. For my needs, you’re a great search engine.

xp84•9mo ago
As far as I can tell, these aren’t made by asking a competent model to answer the “question” — based on what they seem to do, it seems to me like they take a model (a “mini” type of model?), pipe in the contents of the first 5 or 10 results from the slopfest that are Google’s current search results, and tell it to summarize THAT.

This is why it tells you to eat rocks. It is a very narrow sample of webpages and suffers from not even contextualizing each page it is reading to wonder if it’s a troll, satire, propaganda, fiction, or fact.

I have taken to ignoring them completely. I’d rather ask ChatGPT directly than trust these - and often I do just that. It’s much more accurate.

What’s frustrating is that the real estate these occupy was until a few years ago where they’d put text extracts quoted directly from a short-ish list of reputable sites. Same purpose, different content. While it was arguably a bit abusive of the sites to extract and display their contents there, the information used to be pretty reliable as a result.