frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Tell HN: Gemini 3.5 Flash breaks in stupid ways

5•XCSme•54m ago
I thought I was going crazy, trying to use Gemini 3.5 Flash to rate some answers, but it kept giving 7 instead of 10 for correct answers.

Apparently once you add a "Grading criteria" text, the model collapses into a "compressed toward the center of the scale" hallucination (or training set overfitting).

Someone on X asked me to try to reproduce it, and I actually got it on the first try on their Gemini Chat:

https://x.com/XCSme/status/2057613611959279988

I am not sure what to make of this (or most SOTA) models. They got a lot smarter with coding and tool usage, but a lot dumber in other ways...

Comments

XCSme•52m ago
Direct link to the chat, ignore the story, it's just some filler tokens: https://gemini.google.com/share/244af1e74841

Convert between 30 color formats in one tool (HEX, RGB, Tailwind, Flutter,)

https://colorcx.com/
1•hkdb•50s ago•0 comments

How you probably will find Satoshi

https://foxchapelresearch.substack.com/p/how-you-probably-will-find-satoshi
1•lalaland1125•1m ago•0 comments

Condé Nast expects search to become a single-digit of its traffic

https://searchengineland.com/conde-nast-search-single-digit-traffic-477358
1•gnabgib•6m ago•0 comments

Fork Your Dependencies

https://twitter.com/mitchellh/status/2057171518027887035
2•nreece•12m ago•0 comments

Stocks Are Not an Effective Inflation Hedge

https://financialpost.com/pmn/business-pmn/repeat-after-me-stocks-are-not-an-effective-inflation-...
2•littlexsparkee•13m ago•0 comments

The Rise of Build-to-Rent Housing

https://www.construction-physics.com/p/the-rise-of-build-to-rent-housing
1•JumpCrisscross•16m ago•0 comments

Trump's IRS "settlement" is not limited to $1.776B [video]

https://www.youtube.com/watch?v=tIBCjzz-bmk
1•mdnahas•16m ago•1 comments

Woman files lawsuit after arrest for Facebook post concerning Trinidad water

https://www.fox4news.com/news/woman-arrested-facebook-post-concerning-trinidad-water-poisoning
2•ki4jgt•17m ago•0 comments

San Francisco woman gets photographer's old number. It changes both their lives

https://www.nbcbayarea.com/news/local/bay-area-proud/san-francisco-woman-gets-photographers-old-p...
1•gnabgib•18m ago•0 comments

Making sure what your code does you think it does, with Vet

https://robocall.github.io//bouncing-balls
1•d0able•27m ago•0 comments

EHRC guidance puts women first, which may upset people

https://millihill.substack.com/p/ehrc-guidance-puts-women-first-which
1•appreciatorBus•30m ago•1 comments

Nginx-poolsip: new RCE 0-day and ASLR bypass in mainline Nginx

https://twitter.com/nebusecurity/status/2057071579876753643
2•negura•31m ago•1 comments

Yet Another AI Teammate

https://yaat.sh/
1•c4pt0r•34m ago•0 comments

JPMorgan Fights over Comic Books Locked in a Mississippi Warehouse

https://www.bloomberg.com/news/features/2026-05-21/jpmorgan-publishers-fight-over-bankrupt-comic-...
1•petethomas•39m ago•0 comments

Human Urine Becomes Option for Farmers in Fertilizer Supply Crunch

https://www.bloomberg.com/news/articles/2026-05-22/farmers-turn-to-human-urine-after-fertilizer-c...
1•petethomas•43m ago•0 comments

U.S. bears brunt of Israel's missile defense, Pentagon assessments show

https://www.washingtonpost.com/national-security/2026/05/21/us-bears-brunt-israels-missile-defens...
3•Teever•43m ago•0 comments

Reptyr: attach a running process to a new terminal (2011)

https://blog.nelhage.com/2011/01/reptyr-attach-a-running-process-to-a-new-terminal/
1•Curiositry•46m ago•0 comments

The San Francisco $10k treasure chest has been found

https://old.reddit.com/r/sanfrancisco/comments/1tjv11f/we_found_a_10000_treasure_chest
3•notknifescience•48m ago•0 comments

Tell HN: Gemini 3.5 Flash breaks in stupid ways

5•XCSme•54m ago•1 comments

Design Notes: Local Lifetimes for Kotlin

https://github.com/Kotlin/KEEP/blob/main/notes/0007-local-lifetimes.md
1•LelouBil•1h ago•1 comments

Eyes on the Solar System

https://eyes.nasa.gov/apps/solar-system/
1•md224•1h ago•0 comments

Demystifying phone unlocking tools: A technical overview

https://osservatorionessuno.org/blog/2026/05/demystifying-phone-unlocking-tools-a-technical-overv...
1•Cider9986•1h ago•0 comments

Staybl, the browser that adjusts for tremors in real time (2022)

https://www.marketingbrew.com/stories/2022/04/26/the-story-behind-havas-new-app-for-people-with-t...
1•bobbiechen•1h ago•0 comments

Show HN: Free Fonts – a collection of 400+ original, open-source typefaces

https://www.mixfont.com/fonts
3•justswim•1h ago•0 comments

Lam Research focused on adding AI to chipmaking tools as it eyes US expansion

https://www.reuters.com/business/lam-research-focused-adding-ai-chipmaking-tools-it-eyes-us-expan...
2•tartoran•1h ago•0 comments

CVE-2026-28910: Breaking macOS App Sandbox Data Containers and Hijacking Apps

https://mysk.blog/2026/05/19/cve-2026-28910/
2•rzk•1h ago•0 comments

New Getmonero.org Website Design

https://beta.monerodevs.org/
3•Cider9986•1h ago•1 comments

Donald Trump abruptly postpones AI order after White House infighting

https://www.ft.com/content/14213cb0-8d11-4118-bac0-12a403696185
2•petethomas•1h ago•0 comments

Ask HN: What happens when you intercept and modify Claude Code's system prompt?

1•lbrauer•1h ago•0 comments

XMRChat – Tip your favorite streamer in cryptocurrency

https://xmrchat.com
4•Cider9986•1h ago•2 comments