frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: PixelPointingBenchmark – Simple tests reveal surprising gaps

https://autodevice.github.io/PixelPointingBenchmark/
3•myrausman•2h ago
We built a small open-source benchmark to test how well vision-enabled LLMs handle pixel-level pointing on screens. Instead of complex UI screenshots, we use synthetic images with basic shapes and clean backgrounds to isolate spatial reasoning and coordinate accuracy.

The results were surprising:

Many top models miss by tens to hundreds of pixels on trivial tasks (e.g., center of a purple circle or red square). High run-to-run variance in some models (different answers on the same image/prompt). Performance flips dramatically with resolution or aspect ratio changes. Claude Sonnet and Claude Haiku are consistently near-perfect (0–1px error), while others show clear gaps. We wrote a detailed blog post about the findings: https://autodevice.io/blog/wheres-the-pixel-part-1

Repo (easy to run, add tests, try new models): https://autodevice.github.io/PixelPointingBenchmark/

Curious to see how the latest vision LLMs do on this. If you run it, share your results or feedback.

Happy to discuss improvements or extensions!

#VisionLLM #LLM #Benchmark #SpatialReasoning #GUI #ComputerUse #AI

How supermarkets turned loyalty cards into a data treasure trove

https://www.rte.ie/brainstorm/2025/1227/1407120-supermarket-loyalty-cards-customers-data-behaviou...
1•austinallegro•1m ago•0 comments

The world wants more Ube

https://www.nytimes.com/2025/12/29/world/asia/philippines-ube-purple-yam.html
1•lawgimenez•2m ago•0 comments

My Writing Isn't AI Slop–and It Hurts That You Think It Is

https://news.ycombinator.com/submitted?id=haebom
1•haebom•5m ago•1 comments

Show HN: Your backend will fail eventually

https://www.zoyla.app
1•behnamazimi•5m ago•0 comments

Google Doppl

https://labs.google/doppl/
2•ms7892•7m ago•0 comments

Windows VTL2 Technical Exploration

https://howknows.github.io/roooot.github.io/VTL2/Windows_VTL2_Technical_Exploration.html
1•PaulHoule•8m ago•0 comments

The Honey Files Expose Major Fraud

https://www.youtube.com/watch?v=qCGT_CKGgFE
1•garyng•8m ago•0 comments

Who Invented the Transistor?

https://people.idsia.ch/~juergen/who-invented-the-transistor.html
1•todsacerdoti•11m ago•0 comments

Group Customer Interviewing Practice

https://adrianhoward.com/posts/group-user-interviewing-practice/
1•adrianhoward•13m ago•0 comments

AI Futures Model: Dec 2025 Update (to the AI 2027 forecast)

https://blog.ai-futures.org/p/ai-futures-model-dec-2025-update
1•ta_u•13m ago•1 comments

Bitwise ETFs Expansion: Firm Files for 11 New Altcoin Strategy Funds

https://timescrypto.com/cryptonews/regulation-and-policy/bitwise-etfs-expansion-firm-files-for-11...
1•Alan_Rada•15m ago•0 comments

Ask HN: What tiny tool do you use every day but never talk about?

2•puildupO•17m ago•1 comments

Show HN: SWOTPal – Analyze LinkedIn profiles and websites into SWOT charts

https://swotpal.elevenapril.com
1•elevenapril•24m ago•1 comments

US stock market returns – 1870 to present

https://themeasureofaplan.com/us-stock-market-returns-1870s-to-present/
1•simonebrunozzi•25m ago•0 comments

How to Be More Agentic

https://usefulfictions.substack.com/p/how-to-be-more-agentic
1•vidyesh•30m ago•0 comments

Show HN: JSciPy – A Java port of SciPy's signal processing module

https://github.com/hissain/jscipy
1•hissain•31m ago•0 comments

Show HN: Generate your personal HN recap for 2025

https://hn-2025.userjam.com
10•giladvdn•32m ago•1 comments

Removing CapCut Watermarks Using Video Inpainting and Temporal Consistency

https://blog.videowatermarkremove.com/remove-capcut-watermark-ai
1•ilmj8426•33m ago•0 comments

Swiss Federal Council includes apprentices in the official end of year photo

https://www.20min.ch/story/guy-parmelin-bundesratsfoto-2026-soll-authentische-landesregierung-zei...
1•theanonymousone•33m ago•0 comments

AI Futures Model

https://www.aifuturesmodel.com/
1•zielmicha•34m ago•0 comments

Give your agentic processes a name

https://simonhartcher.com/posts/2025-12-31-give-your-agentic-processes-a-name/
1•deevus•35m ago•1 comments

Lithopedion

https://en.wikipedia.org/wiki/Lithopedion
1•ZeljkoS•35m ago•0 comments

Graphics API is irrelevant [video] – Shader code to video with C and FFmpeg

https://www.youtube.com/watch?v=xNX9H_ZkfNE
1•nopakos•35m ago•0 comments

Show HN: Shadowlight, a voice-driven murder mystery and heist inside Minecraft

https://www.playshadowlight.com/
2•marcsimon42•40m ago•1 comments

Anycrap: Infinite Weed

https://anycrap.shop/product/infinite-weed
1•tempodox•41m ago•1 comments

Show HN: Reload processes based on File changes and pluggable Agentic hooks

https://github.com/system32-ai/wip
1•debarshri•43m ago•0 comments

A Story Painted in Data

https://art.r3t.io
1•mxplusb•43m ago•1 comments

Show HN: AdvanceGG – A high-performance 2D graphics library for Go

https://github.com/GrandpaEJ/advancegg
2•ZOROX•48m ago•0 comments

Switching off AI's ability to lie makes it more likely to claim it's conscious

https://www.livescience.com/technology/artificial-intelligence/switching-off-ais-ability-to-lie-m...
5•binning•50m ago•3 comments

The latest AI news we announced in December

https://blog.google/technology/ai/google-ai-updates-december-2025/
3•clarkmaxwell•51m ago•0 comments