frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Open dataset of real-world LLM performance on Apple Silicon

https://devpadapp.com/anubis-oss.html
1•uncSoft•1h ago
Why open source local AI benchmarking on Apple Silicon matters - and why your benchmark submission is more valuable than you think.

The narrative around AI has been almost entirely cloud-centric. You send a prompt to a data center, tokens come back, and you try not to think about the latency, cost, or privacy implications. For a long time, that was the only game in town.

Apple Silicon - from M1 through the M4 Pro/Max shipping today, with M5 on the horizon - has quietly become one of the most capable local AI compute platforms on the planet. The unified memory architecture means an M4 Max with 128GB can run models that would require a dedicated GPU workstation elsewhere. At laptop wattages. Offline. Without sending a single token to a third party.

This shift is legitimately great for all parties (except cloud ones that want your money), but it comes with an unsolved problem: we don't have great, community-driven data on how these machines actually perform in the wild.

That's why I built Anubis OSS.

The Fragmented Local LLM Ecosystem

If you've run local models on macOS, you've felt this friction. Chat wrappers like Ollama and LM Studio are great for conversation but not built for systematic testing. Hardware monitors like asitop show GPU utilization but have no concept of what model is loaded or what the prompt context is. Eval frameworks like promptfoo require terminal fluency that puts them out of reach for many practitioners.

None of these tools correlate hardware behavior with inference performance. You can watch your GPU spike during generation, but you can't easily answer: Is Gemma 3 12B Q4_K_M more watt-efficient than Mistral Small 3.1 on an M3 Pro? How does TTFT scale with context length on 32GB vs. 64GB?

Anubis answers those questions. It's a native SwiftUI app - no Electron, no Python runtime, no external dependencies - that runs benchmark sessions against any OpenAI-compatible backend (Ollama, LM Studio, mlx-lm, and more) while simultaneously pulling real hardware telemetry via IOReport: GPU/CPU utilization, power draw in watts, ANE activity, memory including Metal allocations, and thermal state.

Why the Open Dataset Is the Real Story

The leaderboard submissions aren't a scoreboard - they're the start of a real-world, community-sourced performance dataset across diverse Apple Silicon configs, model families, quantizations, and backends.

This data is hard to get any other way. Formal chipmaker benchmarks are synthetic. Reviewer benchmarks cover a handful of models. Nobody has the hardware budget to run a full cross-product matrix. But collectively, the community does.

For backend developers, the dataset surfaces which chip/memory configurations are underperforming their theoretical bandwidth, where TTFT degrades under long contexts, and what the real-world power envelope looks like under sustained load. For quantization authors, it shows efficiency curves across real hardware, ANE utilization patterns, and whether a quantization actually reduces memory pressure or just parameter count.

Running a benchmark takes about two minutes. Submitting takes one click.

Your hardware is probably underrepresented. The matrix of chip × memory × backend × thermal environment is enormous — every submission fills a cell nobody else may have covered.

The dataset is open. This isn't data disappearing into a corporate analytics pipeline. It's a community resource for anyone building tools, writing research, or optimizing for the platform.

Anubis OSS is working toward 75 GitHub stars to qualify for Homebrew Cask distribution, which would make installation dramatically easier. A star is a genuinely meaningful contribution.

Download from the latest GitHub release — notarized macOS app, no build required Run a benchmark against any model in your preferred backend Submit results to the community leaderboard Star the repo at github.com/uncSoft/anubis-oss

Comments

uncSoft•1h ago
Addendum- Anubis OSS is GPL-3.0 licensed. Built fully in Swift and dev cert signed for safety (if you don't want to clone the source and compile yourself), no external dependencies except Sparkle for autoupdates if you want them, privacy-first - benchmark data is submitted voluntarily and never includes anything beyond hardware specs and model performance metrics.

You Just Reveived

https://dylan.gr/1772520728
1•djnaraps•5m ago•0 comments

Nike Is Moving Jobs to Low-Wage Regions of Indonesia

https://www.propublica.org/article/nike-jobs-indonesia-living-wages
2•petethomas•9m ago•0 comments

First Open-Source PR

https://duanehilton.com/notes/your-first-open-source-pr
1•rem_one•11m ago•0 comments

Signalbase – Real-time business intelligence API for agents (x402, USDC on Base)

https://github.com/brandontan/signalbase
1•brtan881972•16m ago•0 comments

Show HN: OctoFlow–GPU-native lang, vibe-coded with human at every decision gate

https://github.com/octoflow-lang/octoflow
1•mr_octopus•16m ago•1 comments

Low data gravity for fast retrieval on K8s (2021)

https://ra-mos.medium.com/get-up-an-running-with-local-ssds-on-kubernetes-gke-p1-the-code-c6cf5ac...
1•ramoz•17m ago•0 comments

Dear Meta Smart Glasses Wearers: You're Being Watched, Too

https://gizmodo.com/dear-meta-smart-glasses-wearers-youre-being-watched-too-2000728928
2•pabs3•18m ago•1 comments

Bayesian teaching enables probabilistic reasoning in large language models

https://www.nature.com/articles/s41467-025-67998-6
1•paraschopra•19m ago•0 comments

GitHub Copilot Goldeneye model preview

https://docs.github.com/en/copilot/reference/ai-models/model-hosting
1•amusingimpala75•27m ago•1 comments

Superorganism AI

1•kaungsetlin•29m ago•0 comments

LexisNexis confirms data breach as hackers leak stolen files

https://www.bleepingcomputer.com/news/security/lexisnexis-confirms-data-breach-as-hackers-leak-st...
2•arkadiyt•30m ago•0 comments

Morgan Stanley Lays Off 2,500 Employees Across All Divisions

https://www.wsj.com/finance/banking/morgan-stanley-lays-off-2-500-employees-across-all-divisions-...
4•LostMyLogin•32m ago•1 comments

Learn Fundamentals, Not Frameworks

https://newsletter.techworld-with-milan.com/p/learn-fundamentals-not-frameworks
2•stosssik•32m ago•0 comments

Brainworm – Hiding in Your Context Window

https://www.originhq.com/blog/brainworm
1•dsr12•32m ago•0 comments

How does AI change Software Engineering?

https://dlants.me/ai-se.html
1•todsacerdoti•34m ago•0 comments

Iran says targeted AWS Data Centers for support of U.S. military

https://www.cnbc.com/2026/03/04/amazon-bahrain-data-centers-targeted-iran-drone-strike.html
5•johnbarron•36m ago•1 comments

Iran threatens Dimona nuclear site if Israel, US seek to topple Islamic Republic

https://www.timesofisrael.com/liveblog-march-05-2026/
3•johnbarron•37m ago•0 comments

Vibecheck – learn what you build while vibe-coding. A reality check

https://github.com/akshan-main/vibe-check/README.md
1•frutigeraerosol•38m ago•1 comments

Anthropic Reopens Talks with Pentagon

https://www.bloomberg.com/news/articles/2026-03-05/anthropic-s-amodei-reopens-ai-discussions-with...
2•cmrdporcupine•40m ago•1 comments

The L in "LLM" Stands for Lying

https://acko.net/blog/the-l-in-llm-stands-for-lying/
3•LorenDB•41m ago•0 comments

Show HN: Jobbi – Free AI resume tailoring with unlimited PDF exports

https://jobbi.app
1•djrnz•41m ago•0 comments

Show HN: Poppy – a simple app to stay intentional with relationships

https://poppy-connection-keeper.netlify.app/
1•mahirhiro•47m ago•0 comments

Franken Style: a nobuild CSS framework inspired by tailwind and Shadcn

https://franken.style/
1•yashasolutions•47m ago•1 comments

BM25

https://arpitbhayani.me/blogs/bm25/
2•arpitbbhayani•48m ago•1 comments

Ask HN: MacBook or ThinkPad for Compsci

2•helloworlddd•51m ago•5 comments

Show HN: Textideo – Generate video, audio, and 3D assets in one timeline

https://textideo.com/image-to-3d
1•Nancylily•52m ago•1 comments

A new way of editing videos

https://kudoflix.com/
1•mandrixx•56m ago•0 comments

China Tells Top Refiners to Suspend Diesel and Gasoline Exports

https://www.bloomberg.com/news/articles/2026-03-05/china-tells-top-refiners-to-suspend-diesel-and...
5•toomuchtodo•59m ago•0 comments

Home Made GPS Receiver

http://www.aholme.co.uk/GPS/Main.htm
2•jacquesm•1h ago•1 comments

Sound and Silence: What made Alexander Graham Bell invent the telephone? (1998)

https://www.newyorker.com/magazine/1998/04/13/sound-and-silence
1•mitchbob•1h ago•1 comments