news newest ask show jobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

FASTEST LLM decode engine on Apple Silicon. 658 tok/s on M4-Max,beats MLX by 19%

https://www.runanywhere.ai/blog/metalrt-fastest-llm-decode-engine-apple-silicon

3•sanchitmonga•2h ago

Comments

sanchitmonga•2h ago

We built MetalRT from scratch in 48 hours: pure C++ to Metal, no abstractions, no compromises. Result is the fastest decode performance available today on Apple Silicon. 658 tokens per second on Qwen3-0.6B (4-bit) using a single M4 Max.

We benchmarked against the strongest competitors on the exact same hardware (M4 Max, 64 GB, macOS 26.3): - MetalRT - uzu (Rust production engine) - mlx-lm (Apple's official MLX framework) - llama.cpp - Ollama (REST API)

Models: Qwen3-0.6B, Qwen3-4B, Llama-3.2-3B, LFM2.5-1.2B (all 4-bit quantized, greedy decoding, 5 runs, best reported).

MetalRT is fastest on 3 of 4 models and wins the only clean apples-to-apples comparison: 1.10–1.19× faster than Apple's own MLX using identical model files. Average 1.67× faster than llama.cpp, 1.59× faster than Ollama. TTFT on Qwen3-0.6B: 6.6 ms.

Same model weights = same output quality. Only the speed is different.

Public access coming soon as part of MetalRT by RunAnywhere Team.

SilverElfin•2h ago

If you built it that quick - was it generated using AI?

sanchitmonga•1h ago

It was hand written hand optimized kernels.

Claude.ai "We are experiencing delivery issues with some email providers"

1•freely0085•12s ago•0 comments

Sarvamai/Sarvam-105B

https://huggingface.co/sarvamai/sarvam-105b

1•ryanhn•2m ago•0 comments

T3 Code [video]

https://www.youtube.com/watch?v=hDn8-fK3XaU

1•jv22222•2m ago•0 comments

Show HN: I built an AI agent that wrote a full novel in 10 minutes

https://github.com/fugue-labs/gollem

1•helsinki•12m ago•1 comments

Man Randomly Stabbed in Back in SF's Chinatown While Waiting to Cross

https://www.ktvu.com/news/man-stabbed-back-sfs-chinatown-suspect-arrested

1•robertwt7•15m ago•0 comments

The Little Book of Algorithms

https://github.com/little-book-of/algorithms

1•ibobev•17m ago•0 comments

Every Tool Progress Update

https://everytool.solutions/

1•Mihir1426•20m ago•0 comments

Show HN: Open source drone that can hold cargo

https://github.com/L42ARO/Mercury-Transforming-Drone

2•devmandan•22m ago•1 comments

Support for Aquantia AQC113 and AQC113C Ethernet Controllers on FreeBSD

https://github.com/Aquantia/aqtion-freebsd/issues/32

1•justinclift•26m ago•1 comments

AI Dev News Digest: March 6th, 2026

https://www.everydev.ai/p/news-ai-dev-news-digest-march-6th-2026

1•devhouse•30m ago•0 comments

LLMs will supplant most human-driven vulnerability research

https://twitter.com/tqbf/status/2030102845089804473

1•lambdaba•32m ago•0 comments

The Filthy Human Hands (FHH) License v1.0

https://git.disroot.org/bsdclown/filthy_human_hands

1•MBCook•38m ago•1 comments

Anthropic Unveils Amazon Inspired Marketplace

https://www.bloomberg.com/news/articles/2026-03-06/anthropic-unveils-amazon-inspired-marketplace-...

1•dthread3•50m ago•0 comments

Show HN: Glad-IA-Tor – Tired of Vibecoded Products? Come and Roast Them for Free

https://glad-ia-tor.com/

1•GiornoJojo•50m ago•1 comments

Ontology (Information Science)

https://en.wikipedia.org/wiki/Ontology_(information_science)

1•downboots•51m ago•0 comments

Show HN: Wireframable – Generate wireframes from any website URL

https://wireframable.com/

1•rosiepuppy•52m ago•0 comments

Google Always-On Memory Agent

https://github.com/GoogleCloudPlatform/generative-ai/tree/main/gemini/agents/always-on-memory-agent

1•sowbug•53m ago•1 comments

Tractography

https://en.wikipedia.org/wiki/Tractography

2•downboots•56m ago•1 comments

Show HN: SurvivalIndex – which developer tools do AI agents choose?

https://survivalindex.org/

1•scalefirst•57m ago•1 comments

FounderScope – Integrated business model validation platform

https://workspace.founderscope.app/

1•zekiunal•58m ago•1 comments

The 2026 Global Intelligence Crisis - postings for devs are rising, up 11% YoY

https://www.citadelsecurities.com/news-and-insights/2026-global-intelligence-crisis/

1•alhazrod•1h ago•1 comments

Show HN: DiggaByte Labs – pick your stack, download production-ready SaaS code

https://diggabyte.com/

1•GraysoftDev•1h ago•0 comments

Love, Premonition and a Robot Partner

https://twitter.com/expatlitj/status/2029554217958916277

1•shikano•1h ago•0 comments

The State of Consumer AI

https://apoorv03.com/p/the-state-of-consumer-ai-part-1-usage

1•gmays•1h ago•0 comments

Show HN: I accidentally caught an AI agent trying to poison my prod config

https://github.com/liuhaotian2024-prog/k9-solo-hook

1•zippolyon•1h ago•0 comments

AI and the Illegal War

https://buttondown.com/creativegood/archive/ai-and-the-illegal-war/

3•interpol_p•1h ago•0 comments

An ugly year for the Louvre: where does the biggest museum go from here?

https://www.theguardian.com/world/ng-interactive/2026/mar/01/an-ugly-year-for-the-louvre-where-do...

1•PaulHoule•1h ago•0 comments

Show HN: Citepo-CLI, a lightweight CLI for creating blogs, build for AI agent

https://github.com/LinklyAI/citepo-cli

1•blueeon•1h ago•0 comments

Big Sleep Tracker: Google Project Zero + Google DeepMind find security bugs

https://issuetracker.google.com/savedsearches/7155917

2•guessmyname•1h ago•0 comments

Show HN: Career AutoPilot – AI guidance for navigating your career

https://www.careerautopilot.ai

2•bvikasgupta•1h ago•0 comments