frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Bonsai 1.7B ternary model at 442T/s on M4 Max

https://agents2agents.ai/bonsai
9•hhuytho•2h ago
We took a recently released Bonsai 1.7B ternary model from PrismML (https://github.com/PrismML-Eng/Bonsai-demo) and ran our agentic evolution search on it for 6 hours to optimize the Metal kernels. The search was fully autonomous.

Measured against unmodified upstream llama.cpp at the same Bonsai/Q2_0 commit, same M4 Max:

- tg128: 309.82 → 442.42 t/s (+42.0%)

- pp512: 4250.32 → 4622.63 t/s (+8.8%)

Comments

dsecurity49•1h ago
That performance jump is incredible. Curious to know if the evolution search found any specific optimizations that were counter-intuitive to how we normally write Metal kernels?
hhuytho•1h ago
Yes, a few interesting observations:

- Instead of the conventional wisdom for fusion: "fuse early, fuse aggressively", the search does the opposite for Q. It fuses K's RMSNorm at K-cache-write time (one norm. for the whole K matrix), but defers Q's RMSNorm to attention kernel's prologue.

- The result_output of Q2_0 kernel was rewritten to process 2 output rows per SIMD lane instead of 1, with nsg=8. This is against the common Metal advice of maximizing occupancy to keep simdgroups busy. The advantage is that each y vector gets reused across two accumulators, halving DRAM bandwidth for the y operand.

We didn't suggest either of these. The agent had the upstream code, a benchmark, and a correctness check.

Show HN: Muesli – If Granola and Wisprflow had an open source on device baby

https://freedspeech.xyz
5•pHequals7•1h ago•3 comments

Show HN: Bonsai 1.7B ternary model at 442T/s on M4 Max

https://agents2agents.ai/bonsai
9•hhuytho•2h ago•2 comments

Show HN: Ableton Live MCP

https://github.com/bschoepke/ableton-live-mcp
105•bschoepke•1d ago•73 comments

Show HN: Apple's SHARP running in the browser via ONNX runtime web

https://github.com/bring-shrubbery/ml-sharp-web
178•bring-shrubbery•1d ago•43 comments

Show HN: Pytest plugin that classifies why your CI failed

https://github.com/ahmad212o/pytest-cloudreport
3•ahmad212o•4h ago•0 comments

Show HN: Replacing spec-driven development with just facts

https://github.com/av/facts
6•everlier•4h ago•0 comments

Show HN:Privacy-First Pdf Converter

https://privapdf.net
3•omertt27•4h ago•5 comments

Show HN: I built a RISC-V emulator that runs DOOM

https://github.com/lalitshankarch/rvcore
45•Flex247A•1d ago•2 comments

Show HN: State of the Art of Coding Models, According to Hacker News Commenters

https://hnup.date/hn-sota
153•yunusabd•1d ago•86 comments

Show HN: Pollen – distributed WASM runtime, no control plane, single binary

https://github.com/sambigeara/pollen
129•sambigeara•4d ago•59 comments

Show HN: DAC – open-source dashboard as code tool for agents and humans

https://github.com/bruin-data/dac
112•karakanb•5d ago•35 comments

Show HN: Software Engineer to Novelist: Writing a Book Like Coding

https://frequal.com/forwriters/
20•TeaVMFan•1d ago•3 comments

Show HN: Parrot – a fun, skeuomorphic audio recorder to hear yourself

https://www.zkhrv.com/parrot
14•zkhrv•1d ago•0 comments

Show HN: WhatCable, a tiny menu bar app for inspecting USB-C cables

https://github.com/darrylmorley/whatcable
554•sleepingNomad•3d ago•166 comments

Show HN: Mljar Studio – local AI data analyst that saves analysis as notebooks

https://mljar.com/
70•pplonski86•2d ago•16 comments

Show HN: AI CAD Harness

https://fusion.adam.new/install
97•zachdive•3d ago•95 comments

Show HN: Browser-based light pollution simulator using real photometric data

https://iesna.eu/?wasm=skyglow_demo
42•holg•2d ago•16 comments

Show HN: Kula – a family health platform that makes sense of your data

6•samuraikmc•13h ago•8 comments

Show HN: Piruetas – A self-hosted diary app I built for my girlfriend

https://piruet.app
70•patillacode•2d ago•48 comments

Show HN: Filling PDF forms with AI using client-side tool calling

https://copilot.simplepdf.com/?share=a7d00ad073c75a75d493228e6ff7b11eb3f2d945b6175913e87898ec96ca...
56•nip•2d ago•25 comments

Show HN: Large Scale Article Extract of Newspapers 1730s-1960s

https://snewpapers.com/
50•brettnbutter•2d ago•20 comments

Show HN: Tyche: An experimental distributed trading pipeline in Go Java

https://github.com/ItsArnavSh/Tyche
3•itsarnavsh•22h ago•5 comments

Show HN: Hello, World in many different languages

https://languages.jdunn.dev/
10•jdironman•1d ago•11 comments

Show HN: Stop playing my matchstick puzzles, start building your own in seconds

https://mathstick.github.io
36•trangram•2d ago•33 comments

Show HN: ReflowPDF – wrote a layout engine because every PDF library failed

https://reflowpdf.com
6•exsol•17h ago•2 comments

Show HN: Security Scanner for Agent Skills and MCP

https://github.com/snyk/agent-scan
7•lirantal•1d ago•0 comments

Show HN: Site Mogging

https://sitemogging.com
68•jilles•3d ago•76 comments

Show HN: Loopsy, a way for terminals and AI agents on different machines to talk

https://github.com/leox255/loopsy
57•todience•3d ago•12 comments

Show HN: Rust library for Undo/Redo using deltas, snapshots or commands

https://github.com/mikwielgus/undoredo
23•mikolajw•1d ago•4 comments

Show HN: Triggering anti-cheats with just a browser tab title

https://github.com/elliott-diy/DontTrustTitles
8•Elliott-Diy•20h ago•2 comments