frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

All elementary functions from a single binary operator

https://arxiv.org/abs/2603.21852
360•pizza•7h ago•100 comments

The Economics of Software Teams: Why Most Engineering Orgs Are Flying Blind

https://www.viktorcessan.com/the-economics-of-software-teams/
120•kiyanwang•3h ago•62 comments

Taking on CUDA with ROCm: 'One Step After Another'

https://www.eetimes.com/taking-on-cuda-with-rocm-one-step-after-another/
160•mindcrime•10h ago•119 comments

DIY Soft Drinks

https://blinry.org/diy-soft-drinks/
431•_Microft•16h ago•115 comments

Bring Back Idiomatic Design (2023)

https://essays.johnloeber.com/p/4-bring-back-idiomatic-design
554•phil294•20h ago•305 comments

Show HN: boringBar – a taskbar-style dock replacement for macOS

https://boringbar.app/
364•a-ve•15h ago•199 comments

Optimization of 32-bit Unsigned Division by Constants on 64-bit Targets

https://arxiv.org/abs/2604.07902
75•mpweiher•1d ago•7 comments

Ask HN: What Are You Working On? (April 2026)

217•david927•16h ago•673 comments

A perfectable programming language

https://alok.github.io/lean-pages/perfectable-lean/
121•yuppiemephisto•12h ago•42 comments

Most people can't juggle one ball

https://www.lesswrong.com/posts/jTGbKKGqs5EdyYoRc/most-people-can-t-juggle-one-ball
349•surprisetalk•3d ago•109 comments

I gave every train in New York an instrument

https://www.trainjazz.com/
275•joshuawolk•2d ago•52 comments

Apple's accidental moat: How the "AI Loser" may end up winning

https://adlrocha.substack.com/p/adlrocha-how-the-ai-loser-may-end
180•walterbell•6h ago•173 comments

Show HN: Oberon System 3 runs natively on Raspberry Pi 3 (with ready SD card)

https://github.com/rochus-keller/OberonSystem3Native/releases
194•Rochus•20h ago•44 comments

Tell HN: Docker pull fails in Spain due to football Cloudflare block

876•littlecranky67•20h ago•326 comments

We have a 99% email reputation, but Gmail disagrees

https://blogfontawesome.wpcomstaging.com/we-have-a-99-email-reputation-gmail-disagrees/
250•em-bee•20h ago•227 comments

A Canonical Generalization of OBDD

https://arxiv.org/abs/2604.05537
13•luu•4h ago•6 comments

Is math big or small?

https://chessapig.github.io/talks/Big-Small
39•robinhouston•1d ago•12 comments

Caffeine, cocaine, and painkillers detected in sharks from The Bahamas

https://www.sciencedirect.com/science/article/abs/pii/S0269749126001880
8•LostMyLogin•1h ago•2 comments

Exploiting the most prominent AI agent benchmarks

https://rdi.berkeley.edu/blog/trustworthy-benchmarks-cont/
523•Anon84•1d ago•132 comments

Google removes "Doki Doki Literature Club" from Google Play

https://bsky.app/profile/serenityforge.com/post/3mj3r4nbiws2t
448•super256•13h ago•222 comments

Seven countries now generate nearly all their electricity from renewables (2024)

https://www.the-independent.com/tech/renewable-energy-solar-nepal-bhutan-iceland-b2533699.html
575•mpweiher•19h ago•333 comments

JVM Options Explorer

https://chriswhocodes.com/vm-options-explorer.html
193•0x54MUR41•22h ago•86 comments

I ran Gemma 4 as a local model in Codex CLI

https://blog.danielvaughan.com/i-ran-gemma-4-as-a-local-model-in-codex-cli-7fda754dc0d4
57•dvaughan•12h ago•19 comments

How long-distance couples use digital games to facilitate intimacy (2025)

https://arxiv.org/abs/2505.09509
96•radeeyate•17h ago•30 comments

Uncharted island soon to appear on nautical charts

https://www.awi.de/en/about-us/service/press/single-view/unkartierte-insel-demnaechst-auf-seekart...
85•tannhaeuser•12h ago•43 comments

Haunt, the 70s text adventure game, is now playable on a website

https://haunt.madebywindmill.com
55•jscalo•5h ago•18 comments

Phyphox – Physical Experiments Using a Smartphone

https://phyphox.org/
215•_Microft•1d ago•34 comments

Pro Max 5x quota exhausted in 1.5 hours despite moderate usage

https://github.com/anthropics/claude-code/issues/45756
635•cmaster11•20h ago•563 comments

The peril of laziness lost

https://bcantrill.dtrace.org/2026/04/12/the-peril-of-laziness-lost/
386•gpm•13h ago•128 comments

A Tour of Oodi

https://blinry.org/oodi/
138•zdw•3d ago•42 comments
Open in hackernews

I ran Gemma 4 as a local model in Codex CLI

https://blog.danielvaughan.com/i-ran-gemma-4-as-a-local-model-in-codex-cli-7fda754dc0d4
57•dvaughan•12h ago

Comments

anactofgod•8h ago
Amazing. Thanks for your detailed posts on the bake-off between the Mac and GB10, Daniel, and on your learnings. I had trying similar on both compute platforms on my to-do list. Your post should save me a lot of debugs, sweat, and tears.
fortyseven•8h ago
I've been VERY impressed with Gemma4 (26B at the moment). It's the first time I've been able to use OpenCode via a llamacpp server reliably and actually get shit done.

In fact, I started using it as a coding partner while learning how to use the Godot game engine (and some custom 'skills' I pulled together from the official docs). I purposely avoided Claude and friends entirely, and just used Gemma4 locally this week... and it's really helped me figure out not just coding issues I was encountering, but also helped me sift through the documentation quite readily. I never felt like I needed to give in and use Claude.

Very, very pleased.

blackmanta•7h ago
With a nvidia spark or 128gb+ memory machine, you can get a good speed up on the 31B model if you use the 26B MoE as a draft model. It uses more memory but I’ve seen acceptance rate at around 70%+ using Q8 on both models
foobar10000•7h ago
1 token ahead or 2?

It's interesting - imo we'll soon have draft models specifically post-trained for denser, more complicated models. Wouldn't be surprised if diffusion models made a comeback for this - they can draft many tokens at once, and learning curves seem to top out at 90+% match for auto-regressive ones so quite interesting..

ehtbanton•7h ago
This is genuinely very helpful. I'm planning a MacBook pro purchase with local inference in mind and now see I'll have to aim for a slightly higher memory option because the Gemma A4 26B MoE is not all that!
tomr75•1h ago
pretty sure Nvidia GPU is better bang for buck because of usable inference speed..
egorfine•49m ago
I have upgraded my M4 Pro 24GB to M5 Pro 48GB yesterday. The same Gemma 4 MoE model (4bit, don't remember which version) runs about 8x faster on M5 Pro and loads 2x times faster in memory.

So yes, do purchase that new MacBook Pro.

brcmthrowaway•7h ago
Nothing about omlx?
vsrinivas•6h ago
Hey - I use the same, w/ both gemma4 and gpt-oss-*; some things I have to do for a good experience:

1) Pin to an earlier version of codex (sorry) - 0.55 is the best experience IME, but YMMV (see https://github.com/openai/codex/issues/11940, https://github.com/openai/codex/issues/8272).

2) Use the older completions endpoint (llama.cpp's responses support is incomplete - https://github.com/ggml-org/llama.cpp/issues/19138)

tuzemec•2h ago
I'm currently experimenting with running google/gemma-4-26b-a4b with lm studio (https://lmstudio.ai/) and Opencode on a M3 Ultra with 48Gb RAM. And it seems to be working. I had to increase the context size to 65536 so the prompts from Opencode would work, but no other problems so far.

I tried running the same on an M3 Max with less memory, but couldn't increase the context size enough to be useful with Opencode.

It's also easy to integrate it with Zed via ACP. For now it's mostly simple code review tasks and generating small front-end related code snippets.

zihotki•1h ago
For coding it makes no sense to use any quantization worse than Q6_K, from my experience. More quantized models make more mistakes and if for text processing it still can be fine, for coding it's not.
mhitza•1h ago
> The finding I did not expect: model quality matters more than token speed for agentic coding.

I'm really surprised how that was not obvious.

Also, instead of limiting context size to something like 32k, at the cost of ~halving token generation speed, you can offload MoE stuff to the CPU with --cpu-moe.

Havoc•1h ago
You can also try speculative decoding with the E2B model. Under some conditions it can result in a decent speed up
danilop•54m ago
Nice walkthrough and interesting findings! The difference between the MoE and the dense models seems to be bigger than what benchmarks report. It makes sense because a small gain in toll planning and handling can have a large influence on results.
egorfine•47m ago
Related: I have upgraded my M4 Pro 24GB to M5 Pro 48GB yesterday. The same Gemma 4 MoE model (Q4) runs about 8x more t/s on M5 Pro and loads 2x times faster from disk to memory.

Gonna run some more tests later today.

Confiks•10m ago
> The same Gemma 4 MoE model (Q4)

As you have so much RAM I would suggest running Q8_0 directly. It's not slower, and might even be faster, while being almost identical in quality to the original model.

And just to be sure: you're are running the MLX version, right? The mlx-community quantization seemed to be broken when I tried it last week (it spit out garbage), so I downloaded the unsloth version instead. That too was broken in mlx-lm (it crashed), but has since been fixed on the main branch of https://github.com/ml-explore/mlx-lm.

I unfortunately only have 16 GiB of RAM on a Macbook M1, but I just tried to run the Q8_0 GGUF version on a 2023 AMD Framework 13 with 64 GiB RAM just using the CPU, and that works surprisingly well with tokens/s much faster than I can read the output.

egorfine•3m ago
> As you have so much RAM I would suggest running Q8_0 directly

On the 48GB mac - absolutely. The 24GB one cannot run Q8, hence why the comparison.

> And just to be sure: you're are running the MLX version, right?

Nah, not yet. I have only tested in LM Studio and they don't have MLX versions recommended yet.

> but has since been fixed on the main branch

That's good to know, I will play around with it.

karpetrosyan•30m ago
I think local models are not yet that good or fast for complex things, so I am just using local Gemma 4 for some dummy refactorings or something really simple.
dajonker•10m ago
I don't really have the hardware to try it out, but I'm curious to see how Qwen3.5 stacks up against Gemma 4 in a comparison like this. Especially this model that was fine tuned to be good at tool calling that has more than 500k downloads as of this moment: https://huggingface.co/Jackrong/Qwen3.5-27B-Claude-4.6-Opus-...