frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Cloud VM benchmarks 2026

https://devblog.ecuadors.net/cloud-vm-benchmarks-2026-performance-price-1i1m.html
185•dkechag•7h ago•86 comments

"Warn about PyPy being unmaintained"

https://github.com/astral-sh/uv/pull/17643
129•networked•6h ago•34 comments

CasNum

https://github.com/0x0mer/CasNum
267•aebtebeten•11h ago•35 comments

MonoGame: A .NET framework for making cross-platform games

https://github.com/MonoGame/MonoGame
56•azhenley•5h ago•23 comments

From RGB to L*a*b* color space (2024)

https://kaizoudou.com/from-rgb-to-lab-color-space/
8•kqr•3d ago•0 comments

Emacs internals: Deconstructing Lisp_Object in C (Part 2)

https://thecloudlet.github.io/blog/project/emacs-02/
59•thecloudlet•2d ago•2 comments

How to run Qwen 3.5 locally

https://unsloth.ai/docs/models/qwen3.5
79•Curiositry•8h ago•16 comments

A decade of Docker containers

https://cacm.acm.org/research/a-decade-of-docker-containers/
284•zacwest•14h ago•200 comments

Dumping Lego NXT firmware off of an existing brick (2025)

https://arcanenibble.github.io/dumping-lego-nxt-firmware-off-of-an-existing-brick.html
192•theblazehen•2d ago•11 comments

The Editor Who Helped Build a Golden Age of American Letters

https://newrepublic.com/article/205583/editor-helped-build-golden-age-american-letters
8•samclemens•2d ago•0 comments

Yoghurt delivery women combatting loneliness in Japan

https://www.bbc.com/travel/article/20260302-the-yoghurt-delivery-women-combatting-loneliness-in-j...
267•ranit•18h ago•147 comments

Autoresearch: Agents researching on single-GPU nanochat training automatically

https://github.com/karpathy/autoresearch
104•simonpure•11h ago•28 comments

Show HN: A weird thing that detects your pulse from the browser video

https://pulsefeedback.io/
64•kilroy123•3d ago•32 comments

The surprising whimsy of the Time Zone Database

https://muddy.jprs.me/links/2026-03-06-the-surprising-whimsy-of-the-time-zone-database/
99•jprs•13h ago•26 comments

In 1985 Maxell built a bunch of life-size robots for its bad floppy ad

https://buttondown.com/suchbadtechads/archive/maxell-life-size-robots/
103•rfarley04•3d ago•13 comments

Best performance of a C++ singleton

https://andreasfertig.com/blog/2026/03/best-performance-of-a-cpp-singleton/
24•jandeboevrie•1d ago•16 comments

To the Polypropylene Makers

https://www.lesswrong.com/posts/HQTueNS4mLaGy3BBL/here-s-to-the-polypropylene-makers
14•raldi•1h ago•1 comments

Ten Years of Deploying to Production

https://brandonvin.github.io/2026/03/04/ten-years-of-deploying-to-production.html
16•mooreds•2d ago•2 comments

FLASH radiotherapy's bold approach to cancer treatment

https://spectrum.ieee.org/flash-radiotherapy
202•marc__1•16h ago•62 comments

macOS code injection for fun and no profit (2024)

https://mariozechner.at/posts/2024-07-20-macos-code-injection-fun/
87•jstrieb•3d ago•15 comments

A Grand Vision for Rust

https://blog.yoshuawuyts.com/a-grand-vision-for-rust/
38•todsacerdoti•3d ago•33 comments

Lisp-style C++ template meta programming

https://github.com/mistivia/lmp
41•mistivia•9h ago•6 comments

How important was the Battle of Hastings?

https://www.historytoday.com/archive/head-head/how-important-was-battle-hastings
26•benbreen•4d ago•27 comments

Compiling Prolog to Forth [pdf]

https://vfxforth.com/flag/jfar/vol4/no4/article4.pdf
107•PaulHoule•4d ago•9 comments

LLM Writing Tropes.md

https://tropes.fyi/tropes-md
169•walterbell•10h ago•68 comments

Files are the interface humans and agents interact with

https://madalitso.me/notes/why-everyone-is-talking-about-filesystems/
204•malgamves•21h ago•116 comments

Re-creating the complex cuisine of prehistoric Europeans

https://arstechnica.com/science/2026/03/recreating-the-complex-cuisine-of-prehistoric-europeans/
74•apollinaire•1d ago•31 comments

Bourdieu's theory of taste: a grumbling abrégé (2023)

https://dynomight.net/bourdieu/
54•sebg•2d ago•17 comments

Hidden Overheads (2023)

https://blog.xoria.org/hidden-overheads/
17•surprisetalk•1d ago•5 comments

The influence of anxiety: Harold Bloom and literary inheritance

https://thepointmag.com/examined-life/the-influence-of-anxiety/
30•apollinaire•4d ago•2 comments
Open in hackernews

How to run Qwen 3.5 locally

https://unsloth.ai/docs/models/qwen3.5
79•Curiositry•8h ago

Comments

Twirrim•4h ago
I've been finding it very practical to run the 35B-A3B model on an 8GB RTX 3050, it's pretty responsive and doing a good job of the coding tasks I've thrown at it. I need to grab the freshly updated models, the older one seems to occasionally get stuck in a loop with tool use, which they suggest they've fixed.
ufish235•3h ago
Can you give an example of some coding tasks? I had no idea local was that good.
hooch•1h ago
Changed into a directory recently and fired up the qwen code CLI and gave it two prompts: "so what's this then?" - to which it had a good summary across stack and product, and then "think you can find something todo in the TODO?" - and while I was busy in Claude Code on another project, it neatly finished three HTML & CSS tasks - that I had been procrastinating on for weeks.

This was a qwen3-coder-next 35B model on M4 Max with 64GB which seems to be 51GB size according to ollama. Have not yet tried the variants from the TFA.

manmal•14m ago
3.5 seems to be better at coding than 3-coder-next, I’d check it out.
fragmede•3h ago
Which models would that be?
fy20•1h ago
I guess you are doing offloading to system RAM? What tokens per second do you get? I've got an old gaming laptop with a RTX 3060, sounds like it could work well as a local inference server.
manmal•18m ago
In the article, they claim up to 25t/s for the LARGEST model with a 24GB VRAM card. Need a lot of RAM obviously
Curiositry•1h ago
Qwen3.5 9b seems to be fairly competent at OCR and text formatting cleanup running in llama.cpp on CPU, albeit slow. However, I have compiled it umpteen ways and still haven't gotten GPU offloading working properly (which I had with Ollama), on an old 1650 Ti with 4GB VRAM (it tries to allocate too much memory).
acters•1h ago
I have a 1660ti and the cachyos + aur/llama.cpp-cuda package is working fine for me. With about 5.3 GB of usable memory, I find that the 35B model is by far the most capable one that performs just as fast as the 4B model that fits entirely on my GPU. I did try the 9B model and was surprisingly capable. However 35B still better in some of my own anecdotal test cases. Very happy with the improvement. However, I notice that qwen 3.5 is about half the speed of qwen 3
WhyNotHugo•23m ago
If you’re building from source, the vulkan backend is the easiest to build and use for GPU offloading.
Curiositry•19m ago
Yes, that's what I tried first. Same issue with trying to allocate more memory than was available.
moqizhengz•1h ago
Running 3.5 9B on my ASUS 5070ti 16G with lm studio gives a stable ~100 tok/s. This outperforms the majority of online llm services and the actual quality of output matches the benchmark. This model is really something, first time ever having usable model on consumer-grade hardware.
yangikan•1h ago
Do you point claude code to this? The orchestration seems to be very important.
lukan•49m ago
What exact model are you using?

I have a 16GB GPU as well, but have never run a local model so far. According to the table in the article, 9B and 8-bit -> 13 GB and 27B and 3-bit seem to fit inside the memory. Or is there more space required for context etc?

throwdbaaway•46m ago
There are Qwen3.5 27B quants in the range of 4 bits per weight, which fits into 16G of VRAM. The quality is comparable to Sonnet 4.0 from summer 2025. Inference speed is very good with ik_llama.cpp, and still decent with mainline llama.cpp.
teaearlgraycold•7m ago
Qwen3.5 35B A3B is much much faster and fits if you get a 3 bit version. How fast are you getting 27B to run?

On my M3 Air w/ 24GB of memory 27B is 2 tok/s but 35B A3B is 14-22 tok/s which is actually usable.