frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

What's in a GGUF, besides the weights – and what's still missing?

https://nobodywho.ooo/posts/whats-in-a-gguf/
31•bashbjorn•2h ago

Comments

ge96•1h ago
Nice, I recently pulled down TheBloke 7B mistral to try out I have a 4070.
ganelonhb•1h ago
I have a 2070 and can confirm it works amazingly fast.

I love TheBloke I wish he still made stuff

ge96•1h ago
What do you use it for? I'm still trying to use agents, I barely use copilot, only at work when I have to.

I didn't want to get personal with an LLM unless it was local so that's why I was setting this up but yeah. So far just research is what I was looking at.

bashbjorn•1h ago
Yeah, TheBloke era of local LLMs were good times. TBF Unsloth are doing a fantastic job of publishing quants of the major models quickly - they just don't have nearly the volume of "weird" models as TheBloke did.
bashbjorn•1h ago
I love mistral, but that model is... not the best. Maybe try out Gemma 4 e4b, it's a similar size to Mistral 7B, and should run great on your 4070 ("E4B" is slightly misleading naming).
ge96•1h ago
Thanks for the tip, what do you use Gemma 4 e4b for?
redanddead•57m ago
some say it’s a miniaturized gemini model

it’s good at writing, coding, decently intelligent

you can try it on nvidia nim

mixtureoftakes•25m ago
7b mistral is quite outdated. On a 12gb 4070 you can run qwen 3.5 9b q4km or qwen 3.6 35b, the latter will be a lot smarter but also a lot slower due to ram offload.

Try both in lm studio, they really are surprisingly capable

ge96•7m ago
I have 80gb of ram but it's slow capped by i9 CPU or specific asus mobo sucks I think only 2400mhz despite being ddr4

Tried all the stuff bios, volting

kenreidwilson•1h ago
>Published May 18, 2026

hmmm...

bashbjorn•1h ago
whoops, my bad. Just a typo in the markdown. Fixed :)
badsectoracula•38m ago
> not to be confused with the somewhat baffling llama_chat_apply_template exposed in the libllama API, which hardcodes a handful of chat formats directly in C++

As someone who is tinkering with a desktop-based inference app in FLTK[0], i wish this used the actual Jinja2 template parser llama.cpp uses (or there was another C function that did that since AFAICT for "proper" parsing you need to be able to pass a bunch of data to the template so it knows if you, e.g., do tool calling). Currently i'm using this adhocky function, but i guess i'll either write a Jinja2 interpreter or copy/paste the one from llama.cpp's code (depending on how i feel at the time :-P).

But yeah, GGUF's "all-in-one" approach is very convenient. And i agree that it feels odd to have the projection models as separate files - i remember when i first download a vision-capable model, i just grabbed whatever GGUF looked appropriate, then llama.cpp told me it couldn't do model and took me a bit to realize that i had to download an extra file. Literally my thought once i did was "wasn't GGUF supposed to contain everything?" :-P

[0] https://i.imgur.com/GiTBE1j.png

Removing the modem and GPS from my 2024 RAV4 hybrid

https://arkadiyt.com/2026/05/13/removing-the-modem-and-gps-from-my-rav4/
283•arkadiyt•3h ago•135 comments

RTX 5090 and M4 MacBook Air: Can It Game?

https://scottjg.com/posts/2026-05-05-egpu-mac-gaming/
347•allenleee•4h ago•88 comments

New Nginx Exploit

https://github.com/DepthFirstDisclosures/Nginx-Rift
163•hetsaraiya•2h ago•40 comments

First public macOS kernel memory corruption exploit on Apple M5

https://blog.calif.io/p/first-public-kernel-memory-corruption
48•quadrige•1h ago•7 comments

The AI Zombification of Universities

https://www.thenewcritic.com/p/the-great-zombification
50•rmdmphilosopher•1h ago•19 comments

The Power of a Free Popsicle (2018)

https://www.gsb.stanford.edu/insights/power-free-popsicle
30•NaOH•1h ago•6 comments

WinUI 3 Performance: A Leap Forward

https://github.com/microsoft/microsoft-ui-xaml/discussions/11096
15•whatever3•1h ago•0 comments

HDD Firmware Hacking

https://icode4.coffee/?p=1465
65•jsploit•3h ago•6 comments

A message from President Kornbluth about funding and the talent pipeline

https://president.mit.edu/writing-speeches/video-transcript-message-president-kornbluth-about-fun...
511•dmayo•5h ago•534 comments

Computer Hobby Movement in Canada

https://museum.eecs.yorku.ca/exhibits/show/hobby_canada/hobby_canada
154•rbanffy•7h ago•47 comments

Understanding the Linux Kernel: The Linux Kernel Startup

https://internals-for-interns.com/posts/linux-kernel-startup/
29•valyala•1h ago•0 comments

Terranox AI (YC W26) Is Hiring a Founding AI/ML Engineer and Summer AI/ML Intern

https://www.workatastartup.com/companies/terranox-ai
1•jadecheclair•3h ago

AI is making me dumb

https://jpain.io/god-damn-ai-is-making-me-dumb/
229•Eighth•1h ago•154 comments

Int a = 5; a = a++ + ++a; a =? (2011)

https://gynvael.coldwind.pl/?id=372
35•e-topy•2d ago•60 comments

Germany's Sovereign Tech Fund Backs KDE with €1.3M

https://www.theregister.com/oses/2026/05/14/kde-bags-13m-as-europe-realizes-it-might-need-an-os-o...
53•Lihh27•52m ago•4 comments

You Don't Align an AI, You Align with It

https://danieltan.weblog.lol/2026/05/you-dont-align-an-ai-you-align-with-it
27•danieltanfh95•1h ago•7 comments

Fossils show millipede and centipede ancestors evolved legs underwater

https://phys.org/news/2026-05-ancient-sea-fossils-millipede-centipede.html
53•gmays•2d ago•2 comments

What's in a GGUF, besides the weights – and what's still missing?

https://nobodywho.ooo/posts/whats-in-a-gguf/
31•bashbjorn•2h ago•12 comments

German intelligence offices snub Palantir software

https://www.dw.com/en/german-intelligence-offices-snub-us-based-palantir-software/a-77160897
44•abawany•1h ago•6 comments

The conflation of money and things

https://lithub.com/is-it-even-real-on-the-conflation-of-money-and-things/
49•bookofjoe•4h ago•16 comments

60fps Video on a CGA? – The GlyphBlaster

https://martypc.blogspot.com/2026/05/60fps-video-on-cga-glyphblaster.html
51•tambourine_man•4d ago•7 comments

EditLens: Quantifying the extent of AI editing in text (2025)

https://arxiv.org/abs/2510.03154
24•horseradish•1d ago•2 comments

DIY open-source ultrasound hardware on the rp2040/rp2350

http://un0rick.cc/pic0rick
14•kelu124•2h ago•1 comments

Rewrite Bun in Rust has been merged

https://github.com/oven-sh/bun/pull/30412
317•Chaoses•11h ago•380 comments

Show HN: Running the second public ODoH relay

https://numa.rs/blog/posts/odoh-anonymous-dns-without-an-account.html
103•rdme•9h ago•36 comments

Green Card Holders Targeted for Deportation by New 'Removal Apparatus'

https://www.nytimes.com/2026/05/14/us/politics/green-cards-immigration-deportation-trump.html
19•donohoe•1h ago•6 comments

Myths about /dev/urandom (2014)

https://www.2uo.de/myths-about-urandom/
76•signa11•8h ago•40 comments

Leaving the Physical World

https://www.eff.org/pages/leaving-physical-world
168•andsoitis•4d ago•78 comments

The Tree House: A voyage to the source of a backyard dream

https://www.laphamsquarterly.org/roundtable/tree-house
61•Caiero•3d ago•11 comments

OpenData Vector: MIT-Licensed Vector Search on Object Storage

https://www.opendata.dev/blog/introducing-vector/
6•apurvamehta•1h ago•0 comments