frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

France's homegrown open source online office suite

https://github.com/suitenumerique
291•nar001•2h ago•144 comments

British drivers over 70 to face eye tests every three years

https://www.bbc.com/news/articles/c205nxy0p31o
50•bookofjoe•35m ago•23 comments

Start all of your commands with a comma (2009)

https://rhodesmill.org/brandon/2009/commands-with-comma/
391•theblazehen•2d ago•141 comments

Hoot: Scheme on WebAssembly

https://www.spritely.institute/hoot/
70•AlexeyBrin•4h ago•14 comments

First Proof

https://arxiv.org/abs/2602.05192
20•samasblack•1h ago•13 comments

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

https://openciv3.org/
758•klaussilveira•18h ago•236 comments

Reinforcement Learning from Human Feedback

https://arxiv.org/abs/2504.12501
44•onurkanbkrc•3h ago•2 comments

The Waymo World Model

https://waymo.com/blog/2026/02/the-waymo-world-model-a-new-frontier-for-autonomous-driving-simula...
1013•xnx•1d ago•574 comments

Coding agents have replaced every framework I used

https://blog.alaindichiappari.dev/p/software-engineering-is-back
125•alainrk•3h ago•141 comments

Stories from 25 Years of Software Development

https://susam.net/twenty-five-years-of-computing.html
15•vinhnx•1h ago•1 comments

Vocal Guide – belt sing without killing yourself

https://jesperordrup.github.io/vocal-guide/
147•jesperordrup•8h ago•55 comments

Unseen Footage of Atari Battlezone Arcade Cabinet Production

https://arcadeblogger.com/2026/02/02/unseen-footage-of-atari-battlezone-cabinet-production/
95•videotopia•4d ago•23 comments

A Fresh Look at IBM 3270 Information Display System

https://www.rs-online.com/designspark/a-fresh-look-at-ibm-3270-information-display-system
10•rbanffy•3d ago•0 comments

Making geo joins faster with H3 indexes

https://floedb.ai/blog/how-we-made-geo-joins-400-faster-with-h3-indexes
148•matheusalmeida•2d ago•40 comments

Ga68, a GNU Algol 68 Compiler

https://fosdem.org/2026/schedule/event/PEXRTN-ga68-intro/
30•matt_d•4d ago•8 comments

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

https://github.com/valdanylchuk/breezydemo
256•isitcontent•18h ago•27 comments

Monty: A minimal, secure Python interpreter written in Rust for use by AI

https://github.com/pydantic/monty
267•dmpetrov•19h ago•144 comments

Google staff call for firm to cut ties with ICE

https://www.bbc.com/news/articles/cvgjg98vmzjo
63•tartoran•1h ago•8 comments

Hackers (1995) Animated Experience

https://hackers-1995.vercel.app/
536•todsacerdoti•1d ago•260 comments

Sheldon Brown's Bicycle Technical Info

https://www.sheldonbrown.com/
413•ostacke•1d ago•105 comments

Show HN: I spent 4 years building a UI design tool with only the features I use

https://vecti.com
355•vecti•21h ago•161 comments

What Is Ruliology?

https://writings.stephenwolfram.com/2026/01/what-is-ruliology/
59•helloplanets•4d ago•58 comments

Show HN: If you lose your memory, how to regain access to your computer?

https://eljojo.github.io/rememory/
329•eljojo•21h ago•199 comments

An Update on Heroku

https://www.heroku.com/blog/an-update-on-heroku/
453•lstoll•1d ago•297 comments

Microsoft open-sources LiteBox, a security-focused library OS

https://github.com/microsoft/litebox
368•aktau•1d ago•192 comments

Show HN: Kappal – CLI to Run Docker Compose YML on Kubernetes for Local Dev

https://github.com/sandys/kappal
12•sandGorgon•2d ago•3 comments

Cross-Region MSK Replication: K2K vs. MirrorMaker2

https://medium.com/lensesio/cross-region-msk-replication-a-comprehensive-performance-comparison-o...
7•andmarios•4d ago•1 comments

Female Asian Elephant Calf Born at the Smithsonian National Zoo

https://www.si.edu/newsdesk/releases/female-asian-elephant-calf-born-smithsonians-national-zoo-an...
57•gmays•13h ago•23 comments

How to effectively write quality code with AI

https://heidenstedt.org/posts/2026/how-to-effectively-write-quality-code-with-ai/
298•i5heu•21h ago•253 comments

Dark Alley Mathematics

https://blog.szczepan.org/blog/three-points/
107•quibono•5d ago•34 comments
Open in hackernews

What even is a small language model now?

https://jigsawstack.com/blog/what-even-is-a-small-language-model-now--ai
109•yoeven•8mo ago

Comments

zellyn•8mo ago
I think of “fits on the overpowered M1/2/3/4 64GB MacBook Pro my employer gave me” as the dividing line. We’re getting to within spitting distance of models that can code well at that size.
api•8mo ago
I want my next laptop to be the 128gb M series monster. That will run not quite frontier models but ones that are close in performance, and run them fast.
danielbln•8mo ago
And, also quite important, leave your system enough RAM to do anything else.
onecommentman•8mo ago
I don’t know about that. Could a cheaper 2nd MacBook in all cases be used as an outboard unMath/unTPU coprocessor to handle those other tasks? (I have no clue.)
adgjlsfhk1•8mo ago
are you sure? lpddr5 is somewhere in the range of ~0.25 W/GB (for some reason this stat is hard to find good values for), so 128gb of RAM will mean your laptop idles at >25 watts.
api•8mo ago
True. But I am usually near an outlet, and with the M series even the overpowered ones just bring you back to the battery life of a good Intel laptop (3-4 hours) instead of 8+ for smaller ones.

That and running local LLMs pretty much requires an outlet. The GPU goes up to 50 watts. Battery life drops from many hours to less than one.

adgjlsfhk1•8mo ago
that 25 watts was just for ram. You still need another ~5 watts for cpu and display at idle, so you're talking about a laptop that lasts ~3 hours with the screen on literally doing nothing. For comparison, my 2021 XPS-13 (with a pretty small 55wH battery) lasts at least 4 hours under actual use (video playback, browsing github, compiling code etc).

Also, when your laptop is using 25W just for the RAM, that's ~20 watts less that the CPU/GPU get when you want to power up the CPU/GPU.

Maxious•8mo ago
https://mistral.ai/news/devstral and https://huggingface.co/nvidia/AceReason-Nemotron-14B were released in just the last couple of days and work in 24GB 4090 GPUs/32GB Macbook Pros just fine
mark_l_watson•8mo ago
+1 that is my experience. devstral 24B on my Mac does very well designing code. I am writing a book on AI first software development and I have been exploring using small models in a specific process of separate steps for analysis, design, implementation, etc.
nickpsecurity•8mo ago
The term is too overloaded.

I'll add one more: a LLM small enough that it can be trained from scratch on one A100 in 24 hours. Is it really small if it takes $10,000 to train? Or leave that term for $200 models?

Back to your definitions, there are sub-1B models people are using. I think I saw one in the 400-600M range for audio. Another person posted here a 100M-200M model for extracting data from web pages. We told them to just use a rules-based approach where possible but they believed the SLM worked better.

Then, there's projects like BabyLM that can be useful at 10M:

https://babylm.github.io/

GardenLetter27•8mo ago
But you only have to train the foundational model once - so with open weights it's not really a problem.

Maybe resources needed for fine-tuning would be nice to see.

nickpsecurity•8mo ago
Most have been trained on illegally-distributed, copyrighted works. They might output them, too. People might want untainted models. Additionally, some have weaknesses due to tokenizers, pre-training data, or moral alignment (political bias).

For those reasons, users might want to train a new model from scratch.

Researchers of training methods have a different problem. They need to see whether a new technique, like an optimization algorithm, gets better results. They try them more quickly with less money if they have small, training runs representative of what larger models do. If BabyLM-10M was representative, they could test each technique at the FLOPS/$ of a 10M model instead of a 1B model.

So, both researchers and users might want new models trained from scratch. The cheaper to train, the better.

monkeyisland•8mo ago
> Another person posted here a 100M-200M model for extracting data from web pages

Could you post a link to this comment or thread. I can't seem to find this model by searching but world love to try it out.

nickpsecurity•8mo ago
I think I found it. I could be getting the numbers mixed up with another SLM. That example's smaller model was 500M:

https://news.ycombinator.com/item?id=41515730

srikz•8mo ago
I want to see more models that can be streamed to a browser and run locally via wasm. That would be my hope for small models. In the <100mb range.
firejake308•8mo ago
After experimenting with 1B models, I am starting to think that any model with 1B parameters or less will probably lack a lot of the general intelligence that we observe in the frontier models, because it seems physically impossible to encode that much information into so few parameters. I believe that in the range of very small models, the winner will be models that are fine tuned to a small range of tasks or domains, such as a model that can translate between English and any other language, or a legal summarization model, etc.
relaxing•8mo ago
Why? Just so user data stays local?
dainiusse•8mo ago
Yes. And also, cost to run it.
relaxing•8mo ago
Sounds like a case where egress could easily cost more than compute.
vindex10•8mo ago
Have you heard of Transformers.js? They are running onnx inside browser:

https://huggingface.co/docs/transformers.js/en/index

KasianFranks•8mo ago
This is also where MoE shines with a mixture of small and large language models.
alexpham14•8mo ago
I appreciate how it redefines “small” not by parameter count but by practical impact and deployability.
lblume•8mo ago
I do not — parameter count is objective, practical impact depends on such a multitude of factors that any comparison becomes virtually meaningless.
kergonath•8mo ago
The standard for parameters count is rapidly evolving. Something large now will be small tomorrow, there is no point in using such a moving target as a criterion.
lblume•8mo ago
Sure, but nonetheless whether the model is called "small" at some time t should depend on the parameter count and t, not some arbitrarily specified metric of deployability.
antirez•8mo ago
Very small: can run on the edge to allow something like a Raspberry Pi to make basic decisions for your appliance even if disconnected from the internet. Example: those are some time series parameters and instructions, decide if watering the plants or not; vision models that can watch a camera and transcribe what it is seeing in a basic way, ...

Small: runs in an average laptop not optimized for inference of LLMs, like Gemma 3 4B.

Medium: runs in a very high spec computer that people can buy for less than 5k. 30B, 70B dense models or larger MoEs.

Large: Models that big LLM providers sell as "mini", "flash", ...

Extra Large / SOTA: Gemini 2.5 PRO, Claude 4 Opus, ChatGPT O3, ...

mnahkies•8mo ago
I'm not sure if you're implying that very small language models would be run in your raspberry pi example, but for use cases like the time series one, wouldn't something like an LSTM or TiDE architecture make more sense than a language model?

These are typically small and performant both in compute and accuracy/utility from what I've seen.

I think with all the hype at the moment sometimes AI/ML has become too synonymous with LLM

greenavocado•8mo ago
He's talking about general purpose zero shot models.
antirez•8mo ago
Sure if you have a specific need you can specialize some NN with the right architecture, collecting the data, doing the training several times, testing the performances, ... Or: you can download an already built LLM and write a prompt.
galangalalgol•8mo ago
What zero shot would you suggest for that task on an rpi? A temporal fusion thing?
antirez•8mo ago
The small gemma 3 and Qwen 3 models can do wonders for simple tasks as bag of algorithms.
galangalalgol•8mo ago
Those would use more ram than most rpi have wouldn't they? Gemma uses 4GB right?
antirez•8mo ago
Nope, gemma3 and qwen3 exist of many sizes, including very small ones, that 4-bit quantized can run on very small systems. Qwen3-0.6B, 1.7B, ... imagine if you quantize those to 4 bit. But there is the space for the KV cache, if we don't want to limit the runs to very small prompts.
nolist_policy•8mo ago
Gemma 3 4B QAT int4 quantized from bartowsky should barely fit in a 4GB Raspberry Pi, but without the vision encoder.

However the brand-new Gemma 3n E2B and E4B models might fit with vision.

antirez•8mo ago
Yep, the Gemma 3 1B would be 815MB, with enough margin for a longer prompt. Probably more realistic.
mnahkies•8mo ago
So one of the use cases we're serving in production is predicting energy consumption for a home. Whilst I've not tried, I'm very confident that providing an LLM the historical consumption and asking it to predict future consumption will under perform compared to our forecasting model. The compute required is also several orders of magnitude lower compared to an LLM
layer8•8mo ago
For “very small”, I would add “can be passively cooled” as a criterion.
mnky9800n•8mo ago
Why in the world do you need such sophistication to know whether to water the plants or not?
kovezd•8mo ago
There are places where: a) weather predictions are unreliable, b) there is scarcity of water. Just making the right decision on at what hour to water is a huge monthly saving of water.
1over137•8mo ago
None of which need AI hype crap. Some humidity sensors, photosensors, etc. will do the job.
kovezd•8mo ago
Need is a very strong word. We don't need a lot of we have today.

But as a hobbyist I would prefer to program in an LLM than learn a bunch of algorithms, and sensor readings. It's also very similar to how I would think about it, making it easier to debug.

thenthenthen•8mo ago
Or a farmer
onecommentman•8mo ago
In a greenhouse operation with high-valued crops. Automated control technologies in those applications have been around for decades, and AI is competing with today’s sophisticated control technology designed, operated and continually improved by agriculturists with detailed site-specific knowledge of water (quality, availability, etc.), cultivars, markets, disease pressures, etc.. The marginal improvements AI can make in a process of poor data quality and availability, an existing, finely tuned, functioning control system, and facing the vagaries of managing dynamic living systems are…tiny.

The solution for water-constrained operations in the Americas is move to a location with more water, not AI.

For field crops…in the Americas, land and water is too cheap and crop prices are too low to be optimized with AI at the present era. The Americas (10% of world pop) could meet 70% of world food demand if pressed with today’s technologies…40% without breaking a sweat. The Americas are blessed.

Talk to the Saudis, Israel, etc. but, even there, you will lose more production by interfering in the motivations, engagement levels and cultures of working farmers than can be gained by optimizing by any complex opaque technological scheme, AI or no. New cultivars, new chemicals, new machinery even…few problems (but see India for counter examples). Changing millennia of farming practice with expensive, not-locally-maintainable, opaque technology…just no. Great truth learned over the last 70 years of development.

mnky9800n•8mo ago
I think there’s two schools of thought. The models will get so big everyone everywhere will use them for everything and they will make lots of money on api calls. The models will get cheaper and cheaper computationally on inference that implementing them on the edge will cost nothing and so an LLM will be in everything. Then every computational device will have one as long as you pay a license fee to the people who trained them.
ithkuil•8mo ago
Does it have to be computed at the edge by every person?
kovezd•8mo ago
Just as the other comment "have to" is a very strong word. But there are benefits to it: a) adaptability to local weather patterns, b) no access to WiFi in large properties.
ithkuil•8mo ago
I see. I guess it all boils down to how low power you can make this.

Keep in mind that there are other wireless communication systems that are long range and low power that are specifically designed to handle this scenario

collingreen•8mo ago
When you have a golden hammer everything starts to look like a nail
dainiusse•8mo ago
this
amelius•8mo ago
In this case, "sophistication" meaning throwing insane amounts of compute power and data at the problem? In older times we'd probably call that "brute forcing".
hugh-avherald•8mo ago
Today, I asked a colleague to pass me a pen. Was that a egregiously simple task for such a powerful intelligence?
SkiFire13•8mo ago
> Example: those are some time series parameters and instructions, decide if watering the plants or not

How is that a "language model"?

tayo42•8mo ago
Is language model used to mean neural net, with transformers, attention that takes in a series of tokens and out outs a prediction as a value?

Working with time series data would work in that case.

tough•8mo ago
https://github.com/google-research/timesfm
oezi•8mo ago
How do we call the models beyond extra large which are so big they can't be served publicly because their inference cost is too high? Do such exist?
lloydatkinson•8mo ago
> Example: those are some time series parameters and instructions, decide if watering the plants or not; vision models that can watch a camera and transcribe what it is seeing in a basic way, ...

This is the problem I have with the general discourse of "AI" even on Hacker News, of all places. Everything you listed is not an example of a *language model*.

All of those can either be implemented as a simple "if", decision tree, decision table, and finally actual ML in the example of cameras and time series predication.

Using an LLM is not just ridiculous here but totally the wrong fit and a waste of resources.

bdzr•8mo ago
> Using an LLM is not just ridiculous here but totally the wrong fit and a waste of resources.

Time and labor are resources too. There's a whole host of problems where "good enough" is tremendously valuable.

croes•8mo ago
How can a Large Language Model be a small language model?
kelseyfrog•8mo ago
Because words are arbitrary. See Saussure.
tialaramex•8mo ago
See also the Little Giant Girl who is part of The Sultan's Elephant and several other Royal de Luxe performances. She's clearly a little girl, but, she's also clearly a giant.
baq•8mo ago
Why wouldn’t there be any? Right now there are large large language models, medium large language models and small large language models. You can say there are also tiny large language models and extra large large language models. Nothing confusing about it.
Dwedit•8mo ago
These terms are all relative, but there's also "BabyLlama", which measures its parameter count in millions rather than billions.
armcat•8mo ago
There is a "small language model", and then there is a "small LARGE language model". In late 2018, BERT (110 million params) would've been considered a "large" language model. A "small" LM would be some markov chain or a topic model (e.g. latent dirichlet allocation) - technically they would be considered generative language models since they learn joint distributions of params and data (words), and can then sample from that distribution. But today, we usually map "small" LMs to "small" LLMs, so in that sense a small LLM would be anything from BERT to around 3-4B params.
option•8mo ago
whatever fits into gaming GPU such as GeForce 3080
MiddleEndian•8mo ago
Just ask my ex-wife!
stephantul•8mo ago
This post is 100% rewritten or fully generated by gpt-4o. It has the gpt smell all over it.
gwern•8mo ago
> In a world chasing ever-bigger models, small ones are quietly doing more with less—and that's exactly what makes them powerful.

100%. It has enough technical details that maybe a human did something. But who knows.

maksimur•8mo ago
Is there a problem with that? If so, what is it? I don't mind as long as it's not the boilerplate AI spits out by default.
stephantul•8mo ago
Nah not really, the information content is what counts of course. It’s just a bit cringe to see it happen.
rickstanley•8mo ago
On this topic, I've been wondering if models are capable of recommending other models for a given machine spec, for example: which model, if any, would be recommended for a laptop with a Ryzen 9 6000S and RTX 3060m (random spec).
breckinloggins•8mo ago
Maybe we should appropriate the old DOS/x86 memory model names and give them “class-relative” sizes.

“tiny” can run on a microcontroller, “compact” on a Rpi, “small” on a phone, “medium” on a single GPU machine, “large” on AI class workstation hardware, and “huge” on a data center cluster.

GolDDranks•8mo ago
A traditional Markov model trained (rather, just "fitted") on tokens or words is a small language model.
GolDDranks•8mo ago
(To share a recent personal experience about Markov models: I bootstrapped recently a HMM with hand-assigned weights. It was around 15x15 class transitions, 225 weights. That's small. Or rather, microscopic. Then I ran it against real data, and picked up examples of wrong classifications, and made them auxillary training data. Of course, it was not a language model, language model is impossible to fit in such a small space. It was a model of transitions of chapter "types" in novels, where types are something like "Epilogue" , "Prologue", "Chapter 23", "Table of Contents", "Afterword" etc.)
mcswell•8mo ago
> Small models used to mean tiny. Now they mean "runs without drama."

Does this mean without a dedicated electric power plant?

I wanted to say "Right, big-sized. Do you want fries with that?", but I couldn't figure out how to work that in, so I won't say it.

Havoc•8mo ago
It’s always been a little arbitrary. Can it fit on 3090 seems like a reasonable cutoff to me for now