frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Why domain specific LLMs won't exist: an intuition

https://simianwords.bearblog.dev/why-domain-specific-llms-wont-exist-an-intuition/
4•simianwords•3h ago

Comments

scrpgil•3h ago
The author assumes specialization only happens at the model layer. But there's a third option: general model + specialized context.

I built an MCP server that feeds a user's real schedule, tasks, and goals into Claude/ChatGPT. The model isn't specialized — but the output is, because the context is. No fine-tuning, no domain-specific training. Just structured data at inference time.

Domain-specific LLMs won't exist not because specialization is useless, but because it's cheaper to specialize the input than the model.

nickpsecurity•2h ago
We're already using domain-specific LLM's. The only LLM trained lawfully that I know of, KL3M, is also domain-specific. So, the title is already wrong.

https://www.kl3m.ai/

Author is correct that intelligence is compounding. That's why domain-specific models are usually general models converted to domain-specific models by continued pretraining. Even general models, like H20's, have been improved by constraining them to domain-supporting, general knowledge in a second phase of pretraining. But, they're eventually domain specific.

Outside LLM's, I think most models are domain-specific: genetics, stock prices, ECG/EKG scans, transmission shifying, seismic, climate, etc. LLM's trying to do everything are an exception to the rule that most ML is domain-specific.

simianwords•2h ago
> We're already using domain-specific LLM's. The only LLM trained lawfully that I know of, KL3M, is also domain-specific. So, the title is already wrong.

This looks like an "ethical" LLM but not domain specific. What is the domain here?

> That's why domain-specific models are usually general models converted to domain-specific models by continued pretraining

I've also wondered this, like with the case of the Codex model. My hunch is that a good general model trumps a pretrained model by just adding an appropriate system prompt. Which is why even OpenAI sorta recommends using GPT-5.4 over any Codex model.

teleforce•2h ago
>Why domain specific LLMs won’t exist: an intuition

>We would have a healthcare model, economics model, mathematics model, coding model and so on.

It's not the question whether there ever will be specialized model, rather it's the matter of when.

This will democratize almost all work and profession, including programmers, architects, lawyers, engineers, medical doctors, etc.

For half-empty glass people, they will say this is a catastrophe of machine replacing human. On the other hand, the half-full glass people will say this is good for society and humanity by making the work more efficient, faster and at a much lower cost.

Imagine instead of having to wait for a few months for your CVD diagnostic procedures due to the lack of cardiologist around the world (facts), the diagnostics with the help of AI/LLM will probably takes only a few days instead with expert cardiologist in-the-loop, provided the sensitivity is high enough.

It's a win-win situation for patients, medical doctors and hospitals. This will lead to early detection of CVDs, hence less complication and suffering whether it's acute or chronic CVDs.

The foundation models are generic by nature with clusters HPC with GPU/TPU inside AI data-center for model training.

The other extreme is RAG with vector databases and file-system for context prompting as the sibling's comments mentioned.

The best trade-off or Goldilocks is the model fine-tuning. To be specific it's the promising self-distillation fine-tuning (SDFT) as recently proposed by MIT and ETH Zurich [1],[2]. Instead of the disadvantages of forgetting nature of the conventional supervised fine-tuning (SFT), thr SDFT is not forgetful that makes fine-tuning practical and not wasteful. The SDFT only used 4 x H200 GPU for fine-tuning process.

Apple is also reporting the same with their simple Smself-distillation (SSD) for LLM coding specialization [3],[4]. They used 8 x B200 GPU for model fine-tuning, which any company can afford for local fine-tuning based on open weight LLM models available from Google, Meta, Nvidia, OpenAI, DeepSeek, etc.

[1] Self-Distillation Enables Continual Learning:

https://arxiv.org/abs/2601.19897

[2] Self-Distillation Enables Continual Learning:

https://self-distillation.github.io/SDFT.html

[3] Embarrassingly simple self-distillation improves code generation:

https://arxiv.org/abs/2604.01193

[4] Embarrassingly simple self-distillation improves code generation (185 comments):

https://news.ycombinator.com/item?id=47637757

Ask HN: Folks with disabilities, what is it like in this LLM/scraper age?

1•eventualcomp•1m ago•0 comments

Andy Weir Apologizes to 'Star Trek' for Calling Shows 'S–': 'Trying to Be Funny'

https://variety.com/2026/tv/news/andy-weir-apologizes-star-trek-1236702791/
1•randycupertino•2m ago•0 comments

China Creates New Aviation Mystery with Offshore Warning Zones

https://www.wsj.com/world/china/china-creates-new-aviation-mystery-with-offshore-warning-zones-12...
1•mudil•2m ago•0 comments

UFO Time Travel Physics and the Nature of Consciousness

1•uncanny2•5m ago•0 comments

Paste your writing, see which sentences lose readers

https://app.manuscript.no/try
1•issaafk•7m ago•0 comments

Gemma 4 Uncensored (autoresearch results)

https://huggingface.co/collections/TrevorJS/gemma-4-uncensored
3•adefa•9m ago•1 comments

Injectable peptides touted as new fountain of youth. But the science isn't there

https://www.cbc.ca/lite/story/9.7151279
1•colinprince•11m ago•0 comments

Hitachi Ltd, Part I – By Bradford Morgan White

https://www.abortretry.fail/p/hitachi-ltd-part-i
1•rbanffy•12m ago•0 comments

Ask HN: Where are all the disruptive software that AI promised?

6•p-o•15m ago•1 comments

China Built the World's Drone Industry. Now It's Locking Down the Skies.

https://www.nytimes.com/2026/04/05/world/asia/china-drone-regulations.html
2•bookofjoe•17m ago•1 comments

Is All Software Converging

https://jry.io/writing/is-all-software-converging/
2•jryio•17m ago•0 comments

IRL Streaming Map

https://googlemapsmania.blogspot.com/2026/03/the-irl-streaming-map.html
2•gnabgib•18m ago•1 comments

AI agents pay USDC for API data via x402 micropayments – no API keys

https://x402.aigregator.com
3•ybonda•21m ago•0 comments

Tuple for Linux

https://tuple.app/linux/
1•kitallis•23m ago•0 comments

A Textual widget for beautiful diffs in the terminal

https://github.com/batrachianai/textual-diff-view
1•willm•25m ago•0 comments

Why Over-Engineering Happens

https://yusufaytas.com/why-over-engineering-happens/
12•zuhayeer•26m ago•1 comments

PS3 emulator makes Cell CPU breakthrough that improves performance in all games

https://www.tomshardware.com/video-games/playstation/rpcs3-ps3-emulator-gets-cell-cpu-breakthroug...
4•gloxkiqcza•28m ago•0 comments

Do you remember usability testing?

https://www.userium.com/
1•calmnordic•30m ago•0 comments

Agent Governance Toolkit: Open-source runtime security for AI agents

https://opensource.microsoft.com/blog/2026/04/02/introducing-the-agent-governance-toolkit-open-so...
1•tcbrah•30m ago•0 comments

The Melanesian: Dark-skinned people with blonde hair region of Oceania

https://guardian.ng/life/the-melanesian-dark-skinned-people-with-blonde-hair/
4•thunderbong•33m ago•0 comments

OpenNMC is an open network management card platform for APC SmartSlot UPS units

https://gitlab.com/netcube-systems-austria/opennmc
3•zdw•34m ago•0 comments

A all CLIs tokens and context reducer by 97%

https://www.squeezr.es/
1•sergioramosv•35m ago•1 comments

How we feel about AI (2025)

https://goauthentik.io/blog/2025-12-10-how-we-really-feel-about-ai/
1•walterbell•39m ago•0 comments

Show HN: Gecit – DPI bypass using eBPF sock_ops, no proxy or VPN

https://github.com/boratanrikulu/gecit
3•boratanrikulu•40m ago•0 comments

How to Get Better at Guitar

https://www.jakeworth.com/posts/how-to-get-better-at-guitar/
1•jwworth•41m ago•0 comments

Iran internet blackout now longest nation-scale shutdown on record

https://mastodon.social/@netblocks/116350984373909468
2•ukblewis•44m ago•0 comments

Show HN: Stablemount, a response to EmDash, a prototype for a future CMS

https://github.com/jhyolm/stablemount
2•jhyolm•44m ago•1 comments

Watch 'S4 – The Bob Lazar Story' online: Here's where to watch the UFO doc

https://www.tomsguide.com/entertainment/streaming/watch-s4-the-bob-lazar-story-online
2•evo_9•46m ago•0 comments

Show HN: YardSard – Inventory Management

https://apps.apple.com/us/app/yardsard/id6759114903
2•prithsr•51m ago•0 comments

Show HN: Imladri – Cryptographic enforcement and semantic monitoring for your AI

https://imladri.com/
3•osama872•53m ago•0 comments