frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Show HN: Mermaid Formatter – CLI and library to auto-format Mermaid diagrams

https://github.com/chenyanchen/mermaid-formatter
1•astm•14m ago•0 comments

RFCs vs. READMEs: The Evolution of Protocols

https://h3manth.com/scribe/rfcs-vs-readmes/
2•init0•20m ago•1 comments

Kanchipuram Saris and Thinking Machines

https://altermag.com/articles/kanchipuram-saris-and-thinking-machines
1•trojanalert•20m ago•0 comments

Chinese chemical supplier causes global baby formula recall

https://www.reuters.com/business/healthcare-pharmaceuticals/nestle-widens-french-infant-formula-r...
1•fkdk•23m ago•0 comments

I've used AI to write 100% of my code for a year as an engineer

https://old.reddit.com/r/ClaudeCode/comments/1qxvobt/ive_used_ai_to_write_100_of_my_code_for_1_ye...
1•ukuina•26m ago•1 comments

Looking for 4 Autistic Co-Founders for AI Startup (Equity-Based)

1•au-ai-aisl•36m ago•1 comments

AI-native capabilities, a new API Catalog, and updated plans and pricing

https://blog.postman.com/new-capabilities-march-2026/
1•thunderbong•36m ago•0 comments

What changed in tech from 2010 to 2020?

https://www.tedsanders.com/what-changed-in-tech-from-2010-to-2020/
2•endorphine•41m ago•0 comments

From Human Ergonomics to Agent Ergonomics

https://wesmckinney.com/blog/agent-ergonomics/
1•Anon84•45m ago•0 comments

Advanced Inertial Reference Sphere

https://en.wikipedia.org/wiki/Advanced_Inertial_Reference_Sphere
1•cyanf•46m ago•0 comments

Toyota Developing a Console-Grade, Open-Source Game Engine with Flutter and Dart

https://www.phoronix.com/news/Fluorite-Toyota-Game-Engine
1•computer23•49m ago•0 comments

Typing for Love or Money: The Hidden Labor Behind Modern Literary Masterpieces

https://publicdomainreview.org/essay/typing-for-love-or-money/
1•prismatic•49m ago•0 comments

Show HN: A longitudinal health record built from fragmented medical data

https://myaether.live
1•takmak007•52m ago•0 comments

CoreWeave's $30B Bet on GPU Market Infrastructure

https://davefriedman.substack.com/p/coreweaves-30-billion-bet-on-gpu
1•gmays•1h ago•0 comments

Creating and Hosting a Static Website on Cloudflare for Free

https://benjaminsmallwood.com/blog/creating-and-hosting-a-static-website-on-cloudflare-for-free/
1•bensmallwood•1h ago•1 comments

"The Stanford scam proves America is becoming a nation of grifters"

https://www.thetimes.com/us/news-today/article/students-stanford-grifters-ivy-league-w2g5z768z
3•cwwc•1h ago•0 comments

Elon Musk on Space GPUs, AI, Optimus, and His Manufacturing Method

https://cheekypint.substack.com/p/elon-musk-on-space-gpus-ai-optimus
2•simonebrunozzi•1h ago•0 comments

X (Twitter) is back with a new X API Pay-Per-Use model

https://developer.x.com/
3•eeko_systems•1h ago•0 comments

Zlob.h 100% POSIX and glibc compatible globbing lib that is faste and better

https://github.com/dmtrKovalenko/zlob
3•neogoose•1h ago•1 comments

Show HN: Deterministic signal triangulation using a fixed .72% variance constant

https://github.com/mabrucker85-prog/Project_Lance_Core
2•mav5431•1h ago•1 comments

Scientists Discover Levitating Time Crystals You Can Hold, Defy Newton’s 3rd Law

https://phys.org/news/2026-02-scientists-levitating-crystals.html
3•sizzle•1h ago•0 comments

When Michelangelo Met Titian

https://www.wsj.com/arts-culture/books/michelangelo-titian-review-the-renaissances-odd-couple-e34...
1•keiferski•1h ago•0 comments

Solving NYT Pips with DLX

https://github.com/DonoG/NYTPips4Processing
1•impossiblecode•1h ago•1 comments

Baldur's Gate to be turned into TV series – without the game's developers

https://www.bbc.com/news/articles/c24g457y534o
3•vunderba•1h ago•0 comments

Interview with 'Just use a VPS' bro (OpenClaw version) [video]

https://www.youtube.com/watch?v=40SnEd1RWUU
2•dangtony98•1h ago•0 comments

EchoJEPA: Latent Predictive Foundation Model for Echocardiography

https://github.com/bowang-lab/EchoJEPA
1•euvin•1h ago•0 comments

Disablling Go Telemetry

https://go.dev/doc/telemetry
1•1vuio0pswjnm7•1h ago•0 comments

Effective Nihilism

https://www.effectivenihilism.org/
1•abetusk•1h ago•1 comments

The UK government didn't want you to see this report on ecosystem collapse

https://www.theguardian.com/commentisfree/2026/jan/27/uk-government-report-ecosystem-collapse-foi...
5•pabs3•1h ago•0 comments

No 10 blocks report on impact of rainforest collapse on food prices

https://www.thetimes.com/uk/environment/article/no-10-blocks-report-on-impact-of-rainforest-colla...
3•pabs3•1h ago•0 comments
Open in hackernews

DeepSeek-v3.1-Terminus

https://api-docs.deepseek.com/news/news250922
101•meetpateltech•4mo ago

Comments

sbinnee•4mo ago
> What’s improved? Language consistency: fewer CN/EN mix-ups & no more random chars.

It's good that they made this improvement. But is there any advantages at this point using DeepSeek over Qwen?

IgorPartola•4mo ago
I wish there was some easy resource to keep up with the latest models. The best I have come up with so far is asking one model to research the others. Realistically I want to know latest versions, best use case, performance (in terms of speed) relative to some baseline, and hardware requirements to run it.
exe34•4mo ago
> asking one model to research the others.

that's basically choosing are random with extra steps!

throwup238•4mo ago
Research not spit out the answer based on weights. Just ask Gemini/Claude to do deep research on /r/LocalLLama and HN posts.
Jgoauh•4mo ago
have you tried https://artificialanalysis.ai/
JimDugan•4mo ago
Dumb collation of benchmarks that the big labs are essentially training on. Livebench.ai is the industry standard - non contaminated, new questions every few months.
IgorPartola•4mo ago
Thanks! Are the scores in some way linear here? As in, if model A is rated at 25 and model B at 50, does that mean I will have half the mistakes with model B? Get answers that are 2x more accurate? Or is it subjective?
esafak•4mo ago
I believe the score represents the fraction of correct answers, so yes.
alexeiz•4mo ago
It says the best "coding index" is held by Grok 4 and Gemini 2.5 Pro. Give me a break. Nobody uses those models for serious coding. It's dominated by Sonnet 4/Opus 4.1 and GPT-5.
__mharrison__•4mo ago
I use Aider heavily and find their benchmark to be pretty good. It is updated relatively frequently (a month ago, which may be an eternity in AI time).

https://aider.chat/docs/leaderboards/

comrade1234•4mo ago
MIT license that lets you run it on your own hardware and make money off of it.
coder543•4mo ago
Qwen3 models (including their 235B and 480B models) use the Apache-2.0 license, so it’s not like that’s a big difference here.
coder543•4mo ago
They seem fairly competitive with each other. You would have to benchmark them for your specific use case.
twotwotwo•4mo ago
The fast Cerebras thing got me to try the Qwen3 models. I couldn't get them working all that well: they had trouble using the required output format and following instructions. On the other hand, benchmarks say they should be great, and it sounds like maybe some people use them OK via different tools.

I'm curious if my experience was unusual (it very much could be!) and I'd be interested to hear from anyone who's used both.

yu3zhou4•4mo ago
I see no article in the link, just "news250922" header with some layout
meetpateltech•4mo ago
It’s up again, check it.

Twitter/X post link: https://twitter.com/deepseek_ai/status/1970117808035074215

Also Hugging Face model link: https://huggingface.co/deepseek-ai/DeepSeek-V3.1-Terminus

bratao•4mo ago
The link is off. This link works https://api-docs.deepseek.com/updates#deepseek-v31-terminus
esafak•4mo ago
Notable performance improvement in agentic tool use: https://huggingface.co/deepseek-ai/DeepSeek-V3.1-Terminus

The Deepseek provider may train on your prompts: https://openrouter.ai/deepseek/deepseek-v3.1-terminus

storus•4mo ago
I tried V3.1 but it was driving me crazy by ignoring parts of user input, which R1 never did. I had many such instances when e.g. asking about running DeepSeek 671B it instead picked DeepSeek 67B because 671B is too large to exist so I must have made a mistake etc. I concluded that despite being better in benchmarks than R1, it was essentially useless due to this characteristics and I instead started using R1 at OpenRouter. Not sure why deepseek.com removed R1 and left only V3.1 without any ability to switch back, I guess it's cheaper to run.
Grimblewald•4mo ago
Matches my experience in general as well. I find benchmarks largly useless for comparing current models. Many, despite improved metrics, are strictly worse than predecessors. What little gains they show in some areas, like agentic use here, are often set by far broader and often catastrophic losses.
binary132•4mo ago
sure would be neat if these companies would release models that could run on consumer hardware
edude03•4mo ago
So there are two ways to look at this - both hinge on how your define "consumer":

1) We haven't managed to distill models enough to get good enough performance to fit in the typical gaming desktop (say, 7B-24b class models). Even then though - most consumers don't have high end desktops, so even a 3060 class GPU requirement would exclude a lot of people.

2) Nothing is stopping you/anyone from buying 24ish 5090s (a consumer hardware product) to get the required ~600GB-1TB of VRAM to run unquantized deepseek except time/money/know how. Sure, it's unreasonably expensive but it's not like labs are conspiring to prevent people from running these models, it's just expensive for everyone and the common person doesn't have the funding to get into it.

regularfry•4mo ago
> 1) We haven't managed to distill models enough to get good enough performance to fit in the typical gaming desktop (say, 7B-24b class models).

That really depends on what "good enough" means. Qwen3-30b runs absolutely fine at q4 on a 24GB card, although that's also stretching "typical gaming desktop". It's competent as a code completion or aider-type coding agent model in that scenario.

But really we need both. Yes it would be nice to have things targeted to our own particular niche, but there are only so many labs cranking these things out. Small models will only get better from here.

__mharrison__•4mo ago
I'm using Qwen3Next on my MBP. It uses around 42GB of memory and, according to Aider benchmarks, has similar perf to GPT-4.1

https://huggingface.co/mlx-community/Qwen3-Next-80B-A3B-Inst...

binary132•4mo ago
Just waiting on llama.cpp support :)

I usually use GPT-oss-120B with CPU MoE offloading. It writes at about 10tps, which is useful enough for the limited things I use it for. But I’m curious how Q3 Next will work (or whether I’ll be able to offload and run it with GPU acceleration at all.)

(4090)

qwertytyyuu•4mo ago
Simple as hot saturated really quickly huh, less than one year
twotwotwo•4mo ago
Interesting--I'd seen Chinese characters surprise inserted when it was just repeating back input with one provider, but not others. (I'd also occasionally seen tokens surprise-translated to Chinese.)

There's a GitHub bug about it that leads to more discussion here: https://github.com/deepseek-ai/DeepSeek-V3/issues/849

Good to see a fix and that it goes with some benchmark gains!

nojs•4mo ago
The language mixup thing seems to be an issue across all LLMs, as soon as you put some Chinese in the prompt they will often randomly respond in Chinese.

Also, given a partly Chinese prompt, Qwen will sometimes run its whole thinking trace in Chinese, which anecdotally seems to perform slightly worse for the same prompt versus an English thinking trace.