frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Low-Level Optimization with Zig

https://alloc.dev/2025/06/07/zig_optimization
108•Retro_Dev•4h ago•28 comments

The FAIR Package Manager: Decentralized WordPress infrastructure

https://joost.blog/path-forward-for-wordpress/
107•twapi•6h ago•24 comments

Researchers develop ‘transparent paper’ as alternative to plastics

https://japannews.yomiuri.co.jp/science-nature/technology/20250605-259501/
285•anigbrowl•13h ago•143 comments

The time bomb in the tax code that's fueling mass tech layoffs

https://qz.com/tech-layoffs-tax-code-trump-section-174-microsoft-meta-1851783502
900•booleanbetrayal•2d ago•562 comments

Falsehoods programmers believe about aviation

https://flightaware.engineering/falsehoods-programmers-believe-about-aviation/
282•cratermoon•13h ago•109 comments

How we decreased GitLab repo backup times from 48 hours to 41 minutes

https://about.gitlab.com/blog/2025/06/05/how-we-decreased-gitlab-repo-backup-times-from-48-hours-to-41-minutes/
437•immortaljoe•19h ago•184 comments

A year of funded FreeBSD development

https://www.daemonology.net/blog/2025-06-06-A-year-of-funded-FreeBSD.html
280•cperciva•15h ago•81 comments

Why are smokestacks so tall?

https://practical.engineering/blog/2025/6/3/why-are-smokestacks-so-tall
82•azeemba•10h ago•20 comments

Sharing everything I could understand about gradient noise

https://blog.pkh.me/p/42-sharing-everything-i-could-understand-about-gradient-noise.html
73•ux•20h ago•3 comments

The Illusion of Thinking: Understanding the Limitations of Reasoning LLMs [pdf]

https://ml-site.cdn-apple.com/papers/the-illusion-of-thinking.pdf
205•amrrs•17h ago•108 comments

Highly efficient matrix transpose in Mojo

https://veitner.bearblog.dev/highly-efficient-matrix-transpose-in-mojo/
106•timmyd•16h ago•36 comments

Medieval Africans had a unique process for purifying gold with glass (2019)

https://www.atlasobscura.com/articles/medieval-african-gold
98•mooreds•13h ago•51 comments

NASA delays next flight of Boeing's alternative to SpaceX Dragon

https://theedgemalaysia.com/node/758199
31•bookmtn•8h ago•21 comments

Ziina (YC W21) the Series A fintech is hiring product engineers

https://ziina.notion.site/Senior-Backend-Engineer-8b6642ec52ac45869656c135e07c6e86
1•faisaltoukan•4h ago

Sandia turns on brain-like storage-free supercomputer

https://blocksandfiles.com/2025/06/06/sandia-turns-on-brain-like-storage-free-supercomputer/
177•rbanffy•20h ago•67 comments

I Read All of Cloudflare's Claude-Generated Commits

https://www.maxemitchell.com/writings/i-read-all-of-cloudflares-claude-generated-commits/
134•maxemitchell•12h ago•98 comments

Getting Past Procrastination

https://spectrum.ieee.org/getting-past-procastination
124•WaitWaitWha•8h ago•55 comments

A masochist's guide to web development

https://sebastiano.tronto.net/blog/2025-06-06-webdev/
223•sebtron•21h ago•29 comments

Show HN: AI game animation sprite generator

https://www.godmodeai.cloud/ai-sprite-generator
89•lyogavin•16h ago•68 comments

Odyc.js – A tiny JavaScript library for narrative games

https://odyc.dev
214•achtaitaipai•21h ago•49 comments

Reverse Engineering Cursor's LLM Client

https://www.tensorzero.com/blog/reverse-engineering-cursors-llm-client/
20•paulwarren•8h ago•1 comments

Smalltalk, Haskell and Lisp

https://storytotell.org/smalltalk-haskell-and-lisp
88•todsacerdoti•14h ago•37 comments

Workhorse LLMs: Why Open Source Models Dominate Closed Source for Batch Tasks

https://sutro.sh/blog/workhorse-llms-why-open-source-models-win-for-batch-tasks
68•cmogni1•16h ago•16 comments

Wendelstein 7-X sets new fusion record

https://www.heise.de/en/news/Wendelstein-7-X-sets-new-fusion-record-10422955.html
152•doener•4d ago•26 comments

Too Many Open Files

https://mattrighetti.com/2025/06/04/too-many-files-open
125•furkansahin•20h ago•98 comments

Curate your shell history

https://esham.io/2025/05/shell-history
121•todsacerdoti•21h ago•69 comments

What you need to know about EMP weapons

https://www.aardvark.co.nz/daily/2025/0606.shtml
135•flyingkiwi44•1d ago•161 comments

Series C and scale

https://www.cursor.com/en/blog/series-c
78•fidotron•18h ago•54 comments

Meta: Shut down your invasive AI Discover feed

https://www.mozillafoundation.org/en/campaigns/meta-shut-down-your-invasive-ai-discover-feed-now/
488•speckx•20h ago•210 comments

Weaponizing Dependabot: Pwn Request at its finest

https://boostsecurity.io/blog/weaponizing-dependabot-pwn-request-at-its-finest
98•chha•1d ago•46 comments
Open in hackernews

Workhorse LLMs: Why Open Source Models Dominate Closed Source for Batch Tasks

https://sutro.sh/blog/workhorse-llms-why-open-source-models-win-for-batch-tasks
68•cmogni1•16h ago

Comments

ramesh31•14h ago
Flash is just so obscenely cheap at this point it's hard to justify the headache of self hosting though. Really only applies to sensitive data IMO.
behnamoh•13h ago
You're getting downvoted but what you said is true. The cost of self-hosting (and achieving +70 tok/sec consistently across the entire context window) has never been low enough to justify open source as a viable competitor to proprietary models of OpenAI, Google, and Anthropic.
grepfru_it•11h ago
I am curious the need for 70 t/sec?
Aeolun•10h ago
Waiting minutes for your call to succeed is too frustrating?
ekianjo•6h ago
Depends entirely on the use case. Not every LLM workflow is a chatbot
jacob019•13h ago
That's true for Flash 2.0 at $0.40/mtok output. GPT-4.1-nano is the same price and also surprisingly capable. I can spend real money with 2.5 flash, with those $3.50/mtok thinking tokens, worth it though. OP is an inference provider, so there may be some bias. Open source can't compete on context length either, nothing touches 2.5 flash for the price with long context--I've experimented with this a lot for my agentic pricing system. Open source models are improving, but they aren't really any cheaper right now, R1 for example does quite well performance wise, but it uses a LOT of tokens to get there, further limiting the shorter context window. There's still value in the open source models, each model has unique strengths and they're advancing quickly, but the frontier labs are moving fast too and have very compelling "workhorse" offers.
mkl•11h ago
With tools like Ollama, self-hosting is easier than hosted. No sign-up, no API keys, no permission to spend money, no worries about data security, just an easy install then import a Python library. Qwen2.5-VL 7B is proving useful even on a work laptop with insufficient VRAM - I just leave it running over a night or weekend and it's saving me dozens of hours of work (that I then get to spend on other higher-value work).
mgraczyk•11h ago
It does not take dozens of hours to get an API key for gemini
mkl•10h ago
I never claimed that it did. Gemini would probably save me the same dozens of hours, but come with ongoing costs and additional starting up hurdles (some near insurmountable in my organisation, like data security for some of what I'm doing).
shmoogy•10h ago
Gemini flash or any free LLM on openrouter would be orders of magnitude faster and effectively free. Unless you are concerned about privacy of the conversation - it's really purely being able to say you did it locally.

I definitely do appreciate and believe in the value of open source / open weight LLMs - but inference is so cheap right now for non frontier models.

cortesoft•7h ago
They weren’t saying getting the api key would take that long, just getting permission from their company to let them do it.
genewitch•10h ago
I got the 70b qwen llama distill, I have 24GB of vram.

I opened aider and gave a small prompt, roughly:

  Implement a JavaScript 2048 game that exists as flat file(s) and does not require a server, just the game HTML, CSS, and js. Make it compatible with firefox, at least.
That's it. Several hours later, it finished. The game ran. It was worth it because this was in the winter and it heated my house a bit, yay. I think the resulting 1-shot output is on my github.

I know it was in the training set, etc, but I wanted to see how big of a hassle it was, if it would 1-shot with such a small prompt, how long it would take.

Makes me want to try deepseek 671B, but I don't have any machines with >1TB of memory.

I do take donations of hardware.

xfalcox•9h ago
You'd be surprised how often people in enterprise can be left waiting months to get an API key approved for an LLM provider.
diggan•2h ago
Are you saying that it's faster for them to get the hardware to run the weights themselves? Otherwise I'm not sure what the relevancy is.
cortesoft•6h ago
There is a wide range of opinions on what should be considered sensitive data. Many people would classify a vast majority of their data as sensitive.
delichon•13h ago
Pass the choices through, please. It's so context dependent that I want a <dumber> and a <smarter> button, with units of $/M tokens. And another setting to send a particular prompt to "[x] batch" and email me with the answer later. For most things I'll start dumb and fast, but switch to smart and slow when the going gets rough.