frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Get found where people search today

https://kleonotus.com/
1•makenotesfast•2m ago•1 comments

Show HN: An early-warning system for SaaS churn (not another dashboard)

https://firstdistro.com
1•Jide_Lambo•2m ago•0 comments

Tell HN: Musk has never *tweeted* a guess for real identity of Satoshi Nakamoto

1•tokenmemory•3m ago•0 comments

A Practical Approach to Verifying Code at Scale

https://alignment.openai.com/scaling-code-verification/
1•gmays•5m ago•0 comments

Show HN: macOS tool to restore window layouts

https://github.com/zembutsu/tsubame
1•zembutsu•7m ago•0 comments

30 Years of <Br> Tags

https://www.artmann.co/articles/30-years-of-br-tags
1•FragrantRiver•14m ago•0 comments

Kyoto

https://github.com/stevepeak/kyoto
2•handfuloflight•15m ago•0 comments

Decision Support System for Wind Farm Maintenance Using Robotic Agents

https://www.mdpi.com/2571-5577/8/6/190
1•PaulHoule•15m ago•0 comments

Show HN: X-AnyLabeling – An open-source multimodal annotation ecosystem for CV

https://github.com/CVHub520/X-AnyLabeling
1•CVHub520•18m ago•0 comments

Penpot Docker Extension

https://www.ajeetraina.com/introducing-the-penpot-docker-extension-one-click-deployment-for-self-...
1•rainasajeet•19m ago•0 comments

Company Thinks It Can Power AI Data Centers with Supersonic Jet Engines

https://www.extremetech.com/science/this-company-thinks-it-can-power-ai-data-centers-with-superso...
1•vanburen•22m ago•0 comments

If AIs can feel pain, what is our responsibility towards them?

https://aeon.co/essays/if-ais-can-feel-pain-what-is-our-responsibility-towards-them
3•rwmj•26m ago•4 comments

Elon Musk's xAI Sues Apple and OpenAI over App Store Drama

https://mashable.com/article/elon-musk-xai-lawsuit-apple-openai
1•paulatreides•29m ago•1 comments

Ask HN: Build it yourself SWE blogs?

1•bawis•29m ago•1 comments

Original Apollo 11 Guidance Computer source code

https://github.com/chrislgarry/Apollo-11
3•Fiveplus•35m ago•0 comments

How Did the CIA Lose Nuclear Device?

https://www.nytimes.com/interactive/2025/12/13/world/asia/cia-nuclear-device-himalayas-nanda-devi...
1•Wonnk13•35m ago•0 comments

Is vibe coding the new gateway to technical debt?

https://www.infoworld.com/article/4098925/is-vibe-coding-the-new-gateway-to-technical-debt.html
1•birdculture•39m ago•1 comments

Why Rust for Embedded Systems? (and Why I'm Teaching Robotics with It)

https://blog.ravven.dev/blog/why-rust-for-embedded-systems/
2•aeyonblack•41m ago•0 comments

EU: Protecting children without the privacy nightmare of Digital IDs

https://democrats.eu/en/protecting-minors-online-without-violating-privacy-is-possible/
3•valkrieco•41m ago•0 comments

Using E2E Tests as Documentation

https://www.vaslabs.io/post/using-e2e-tests-as-documentation
1•lihaoyi•42m ago•0 comments

Apple Welcome Screen: iWeb

https://www.apple.com/welcomescreen/ilife/iweb-3/
1•hackerbeat•43m ago•1 comments

Accessible Perceptual Contrast Algorithm (APCA) in a Nutshell

https://git.apcacontrast.com/documentation/APCA_in_a_Nutshell.html
1•Kerrick•44m ago•0 comments

AI agent finds more security flaws than human hackers at Stanford

https://scienceclock.com/ai-agent-beats-human-hackers-in-stanford-cybersecurity-experiment/
3•ashishgupta2209•45m ago•2 comments

Nano banana prompts, updates everyday

https://github.com/fionalee1412/bestnanobananaprompt-github
4•AI_kid1412•49m ago•0 comments

Skills vs. Dynamic MCP Loadouts

https://lucumr.pocoo.org/2025/12/13/skills-vs-mcp/
3•cube2222•53m ago•0 comments

Top validated AI-SaaS Ideas are available here

1•peterbricks•57m ago•0 comments

UnmaskIP: A Clean, Ad-Free IP and Deep Packet Leak Checker

https://unmaskip.net
1•kfwkwefwef•1h ago•0 comments

PydanticAI-DeepAgents – AI Agent Framework planning, filesystem, and subagents

https://github.com/vstorm-co/pydantic-deepagents
1•kacper-vstorm•1h ago•1 comments

DeepCSIM – Detect duplicate and similar code using AST analysis

https://github.com/whm04/deepcsim
1•whm04•1h ago•1 comments

Chip‐8 Technical Reference

https://github.com/mattmikolay/chip-8/wiki/CHIP%E2%80%908-Technical-Reference
1•AlexeyBrin•1h ago•0 comments
Open in hackernews

Workhorse LLMs: Why Open Source Models Dominate Closed Source for Batch Tasks

https://sutro.sh/blog/workhorse-llms-why-open-source-models-win-for-batch-tasks
118•cmogni1•6mo ago

Comments

ramesh31•6mo ago
Flash is just so obscenely cheap at this point it's hard to justify the headache of self hosting though. Really only applies to sensitive data IMO.
behnamoh•6mo ago
You're getting downvoted but what you said is true. The cost of self-hosting (and achieving +70 tok/sec consistently across the entire context window) has never been low enough to justify open source as a viable competitor to proprietary models of OpenAI, Google, and Anthropic.
grepfru_it•6mo ago
I am curious the need for 70 t/sec?
Aeolun•6mo ago
Waiting minutes for your call to succeed is too frustrating?
ekianjo•6mo ago
Depends entirely on the use case. Not every LLM workflow is a chatbot
jbellis•6mo ago
no, but if you're not latency sensitive you should probably be using DeepSeek v3 (cheaper than flash, significantly smarter)
lostmsu•6mo ago
What makes you believe DeepSeek is smarter than Flash 2.5? It is lower on all leaderboards.
jbellis•6mo ago
you're right, I should clarify that I'm talking about no thinking mode, otherwise flash goes from "a bit more expensive than dsv3" to "10x more expensive"
cootsnuck•6mo ago
High concurrency voice AI systems.
grepfru_it•6mo ago
Why are you self hosting that?
jacob019•6mo ago
That's true for Flash 2.0 at $0.40/mtok output. GPT-4.1-nano is the same price and also surprisingly capable. I can spend real money with 2.5 flash, with those $3.50/mtok thinking tokens, worth it though. OP is an inference provider, so there may be some bias. Open source can't compete on context length either, nothing touches 2.5 flash for the price with long context--I've experimented with this a lot for my agentic pricing system. Open source models are improving, but they aren't really any cheaper right now, R1 for example does quite well performance wise, but it uses a LOT of tokens to get there, further limiting the shorter context window. There's still value in the open source models, each model has unique strengths and they're advancing quickly, but the frontier labs are moving fast too and have very compelling "workhorse" offers.
mkl•6mo ago
With tools like Ollama, self-hosting is easier than hosted. No sign-up, no API keys, no permission to spend money, no worries about data security, just an easy install then import a Python library. Qwen2.5-VL 7B is proving useful even on a work laptop with insufficient VRAM - I just leave it running over a night or weekend and it's saving me dozens of hours of work (that I then get to spend on other higher-value work).
mgraczyk•6mo ago
It does not take dozens of hours to get an API key for gemini
mkl•6mo ago
I never claimed that it did. Gemini would probably save me the same dozens of hours, but come with ongoing costs and additional starting up hurdles (some near insurmountable in my organisation, like data security for some of what I'm doing).
shmoogy•6mo ago
Gemini flash or any free LLM on openrouter would be orders of magnitude faster and effectively free. Unless you are concerned about privacy of the conversation - it's really purely being able to say you did it locally.

I definitely do appreciate and believe in the value of open source / open weight LLMs - but inference is so cheap right now for non frontier models.

cortesoft•6mo ago
They weren’t saying getting the api key would take that long, just getting permission from their company to let them do it.
genewitch•6mo ago
I got the 70b qwen llama distill, I have 24GB of vram.

I opened aider and gave a small prompt, roughly:

  Implement a JavaScript 2048 game that exists as flat file(s) and does not require a server, just the game HTML, CSS, and js. Make it compatible with firefox, at least.
That's it. Several hours later, it finished. The game ran. It was worth it because this was in the winter and it heated my house a bit, yay. I think the resulting 1-shot output is on my github.

I know it was in the training set, etc, but I wanted to see how big of a hassle it was, if it would 1-shot with such a small prompt, how long it would take.

Makes me want to try deepseek 671B, but I don't have any machines with >1TB of memory.

I do take donations of hardware.

mechagodzilla•6mo ago
Buy a used workstation with 512GB of DDR4 RAM. It will probably cost like $1-1.5k, and be able to run a Q4 version of the full deepseek 671B models. I have a similar setup with dual-socket 18 core Xeons (and 768GB of RAM, so it cost about $2k), and can get about 1.5 tokens/sec on those models. Being able to see the full thinking trace on the R1 models is awesome compared to the OpenAI models.
3036e4•6mo ago
If/when Corporate Legal approves a tool like Ollama for use on company computers, yes. Might not require purchasing anything, but there can still be red tape.
xfalcox•6mo ago
You'd be surprised how often people in enterprise can be left waiting months to get an API key approved for an LLM provider.
diggan•6mo ago
Are you saying that it's faster for them to get the hardware to run the weights themselves? Otherwise I'm not sure what the relevancy is.
ChromaticPanic•6mo ago
Yes some have existing infra
diggan•6mo ago
I'm having a somewhat hard time believing a corporation where getting a API key for a LLM service is very difficult, somehow has the (GPU) infrastructure already running for doing the same thing themselves, unless they happen to be a ML corporation, but I don't think we're talking about those in this context.
oooyay•6mo ago
Nah this is definitely a real scenario. Getting access to public models requires a lot of security review, but proving through Bedrock is much more simple. I may be spoiled in having worked for companies that have ML departments and developer XP departments though.
diggan•6mo ago
Not sure Bedrock counts as self-hosting though, isn't it a managed service Amazon provides?

> I may be spoiled in having worked for companies that have ML

Sounds likely, yeah, how many companies have ML departments today? DS departments seem common, but ML i'm not too sure about

fourthark•6mo ago
A lot of companies think they do
achierius•6mo ago
No, this is very real. One reason why this can happen: a company has elaborate processes for protecting their internal data from leaking, but otherwise lets engineers do what they want with resources allocated to them.
pegasus•6mo ago
Unless they are already in the possession of such hardware (like an M3 mac, for example).
cortesoft•6mo ago
There is a wide range of opinions on what should be considered sensitive data. Many people would classify a vast majority of their data as sensitive.
dTal•6mo ago
Not only that but it's a liability having two pipelines, one secure and one insecure. Apart from technical overhead, since the "insecure" pipeline is surely better/faster/cheaper/convenient (or else why have it at all), it creates a perverse incentive when classifying data as "sensitive" or not.

We already went through this with https everywhere. Previously, encryption was considered "only for sensitive data".

delichon•6mo ago
Pass the choices through, please. It's so context dependent that I want a <dumber> and a <smarter> button, with units of $/M tokens. And another setting to send a particular prompt to "[x] batch" and email me with the answer later. For most things I'll start dumb and fast, but switch to smart and slow when the going gets rough.
jbellis•6mo ago
This is a useful analysis, but only as a first cut and sometimes not even that -- Grok 3 mini and DeepSeek V3 are by far the least expensive coding models that are worth trying for scenarios where you do and don't care about the vendor training on your requests, respectively. One of those is "open source" (by which he seems to mean "open weights") but far too large to run locally.

[I guess that must be a useful market niche though, apparently this is by a company selling batch compute on exactly those small open weights models.]

The problem is the author is evaluating by dividing the Artificial Analysis score by a blended cost per token, but most tasks have an intelligence "floor" below which it doesn't matter how cheap something is, it will never succeed. And when you strip out the very high results from super cheap 4B OSS models the rest are significantly outclassed by Flash 2.0 (not on his chart but still worth considering) and 2.5, not to mention other models that might be better in domain specific tasks like grok-3 mini for code.

(Nobody should be using Haiku in 2025. The OpenAI mini models are not as bad as Haiku in p/p and maybe there is a use case for prefering one over Flash but if so I don't know what it is.)

dinosaurdynasty•6mo ago
DeepSeek has a lot of competing providers that at least state they don't train on API data, OpenRouter lists a bunch of them: https://openrouter.ai/deepseek/deepseek-chat-v3-0324/provide...

(This is a big advantage of open weight models; even if they're too big to host yourself, if it's worth anything there's a lot of competition for inference)

jbellis•6mo ago
and all of them are so much more expensive than OG deepseek as to completely remove themselves from consideration

you should probably use grok 3 mini if you want "cheapest model that is reasonably good at code"

aitchnyu•6mo ago
The above link disproves your "more expensive than OG deepdeek"
jbellis•5mo ago
Only if you can't read. DSv3 is 4x cheaper than the cheapest option in that link during US business hours.