frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

An intuitive approach for understanding electricity [video]

https://www.youtube.com/watch?v=X_crwFuPht4
1•thunderbong•18s ago•0 comments

A Parallel Internet

https://k2xl.substack.com/p/a-parallel-internet
1•k2xl•1m ago•0 comments

Blue Owl Halts Redemptions on Private Credit Retail Fund

https://www.bloomberg.com/news/articles/2026-02-18/blue-owl-loan-sale-raises-1-4-billion-for-inve...
1•zerosizedweasle•3m ago•1 comments

AIP – How my AI agent built a decentralized identity protocol for agents

https://github.com/The-Nexus-Guard/aip
1•the_nexus_guard•3m ago•1 comments

I Obtained Mew in Pokémon Red on a Real Game Boy

https://vaguilar.com/2026/02/18/how-i-obtained-mew-in-pokemon-red-on-a-real-game-boy/
1•vaguilar•3m ago•0 comments

Sub-$200 Lidar Could Reshuffle Auto Sensor Economics

https://spectrum.ieee.org/solid-state-lidar-microvision-adas
1•mhb•4m ago•0 comments

Nickel Since 1.0

https://www.tweag.io/blog/2026-02-19-nickel-since-1-0/
1•ingve•4m ago•0 comments

Dear Copilot, can you help me with SQL?

https://devblogs.microsoft.com/azure-sql/dear-copilot-azure-sql/
1•ibobev•4m ago•0 comments

Microspeak: Escrow

https://devblogs.microsoft.com/oldnewthing/20260217-00/?p=112067
1•ibobev•4m ago•0 comments

OpenBlockspace – IR³ Alpha – Pure Flux Architecture

https://bitcoin-zero-down-2ea152.gitlab.io/gallery/gallery-item-neg-878/
1•machardmachard•4m ago•1 comments

Optofluidic three-dimensional microfabrication and nanofabrication

https://www.nature.com/articles/s41586-025-10033-x
1•PaulHoule•5m ago•0 comments

Show HN: PostForge – A PostScript interpreter written in Python

https://github.com/AndyCappDev/postforge
1•AndyCappDev•5m ago•0 comments

Why Do the Police Exist? (2020)

https://novaramedia.com/2020/06/20/why-does-the-police-exist/
2•robtherobber•6m ago•0 comments

AI-Powered Performance Analysis

https://twitter.com/LangChain_JS/status/2024515544788140134
1•cbromann•6m ago•0 comments

Show HN: Public Speaking Coach with AI

https://apps.apple.com/us/app/speaking-coach-spechai/id6755611866
1•javierbuilds•6m ago•0 comments

AI found 12 of 12 OpenSSL zero-days

https://www.lesswrong.com/posts/7aJwgbMEiKq5egQbd/ai-found-12-of-12-openssl-zero-days-while-curl-...
2•AndrewDucker•6m ago•0 comments

AI made coding more enjoyable

https://weberdominik.com/blog/ai-coding-enjoyable/
2•domysee•7m ago•0 comments

Reflections on Oman

https://twitter.com/WillManidis/status/2024489454023405861
2•jger15•7m ago•0 comments

Hope

https://en.wikipedia.org/wiki/Hope
1•marysminefnuf•8m ago•0 comments

Passkey deployment mistakes banks make

https://www.corbado.com/blog/passkey-deployment-mistakes-banks
1•vdelitz•9m ago•0 comments

Naval shipwreck emerges in Sweden after being buried underwater for 400 years

https://www.cbsnews.com/news/navy-shipwreck-emerges-baltic-sea-sweden/
3•efrecon•9m ago•0 comments

Cue Is a Configuration Language

https://bitfieldconsulting.com/posts/cuelang-exciting
1•ahamez•10m ago•0 comments

Goosetown: Parallel AI agent flocks that research, build, and review code

https://github.com/block/goosetown
1•triple5•10m ago•0 comments

AI-generated passwords are easy to crack

https://gizmodo.com/ai-generated-passwords-are-apparently-quite-easy-to-crack-2000723660
1•vdelitz•11m ago•0 comments

OpenClaw Partners with VirusTotal for Skill Security

https://openclaw.ai/blog/virustotal-partnership
1•trogonkhant•11m ago•0 comments

Measuring Input-to-Photon Latency (Because 'Wayland Feels Off' Isn't a Metric)

https://davidjusto.com/articles/m2p-latency/
1•madspindel•12m ago•0 comments

Why IP Address Certificates Are Dangerous and Usually Unnecessary

https://www.agwa.name/blog/post/ip_address_certs
2•agwa•12m ago•0 comments

The RAM shortage is coming for everything you care about

https://www.theverge.com/tech/880812/ramageddon-ram-shortage-memory-crisis-price-2026-phones-laptops
3•LordAtlas•13m ago•0 comments

MCP Guardian – Let your LLM audit its own MCP tools for prompt injection

https://github.com/alexandriashai/mcp-guardian
2•alexandriaeden•13m ago•2 comments

Gemini 3.1

https://deepmind.google/models/model-cards/gemini-3-1-pro/
20•PunchTornado•13m ago•1 comments
Open in hackernews

The $2k Laptop That Replaced My $200/Month AI Subscription

6•Raywob•1h ago
Cloud AI pricing is per-token. The more useful your pipeline, the more it costs. I built a dual-model orchestration pattern that routes 80% of work to a free local model (Qwen3 8B on Ollama, GPU-accelerated) and only sends the synthesis/judgment stage to a cloud API.

Cost for a 50-item research pipeline: $0.15-0.40 vs $8-15 all-cloud. Same output quality where it matters.

Stack: RTX 5080 laptop, Ollama in Docker with GPU passthrough, PostgreSQL, Redis, Claude API for the final 20%.

The pattern: scan locally → score locally → deduplicate locally → synthesize via cloud. Four stages, three are free.

Gotchas I hit: Qwen3's thinking tokens through /api/generate (use /api/chat instead), Docker binding to IPv4 only while Windows resolves localhost to IPv6, and GPU memory ceilings on consumer cards.

Happy to share architecture details in comments.

Comments

solomatov•1h ago
How is quality for what Qwen 8B provides compares to proprietary models? Is it good enough for your use case?
Raywob•1h ago
For the mechanical stages (scanning, scoring, dedup) — indistinguishable from proprietary models. These are structured tasks: "score this post 1-10 against these criteria" or "extract these fields from this text." An 8B model handles that fine at 30 tok/s on consumer GPU.

For synthesis and judgment — no, it's not close. That's exactly why I route those stages to Claude. When you need the model to generate novel connections or strategic recommendations, the quality gap between 8B and frontier is real.

The key insight is that most pipeline stages don't need synthesis. They need pattern matching. And that's where the 95% cost savings live.

roosgit•43m ago
Have you tried other local models?

The 14B Q4_K_M needs 9GB, but Q3_K_M is 7.3GB. But you also need some room for context. Still, maybe using `--override-tensor` in llama.cpp would get you a 50% improvement over "naively" offloading layers to the GPU. Or possibly GPT-OSS-20B. It's 12.1GB in MXFP4, but it’s a MOE model so only a part of it would need to be on the GPU. On my dedicated 12GB 3060 it runs at 85 t/s, with a smallish context. I've also read on Reddit some claims that Qwen3 4B 2507 might be better than 8B, because Qwen never released a "2507" update for 8B.

Raywob•23m ago
Haven't tried GPT-OSS-20B yet — the MOE approach is interesting for keeping VRAM usage down while getting better reasoning. 85 t/s on a 3060 is impressive. I'll look into that.

I've been on Qwen3 8B mostly because it was "good enough" for the mechanical stages (scanning, scoring, dedup) and I didn't want to optimize the local model before validating the orchestration pattern itself. Now that the pipeline is proven, experimenting with the local model is the obvious next lever to pull.

The Qwen3 4B 2507 claim is interesting — if the quality holds for structured extraction tasks, halving the VRAM footprint would open up running two models concurrently or leaving more room for larger contexts. Worth testing.

Thanks for the pointers — this is exactly the kind of optimization I haven't had time to dig into yet.