frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Phone Is a Snitch – Untraceable Digital Dissident

https://untraceabledigitaldissident.com/your-phone-is-a-snitch/
1•janandonly•1m ago•0 comments

Why reliability is hard at scale: learnings from infrastructure outages

https://newsletter.pragmaticengineer.com/p/why-reliability-is-hard-at-scale
1•Khaine•3m ago•0 comments

Running Gaming Workloads Through AMD's Zen 5

https://chipsandcheese.com/p/running-gaming-workloads-through
1•rbanffy•4m ago•0 comments

Theory of Scale-Relative Time: Derivations of the Galactic Scale Factor

https://zenodo.org/records/16099052
3•Bluestein•5m ago•0 comments

The best companies are dictatorships

https://writing.nikunjk.com/p/the-best-companies-are-dictatorships
1•hetdv•5m ago•0 comments

Show HN: Fetchet – A compact, promise-based, HTTP fetch wrapper

https://github.com/brysonbw/fetchet
2•Brysonbw•11m ago•0 comments

Are prompts the new unit of work for applications?

https://www.archgw.com/blogs/are-prompts-the-new-unit-of-work
1•honorable_coder•17m ago•0 comments

AI Thinking, Fast and Slow

https://danmu.nz/blog/ai-thinking-fast-and-slow/
2•mooreds•26m ago•0 comments

Quantum Interference 1: A Simple Example

https://profmattstrassler.com/2025/03/18/quantum-interference-1-a-simple-example/
1•mhb•27m ago•0 comments

UK gets first female Astronomer Royal in 350 years

https://www.bbc.com/news/articles/c741lll88q5o
3•mooreds•27m ago•0 comments

Multi-cloud migration startup FluidCloud emerges from stealth

https://www.networkworld.com/article/4030429/multi-cloud-migration-startup-fluidcloud-emerges-from-stealth.html
1•mooreds•30m ago•0 comments

A third of Chinese provinces now spend their entire revenue on debt repayments

https://old.reddit.com/r/China/comments/1mfnjd2/a_third_of_provinces_now_spend_their_entire/
4•decimalenough•31m ago•0 comments

No Gravity – FOSS space game classic from 2005 ported for the web

https://midzer.de/wasm/nogravity/
1•midzer•33m ago•1 comments

Google indexing ChatGPT convos, potentially exposing sensitive user data

https://www.fastcompany.com/91376687/google-indexing-chatgpt-conversations
4•xpe•34m ago•0 comments

Should we treat rivers as living things?

https://www.nature.com/articles/d41586-025-02263-w
2•gnabgib•35m ago•0 comments

Arch-Router: Aligning LLM Routing with Human Preferences

https://arxiv.org/abs/2506.16655
6•handfuloflight•41m ago•2 comments

Lina Khan points to Figma IPO as vindication of M&A scrutiny

https://techcrunch.com/2025/08/02/lina-khan-points-to-figma-ipo-as-vindication-for-ma-scrutiny/
3•bingden•44m ago•0 comments

Someone on GitHub filed a bug report against reality, says P=NP cause causality

https://github.com/tasteburger/owning-physics-with-srt/issues/1
2•bastischmidt•44m ago•0 comments

<IsAgent/>

https://stytch.com/blog/introducing-is-agent/
2•benswerd•46m ago•1 comments

Cyberpunk Is Now Our Reality

https://danieltan.weblog.lol/2025/07/cyberpunk-is-now-our-reality
3•tempodox•51m ago•1 comments

Show HN: Rudys.ai, Scale Google Ads Globally in Any Language

https://rudys.ai
1•nasir•1h ago•0 comments

Show HN: NameFast – Generate names for your SaaS idea in seconds

1•skyzouw•1h ago•0 comments

Exfiltrating Your ChatGPT Chat History and Memories with Prompt Injection

https://embracethered.com/blog/posts/2025/chatgpt-chat-history-data-exfiltration/
3•wunderwuzzi23•1h ago•0 comments

As a linguist, I want to find the words to measure chronic illness

https://thesicktimes.org/2025/08/01/as-a-linguist-i-want-to-find-the-words-to-measure-chronic-illness/
2•Avshalom•1h ago•0 comments

Show HN: Fast Elevation API with memory mapped tiles

https://www.terraintap.com
2•anaj123•1h ago•0 comments

A New Jersey Racing Institution Could Be Destroyed for Housing Development

https://www.thedrive.com/news/a-new-jersey-racing-institution-could-be-destroyed-for-housing-development
3•PaulHoule•1h ago•1 comments

Immigration officers smash car windows to speed up arrests

https://projects.propublica.org/trump-ice-smashed-windows-deportation-arrests/
4•heavyset_go•1h ago•0 comments

Pentomino Configurations and Solutions

https://isomerdesign.com/Pentomino/
2•cabidaher•1h ago•0 comments

Vibe-Coding Yourself into Irrelevance

https://www.osnews.com/story/142956/vibe-coding-yourself-into-irrelevance/
3•ta988•1h ago•0 comments

UK Energy Trading Market Infographic

https://a115.co.uk/uk-energy-trading-market-2025-infographic/
1•jd115•1h ago•0 comments
Open in hackernews

Ask HN: Tips for reducing LLM token usage?

1•vmt-man•2h ago
I've been using Claude Code with Serena MCP, but for the past few weeks it's been compressing the context more often. I have two Pro accounts, and it's still not enough for my daily needs anymore :(

Also, Claude Code tends to make very broad search requests, and I keep getting an error from MCP about exceeding 25,000 characters. It happens quite often.

What would you recommend?

Comments

bigyabai•2h ago
> What would you recommend?

Invest in a local inference server and run Qwen3. At this point it will still cost less than two pro accounts.

vmt-man•1h ago
What hardware do you suggest? :)
bigyabai•1h ago
Iunno, whatever you can afford?

Nvidia hardware is cheap as chips right now. If you got 2x 3060 12gb cards (or a 24gb 4090), you'd have 24gb of CUDA-accelerated VRAM to play with for inference and finetuning. It should be plenty to fit the smaller SOTA models like GLM-4.5 Air, Qwen3 30b A3B, and Llama Scout, and definitely enough to start layering the giant 100b+ parameter options.

That's what I'd get, at least.

vmt-man•1h ago
> GLM-4.5 Air, Qwen3 30b A3B, and Llama Scout

Are they good enough compared to Sonnet 4?

I’ve also used Gemini 2.5 Pro and Flash, and they’re worse. But they’re much bigger, not just 30B.

bigyabai•1h ago
In my opinion? Qwen3 does live up to the benchmarks, it leaves Sonnet 4 in the dust quality-wise if you can get a fast enough tok/s to use it. I haven't tried GLM or Llama Scout yet, nor do I have a particularly big frame of reference for the quality of Opus 4.

You might be able to try out Qwen3 via API to see if it suits your needs. Their 30b MOE is really impressive, and the 480b one can only be better (presumably).