frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Open in hackernews

Show HN: Llmswap – Python package to reduce LLM API costs by 50-90% with caching

https://pypi.org/project/llmswap
12•sreenathmenon•2d ago
I built llmswap to solve a problem I kept hitting in hackathons - burning through API credits while testing the same prompts repeatedly during development.

It's a simple Python package that provides a unified interface for OpenAI, Anthropic, Google Gemini, and local models (Ollama), with built-in response caching that can cut API costs by 50-90%.

Key features: - Intelligent caching with TTL and memory limits - Context-aware caching for multi-user apps - Auto-fallback between providers when one fails - Zero configuration - works with environment variables

  from llmswap import LLMClient

  client = LLMClient(cache_enabled=True)
  response = client.query("Explain quantum computing")
  # Second identical query returns from cache instantly (free)
The caching is disabled by default for security. When enabled, it's thread-safe and includes context isolation for multi-user applications.

Built this from components of a hackathon project. Already at 2.2k downloads on PyPI. Hope it helps others save on API costs during development.

GitHub: https://github.com/sreenathmmenon/llmswap PyPI: https://pypi.org/project/llmswap/

Comments

rav•2d ago
How is it "50-90%" savings? If a given application doesn't repeat its queries, surely there's nothing to save by caching the responses?
sreenathmenon•1d ago
Hey, Thanks for the great feedback! You're raising valid point.

Actually, this package started based on a hackathon project where I was burning the Anthropic API credits for our hackathon project which was RAG (internal documentation) + MCP.

There were question which were getting repeated several times. The 50% + comes from this experience. So, based on this, I was thinking of some of the use cases like this:

Multi-User Support/FAQ Systems: - How do I reset my password? - Reset password steps? - Forgot my password help - Password reset procedure

RAG based: - How to configure VM? - How to deploy? - How to create a network?

Educational/Training Apps Developer Testing scenarios, etc

You're absolutely right that apps with unique queries won't see these benefits - this won't help in - Personalized Content - Real-Time Data - User-Specific Queries - Creative Generation and other scenarios

I think I should clarify this in the docs. Thanks for the great feedback. This is my first opensource package and first conversation in hackernews. Great to interact and learn from all of you

0points•2d ago
I hate to be that guy, but your AI should have suggested you used one of the off-the-shelf in-memory key-value databases.

The most popular probably being redis.

sreenathmenon•1d ago
Fair point! Redis would be better for production. I went with in-memory for zero-config simplicity, but should add Redis as an option. Thanks!
wasabi991011•2d ago
How does this compare to decorating with @functions.cache?
sreenathmenon•1d ago
Hey, functools.cache is definitely simpler and would be sufficient for most basic cases. But I was thinking of multi-tenant and context aware scenario's - that's why went with different strategy.

Claude Sonnet 4 now supports 1M tokens of context

https://www.anthropic.com/news/1m-context
940•adocomplete•11h ago•517 comments

Search all text in New York City

https://www.alltext.nyc/
125•Kortaggio•2h ago•29 comments

Ashet Home Computer

https://ashet.computer/
201•todsacerdoti•8h ago•43 comments

Show HN: Building a web search engine from scratch with 3B neural embeddings

https://blog.wilsonl.in/search-engine/
362•wilsonzlin•11h ago•59 comments

Journaling using Nix, Vim and coreutils

https://tangled.sh/@oppi.li/journal
86•icy•13h ago•29 comments

Training language models to be warm and empathetic makes them less reliable

https://arxiv.org/abs/2507.21919
220•Cynddl•13h ago•221 comments

Bezier-rs – algorithms for Bézier segments and shapes

https://graphite.rs/libraries/bezier-rs/
16•jarek-foksa•3d ago•0 comments

A gentle introduction to anchor positioning

https://webkit.org/blog/17240/a-gentle-introduction-to-anchor-positioning/
49•feross•4h ago•13 comments

Show HN: Omnara – Run Claude Code from anywhere

https://github.com/omnara-ai/omnara
220•kmansm27•10h ago•111 comments

Visualizing quaternions: An explorable video series (2018)

https://eater.net/quaternions
11•uncircle•3d ago•3 comments

Multimodal WFH setup: flight SIM, EE lab, and music studio in 60sqft/5.5M²

https://www.sdo.group/study
190•brunohaid•3d ago•81 comments

Blender is Native on Windows 11 on Arm

https://www.thurrott.com/music-videos/324346/blender-is-native-on-windows-11-on-arm
125•thunderbong•4d ago•50 comments

WHY2025: How to become your own ISP [video]

https://media.ccc.de/v/why2025-9-how-to-become-your-own-isp
107•exiguus•10h ago•13 comments

LLMs aren't world models

https://yosefk.com/blog/llms-arent-world-models.html
242•ingve•2d ago•129 comments

Blender on iPad Is Finally Happening

https://www.creativebloq.com/3d/blender-on-ipad-is-finally-happening-and-it-could-be-the-app-every-artist-needs
20•walterbell•1h ago•7 comments

Launch HN: Design Arena (YC S25) – Head-to-head AI benchmark for aesthetics

61•grace77•11h ago•24 comments

A spellchecker used to be a major feat of software engineering (2008)

https://prog21.dadgum.com/29.html
140•Bogdanp•4d ago•129 comments

Go 1.25 Release Notes

https://go.dev/doc/go1.25
134•bitbasher•5h ago•25 comments

Why are there so many rationalist cults?

https://asteriskmag.com/issues/11/why-are-there-so-many-rationalist-cults
410•glenstein•12h ago•614 comments

RISC-V single-board computer for less than 40 euros

https://www.heise.de/en/news/RISC-V-single-board-computer-for-less-than-40-euros-10515044.html
131•doener•4d ago•75 comments

Fixing a loud PSU fan without dying

https://chameth.com/fixing-a-loud-psu-fan-without-dying/
22•sprawl_•3d ago•25 comments

The equality delete problem in Apache Iceberg

https://blog.dataengineerthings.org/the-equality-delete-problem-in-apache-iceberg-143dd451a974
47•dkgs•8h ago•23 comments

Evaluating LLMs playing text adventures

https://entropicthoughts.com/evaluating-llms-playing-text-adventures
94•todsacerdoti•11h ago•58 comments

Weave (YC W25) is hiring a founding AI engineer

https://www.ycombinator.com/companies/weave-3/jobs/SqFnIFE-founding-ai-engineer
1•adchurch•10h ago

Debian GNU/Hurd 2025 released

https://lists.debian.org/debian-hurd/2025/08/msg00038.html
189•jrepinc•3d ago•102 comments

Dumb to managed switch conversion (2010)

https://spritesmods.com/?art=rtl8366sb&page=1
39•userbinator•3d ago•17 comments

The Missing Protocol: Let Me Know

https://deanebarker.net/tech/blog/let-me-know/
81•deanebarker•7h ago•60 comments

Galileo’s telescopes: Seeing is believing (2010)

https://www.historytoday.com/archive/history-matters/galileos-telescopes-seeing-believing
18•hhs•3d ago•7 comments

Is Meta Scraping the Fediverse for AI?

https://wedistribute.org/2025/08/is-meta-scraping-the-fediverse-for-ai/
7•nogajun•1h ago•0 comments

Australian court finds Apple, Google guilty of being anticompetitive

https://www.ghacks.net/2025/08/12/australian-court-finds-apple-google-guilty-of-being-anticompetitive/
335•warrenm•13h ago•125 comments