frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Open in hackernews

Reduced OpenAI RAG costs by 70% by using a pre-check API call

2•Kong91•1d ago
I am using OpenAI's RAG implementation for my product. I tried doing it on my own with Pinecone but could never get it to retrieve relevant info. Anyway, OpenAI is costly, they charge for embeddings and using "file search" which retrieves the relevant chunk after the question is embedded and turned into vectors for similarity search. Not all questions a user asks need to retrieve context (which is costly). SO, I included a pre-step that users a cheaper OpenAI model to determine whether the question asked needs the context or not, if not, the RAG implementation is not touched. This decreased costs by 70%, making the business viable or more lucrative.

Comments

kristianp•1d ago
Sounds interesting, but how accurate is it? Have you done evals?
Kong91•1d ago
It's pretty accurate, it cites the caselaw it used to answer so you can check that it exists and did not hallucinate or cite US law etc.

WWDC: Disappointing in Terms of Apple AI?

https://www.heise.de/en/news/WWDC-Disappointing-in-terms-of-Apple-AI-10421859.html
1•doener•8s ago•0 comments

Ignoring the value of "quiet work" starts in the classroom

https://blog.medium.com/ignoring-the-value-of-quiet-work-starts-in-the-classroom-cb57ee7a602c
1•rbanffy•5m ago•0 comments

iNymbus

https://www.inymbus.com
1•inymbuss•6m ago•0 comments

Forcing AI Personas to Admit Ignorance Makes Them More Realistic

https://askrally.com/paper/character-llm-a-trainable-agent-for-role-playing
1•virtual_rf•6m ago•0 comments

Revolutionizing Open Source: How Our OSPO Transformed Our Strategy

https://medium.com/mercadolibre-tech/revolutionizing-open-source-how-our-ospo-transformed-our-strategy-2f0252e35167
1•Tomte•6m ago•0 comments

I

2•inymbuss•7m ago•0 comments

A.I. Is Coming for the Coders Who Made It

https://www.nytimes.com/2025/06/02/opinion/ai-coders-jobs.html
3•donohoe•11m ago•2 comments

Flux Kontext: A new generation of multimodal image generation and editing tools

https://kontextflux.com
1•AllenRen•12m ago•0 comments

We've Been Moving Data Around for Decades

https://www.datamanagement.ai/
1•shen_pandi•25m ago•1 comments

Kurzweil: We'll Outpace Aging by 2029

https://www.popularmechanics.com/science/a64906457/humans-going-backwards-in-time/
2•karlperera•26m ago•3 comments

DNS Does Not Have to Be Hard

https://www.danielfullstack.com/article/dns-does-not-have-to-be-hard
1•thunderbong•27m ago•0 comments

ThorVG: Super Lightweight Vector Graphics Engine

https://www.thorvg.org/about
1•elcritch•29m ago•0 comments

The LLM is just guessing and that's quite okay

https://www.ralphminderhoud.com/blog/llm-just-guessing/
1•lazypenguin•30m ago•0 comments

Show HN: Aruko – Plan group travel without the chaos (no app download needed)

https://www.aruko.world/
2•ankit21j•32m ago•0 comments

TradExpert: Revolutionizing Trading with Mixture of Expert LLMs

https://arxiv.org/abs/2411.00782
15•wertyk•35m ago•3 comments

Returning UnitedHealth CEO to face questions over pay and share price

https://www.ft.com/content/6dcbb9b2-38b4-491a-827b-b687b29f2fbe
1•cebert•37m ago•2 comments

Computer science has one of the highest unemployment rates

https://www.newsweek.com/computer-science-popular-college-major-has-one-highest-unemployment-rates-2076514
18•zdgeier•37m ago•4 comments

DSPy in Elixir

https://github.com/arthurcolle/dspy.ex
1•Dowwie•40m ago•0 comments

The Leporine Trap

https://leporinetrap.wordpress.com
1•a2code•40m ago•0 comments

Apple Challenges EU Order to Increase Compatibility with Rivals' Products

https://www.wsj.com/tech/apple-challenges-eu-order-to-increase-compatibility-with-rivals-products-52082b50
2•aspenmayer•44m ago•2 comments

The Simple Macroeconomics of AI (2024) [pdf]

https://economics.mit.edu/sites/default/files/2024-05/The%20Simple%20Macroeconomics%20of%20AI.pdf
1•gone35•46m ago•0 comments

Welcome to the age of $10/month Lakehouses

https://tobilg.com/the-age-of-10-dollar-a-month-lakehouses
2•furkansahin•49m ago•0 comments

I've built an open source streaming library for async pipelines

https://github.com/ju-bezdek/conveyor
1•ju-bezdek•53m ago•1 comments

Random Silicon Sampling with AI Personas

https://askrally.com/paper/random-silicon-sampling-simulating-human-sub-population-opinion-using-a-large-language-model-based-on-group-level-demographic-information
1•virtual_rf•53m ago•0 comments

NSDL IPO – Dates, GMP, Price Band, Financials and Subscription Info

https://www.ipogyan.in/2025/06/nsdl-ipo-gmp-details.html
1•deepmistry•54m ago•0 comments

Upcoming IPOs in June 2025

https://www.ipogyan.in/2025/06/upcoming-ipos-june-2025.html
1•deepmistry•56m ago•0 comments

Things I Learned This Week: Patching Pitfalls, Go's OOP Philosophy, Python Async

https://krthr.co/things-i-learned-this-week-may-5/
1•krthr•59m ago•0 comments

Meta Aims to Automate Ad Creation Using AI

https://www.wsj.com/tech/ai/meta-aims-to-fully-automate-ad-creation-using-ai-7d82e249
3•thm•1h ago•0 comments

Schej

https://schej.it/
2•zekrioca•1h ago•0 comments

Kan.bn – An open-source alterative to Trello

https://github.com/kanbn/kan
5•henryball•1h ago•1 comments