frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Open in hackernews

Retrieval Augmented Generation Based on SQLite

https://github.com/ggozad/haiku.rag
71•emzo•13h ago

Comments

wredcoll•7h ago
This looks cool, and I'm interested in these keywords, but I read the entire readme and I'm still unsure what problem it's actually solving.

Anyone want to help out?

webstrand•7h ago
This is for LLMs. In general RAG takes a user prompt and uses it to find potentially relevant documents in the database. It then enriches the original prompt with those documents so that the LLM has context that wasn't in its training dataset.
Octplane•7h ago
RAG -> Vector search -> means that your documents are not indexed as full text but as Vectorized objects which mean that then you can search using concepts instead of exacts strings you would use with a regular "Fulltext search".

This makes the search less precise and more powerful at the same time (ie it could look clever to some extent).

almosthere•7h ago
Sqlite has an embedding search? Or is that being provided by this tool?
Octplane•7h ago
It's provided via https://github.com/asg017/sqlite-vec
ethan_smith•2h ago
SQLite itself doesn't have native embedding search, but extensions like sqlite-vss and sqlite-vectorize add vector similarity search capabilities to SQLite.
rcarmo•4h ago
This and SQLite-vec (or whatever extension is trendy these days) can do a lot in a very limited amount of compute.
anoojb•4h ago
Would love to see a system that blends cheap lexical (Fulltext Search) or semantic/vector search using SQLite and chooses the best approach given the input.
bob1029•3h ago
If you want the best possible solution vertical for most business, I'd be looking at using Lucene for FTS duty.

Having the FTS engine provide a google-style snippet of the most relevant document chunk is the holy grail for RAG applications. Lucene does this kind of thing better than anyone else:

https://lucene.apache.org/core/8_0_0/highlighter/org/apache/...

It is also very easy to customize this engine and align the document tokenization & indexing concerns with your specific retrieval scenarios.

Fun with uv and PEP 723

https://www.cottongeeks.com/articles/2025-06-24-fun-with-uv-and-pep-723
215•deepakjois•3h ago•74 comments

National Archives to restrict public access starting July 7

https://www.archives.gov/college-park
46•LastTrain•1h ago•13 comments

Writing toy software is a joy

https://blog.jsbarretto.com/post/software-is-joy
442•bundie•7h ago•181 comments

ChatGPT's enterprise success against Copilot fuels OpenAI/Microsoft rivalry

https://www.bloomberg.com/news/articles/2025-06-24/chatgpt-vs-copilot-inside-the-openai-and-microsoft-rivalry
83•mastermaq•6h ago•66 comments

Ancient X11 scaling technology

https://flak.tedunangst.com/post/forbidden-secrets-of-ancient-X11-scaling-technology-revealed
121•todsacerdoti•3h ago•72 comments

PlasticList – Plastic Levels in Foods

https://www.plasticlist.org/
239•homebrewer•8h ago•109 comments

Analyzing a Critique of the AI 2027 Timeline Forecasts

https://thezvi.substack.com/p/analyzing-a-critique-of-the-ai-2027
29•jsnider3•2h ago•17 comments

Finding a 27-year-old easter egg in the Power Mac G3 ROM

https://www.downtowndougbrown.com/2025/06/finding-a-27-year-old-easter-egg-in-the-power-mac-g3-rom/
272•zdw•9h ago•77 comments

Subsecond: A runtime hotpatching engine for Rust hot-reloading

https://docs.rs/subsecond/0.7.0-alpha.1/subsecond/index.html
41•varbhat•3h ago•2 comments

XBOW, an autonomous penetration tester, has reached the top spot on HackerOne

https://xbow.com/blog/top-1-how-xbow-did-it/
108•summarity•6h ago•70 comments

The bitter lesson is coming for tokenization

https://lucalp.dev/bitter-lesson-tokenization-and-blt/
174•todsacerdoti•8h ago•78 comments

How to Think About Time in Programming

https://shanrauf.com/archive/how-to-think-about-time-in-programming
26•rmason•2h ago•9 comments

Starship: The minimal, fast, and customizable prompt for any shell

https://starship.rs/
336•benoitg•11h ago•166 comments

Gemini Robotics On-Device brings AI to local robotic devices

https://deepmind.google/discover/blog/gemini-robotics-on-device-brings-ai-to-local-robotic-devices/
133•meetpateltech•8h ago•52 comments

Basic Facts about GPUs

https://damek.github.io/random/basic-facts-about-gpus/
206•ibobev•10h ago•52 comments

Show HN: Autumn – Open-source infra over Stripe

https://github.com/useautumn/autumn
87•ayushrodrigues•9h ago•28 comments

Expand.ai (YC S24) is hiring a founding engineer

1•timsuchanek•5h ago

Mapping LLMs over excel saved my passion for game dev

https://danieltan.weblog.lol/2025/06/map-llms-excel-saved-my-passion-for-game-dev
23•danieltanfh95•3d ago•1 comments

The economics behind "Basic Economy" – A masterclass in price discrimination

https://blog.getjetback.com/the-economics-behind-basic-economy-a-masterclass-in-price-discrimination/
52•bdev12345•2h ago•69 comments

World Curling tightens sweeping rules, bans firmer broom foams ahead of Olympics

https://www.cbc.ca/sports/olympics/winter/curling/world-curling-broom-ban-1.7566638
14•emptybits•2d ago•4 comments

The German automotive industry wants to develop open-source software together

https://www.vda.de/en/press/press-releases/2025/250624_PM_Automotive_industry_signs_Memorandum_of_Understanding
72•smartmic•2h ago•35 comments

Nordic Semiconductor Acquires Memfault

https://www.nordicsemi.com/Nordic-news/2025/06/Nordic-Semiconductor-acquires-Memfault
93•hasheddan•7h ago•28 comments

Timdle – Place historical events in chronological order

https://www.timdle.com/
137•maskinberg•1d ago•49 comments

PyTorch Reshaping with None

https://blog.detorch.xyz/post/2025-06-21-pytorch-reshaping-with-none.md
6•demirbey05•3d ago•0 comments

MCP is eating the world

https://www.stainless.com/blog/mcp-is-eating-the-world--and-its-here-to-stay
178•emschwartz•3d ago•114 comments

Show HN: Oasis – an open-source, 3D-printed smart terrarium

https://github.com/justbuchanan/oasis
87•jbuch•8h ago•17 comments

Bridging Cinematic Principles and Generative AI for Automated Film Generation

https://arxiv.org/abs/2506.18899
20•jag729•3h ago•6 comments

SFStreets: History of San Francisco place names

http://sfstreets.noahveltman.com/
35•jxmorris12•5h ago•17 comments

How Cloudflare blocked a monumental 7.3 Tbps DDoS attack

https://blog.cloudflare.com/defending-the-internet-how-cloudflare-blocked-a-monumental-7-3-tbps-ddos/
201•methuselah_in•4d ago•107 comments

Circular Microcomputers embedded and powered by repurposed smartphone components

https://citronics.eu/
71•Bluestein•12h ago•19 comments