frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Valori – A Python-native Vector Database I built from scratch

4•varshith17•4h ago
I’ve been working on a project called Valori, a Python-native vector database I built from the ground up — not by reinventing every algorithm, but by wiring together efficient, well-known indexing and search techniques into a cohesive, hackable framework.

The idea came from my frustration with existing vector DBs that were either too heavy for experimentation or too opaque to modify. I wanted something simple, modular, and extensible — so I built it.

What it does:

Lets you store, index, and search high-dimensional vectors

Supports multiple indices (Flat, HNSW, IVF, LSH, Annoy)

Has memory, disk, and hybrid storage backends

Includes a full document processing pipeline (parsing, cleaning, chunking, embedding)

Offers quantization, persistence, and plugin-based extensibility

All written in Python, integrated with NumPy, and production-tested with logging and monitoring built in.

Install:

pip install valori

GitHub: https://github.com/varshith-Git/valori

PyPI: https://pypi.org/project/valori

I’d love to hear your thoughts —

What’s missing for you in current vector DBs?

If you’ve built LLM or RAG systems, what do you wish a lightweight, pure Python DB like this handled better?

Would you prefer tighter integrations (LangChain, Haystack, etc.) or a more “build-it-yourself” style?

Feedback, criticism, or collaboration ideas are all welcome. — Varshith (varshith.gudur17@gmail.com )

Comments

bendtb•3h ago
What’s the advantage if this being in python?
redskyluan•3h ago
dude you already missed the window.

nothing is better than sqlite as a library and don't use high perforamnce as your value for a python product

Samsung Family Hub fridges will start showing adds to "Elevate" Home Ecosystem

https://news.samsung.com/us/samsung-family-hub-2025-update-elevates-smart-home-ecosystem/
129•janandonly•1h ago•95 comments

The Manuscripts of Edsger W. Dijkstra

https://www.cs.utexas.edu/~EWD/
47•nathan-barry•1h ago•7 comments

Montana Becomes First State to Enshrine 'Right to Compute' into Law

https://montananewsroom.com/montana-becomes-first-state-to-enshrine-right-to-compute-into-law/
73•bilsbie•3h ago•38 comments

AI isn't replacing jobs. AI spending is

https://www.fastcompany.com/91435192/chatgpt-llm-openai-jobs-amazon
94•felineflock•1h ago•18 comments

Reviving Classic Unix Games: A 20-Year Journey Through Software Archaeology

https://vejeta.com/reviving-classic-unix-games-a-20-year-journey-through-software-archaeology/
71•mwheeler•4h ago•19 comments

Visualize FastAPI endpoints with FastAPI-Voyager

https://www.newsyeah.fun/voyager/
68•tank-34•4h ago•12 comments

Zensical – A modern static site generator built by the Material for MkDocs team

https://squidfunk.github.io/mkdocs-material/blog/2025/11/05/zensical/
42•japhyr•4h ago•4 comments

Email verification protocol

https://github.com/WICG/email-verification-protocol
71•sgoto•1w ago•42 comments

Using bubblewrap to add sandboxing to NetBSD

https://blog.netbsd.org/tnf/entry/gsoc2025_bubblewrap_sandboxing
32•jaypatelani•3h ago•4 comments

When Your Hash Becomes a String: Hunting Ruby's Million-to-One Memory Bug

https://mensfeld.pl/2025/11/ruby-ffi-gc-bug-hash-becomes-string/
24•phmx•5d ago•2 comments

I Am Mark Zuckerberg

https://iammarkzuckerberg.com/
836•jb1991•10h ago•307 comments

Ironclad – formally verified, real-time capable, Unix-like OS kernel

https://ironclad-os.org/
320•vitalnodo•17h ago•89 comments

Alive internet theory

https://alivetheory.net/
110•manbitesdog•4h ago•45 comments

About KeePassXC's Code Quality Control

https://keepassxc.org/blog/2025-11-09-about-keepassxcs-code-quality-control/
45•haakon•2h ago•7 comments

Reverse engineering Codex CLI to get GPT-5-Codex-Mini to draw me a pelican

https://simonwillison.net/2025/Nov/9/gpt-5-codex-mini/
119•simonw•12h ago•58 comments

Largest cargo sailboat completes first Atlantic crossing

https://www.marineinsight.com/shipping-news/worlds-largest-cargo-sailboat-completes-historic-firs...
331•defrost•20h ago•224 comments

The overengineered solution to my pigeon problem (2022)

https://maxnagy.com/posts/pigeons/
49•cyb0rg0•6d ago•34 comments

How to get the GOT address from a PLT stub using GDB

https://rafaelbeirigo.github.io/cybersec-dojo/research/2025/11/01/how-to-get-the-got-address-from...
11•rafaelbeirigo•1w ago•2 comments

Ask HN: How would you set up a child’s first Linux computer?

93•evolve2k•5h ago•131 comments

Marko – A declarative, HTML‑based language

https://markojs.com/
326•ulrischa•22h ago•157 comments

Ask HN: I underestimated how lonely building solo can be

7•paulwilsonn•6d ago•11 comments

ChatGPT knows my IP geolocation

https://www.hermandaniel.com/blog/20251109-chatgpt-geolocation/
7•kekqqq•3h ago•5 comments

Toolkit to help you get started with Spec-Driven Development

https://github.com/github/spec-kit
46•mooreds•6d ago•18 comments

Open-source communications by bouncing signals off the Moon

https://open.space/
220•fortran77•1w ago•61 comments

Study identifies weaknesses in how AI systems are evaluated

https://www.oii.ox.ac.uk/news-events/study-identifies-weaknesses-in-how-ai-systems-are-evaluated/
383•pseudolus•1d ago•181 comments

How Airbus took off

https://worksinprogress.co/issue/how-airbus-took-off/
114•JumpCrisscross•15h ago•101 comments

Drax: Speech Recognition with Discrete Flow Matching

https://huggingface.co/papers/2510.04162
37•cliffly•3h ago•0 comments

Defeating KASLR by doing nothing at all

https://googleprojectzero.blogspot.com/2025/11/defeating-kaslr-by-doing-nothing-at-all.html
79•aa_is_op•5d ago•7 comments

Genetically Engineered Babies Are Banned. Tech Titans Are Trying to Make One

https://www.wsj.com/tech/biotech/genetically-engineered-babies-tech-billionaires-6779efc8
13•nradov•2h ago•8 comments

Tabloid: The Clickbait Headline Programming Language

https://tabloid.vercel.app/
277•sadeshmukh•13h ago•41 comments