frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Open in hackernews

Ragged – Leveraging Video Container Formats for Efficient Vector DB Distribution

1•loaderchips•5h ago
Hello HN community,

Longtime lurker and really happy to be writing this post. I'm excited to share a proof of concept I've been working on for efficient vector database distribution called Ragged. In my paper and PoC, I explore leveraging the MP4 video container format to store and distribute high-dimensional vectors for semantic search applications.

The idea behind Ragged is to encode vectors and their metadata into MP4 files using custom tracks, allowing seamless distribution through existing Content Delivery Networks (CDNs). This approach maintains compatibility with standard video infrastructure while achieving comparable search performance to traditional vector databases.

Key highlights of my work include: - A novel encoding scheme for high-dimensional vectors and metadata into MP4 container formats. - CDN-optimized architecture with HTTP range requests, fragment-based access patterns, and intelligent prefetching. - Comprehensive evaluation showing significant improvements in cold-start latency and global accessibility. - An open-source implementation to facilitate reproduction and adoption.

I was inspired by the innovative work of Memvid (https://github.com/Olow304/memvid), which demonstrated the potential of using video formats for data storage. My project builds on this concept with a focus on CDNs and semantic search.

I believe Ragged offers a promising solution for deploying semantic search capabilities in edge computing and serverless environments, leveraging the mature video distribution ecosystem. Also sharing indexed knowledge bases in the form of offline MP4 can unlock a new class of applications.

I'm eager to hear your thoughts, feedback, and any potential use cases you envision for this approach. You can find the full paper and implementation details [here](https://github.com/nikitph/ragged).

Thank you for your time fellows

Nikit

Are we doing enough to save Earth from a devastating asteroid strike?

https://www.theguardian.com/science/2025/jun/28/its-something-that-happens-are-we-doing-enough-to-save-earth-from-a-devastating-asteroid-strike
1•pseudolus•2m ago•0 comments

UK Constituency Dashboards

https://open-innovations.github.io/constituency-dashboards/
1•robin_reala•4m ago•0 comments

What LLMs Know About Their Users

https://www.schneier.com/blog/archives/2025/06/what-llms-know-about-their-users.html
3•nabla9•6m ago•0 comments

The American Dream Is Broken. This $50M Bet Could Help Rebuild It

https://denver-frederick.com/2025/06/24/the-american-dream-is-broken-this-50-million-bet-could-help-rebuild-it/
2•MaysonL•7m ago•1 comments

X explains Z% of the variance in Y

https://www.lesswrong.com/posts/E3nsbq2tiBv6GLqjB/x-explains-z-of-the-variance-in-y
1•surprisetalk•13m ago•0 comments

My ENTIRE system for making games without an engine [video]

https://www.youtube.com/watch?v=jYFhKIleL4w
1•surprisetalk•13m ago•0 comments

It doesn't make sense to wrap modern data in a 1979 format, introducing .ptar

https://plakar.io/posts/2025-06-27/it-doesnt-make-sense-to-wrap-modern-data-in-a-1979-format-introducing-.ptar/
1•Signez•15m ago•0 comments

First-Class Models: The Missing Productivity Revolution

https://frest.substack.com/p/first-class-models-the-missing-productivity
2•surprisetalk•17m ago•0 comments

Ask HN: From the MIT study, is it smarter to resign than to use forced AI?

1•ciwolex•19m ago•0 comments

Show HN: Clai, vendor agnostic Claude code/Gemini CLI written in Go

https://github.com/baalimago/clai
1•baalimago•21m ago•0 comments

Call Center Workers Are Tired of Being Mistaken for AI

https://www.bloomberg.com/news/articles/2025-06-27/as-ai-infiltrates-call-centers-human-workers-are-being-mistaken-for-bots
1•JumpCrisscross•21m ago•0 comments

A continent is splitting in two, the rift is already visible

https://evidencenetwork.ca/africas-splitting-continent-new-ocean-forming-east-african-rift/
2•amichail•21m ago•1 comments

Show HN: eKilo – Super lightweight terminal text editor based

https://github.com/antonio-foti/ekilo
2•antoniofoti•22m ago•0 comments

Share what you are working on, let's blow it up!

3•timetodine17•22m ago•2 comments

Show HN: Tandem, Better 1-on-1 meetings for remote managers

https://tndm.app/
2•TandemApp•26m ago•0 comments

We're building Signdeer.com and we'd love your input

https://signdeer.com
1•imranvmungai•29m ago•1 comments

New Tires Could End Up Ten Times Tougher with Harvard's Crack-Resistant Rubber

https://www.jalopnik.com/1896687/new-tires-ten-times-tougher-harvard-rubber/
1•greesil•30m ago•0 comments

Restoring a ZX Spectrum+ Toastrack

https://celso.io/posts/2025/06/28/toastrack/
1•rcarmo•31m ago•0 comments

Why Tech Billionaires Want Bots to Be Your BFF

https://www.wsj.com/tech/ai/why-tech-billionaires-want-bots-to-be-your-bff-0c0e531b
1•sandwichsphinx•32m ago•0 comments

Security Advisory: Anthropic's Slack MCP Server Vulnerable to Data Exfiltration

https://embracethered.com/blog/posts/2025/security-advisory-anthropic-slack-mcp-server-data-leakage/
1•wendythehacker•35m ago•0 comments

Show HN: Natrul AI – An API for autocomplete, search, and content enhancement

https://www.natrul.ai:443/
1•jroseborough•36m ago•0 comments

How Muppets break free from their Puppeteers [video]

https://www.youtube.com/watch?v=t86ZjhGxwAY
1•jenoer•36m ago•0 comments

Electron Speech-to-Speech App for Voice Chats

1•Kutalia•40m ago•0 comments

Are LLM AIs making people dumber?

https://skeptics.stackexchange.com/q/58996/69248
1•lr0•41m ago•0 comments

Parsing JSON in Forty Lines of Awk

https://akr.am/blog/posts/parsing-json-in-forty-lines-of-awk
4•thefilmore•41m ago•2 comments

Senate Republicans make steep cuts to wind and solar in updated megabill text

https://www.politico.com/live-updates/2025/06/28/congress/senate-republicans-make-steep-cuts-to-wind-and-solar-in-updated-megabill-text-00430686
2•MilnerRoute•41m ago•0 comments

Linus Tech Tips: Companies Are Suing Honest Reviewers [video]

https://www.youtube.com/watch?v=RNonfByE9xc
1•bundie•42m ago•0 comments

QCReport – Predict the quality of every image in any dataset, automatically

https://qcreportai.netlify.app
1•ViktorOsadsky•43m ago•0 comments

unwrap-or: a TypeScript implementation of Rust's Option and Result types

https://github.com/hnatiukr/unwrap-or
1•hnatiukr•45m ago•1 comments

The New Yorker Cartoon Bank

https://cartoonbank.com
2•LouisLazaris•49m ago•0 comments