frontpage.

I am SUPER EXCITED to publish the 130th episode of the Weaviate Podcast featuring Xiaoqiang Lin, a Ph.D. student at the National University of Singapore! During his time at Meta, Xiaoqiang lead the research behind REFRAG: Rethinking RAG-based Decoding!

Traditional RAG systems use vectors to find relevant contexts with semantic search, but then throw away these vectors when it is time to pass the retrieved information to the LLM! REFRAG instead feeds the LLM these pre-computed vectors, achieving massive gains in long context processing and LLM inference speeds!

REFRAG makes Time-To-First-Token (TTFT) 31x faster and Time-To-Iterative-Token (TTIT) 3x faster, boosting overall LLM throughput by 7x while also being able to handle much longer contexts!

This is such an exciting evolution for the applications of Vector Databases, and Weaviate’s mission to weave AI and Database systems together! I loved diving into the details of REFRAG with Xiaoqiang, I hope you enjoy the podcast!

YouTube: https://www.youtube.com/watch?v=yi7v-UXMg0U

Spotify: https://spotifycreators-web.app.link/e/RWvmvMgRZXb

Control structures in programming languages: from goto to algebraic effects

Perplexity's new AI tool aims to simplify patent research

Show HN: Secret Management for Local Development

Agent-shell 0.17 improvements and MELPA

Antarctic glacier saw the fastest retreat in modern history

How the American Dream Became a Nightmare

Walking Down to the Rhine's Riverbed

An AI company CEO could take over the world

Elon Musk hypes Tesla's 8th gen AI chip, still hasn't delivered self-driving

Comparing C++/Qt Data Serialization Formats: Code, Size, and Performance

Apple's App Store Full Front End Source Code

I built ScreenStacka – a simple, ad-free tool to compare TV and monitor sizes

Apple App Store Web Has Exposed Its Source Code

Israeli military lawyer arrested leaking video of Palestinian detainee abuse

GM's EV push will cost it $1.6B in Q3 with end of the tax credit

Why fall colors in Maine are less vibrant this year

The Kraken: When Myth Encounters Science (2014|PDF)

Kubernetes and Ceph: Your Freedom from the Cloud Cartel

Microsoft's hiring shift: Fewer generalists, more AI-driven roles

The Agent Development Lifecycle (ADLC) – A new way to build reliable Agents

DeepMind's AI Learns to Create Original Chess Puzzles, Praised by GMs

Fund the future Trillion-dollar Industrial monopoly, just got validated

Effect of cooling white rice on resistant starch content and glycemic response

Agents Are Commoditizing the Complement

Ask HN: No more freelancer thread from whoishiring?

Licentra v1.0.0 is out version-aware licensing with granular update control

BettaFish – Public Opinion Sentiment Analysis Model

Composing the Idea: Why "Next Word Prediction" Misses the Point

A brief history of Time Machine (2024)

I tried Elon Musk's Starlink internet on a Royal Caribbean cruise ship

Interview with the lead author of REFRAG (Meta)