frontpage.

I’ve been exploring how recommendation systems are actually implemented in production, beyond just training models.

A common pattern I kept seeing is to split the problem into two stages:

1. Retrieve a small set of relevant candidates

2. Re-rank them using a model

Instead of doing brute-force inference across all items, I built a small prototype around this idea.

The flow looks like this:

- Store embeddings in a vector database (ChromaDB)

- Retrieve the Top-K most similar items/users based on vector similarity

- Run a TensorFlow.js model to re-rank the candidates

The goal is to reduce the search space before applying inference, which seems necessary when latency and scale matter.

What I found interesting is that once you move to this approach, a lot of the complexity shifts from the model itself to the retrieval layer:

- choosing K

- filtering candidates

- embedding quality

- latency vs recall trade-offs

Curious how others approach this in real systems:

- How do you decide on K?

- Do you rely purely on vector similarity or add heuristics?

- How do you handle re-ranking at scale?

Project: https://github.com/ftonato/recommendation-system-chromadb-tf...

Show HN: Decision Guardian – Auto-surface architectural context on PRs and CLI

FCC approves Charter Communications' $34.5B deal to buy Cox

Shift from passive documentation to active enforcement

Show HN: Accept.md now supports SvelteKit – return Markdown from any page

I stopped using JSON for MQTT and use Zig to develop gRPC-like communication

Raided by the Police – Investigating Nintendo, Sega, & Devkit Arrests [video]

Sam Altman Says OpenAI Is Working on Pentagon Deal

Autokey Wayland – fork of AutoKey a desktop automation app with Wayland support

Software development now costs less than than the wage of a minimum wage worker

Pentagon approves OpenAI safety red lines after dumping Anthropic

Show HN: I made a website to write online math as fast as paper

Is AGI a Billion-Dollar Mirage? The AI Circular Trap

Money Is the First AI – and We Never Noticed

LFortran Compiles Fpm

Show HN: I seriously think this is the most effective email leadgen tool

Devs who code like you (from public GitHub signals)

The Ballad of Dario and Pete

A way to be a person

Modernity: We launched two weeks ago, got 400 testers, zero customers. Fixes

History of Software Design

How strong is New York's "illegal gambling" case against Valve's loot boxes?

C inference for Qwen3-ASR 0.6B and 1.7B transcriptions models

Show HN: OpenTimelineEngine – Shared local memory for Claude Code and codex

Carabiner Hacking for Workout [video]

New AGI Framework Different

Claude's Constitution (2023)

The Legal Rights of Extraterrestrials

'RentAHuman' platform includes Texans who can be rented by AI agents for tasks

Input Remapper – easy to use tool to change the behaviour of Linux input devices

Enclave gem: Mega useful if you're building agents on Ruby on Rails

Show HN: A production-style recommender using vector retrieval and re-ranking