frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Open in hackernews

1.5B LLM routing model that aligns to preferences, not leaderboards

https://huggingface.co/katanemo/Arch-Router-1.5B
3•honorable_coder•4h ago

Comments

honorable_coder•4h ago
Hi HN — we're the team behind Arch (an open-source edge and service proxy for agents)[1], and today we're releasing Arch-Router (https://huggingface.co/katanemo/Arch-Router-1.5B), a 1.5B LLM router model designed to align to user-defined preferences, not public benchmarks and leader boards.

As teams integrate multiple LLMs - each with different strengths, styles, or cost/latency profiles — routing the right prompt to the right model becomes a critical part of the application design. But it's still an open problem. Most routing systems fall into two camps:

- Embedding-based routers use intent classifiers — label a prompt as “support,” “SQL,” or “math,” then route to a matching model. This works for simple tasks but breaks down in real conversations. Users shift topics mid-conversation, task boundaries blur, and product changes require retraining classifiers.

- Performance-based routers pick models based on benchmarks like MMLU or MT-Bench, or based on latency or cost curves. But benchmarks often can't capture what matters in production: domain-specific quality or subjective evaluation criteria. These routers are often opaque, difficult to debug, and their quality judgments can feel arbitrary, failing to capture the subjective nuance of what a “good” response actually means for a specific user’s intent.

Arch-Router takes a different approach: route to LLMs based on preferences written as policies in plain ol English.

You write policies like “contract clauses → GPT-4o” or “quick travel tips → Gemini Flash.” The router maps the prompt (and the full conversation context) to those policies using a lightweight 1.5B auto-regressive model. The model is capable to handle intent drift, supports multi-turn conversations, and lets you swap in or out models with a one-line change to the routing policy. To read more about the strength of our model, check out our research paper here: https://arxiv.org/abs/2506.16655

Essentially, Arch-Router splits the routing process into two distinct parts:

    Route Selection: This is the what. The system defines a set of human-readable routing policies using a “Domain-Action Taxonomy.” Think of it as a clear API contract written in plain English. A policy isn’t just intent_123; it’s a descriptive label like Domain: ‘finance’, Action: ‘analyze earnings report’. The router’s only job is to match the user’s query to the best-fit policy description.

    Model Assignment: This is the how. A separate, simple mapping configuration connects each policy to a specific LLM. The finance/"analyze earnings report" policy might map to a powerful model like GPT-4o, while a simpler general/"greeting" policy maps to a faster, cheaper model.
Specs:

- 1.5B params — runs on a single GPU (or CPU for testing)

- No retraining needed — point it at any mix of LLMs

- Outperforms larger closed models on our conversational routing benchmarks (details in the paper)

Links:

[1] Arch Proxy: https://github.com/katanemo/archgw

Celebrating 40 years of chemical drawing with ChemDraw

https://www.compoundchem.com/2025/07/18/chemdraw/
1•sohkamyung•2m ago•0 comments

Upcoming deprecation of GitHub Command Palette feature preview

https://github.blog/changelog/2025-07-15-upcoming-deprecation-of-github-command-palette-feature-preview/
1•going_north•4m ago•0 comments

What Scientists Learned Scanning the Bodies of 100k Brits

https://www.bloomberg.com/news/articles/2025-07-18/what-scientists-learned-scanning-the-bodies-of-100-000-brits
1•helsinkiandrew•10m ago•1 comments

We Turned Claude AI into an L5 Data Scientist [video]

https://www.youtube.com/watch?v=i7dXSsm6ULw
1•rented_mule•11m ago•1 comments

Show HN: Macuse – Give Your AI Superpowers on macOS

https://macuse.app
1•ahonn•15m ago•0 comments

No level of alcohol consumption is safe for our health (2023)

https://www.who.int/europe/news/item/04-01-2023-no-level-of-alcohol-consumption-is-safe-for-our-health
1•doener•19m ago•0 comments

Understand What Whales Are Saying

https://www.projectceti.org
2•doener•22m ago•0 comments

An Epicyclic Clock

https://sophiehoulden.com/randomstuff/epitime/
1•fanf2•24m ago•0 comments

Sensitive Wikimedia databases have reportedly been hacked and leaked

https://wikipediasucks.co/forum/viewtopic.php?f=5&t=3520
1•kurtreed2•24m ago•0 comments

When Root Meets Immutable: OpenBSD Chflags vs. Log Tampering

https://rsadowski.de/posts/2025/openbsd-immutable-system-logs/
7•todsacerdoti•29m ago•2 comments

H-Nets – The Future

https://goombalab.github.io/blog/2025/hnet-future/
1•cubefox•35m ago•0 comments

Firmware for the open source Teufel Mynd speaker

https://github.com/teufelaudio/mynd-firmware
1•morsch•37m ago•0 comments

Implementing a Fast Tensor Core Matmul on the Ada Architecture

https://www.spatters.ca/mma-matmul
2•skidrow•39m ago•1 comments

Engineering the End of Work

https://schmud.de/posts/2025-07-15-engineering-end-of-work.html
2•Bogdanp•40m ago•0 comments

Compiler Explorer: An Essential Kernel Playground for CUDA Developers

https://developer.nvidia.com/blog/compiler-explorer-the-kernel-playground-for-cuda-developers/
1•skidrow•41m ago•0 comments

Creating custom kernels for the AMD MI300

https://huggingface.co/blog/mi300kernels
1•skidrow•42m ago•0 comments

Show HN: FigForm – Feel the power of Figma when creating customized forms

https://figform.io
1•aarondelasy•45m ago•0 comments

Run a server at home using Raspberry Pi and Tunnelmole

https://softwareengineeringstandard.com/2025/07/14/raspberry-pi-server/
1•aussieguy1234•48m ago•0 comments

Best Chrome Extension to Remove Paywall in 2025

https://puupnewsapp.com/chrome-extension-to-remove-paywall/
1•CodeWanderer•49m ago•0 comments

What's happening to Matlab? Or, "The slow demise of the engineering toolbox"

https://blog.pictor.us/whats-happening-to-matlab/
2•bauta-steen•49m ago•0 comments

Mnemonic Devices in Illuminated Manuscripts

https://twitter.com/AHomelyHouse/status/1945940846559338597
1•Michelangelo11•52m ago•0 comments

Decoding Secrets: How military medals exposed Russia's SIGINT network

https://checkfirst.network/decoding-secrets-through-symbols-how-military-insignia-revealed-russias-hidden-sigint-network/
2•amaury•55m ago•1 comments

Aardvark

https://en.wikipedia.org/wiki/Aardvark
1•simonebrunozzi•57m ago•0 comments

Show HN: Self-made web media player without <video> or <audio>

https://mediabunny.dev/examples/media-player/
6•vanilagy•57m ago•5 comments

Mediabunny, a pure-TypeScript replacement for FFmpeg for in-web media processing

https://mediabunny.dev/
3•vanilagy•59m ago•0 comments

Trump Targets "Woke" AI

https://www.wsj.com/tech/ai/white-house-prepares-executive-order-targeting-woke-ai-e68e8e24
1•timoth3y•1h ago•1 comments

Thoughts on External Memory for LLMs

https://medium.com/@chipiga86/thoughts-on-external-memory-for-llms-e2ee21be3292
1•rishikeshs•1h ago•0 comments

Is HN Down in the UK?

4•curiousgal•1h ago•3 comments

Ask HN: How do you build good software that users pay for?

https://github.com/Mtendekuyokwa19
1•sonderotis•1h ago•2 comments

Genocide VC

https://genocide.vc/
3•FilosofumRex•1h ago•0 comments