frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Open in hackernews

Show HN: Arch-Router – Aligning LLM Routing with Human Preferences

https://arxiv.org/abs/2506.16655
1•honorable_coder•2h ago
Hi HN — we're the team behind ArchGW [1], an edge and service proxy for agents written in Rust, and we just recently published our research on LLM Routing: https://arxiv.org/abs/2506.16655

As teams integrate multiple LLMs - each with different strengths, styles, or cost/latency profiles — routing the right prompt to the right model becomes a critical part of the application design. But it's still an open problem. Most routing systems fall into two camps:

- Embedding-based routers use intent classifiers — label a prompt as “support,” “SQL,” or “math,” then route to a matching model. This works for simple tasks but breaks down in real conversations. Users shift topics mid-conversation, task boundaries blur, and product changes require retraining classifiers.

- Performance-based routers pick models based on benchmarks like MMLU or MT-Bench, or based on latency or cost curves. But benchmarks often miss what matters in production: domain-specific quality or subjective preferences like “Will legal accept this clause?”

Arch-Router takes a different approach: it decouples route selection from model assignment. Developers can write route policies using a domain-action taxonomy (like "engineering" or "image editing" respectively), and the router maps the prompt (and conversation context) to those policies using a lightweight 1.5B autoregressive model. No retraining, no fragile if/else chains. We built this with input from teams at Twilio and Atlassian. Arch-Router handles intent drift, supports multi-turn conversations, and lets you swap in or out models with a one-line change to the routing policy. Full details are in our paper, but here's a snapshot:

Specs:

- 1.5B params — runs on a single GPU (or CPU for testing)

- No retraining needed — point it at any mix of LLMs

- Routing can be cost, latency or quality aware based on your preferences

- Outperforms larger closed models on our conversational routing benchmarks (benchmarks in the paper)

Links:

- ArchGW (open source edge and service proxy for agents ): https://github.com/katanemo/archgw

- Model + code: https://huggingface.co/katanemo/Arch-Router-1.5B

- Paper: https://arxiv.org/abs/2506.16655

Does this look like a real woman? AI model in Vogue

https://www.bbc.com/news/articles/cgeqe084nn4o
1•pieterr•6m ago•0 comments

Portability of Tar Features

https://cdrtools.sourceforge.net/private/portability-of-tar-features.html
1•fanf2•6m ago•0 comments

How FastAPI Works

https://fastlaunchapi.dev/blog/how-fastapi-works/
1•sh_tomer•9m ago•0 comments

ChatGPT launches study mode to encourage 'responsible' academic use

https://www.theguardian.com/technology/2025/jul/29/chatgpt-openai-chatbot-study-mode-universities-students-education
1•01-_-•12m ago•0 comments

Claude Code and shipping stuff to prod

https://boliv.substack.com/p/claude-code-usage-patterns
1•brunooliv•16m ago•0 comments

Study reveals that 12-year-olds see OnlyFans as an alternative to work

https://www.psypost.org/teens-as-young-as-12-see-onlyfans-as-an-appealing-alternative-to-traditional-work-study-finds/
1•01-_-•20m ago•0 comments

Chroma: Open-source search and retrieval database for AI applications

https://www.trychroma.com/
1•teleforce•28m ago•0 comments

What'll happen if we spend nearly $3T on data centres no one needs?

https://www.ft.com/content/7052c560-4f31-4f45-bed0-cbc84453b3ce
2•cmsefton•33m ago•0 comments

Grok Explores Browardlocals.com Impact

http://browardlocals.com/
1•rogermaragh•39m ago•1 comments

Machine took control of my brain and eyeballs [video]

https://www.youtube.com/shorts/9Om2X6QcTgw
1•RicoElectrico•42m ago•0 comments

Three bad things: threads, garbage collection, and nondeterministic destructors

https://apenwarr.ca/log/20100810
1•porridgeraisin•44m ago•0 comments

Scheme-dql: S-expression data query language module

https://lists.nongnu.org/archive/html/guile-user/2025-07/msg00039.html
1•ynzoqn•47m ago•0 comments

How the Martian Was Written

https://www.youtube.com/watch?v=EXD3b6OLtsg
1•kamphey•50m ago•0 comments

Agent2Agent – Samples

https://github.com/yogananda-muthaiah/A2A
2•yogananda•51m ago•0 comments

Why did Anthropic chose an anus for Claude's logo?

1•xucian•53m ago•1 comments

Microsoft researchers have revealed the 40 jobs most exposed to AI

https://fortune.com/2025/07/31/microsoft-research-generative-ai-occupational-impact-jobs-most-and-least-likely-to-impact-teaching-office-jobs-college-gen-z-grads/
1•BerislavLopac•53m ago•0 comments

I got Wan 2.2 working in ComfyUI with just 8GB VRAM – here's the workflow

https://www.youtube.com/watch?v=7hUO6KhUsvQ
2•aitechtutorials•1h ago•1 comments

US Military's squad of satellite trackers is now routinely going on alert

https://arstechnica.com/space/2025/08/the-militarys-squad-of-satellite-trackers-is-now-routinely-going-on-alert/
6•xrayarx•1h ago•0 comments

Things I miss about civilization

https://www.nature.com/articles/d41586-025-02248-9
1•Bluestein•1h ago•0 comments

Royal Society right to keep Elon Musk as member, says new astronomer royal

https://www.theguardian.com/science/2025/aug/01/royal-society-elon-musk-astronomer-royal-michele-dougherty
2•Bluestein•1h ago•0 comments

Show HN: Team Timezone Wall (100% offline, single file)

1•jharohit•1h ago•4 comments

Show HN: Trivia Player

https://triviaplayer.com/
1•indest•1h ago•0 comments

Show HN: AI tool that builds and deploys n8n workflows from a single prompt

https://csworkflow.consciousstage.com/
3•abdulhak•1h ago•2 comments

AxxSolder

https://github.com/AxxAxx/AxxSolder
1•fk_fk•1h ago•0 comments

Ask HN: Am I Alone Here?

4•kinj28•1h ago•5 comments

Kadag Security – AI-driven security testing by running your app

https://kadagsecurity.com/
1•valentin_k•1h ago•1 comments

Has anyone tried FakeFind? It analyzes reviews like Fakespot used to

https://fakefind.ai/
1•UniJen•1h ago•2 comments

Banana Pi BPI-R4 Lite Released with MediaTek MT7987A and Wi-Fi 7 Support

https://linuxgizmos.com/banana-pi-bpi-r4-lite-released-with-mediatek-mt7987a-and-wi-fi-7-support/
1•teleforce•1h ago•0 comments

(NSFW) Google Search LLM Halluninating

https://www.google.com/search?q=home+gym+in+a+got+garage&oq=home+gym+in+a+got+garage&gs_lcrp=EgZjaHJvbWUyBggAEEUYOTIHCAEQIRiPAtIBCDEyODhqMGo3qAIBsAIB&client=ms-android-att-us-rvc3&sourceid=chrome-mobile&ie=UTF-8
1•llmlapd•1h ago•0 comments

Modos Paper Monitor Brings High-Speed E-Paper to Developers

https://linuxgizmos.com/modos-paper-monitor-brings-high-speed-e-paper-to-developers/
3•teleforce•1h ago•0 comments