frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Model literals, semantic aliases, and preference-aligned routing for LLMs

https://docs.archgw.com/guides/llm_router.html
1•honorable_coder•1h ago

Comments

honorable_coder•1h ago
Today we’re shipping a major update to ArchGW (an edge and service proxy for agents [1]): a unified router that supports three strategies for directing traffic to LLMs — from explicit model names, to semantic aliases, to dynamic preference-aligned routing. Here’s how each works on its own, and how they come together.

Preference-aligned routing decouples task detection (e.g., code generation, image editing, Q&A) from LLM assignment. This approach captures the preferences developers establish when testing and evaluating LLMs on their domain-specific workflows and tasks. So, rather than relying on an automatic router trained to beat abstract benchmarks like MMLU or MT-Bench, developers can dynamically route requests to the most suitable model based on internal evaluations — and easily swap out the underlying moodel for specific actions and workflows. This is powered by our 1.5B Arch-Router LLM [2]. We also published our research on this recently[3]

Modal-aliases provide semantic, version-controlled names for models. Instead of using provider-specific model names like gpt-4o-mini or claude-3-5-sonnet-20241022 in your client you can create meaningful aliases like "fast-model" or "arch.summarize.v1". This allows you to test new models, swap out the config safely without having to do code-wide search/replace every time you want to use a new model for a very specific workflow or task.

Model-literals (nothing new) lets you specify exact provider/model combinations (e.g., openai/gpt-4o, anthropic/claude-3-5-sonnet-20241022), giving you full control and transparency over which model handles each request.

P.S. we routinely get asked why we didn't build semantic/embedding models for routing use cases or use some form of clustering technique. Clustering/embedding routers miss context, negation, and short elliptical queries, etc. An autoregressive approach conditions on the full context, letting the model reason about the task and generate an explicit label that can be used to match to an agent, task or LLM. In practice, this generalizes better to unseen or low-frequency intents and stays robust as conversations drift, without brittle thresholds or post-hoc cluster tuning.

[1] https://github.com/katanemo/archgw [2] https://huggingface.co/katanemo/Arch-Router-1.5B [2] https://arxiv.org/abs/2506.16655

Impact of Zelda and Ghibli on Young People's Exploration and Happiness

https://pmc.ncbi.nlm.nih.gov/articles/PMC12357126/
1•zufallsheld•56s ago•0 comments

Someone gave their consciousness to Gemini

https://open.substack.com/pub/mackenziesharp/p/i-gave-5-years-of-my-journals-to
1•gpucpufarmer•1m ago•1 comments

CATL: The Missed Empire and the Playbook for the Next Industrial VC

https://maggiexiao.com/catl/
1•walterbell•7m ago•0 comments

H-1B visas will cost $100K for new petitions; but could lead to more offshoring

https://www.theregister.com/2025/09/22/h1b_visa_changes/
2•rntn•8m ago•0 comments

Convert Google Maps Saved Places to Apple Maps

https://www.gotoapplemaps.com
2•ruslandautov•8m ago•1 comments

Nvidia and United Kingdom Build Nation's AI Infrastructure

https://nvidianews.nvidia.com/news/nvidia-and-united-kingdom-build-nations-ai-infrastructure-and-...
2•andrewstetsenko•10m ago•0 comments

Tell HN: You gave us pricing feedback, we're testing it

1•pedalpete•11m ago•0 comments

Three crashes in the first day:Tesla in Austin

https://arstechnica.com/cars/2025/09/teslas-robotaxi-test-three-crashes-in-only-7000-miles/
2•worik•12m ago•0 comments

Is Life a Form of Computation?

https://thereader.mitpress.mit.edu/is-life-a-form-of-computation/
1•anarbadalov•12m ago•3 comments

Libghostty Is Coming

https://mitchellh.com/writing/libghostty-is-coming
1•pbardea•12m ago•0 comments

Ask HN: Is anyone building mental health support for vibe coders?

1•mbm•12m ago•0 comments

Confessions of a 'Professional' Narcissist Influencer

https://nymag.com/intelligencer/article/diagnosed-narcissists-npd-disorder-coaching-hustle-influe...
2•rendx•17m ago•0 comments

LinkedIn will soon use your data to train AI. Here's what you can do to opt out

https://proton.me/blog/linkedin-ai-training
2•LopRabbit•18m ago•3 comments

We vs It: How AI is shifting power from humans to models

https://bisi.org.uk/reports/artificial-intelligence-power-dynamics-who-controls-ai
1•BigVan•19m ago•0 comments

Oracle's Ellison joins Musk and Zuckerberg in controlling platforms billions see

https://boingboing.net/2025/09/22/oracles-ellison-joins-musk-and-zuckerberg-in-controlling-platfo...
3•KittenInABox•20m ago•0 comments

Unsupervised Instance Segmentation with Superpixels

https://arxiv.org/abs/2509.05352
1•PaulHoule•20m ago•0 comments

Aposd-vs-clean-code: A discussion between John Ousterhout and Robert Martin

https://github.com/johnousterhout/aposd-vs-clean-code
1•Bogdanp•21m ago•0 comments

The Cost of Progressive Rollout

https://surfingcomplexity.blog/2025/09/13/the-hidden-trade-offs-of-fine-grained-progressive-rollo...
1•ijidak•22m ago•0 comments

Amiberry-Lite is an optimized Amiga emulator for ARM and RISC-V platforms

https://github.com/BlitterStudio/amiberry-lite
1•doener•24m ago•0 comments

Show HN: Spatialbound – Turn any location into an interactive 3D playgrounds

https://www.spatialbound.com
3•mibrahimSB•24m ago•2 comments

Bring your Launchpad back in MacOS26+

https://github.com/RoversX/LaunchNext
1•amazonhut•27m ago•0 comments

The Illusion of Diminishing Returns: Measuring Long Horizon Execution in LLMs

https://arxiv.org/abs/2509.09677
1•speckx•27m ago•0 comments

Tech Report: Winning CRS from Team Atlanta (DARPA AIxCC)

https://arxiv.org/abs/2509.14589
14•tsgates•27m ago•0 comments

Tenants Seek to Unionize One Private Equity Firm's Entire Housing Portfolio

https://www.bloomberg.com/news/articles/2025-09-22/capital-realty-group-tenants-seek-to-unionize-...
3•petethomas•29m ago•0 comments

Wake Forest University goes tuition free for families making less than 200k/year

https://news.wfu.edu/2025/09/17/wake-forest-university-will-be-tuition-free-for-admitted-students...
3•kogus•29m ago•2 comments

New Olympic calendar likely because of climate change

https://www.bbc.com/sport/athletics/articles/c5yj0wyje7lo
2•campuscodi•30m ago•0 comments

Vogte: Agentic TUI for Go projects with LLM integration

https://github.com/piqoni/vogte
1•amazonhut•33m ago•0 comments

SEAL Showdown

https://showdown.scale.com/showdown
1•ej88•34m ago•0 comments

Printer Tracking Dots

https://en.wikipedia.org/wiki/Printer_tracking_dots
1•tomas789•37m ago•1 comments

CADBase for engineers and designers updated to v0.3

1•mnnxp•38m ago•0 comments