frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Model-literals, model-aliases, and preference-aligned routing for LLMs

https://docs.archgw.com/guides/llm_router.html
2•honorable_coder•4mo ago
Today we’re shipping a major update to ArchGW (an edge and service proxy for agents [1]): a unified router that supports three strategies for directing traffic to LLMs — from explicit model names, to semantic aliases, to dynamic preference-aligned routing. Here’s how each works on its own, and how they come together.

Preference-aligned routing decouples task detection (e.g., code generation, image editing, Q&A) from LLM assignment. This approach captures the preferences developers establish when testing and evaluating LLMs on their domain-specific workflows and tasks. So, rather than relying on an automatic router trained to beat abstract benchmarks like MMLU or MT-Bench, developers can dynamically route requests to the most suitable model based on internal evaluations — and easily swap out the underlying moodel for specific actions and workflows. This is powered by our 1.5B Arch-Router LLM [2]. We also published our research on this recently[3]

Modal-aliases provide semantic, version-controlled names for models. Instead of using provider-specific model names like gpt-4o-mini or claude-3-5-sonnet-20241022 in your client you can create meaningful aliases like "fast-model" or "arch.summarize.v1". This allows you to test new models, swap out the config safely without having to do code-wide search/replace every time you want to use a new model for a very specific workflow or task.

Model-literals (nothing new) lets you specify exact provider/model combinations (e.g., openai/gpt-4o, anthropic/claude-3-5-sonnet-20241022), giving you full control and transparency over which model handles each request.

P.S. we routinely get asked why we didn't build semantic/embedding models for routing use cases or use some form of clustering technique. Clustering/embedding routers miss context, negation, and short elliptical queries, etc. An autoregressive approach conditions on the full context, letting the model reason about the task and generate an explicit label that can be used to match to an agent, task or LLM. In practice, this generalizes better to unseen or low-frequency intents and stays robust as conversations drift, without brittle thresholds or post-hoc cluster tuning.

[1] https://github.com/katanemo/archgw [2] https://huggingface.co/katanemo/Arch-Router-1.5B [2] https://arxiv.org/abs/2506.16655

Near-Instantly Aborting the Worst Pain Imaginable with Psychedelics

https://psychotechnology.substack.com/p/near-instantly-aborting-the-worst
1•eatitraw•3m ago•0 comments

Show HN: Nginx-defender – realtime abuse blocking for Nginx

https://github.com/Anipaleja/nginx-defender
2•anipaleja•4m ago•0 comments

The Super Sharp Blade

https://netzhansa.com/the-super-sharp-blade/
1•robin_reala•5m ago•0 comments

Smart Homes Are Terrible

https://www.theatlantic.com/ideas/2026/02/smart-homes-technology/685867/
1•tusslewake•7m ago•0 comments

What I haven't figured out

https://macwright.com/2026/01/29/what-i-havent-figured-out
1•stevekrouse•7m ago•0 comments

KPMG pressed its auditor to pass on AI cost savings

https://www.irishtimes.com/business/2026/02/06/kpmg-pressed-its-auditor-to-pass-on-ai-cost-savings/
1•cainxinth•7m ago•0 comments

Open-source Claude skill that optimizes Hinge profiles. Pretty well.

https://twitter.com/b1rdmania/status/2020155122181869666
2•birdmania•8m ago•1 comments

First Proof

https://arxiv.org/abs/2602.05192
2•samasblack•10m ago•1 comments

I squeezed a BERT sentiment analyzer into 1GB RAM on a $5 VPS

https://mohammedeabdelaziz.github.io/articles/trendscope-market-scanner
1•mohammede•11m ago•0 comments

Kagi Translate

https://translate.kagi.com
2•microflash•12m ago•0 comments

Building Interactive C/C++ workflows in Jupyter through Clang-REPL [video]

https://fosdem.org/2026/schedule/event/QX3RPH-building_interactive_cc_workflows_in_jupyter_throug...
1•stabbles•13m ago•0 comments

Tactical tornado is the new default

https://olano.dev/blog/tactical-tornado/
2•facundo_olano•15m ago•0 comments

Full-Circle Test-Driven Firmware Development with OpenClaw

https://blog.adafruit.com/2026/02/07/full-circle-test-driven-firmware-development-with-openclaw/
1•ptorrone•15m ago•0 comments

Automating Myself Out of My Job – Part 2

https://blog.dsa.club/automation-series/automating-myself-out-of-my-job-part-2/
1•funnyfoobar•15m ago•0 comments

Google staff call for firm to cut ties with ICE

https://www.bbc.com/news/articles/cvgjg98vmzjo
39•tartoran•15m ago•5 comments

Dependency Resolution Methods

https://nesbitt.io/2026/02/06/dependency-resolution-methods.html
1•zdw•16m ago•0 comments

Crypto firm apologises for sending Bitcoin users $40B by mistake

https://www.msn.com/en-ie/money/other/crypto-firm-apologises-for-sending-bitcoin-users-40-billion...
1•Someone•16m ago•0 comments

Show HN: iPlotCSV: CSV Data, Visualized Beautifully for Free

https://www.iplotcsv.com/demo
2•maxmoq•17m ago•0 comments

There's no such thing as "tech" (Ten years later)

https://www.anildash.com/2026/02/06/no-such-thing-as-tech/
1•headalgorithm•18m ago•0 comments

List of unproven and disproven cancer treatments

https://en.wikipedia.org/wiki/List_of_unproven_and_disproven_cancer_treatments
1•brightbeige•18m ago•0 comments

Me/CFS: The blind spot in proactive medicine (Open Letter)

https://github.com/debugmeplease/debug-ME
1•debugmeplease•18m ago•1 comments

Ask HN: What are the word games do you play everyday?

1•gogo61•21m ago•1 comments

Show HN: Paper Arena – A social trading feed where only AI agents can post

https://paperinvest.io/arena
1•andrenorman•23m ago•0 comments

TOSTracker – The AI Training Asymmetry

https://tostracker.app/analysis/ai-training
1•tldrthelaw•27m ago•0 comments

The Devil Inside GitHub

https://blog.melashri.net/micro/github-devil/
2•elashri•27m ago•0 comments

Show HN: Distill – Migrate LLM agents from expensive to cheap models

https://github.com/ricardomoratomateos/distill
1•ricardomorato•27m ago•0 comments

Show HN: Sigma Runtime – Maintaining 100% Fact Integrity over 120 LLM Cycles

https://github.com/sigmastratum/documentation/tree/main/sigma-runtime/SR-053
1•teugent•27m ago•0 comments

Make a local open-source AI chatbot with access to Fedora documentation

https://fedoramagazine.org/how-to-make-a-local-open-source-ai-chatbot-who-has-access-to-fedora-do...
1•jadedtuna•29m ago•0 comments

Introduce the Vouch/Denouncement Contribution Model by Mitchellh

https://github.com/ghostty-org/ghostty/pull/10559
1•samtrack2019•29m ago•0 comments

Software Factories and the Agentic Moment

https://factory.strongdm.ai/
1•mellosouls•30m ago•1 comments