frontpage.
newsnewestaskshowjobs

Open Source @Github

fp.

Open in hackernews

Show HN: role-model, a router for hybrid local/cloud AI

https://github.com/try-works/role-model
1•try-working•1h ago
Hey everyone, I'm launching role-model today: a routing protocol, a reference router runtime, and an extension for Pi that allows for better informed routing decisions.

role-model is mostly deterministic, with fallback to a controller model, that routes requests based on a chosen routing strategy. the protocol is structured around assigning domains and roles to models, where requests sent by consumer applications like Pi have task types to enrich routing metadata and thereby accuracy. you can to run the built-in benchmark to compare performance of models across speed, quality and cost, as well as observed performance on real tasks. I have a diagram on how routing works in [0].

The runtime supports local models, either directly to your local endpoint (LM Studio, llama.cpp etc), or routing between multiple local models via vendored llama-swap.

Since there was another model router post yesterday where people discussed the basics of routing, I will focus on discussing some of the interesting learnings I've made building and testing this:

1. Model routing is essentially trying to predict the future: which model will perform optimally (based on criteria defined by the user) on this request?

2. After you have routed the request, you want to evaluate if it was the right decision or if some other model would have performed better

3. You also realize that having the router assess difficulty (among other things) to make decisions by itself is far from ideal - we'd prefer to have the consumer application work with the router to define what the request needs

4. You also realize that it becomes much easier, decisions become much accurate, and the outcomes of routing becomes more impactful when there is more of a distinction between models

For point 2, I will be launching evals that you can run locally to benchmark models in your pool on the same requests. The outcomes here can then be used for point 1, as input when routing new requests.

For point 3, I've built the pi-role-model package for Pi, which lets the Pi agent inject role_model.intent metadata including difficulty, preferred roles or even specfic model ids, required capabilities (say tool use or image input) and so on. You should be able to customized this further in Pi, and route in additional ways by changing metadata. This is why I've also built the role-model routing protocol.

For point 4, what model routing really does as a second order effect is create a market for specialized models - models that may or may not be smaller, could be cheaper or more expensive, may be locally runnable. It makes little sense to route between two frontier models (GPT 5.5 and Opus 4.8); it makes more sense to route between models where one of the factors of quality, speed, cost is a multiple of the other candidate models, and it makes even more sense to have specialized domain models: code, prose, math and science, visuals and so on. It is at this stage model routing becomes really valuable.

While role-model has a reference runtime that I'm continuously building out (there's lots to do to improve routing, as well as give users more granular control over routing decisions, and also ways to improve cross-model caching and also add techniques like FastContext), the ultimate goal of role-model is for there to be a standard protocol for inference requests that is used by consumer applications, so that the provider, be it a router middleware or an inference provider, will be able to route to a model that strikes the best balance between cost, speed and quality and also respects user choices, and even lets the user control these preferences to use local models for some tasks and allow cloud for others.

Links:

[0] role-model - the case for a model routing protocol: https://try.works/role-model-the-case-for-a-model-routing-pr...

[1] GitHub: https://github.com/try-works/role-model

[2] Docs: https://role-model.dev/

Comments

try-working•1h ago
For those interested, feel free to take a look at my write up on the need for a standard routing protocol [0], or point your agent on the GitHub repo and let your agent install the role-model runtime. I suggest using Pi for now; I plan to also launch an extension/plugin for OpenCode. The runtime is an alpha release so expect bugs and such. I would welcome feedback on how to make configuration more intuitive for first-time users.

I will also add that based on my current experiments, the ideal number of models in a routing pool is probably 2, following point 4 above. Each model needs to have significant differences in either quality, speed, or cost, otherwise routing decisions are hard to make and become less accurate; the benefit of routing is also less. For coding, the ideal pool in my opinion is GPT 5.4 and DeepSeek V4 Pro, to extend the GPT quota by routing some of the medium and easy requests.

[0] role-model - the case for a model routing protocol: https://try.works/role-model-the-case-for-a-model-routing-pr...

Ask HN: Are OTA updates for native iOS/Swift apps allowed?

1•jackappdev•2m ago•0 comments

Cypherpunk Library

https://www.cypherpunklibrary.com/collection
1•bookofjoe•2m ago•0 comments

Nearly Three-Quarters of Dutch Responses to EU Tobacco Rules Were AI-Generated

https://pointer.kro-ncrv.nl/meerderheid-nederlandse-inspraak-op-strengere-eu-tabakswet-afkomstig-...
2•stefanvdw1•6m ago•0 comments

Ask HN: If someone invested $100k now for your startup, how would you spend it?

3•aurenvale•12m ago•0 comments

Tldr.fail – buggy servers break PQ KEX compatibility in TLS

https://tldr.fail/
1•basilikum•13m ago•0 comments

Kids Act Would Require Age Checks to Get Online

https://www.eff.org/deeplinks/2026/06/kids-act-would-require-age-checks-get-online
2•bilsbie•15m ago•0 comments

CORS Explained in Plain English

https://sanyamserver.online/posts/cors/
1•RickJWagner•15m ago•0 comments

More evidence of life on Mars but still no life

https://www.cbc.ca/radio/quirks/more-evidence-of-life-on-mars-but-still-no-life-1.7649645
5•pseudolus•16m ago•0 comments

From Prompts to Loops: Building Autonomous Coding Agents

https://animeshgaitonde.medium.com/from-prompts-to-loops-building-autonomous-coding-agents-6135bf...
1•animesh371g•17m ago•0 comments

Expect Claude Fable 5 to Be Turned Back on in a Matter of Days, Report Says

https://gizmodo.com/expect-claude-fable-5-to-be-turned-back-on-in-a-matter-of-days-report-says-20...
1•HiroProtagonist•18m ago•0 comments

Beyond Functional Programming: The Verse Programming Language (2022) [pdf]

https://simon.peytonjones.org/assets/pdfs/haskell-exchange-22.pdf
1•tosh•18m ago•0 comments

France records around 1k additional deaths amid extreme heat wave

https://apnews.com/article/europe-heat-temperature-records-france-deaths-germany-61f444317600cf1b...
2•geox•18m ago•0 comments

Almavivo – The On-Device Health Platform

https://almavivo.com
2•morog•22m ago•0 comments

Policy Pulse – Issue #21 – Week of June 27, 2026

https://blog.disclose.io/policy-pulse-issue-21-week-of-june-27-2026/
1•jruohonen•25m ago•0 comments

Show HN: Warren – run isolated instances of any CLI tool (no containers,no root)

https://github.com/swadhinbiswas/warren
1•0xER•25m ago•0 comments

Shadcn/UI components that can be used without react

https://basecoatui.com/
2•buckwheatmilk•26m ago•1 comments

Imagine Telling Someone in 1999

https://twitter.com/JesseTinsley/status/2070306180543500530
1•ksec•27m ago•0 comments

Show HN: Genius AI Detector

https://geniusaidetector.com/
1•Rudism•27m ago•0 comments

GhostGrid drift detection and edge tamper evidence via Ed25519

https://ghostgrid.dannygc.cloud/
1•aisoverighn•28m ago•0 comments

Artificial rain isn’t a solution to drought, according to a cloud-seeding expert

https://www.swissinfo.ch/eng/climate-adaptation/why-artificial-rain-is-not-a-solution-to-drought/...
1•giuliomagnifico•28m ago•0 comments

Perceus: Garbage free reference counting with reuse

https://dl.acm.org/doi/10.1145/3453483.3454032
1•fanf2•29m ago•0 comments

Show HN: A faithful MUMPS 76 anniversary implementation – the original NoSQL DB

https://github.com/rochus-keller/mumps/
1•Rochus•29m ago•1 comments

Show HN: UnfoldCMS – self-hosted Laravel CMS, one-time pricing, no subscriptions

https://unfoldcms.com
1•hpakdaman•35m ago•0 comments

Trump threatens 100% tariff on European countries that impose digital tax

https://apnews.com/article/trump-tariff-europe-d2007fee8ae733a15f240c5f83462c96
3•leonidasrup•37m ago•0 comments

Flounder Mode

https://colossus.com/article/flounder-mode/
1•skadamat•37m ago•0 comments

SHOW HN: Experimental GCC/GAS inline assembly bridge for Python

https://pypi.org/project/sfpy-asm/0.1.1/
2•sunuhwang•38m ago•0 comments

Yourbrowsercandoit – 64 file tools, free, no upload, no signup, no tracking

https://yourbrowsercandoit.com/
1•robhati•40m ago•1 comments

Ask HN: How do I capture the right audience and find the product market fit

4•akarshhegde18•42m ago•1 comments

Build your own vulnerability harness

https://blog.cloudflare.com/build-your-own-vulnerability-harness/
2•talboren•46m ago•0 comments

SmartTAR – STAR 1.2.2

https://github.com/eco-by-different/smarttar-star/releases/download/SmartTAR/SmartTAR.1.2.2.exe
1•e_b_d•46m ago•1 comments