Debug Mode for LLMs in vLLora

48•mrun1729•1mo ago

Comments

kappuchino•1mo ago

Until https://github.com/vllora/vllora/tree/v0.1.6 it was Apache licensed. Then Elastic Search 2. Nah.

IMHO the "don't remove anything with a licensekey ever" part in the license is the kind of potential poison that I would never recommend this to my or any other company. More than a few fellow engineers consider nagware an insult and see the potential to twist your arm late in the game making former free functions part of a new "optimized pay package", which you need because you can't fix the bug in the goddamn license part that is a security risk. LOL. (Not saying that you ever do. See below)

And there is no moat, debugging AI flows is a few prompts and a claude code max, google gemini pro or codex whatever for a couple of days while doing the usual things will happen.

Note: Its not about this software specific. I learned that the cuts and bruises of incidents before you come along are the ones that shape behaviour of your partners/colleagues/peers. You may have the purest intentions and best approaches, but someone longe before you ruined it. Its not you, its you chosing the same path.

v3g42•1mo ago

Hey, I’m one of the builders behind vLLora, so let me clarify the reasoning.

We split the project intentionally: everything embeddable (the Rust crate you ship inside your own product) is released separately under Apache 2.0. So if you’re embedding it, you’re not inheriting license-key / “licensing baggage” concerns in your codebase. (https://crates.io/crates/vllora_llm)

The parts under the fair-code license are the local debugging UI/tooling. Will always be free to use, we just don’t want it copied and resold.

Any paid, advanced observability lives in a separate cloud offering under a different name so there is no confusion whatsover.

We use it to build deeper agentic workflows. it’s been extremely useful for iterating and we want to share this free to use with everyone. Happy to share our experiences if you want to know more.

Re: "no moat, just a few prompts + Claude/Codex". I’ll be a bit cheeky you’re entitled to that view, but we’re in different camps. Some folks vibe code everything; We believe in having proper tools. You still want a screwdriver for screws.

_pdp_•1mo ago

interesting but ... why not debug the actual code that is invoking the API.. like break point at the right place, edit state, step over, resume... it seems that the toolchain is a lot more mature and it will fit right into the specific programming environment that is targeted

suprjami•1mo ago

Because this is way easier. It's effectively a printf debugger and editor you can just slot in the middle of the data stream.

v3g42•1mo ago

You can still use normal debuggers for the code path, but we found it really valuable to isolate and inspect the agent data stream itself: the exact prompts, model outputs, tool inputs/outputs, and how that impacts cost, time, and behavior over long runs. That visibility has been a big lever for improving overall product quality for some of the deeper agentic experiences we are building. Ability to modify and change models has been useful too.

omneity•1mo ago

What a strange naming choice, mixing two things (vLLM and LoRA) while being related to neither..

v3g42•1mo ago

Haha. One of our objectives is to allow for local debuggging but not just pure debugging; Also enabling users to fine tune a version of the model that performs better. We are working on that feature set and involves Lora :) Hence the name. I guess its a future vision ? :)

Was going to share my work

Pitchfork: A devilishly good process manager for developers

You Are Here

Why social apps need to become proactive, not reactive

How patient are AI scrapers, anyway? – Random Thoughts

Vouch: A contributor trust management system

I built a terminal monitoring app and custom firmware for a clock with Claude

Tiny C Compiler

Y Combinator Founder Organizes 'March for Billionaires'

Ask HN: Need feedback on the idea I'm working on

OpenClaw Addresses Security Risks

Apple finalizes Gemini / Siri deal

Italy Railways Sabotaged

Emacs-tramp-RPC: high-performance TRAMP back end using MsgPack-RPC

Nintendo Wii Themed Portfolio

"There must be something like the opposite of suicide "

Ask HN: Why doesn't Netflix add a “Theater Mode” that recreates the worst parts?

Show HN: Engineering Perception with Combinatorial Memetics

Show HN: Steam Daily – A Wordle-like daily puzzle game for Steam fans

The Anthropic Hive Mind

Just Started Using AmpCode

LLM as an Engineer vs. a Founder?

Crosstalk inside cells helps pathogens evade drugs, study finds

Show HN: Design system generator (mood to CSS in <1 second)

Show HN: 26/02/26 – 5 songs in a day

Toroidal Logit Bias – Reduce LLM hallucinations 40% with no fine-tuning

Top AI models fail at >96% of tasks

The Science of the Perfect Second (2023)

Bob Beck (OpenBSD) on why vi should stay vi (2006)

Show HN: a glimpse into the future of eye tracking for multi-agent use

Was going to share my work

Pitchfork: A devilishly good process manager for developers

You Are Here

Why social apps need to become proactive, not reactive

How patient are AI scrapers, anyway? – Random Thoughts

Vouch: A contributor trust management system

I built a terminal monitoring app and custom firmware for a clock with Claude

Tiny C Compiler

Y Combinator Founder Organizes 'March for Billionaires'

Ask HN: Need feedback on the idea I'm working on

OpenClaw Addresses Security Risks

Apple finalizes Gemini / Siri deal

Italy Railways Sabotaged

Emacs-tramp-RPC: high-performance TRAMP back end using MsgPack-RPC

Nintendo Wii Themed Portfolio

"There must be something like the opposite of suicide "

Ask HN: Why doesn't Netflix add a “Theater Mode” that recreates the worst parts?

Show HN: Engineering Perception with Combinatorial Memetics

Show HN: Steam Daily – A Wordle-like daily puzzle game for Steam fans

The Anthropic Hive Mind

Just Started Using AmpCode

LLM as an Engineer vs. a Founder?

Crosstalk inside cells helps pathogens evade drugs, study finds

Show HN: Design system generator (mood to CSS in <1 second)

Show HN: 26/02/26 – 5 songs in a day

Toroidal Logit Bias – Reduce LLM hallucinations 40% with no fine-tuning

Top AI models fail at >96% of tasks

The Science of the Perfect Second (2023)

Bob Beck (OpenBSD) on why vi should stay vi (2006)

Show HN: a glimpse into the future of eye tracking for multi-agent use

Debug Mode for LLMs in vLLora

Comments