frontpage.

Show HN: Local task classifier and dispatcher on RTX 3080

https://github.com/resilientworkflowsentinel/resilient-workflow-sentinel

8•Shubham_Amb•4h ago

Hi HN, I am shubham a 3d artist who learned coding in college as an I.T. graduate know logics but not an expert as i just wanna try my hands on to ai

So i built Resilient Workflow Sentinel this is offline ai agent which classify urgency (Low,Medium and HIgh) and dispatches to the candidates based on availability Well i want an offline system like a person can trust with its sensitive data to stay completely locally

Did use ai to code for speeding and cutting labor.

Its works on RTX 3080 system (this is an basic affordable setup not heavy ai machinery) which i want it to make it reliable without heavy upgrade This is full system doesn't require ollama(I am not against it)

I see in companies tickets are raised on jira and slack. Currently people or manager (self) have to sort those things either manually read one by one or send them to the cloud. But the issue is you can't send everything like there is a lot of sensitive data out there which they do not trust and makes it harder and manual sorting through thousands is likely a nightmare.

But then just imagine u get all the task classified like its urgency and distribution u can selectively see which task is urgent and needs immediate attention and last of all information doesn't leave your building totally secure Also Api sending is not the only issue u are paying per token cost for task for each may be monthly 100$ to 1000$ which can like save hassle for startup a lot or companies as well

There was several biases like positional bias also json out put bias also have issues in attention At start i tried just prompting things like Chain of thoughts,RISE(evaluate negative first), given negative examples,Positive examples, somewhere it was struggling with commonsense issue so examples for that (Later changed the approach)

Well prompting did give the output and worked well but took too much time to process for single task like 70 to 90secs for a task

Then i tried batching and the biases got worst like it got stronger it always use to like favour alice also more prompts are like ignored and more

For json output i used constrain so model can only generate json and if fails there is a as well parser i used when i implemented prompting only

This reduce time from 90sec to nearly 15 to 30secs per task I used steering vector to correct the attention i seen issues happening

Stack: Language: Python 3.10 Model: qwen2.5-7b-instruct Libraries: Pytorch, Hugging Face Transformers (No Langchain, No Ollama) API: Fast API UI: NiceGUI Hardware: Ryzen 5, 16Gb ram RTX 3080

Implementation:

Quantization: Load model in nf4 quantization so models like 7b can fit on vram of 10gb which is on rtx 3080 also my hardware

Steering Vectors: Standard prompting wasn't enough. I need to block or direct certain things on a certain layer of llm to make it reliable.

Json Constraints: Used constraint to make model strictly give json and also stop from over explanation this happens at logits level where token are blocked which are not required etc

github : https://github.com/resilientworkflowsentinel/resilient-workf...

Youtube: https://youtu.be/tky3eURLzWo

Claude Opus 4.6

GPT-5.3-Codex

My AI Adoption Journey

We tasked Opus 4.6 using agent teams to build a C Compiler

Recreating Epstein PDFs from raw encoded attachments

Pong Cam – My ESP32S3 Thinks It's a WebCam

Animated Knots

Review of 1984 by Isaac Asimov (1980)

GitHub Actions Is Slowly Killing Your Engineering Team

MenuetOS – a GUI OS that boots from a single floppy disk

I reversed Tower of Fantasy's anti-cheat driver: a BYOVD toolkit never loaded

LinkedIn checks for 2953 browser extensions

Claude Opus 4.6 extra usage promo

Launching My Side Project as a Solo Dev: The Walkthrough

India's female workers watching hours of abusive content to train AI

Hypernetworks: Neural Networks for Hierarchical Data

You Still Struggle with CORS Even After Reading Docs

What if writing tests was a joyful experience? (2023)

Orchestrate teams of Claude Code sessions

Flock CEO calls Deflock a “terrorist organization” (2025) [video]

Show HN: Calfkit – an SDK to build distributed, event-driven AI agents on Kafka

What's wrong with bunny hands on dinosaurs? (2018)

There Will Come Soft Rains (1950) [pdf]

Show HN: Local task classifier and dispatcher on RTX 3080

Housman's Introductory Lecture (1892)

150 MB Minimal FreeBSD Installation

The RCE that AMD won't fix

Maihem (YC W24): hiring senior robotics perception engineer (London, on-site)

What's at the Other End of 8.8.8.8?

OpenClaw: When AI Agents Get Full System Access. Security nightmare?

Show HN: Local task classifier and dispatcher on RTX 3080

Claude Opus 4.6

GPT-5.3-Codex

My AI Adoption Journey

We tasked Opus 4.6 using agent teams to build a C Compiler

Recreating Epstein PDFs from raw encoded attachments

Pong Cam – My ESP32S3 Thinks It's a WebCam

Animated Knots

Review of 1984 by Isaac Asimov (1980)

GitHub Actions Is Slowly Killing Your Engineering Team

MenuetOS – a GUI OS that boots from a single floppy disk

I reversed Tower of Fantasy's anti-cheat driver: a BYOVD toolkit never loaded

LinkedIn checks for 2953 browser extensions

Claude Opus 4.6 extra usage promo

Launching My Side Project as a Solo Dev: The Walkthrough

India's female workers watching hours of abusive content to train AI

Hypernetworks: Neural Networks for Hierarchical Data

You Still Struggle with CORS Even After Reading Docs

What if writing tests was a joyful experience? (2023)

Orchestrate teams of Claude Code sessions

Flock CEO calls Deflock a “terrorist organization” (2025) [video]

Show HN: Calfkit – an SDK to build distributed, event-driven AI agents on Kafka

What's wrong with bunny hands on dinosaurs? (2018)

There Will Come Soft Rains (1950) [pdf]

Show HN: Local task classifier and dispatcher on RTX 3080

Housman's Introductory Lecture (1892)

150 MB Minimal FreeBSD Installation

The RCE that AMD won't fix

Maihem (YC W24): hiring senior robotics perception engineer (London, on-site)

What's at the Other End of 8.8.8.8?

OpenClaw: When AI Agents Get Full System Access. Security nightmare?