frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

US to bankroll far-right think tanks in Europe against digital laws

https://www.brusselstimes.com/1957195/us-to-fund-far-right-forces-in-europe-tbtb
1•saubeidl•55s ago•0 comments

Ask HN: Have AI companies replaced their own SaaS usage with agents?

1•tuxpenguine•3m ago•0 comments

pi-nes

https://twitter.com/thomasmustier/status/2018362041506132205
1•tosh•5m ago•0 comments

Show HN: Crew – Multi-agent orchestration tool for AI-assisted development

https://github.com/garnetliu/crew
1•gl2334•6m ago•0 comments

New hire fixed a problem so fast, their boss left to become a yoga instructor

https://www.theregister.com/2026/02/06/on_call/
1•Brajeshwar•7m ago•0 comments

Four horsemen of the AI-pocalypse line up capex bigger than Israel's GDP

https://www.theregister.com/2026/02/06/ai_capex_plans/
1•Brajeshwar•8m ago•0 comments

A free Dynamic QR Code generator (no expiring links)

https://free-dynamic-qr-generator.com/
1•nookeshkarri7•8m ago•1 comments

nextTick but for React.js

https://suhaotian.github.io/use-next-tick/
1•jeremy_su•10m ago•0 comments

Show HN: I Built an AI-Powered Pull Request Review Tool

https://github.com/HighGarden-Studio/HighReview
1•highgarden•10m ago•0 comments

Git-am applies commit message diffs

https://lore.kernel.org/git/bcqvh7ahjjgzpgxwnr4kh3hfkksfruf54refyry3ha7qk7dldf@fij5calmscvm/
1•rkta•13m ago•0 comments

ClawEmail: 1min setup for OpenClaw agents with Gmail, Docs

https://clawemail.com
1•aleks5678•20m ago•1 comments

UnAutomating the Economy: More Labor but at What Cost?

https://www.greshm.org/blog/unautomating-the-economy/
1•Suncho•26m ago•1 comments

Show HN: Gettorr – Stream magnet links in the browser via WebRTC (no install)

https://gettorr.com/
1•BenaouidateMed•27m ago•0 comments

Statin drugs safer than previously thought

https://www.semafor.com/article/02/06/2026/statin-drugs-safer-than-previously-thought
1•stareatgoats•29m ago•0 comments

Handy when you just want to distract yourself for a moment

https://d6.h5go.life/
1•TrendSpotterPro•31m ago•0 comments

More States Are Taking Aim at a Controversial Early Reading Method

https://www.edweek.org/teaching-learning/more-states-are-taking-aim-at-a-controversial-early-read...
1•lelanthran•32m ago•0 comments

AI will not save developer productivity

https://www.infoworld.com/article/4125409/ai-will-not-save-developer-productivity.html
1•indentit•37m ago•0 comments

How I do and don't use agents

https://twitter.com/jessfraz/status/2019975917863661760
1•tosh•43m ago•0 comments

BTDUex Safe? The Back End Withdrawal Anomalies

1•aoijfoqfw•46m ago•0 comments

Show HN: Compile-Time Vibe Coding

https://github.com/Michael-JB/vibecode
6•michaelchicory•48m ago•1 comments

Show HN: Ensemble – macOS App to Manage Claude Code Skills, MCPs, and Claude.md

https://github.com/O0000-code/Ensemble
1•IO0oI•52m ago•1 comments

PR to support XMPP channels in OpenClaw

https://github.com/openclaw/openclaw/pull/9741
1•mickael•52m ago•0 comments

Twenty: A Modern Alternative to Salesforce

https://github.com/twentyhq/twenty
1•tosh•54m ago•0 comments

Raspberry Pi: More memory-driven price rises

https://www.raspberrypi.com/news/more-memory-driven-price-rises/
2•calcifer•59m ago•0 comments

Level Up Your Gaming

https://d4.h5go.life/
1•LinkLens•1h ago•1 comments

Di.day is a movement to encourage people to ditch Big Tech

https://itsfoss.com/news/di-day-celebration/
3•MilnerRoute•1h ago•0 comments

Show HN: AI generated personal affirmations playing when your phone is locked

https://MyAffirmations.Guru
4•alaserm•1h ago•3 comments

Show HN: GTM MCP Server- Let AI Manage Your Google Tag Manager Containers

https://github.com/paolobietolini/gtm-mcp-server
1•paolobietolini•1h ago•0 comments

Launch of X (Twitter) API Pay-per-Use Pricing

https://devcommunity.x.com/t/announcing-the-launch-of-x-api-pay-per-use-pricing/256476
1•thinkingemote•1h ago•0 comments

Facebook seemingly randomly bans tons of users

https://old.reddit.com/r/facebookdisabledme/
1•dirteater_•1h ago•2 comments
Open in hackernews

Show HN: I made a human-in-the-loop system for tuning LLMs in beta

https://www.joinoneshot.com/
2•gitpullups•1mo ago
OneShot is an API that routes failed LLM outputs to trained humans, returns corrected outputs or prompt injections, and stores the edits as structured training data.

Privacy Note: This product is not built for privacy yet. The current use case is internal tools or beta features where users aren’t promised privacy. But the point of this tool is NOT FOR PRODUCTION.

In the future, there will be a feature for anonymizing all private information automatically.

Problem: My project this year was a tool for pediatricians to do their insurance claims assisted by AI.

Niche industries like this require a ton of examples, fine-tuning and re-prompting to actually get them a product that works. Then, it requires monitoring the output to some extent (of course with the hospital’s consent) so small model changes or edge cases don’t break outputs for at least the first couple months.

This monitoring takes months of being distracted from doing new features. And every new feature I wanted to ship required this constant beta monitoring to get it to a reliable state. This also includes internal tools and automations that I needed to work reliably. That is when I started wishing I had an AI engineer/architect monitoring outputs 24-7 for every new feature’s first month. In real-world software, programs need to break less. Like almost never. And current AI models often don’t get us quite there. From 90% to 100 or 95 to 100. We waste months before shipping new features trying to tweak it internally without the model being able to have the hybrid of being improved live in the real world.

In niche agent environments, you sometimes need an actual human to jump in.

How it works: First, a beta deployment. You deploy your AI to do X business use case in beta or internally.

Each step of your pipeline queries our API with what models you prefer, etc.

Then, a human who is in charge of a batch of outputs will see a flagged output when something goes wrong (we agree first on what that means). They can then use human judgement to tweak the prompt, prompt a different model, or provide added context over and over in multiple parallel threads until the correct output comes out.

Second, fine tuning. You now own a dataset of what changes to your prompt and what changes to the output were made that caused that magical output. Thousands of changes and tweaks that can take your model to the next level internally for each feature are in your db. This data allows you to ship faster, with better guarantees and much less manual testing that isn’t being rewarded or punished by the real world.

Who are the humans? I’m a developer doing the tickets manually with my technical friends I’m paying out of pocket for now (yes, it IS available 24/7!!!). This is intentionally manual during beta, with clear review guidelines, so we understand the process before trying to hire.

How slow is it? Most of the time no human will touch it and sometimes a human will take a quick unnoticeable automated action. In some edge cases, you’ll feel some noticeable slowing (10s+) but we’re looking to accelerate those as well, and the alternative is fully broken output.

Who is it not for? This is not meant for consumer apps, privacy-sensitive production systems, or teams expecting zero human involvement.

Comments

vmitro•1mo ago
Don't laugh, but I think in the (near) future, more and more accent will be put on HITL concept as private or selfhosted AI workflows gain on interest; it's hard not to (hope for?) an emergence of movement similar to GNU in the space of software itself, where freely available tooling allows for collaborative, federated HITL powered finetuning of ML models.

As I do also work on a similar concept, where HITL is the first class citizen, can you tell us a bit more about the underlying technology stack, if it's possible for users to host their own models for inference and fine tuning, how are pipelines defined and such?

gitpullups•1mo ago
1. Pipelines are defined on your end, I want to build another option but for now it is still just queried as an API endpoint 2. Same as 1, so yes you can definitely use your models, you can definitely just send outputs you don't have to send prompts.
gitpullups•1mo ago
I'm a bit curious what you're working on, and if there might be some interesting connections there. Would you like to speak? You can just book in my calendar through the site.
vmitro•1mo ago
Sorry for the late reply, I'm juggling family / working as a full time senior resident / final year specialty trainee in a German hospital / maintaining three side projects. I've looked at you calendar and the timezones are a huge problem: either I get up at 4 AM or book it after the late shift (11 PM here)...

Anyway:

It's an open-source licensed, distributed data orchestration framework designed from the ground up for HITL workflows where correctness matters more than speed (primary field is medicine, but law, etc. could also benefit). It sounds like we're attacking the same problem from complementary angles. You're building the human routing API, I'm building the pipeline infrastructure that defines when and how humans get routed into the loop.

The core idea: pipelines are YAML-defined state machines with explicit correction steps. When a worker (e.g. your LLM endpoint) produces output, the pipeline can pause, send results to a human reviewer, and wait for either approval or corrected data; all as first-class custom protocol messages (based on Majordomo protocol). The correction protocol has timeout handling, strike counting for repeated failures, and an audit trail that captures every decision point. Also, the YAML can define how to "steer" the pipeline in case of a correction, it can continue, store the correction, route to a specific step, fail, etc. (combinations also possible, e.g. store the correction, continue or jump to another step). A feature creep that's currently itching is implementing a largely reduced Lucid (the dataflow language) syntax set parser and transpiler into YAML pipeline definitions.

What might interest you: every message in a pipeline shares a UUID, and each correction creates an immutable record of what was changed and why. This is essentially your "structured training data" as a sort of (useful) byproduct of the architecture: you don't extract it after the fact, it's the communication protocol itself. Its intended workflow philosophy is an iterative fine tuning, I guess, with training data for

The framework uses ZeroMQ for binary messaging (sub-millisecond routing overhead) and can run from edge devices to datacenters. If it speaks TCP/IP and can run Python 3.11+, you can plug it in. Workers are pluggable, your existing model endpoints could be wrapped as the framework's workers with about 20 lines of Python, receiving tasks and returning results through the same correction-aware pipeline. All the components of the framework have lifecycle aware "hooks" so when you design your workers for example, in Python, you define them as a class and decorate their methods with @hook("async_init") or @hook("process_message") and those hooks get executed at each lifecycle event.

So in your project, instead of clients defining pipelines on their end and querying your API, the framework could provide the orchestration layer that routes between your clients' models, your human review queue, and back—with the pipeline definition living in a YAML file rather than scattered across client code. Your humans would interact with a well-defined correction protocol rather than ad-hoc intervention.

No HTTP endpoint (yet), you'd need to implement a worker that relays e.g. REST API calls and translates them into the framework's messages.

It's LGPL-licensed, intended for federated machine learning and self-hosted scenarios, and the (initial, now fairly more complex) "spartanic philosophy" means the core stays minimal while complexity lives in pluggable workers.

But it's not MVP ready, some things are still broken and I'm trying to hit the 0.1.0 version with a simple demo that takes a WAV file, transcribes it into text, then another model extracts keywords from the text, including intent and the basic context, then it all goes to another model that generates a TinkerPop/Gremlin query based on it, then the client executes the query and the results get sent along to the final worker that summarizes the (reduced) knowledge graph. That'd show a multi modal pipeline in action.

If you're interested, find me on github, the username is the same.