frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Improving Prompt Injection Detection with Weighted Ensembles

https://github.com/appleroll-research/promptforest
2•appleroll•2h ago

Comments

appleroll•1h ago
Hi HN — I’m the author.

This project started as an open-source system for detecting prompt injections in LLMs. The goal is to flag adversarial prompts before they reach a model, while keeping latency low and probabilities well-calibrated.

The main insight came from ensembles: not all models are equally good at every case. Instead of just averaging outputs, I:

1. Benchmarked each candidate model first to see what it actually contributes.

2. Remove models that don’t improve the ensemble through ablation studies (e.g., ProtectAI's Deberta finetune was dropped as it only contributed 0.5% to ECE and actually decreased accuracy).

3. Weight predictions by each model’s accuracy, letting models specialize in what they’re good at.

With this approach, the ensemble is smaller (~237M parameters vs ~600M for the leading baseline), 2x faster, and more calibrated (lower Expected Calibration Error) while still achieving competitive accuracy. Lower confidence on wrong predictions makes it safer for “human-in-the-loop” fallback systems.

For more info, you can check it out here: https://github.com/appleroll-research/promptforest

This project is open to all forms of contributions, and I’d love to hear feedback from the HN community — especially on ideas to further improve calibration, robustness, or ensemble design.

Ask HN: Non-profit, volunteers run org needs CRM. Is Odoo Community a good sol.?

1•netfortius•1m ago•0 comments

WiFi Could Become an Invisible Mass Surveillance System

https://scitechdaily.com/researchers-warn-wifi-could-become-an-invisible-mass-surveillance-system/
1•mgh2•2m ago•0 comments

Build your own Mac cloud

https://ciderstack.com
1•ciderdev•2m ago•0 comments

Anduril announces AI Grand Prix – autonomous drone racing competition (2026)

https://www.dcl-project.com/
1•aanet•3m ago•0 comments

How the Tandy Color Computer Works [video]

https://www.youtube.com/watch?v=r2Tq8jdS6mY
1•amichail•5m ago•0 comments

Bash scripts are brittle – simple error handling in bash

https://notifox.com/blog/bash-error-handling
1•Meetvelde•8m ago•0 comments

WebView performance significantly slower than PWA

https://issues.chromium.org/issues/40817676
1•denysonique•9m ago•0 comments

I'm going to cure my girlfriend's brain tumor

https://andrewjrod.substack.com/p/im-going-to-cure-my-girlfriends-brain
1•ray__•13m ago•0 comments

Antigen specificity of clonally enriched CD8T cells in multiple sclerosis

https://www.nature.com/articles/s41590-025-02412-3
2•bookofjoe•13m ago•0 comments

Show HN: Vibe-coded game prototypes. Tell me which to work on

1•chux52•15m ago•0 comments

Do rich people live longer?

https://www.empirical.health/blog/rich-people-live-longer-hims-superbowl/
3•brandonb•15m ago•1 comments

R/IndieAppNews

https://old.reddit.com/r/IndieAppNews/
1•arthurofbabylon•22m ago•0 comments

Building "zero-gap" secrets for a UGC platform

1•Braden-dev•23m ago•0 comments

Show HN: NexVo'-Verdicts for SaaS Ideas

https://nexvo.io/
1•Kasra0•27m ago•0 comments

Bypassing Kernel32.dll for Fun and Nonprofit

https://ziglang.org/devlog/2026/#2026-02-03
1•Retro_Dev•27m ago•0 comments

Show HN: Gohpts tproxy with arp spoofing and sniffing got a new update

https://github.com/shadowy-pycoder/go-http-proxy-to-socks
1•shadowy-pycoder•28m ago•0 comments

Installing Ollama and Gemma 3B on Linux

https://byandrev.dev/en/blog/ollama-in-linux
2•byandrev•28m ago•0 comments

Token Smuggling:How Non-Standard Encoding Bypass AI Security

https://instatunnel.my/blog/token-smuggling-bypassing-filters-with-non-standard-encodings
1•birdculture•30m ago•0 comments

Wearable textile-based phototherapy toward non-invasive hair loss treatment

https://www.nature.com/articles/s41467-025-68258-3
1•T-A•30m ago•0 comments

The 1 feature I'm really liking in the OpenAI Codex App

https://asadjb.com/blog/2026-02-06-the-codex-app-feature-i-really-like
1•asadjb•32m ago•0 comments

BlECSd – Terminal UI Library Built on an Entity Component System

https://github.com/Kadajett/blECSd
1•kadajett•36m ago•1 comments

Even in Her Victory Lap, Jessie Diggins Is Always Thinking About Others

https://www.si.com/winter-olympics/even-in-her-victory-lap-jessie-diggins-is-always-thinking-abou...
1•mmooss•38m ago•1 comments

Jobs getting better? "AI has the potential for a productivity uplift"

https://blogs.lse.ac.uk/businessreview/2026/02/03/are-jobs-getting-better-ai-has-the-potential-fo...
2•hhs•40m ago•0 comments

System time, clocks and their syncing in macOS

https://eclecticlight.co/2025/05/21/system-time-clocks-and-their-syncing-in-macos/
1•todsacerdoti•42m ago•0 comments

Science of Sharp

https://scienceofsharp.com/
1•volcano_diver•42m ago•0 comments

Show HN: LLM-use – Open-source tool to route and orchestrate multi-LLM tasks

1•justvugg•43m ago•0 comments

Chandra-OCR

https://github.com/datalab-to/chandra
3•Curiositry•46m ago•1 comments

Show HN: Web Cache Using Origin Private File System

https://github.com/P0u4a/opfs-cache
1•p0u4a•46m ago•0 comments

The Malleability of Tools: AI Is Eating UI

https://www.cjroth.com/blog/the-malleability-of-tools
1•thoughtfulchris•47m ago•0 comments

1972: How to commit Computer Fraud – 70s style [video]

https://www.youtube.com/watch?v=RHo3d_4d2SM
1•1659447091•50m ago•0 comments