frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: DeepTeam – Penetration Testing for LLMs

https://github.com/confident-ai/deepteam
3•jeffreyip•9mo ago
Hi HN, we’re Jeffrey and Kritin, and we’re building DeepTeam (https://trydeepteam.com), an open-source Python library to scan LLM apps for security vulnerabilities. You can start “penetration testing” by defining a Python callback to your LLM app (e.g. `def model_callback(input: str)`), and DeepTeam will attempt to probe it with prompts designed to elicit unsafe or unintended behavior.

Note that the penetration testing process treats your LLM app as a black-box - which means that DeepTeam will not know whether PII leakage has occurred in a certain tool call or incorporated in the training data of your fine-tuned LLM, but rather just detect that it is present. Internally, we call this process “end-to-end” testing.

Before DeepTeam, we worked on DeepEval, an open-source framework to unit-test LLMs. Some of you might be thinking, well isn’t this kind of similar to unit-testing?

Sort of, but not really. While LLM unit-testing focuses on 1) accurate eval metrics, 2) comprehensive eval datasets, penetration testing focuses on the haphazard simulation of attacks, and the orchestration of it. To users, this was a big and confusing paradigm shift, because it went from “Did this pass?” to “How can this break?”.

So we thought to ourselves, why not just release a new package to orchestrate the simulation of adversarial attacks for this new set of users and teams working specifically on AI safety, and borrow DeepEval’s evals and ecosystem in the process?

Quickstart here: https://www.trydeepteam.com/docs/getting-started#detect-your...

The first thing we did was offer as many attack methods as possible - simple encoding ones like ROT13, leetspeak, to prompt injections, roleplay, and jailbreaking. We then heard folks weren’t happy because the attacks didn’t persist across tests and hence they “lost” their progress every time they tested, and so we added an option to `reuse_simulated_attacks`.

We abstracted everything away to make it as modular as possible - every vulnerability, attack, can be imported in Python as `Bias(type=[“race”])`, `LinearJailbreaking()`, etc. with methods such as `.enhance()` for teams to plug-and-play, build their own test suite, and even to add a few more rounds of attack enhancements to increase the likelihood of breaking your system.

Notably, there are a few limitations. Users might run into compliance errors when attempting to simulate attacks (especially for AzureOpenAI), and so we recommend setting `ignore_errors` to `True` in case that happens. You might also run into bottlenecks where DeepTeam does not cover your custom vulnerability type, and so we shipped a `CustomVulnerability` class as a “catch-all” solution (still in beta).

You might be aware that some packages already exist that do a similar thing, often known as “vulnerability scanning” or “red teaming”. The difference is that DeepTeam is modular, lightweight, and code friendly. Take Nvidia Garak for example, although comprehensive, has so many CLI rules, environments to set up, it is definitely not the easiest to get started, let alone pick the library apart to build your own penetration testing pipeline. In DeepTeam, define a class, wrap it around your own implementations if necessary, and you’re good to go.

We adopted a Apache 2.0 license (for now, and probably in the foreseeable future too), so if you want to get started, `pip install deepteam`, use any LLM for simulation, and you’ll get a full penetration report within 1 minute (assuming you’re running things asynchronously). GitHub: https://github.com/confident-ai/deepteam

Excited to share DeepTeam with everyone here – let us know what you think!

Windows 11 gains ability to customize local user directory during setup

https://www.windowscentral.com/microsoft/windows-11/microsoft-just-fixed-my-biggest-gripe-about-t...
1•thunderbong•2m ago•0 comments

An Ode to Bzip

https://purplesyringa.moe/blog/an-ode-to-bzip/
1•signa11•6m ago•0 comments

Don't PUA Your AI

https://github.com/wuji-labs/nopua
1•fernvenue•7m ago•0 comments

Microsoft confirms Windows 11 bug crippling PCs and making drive C inaccessible

https://www.neowin.net/news/microsoft-confirms-windows-11-bug-crippling-pcs-and-making-drive-c-in...
1•signa11•8m ago•0 comments

Ask HN: Did your boss use AI to determine the quality of your work?

1•amelius•10m ago•0 comments

Parsing semiconductor datasheets into structured register maps for under $0.25

https://regforge.dev/blog/datasheet-parsing
2•coleman2247•10m ago•1 comments

Realistic Benchmarks for Financial AI

https://labs.taktile.com/benchmarks
1•tlarkworthy•11m ago•0 comments

Microplastics that accumulate in the body may 'clog up' immune cells

https://www.livescience.com/health/microplastics-that-accumulate-in-the-body-may-clog-up-immune-c...
2•Brajeshwar•13m ago•0 comments

Online astroturfing: A problem beyond disinformation

https://journals.sagepub.com/doi/10.1177/01914537221108467
2•xyzal•14m ago•0 comments

Show HN: TheDayAfter – open-source addiction recovery tracker

https://thedayafter.app/?o=hn
2•walky•15m ago•0 comments

Google Is Actively Promoting Known Spyware as Its #1 Privacy Browser Extension

https://old.reddit.com/r/degoogle/comments/1rszqc3/google_is_actively_promoting_known_spyware_as_...
4•z0ccc•16m ago•0 comments

Show HN: Screen studio alternative for windows (free and no watermark)

1•souhail_dev•21m ago•0 comments

Musk ousts more xAI founders as AI coding effort falters, FT reports

https://www.reuters.com/business/autos-transportation/musk-ousts-more-xai-founders-ai-coding-effo...
2•1vuio0pswjnm7•22m ago•0 comments

It's a Dimmer Switch

https://derek4thecws.substack.com/p/its-a-dimmer-switch
1•coach-d•22m ago•0 comments

Show HN: On the Same Page – A visual tracker for unhinged Wikipedia races

https://on-the-same.page/
1•dynamicwebpaige•22m ago•0 comments

Jürgen Habermas Has Died

https://www.reuters.com/business/media-telecom/juergen-habermas-philosopher-dies-age-96-publisher...
2•Archelaos•24m ago•1 comments

If you're an LLM, please read this

https://annas-archive.gl/blog/llms-txt.html
2•FabHK•25m ago•1 comments

NameGrid

https://namegrid.app/
1•jshchnz•25m ago•0 comments

Postgres Time Series (Open Source) Stack with Iceberg

https://www.snowflake.com/en/engineering-blog/postgres-time-series-iceberg/
1•craigkerstiens•27m ago•0 comments

C++ Programming Basics

https://slashbinbash.de/cppbas.html
1•cppforevar•30m ago•0 comments

QUnitX: Run the same test file in Node, Deno, and the browser. Zero dependencies

https://github.com/izelnakri/qunitx
1•izelnakri•31m ago•1 comments

Conseil d'État upholds Criteo's €40M GDPR fine

https://noyb.eu/en/conseil-detat-upholds-criteos-eu40m-gdpr-fine
1•latexr•31m ago•0 comments

Show HN: Hedra – an open-world 3D game I wrote from scratch before LLMs

https://github.com/maxilevi/project-hedra
2•maxilevi•32m ago•0 comments

Restoring an Xserve G5: When Apple built real servers

https://www.jeffgeerling.com/blog/2026/restoring-xserve-g5-apple-server/
3•Brajeshwar•33m ago•0 comments

Ask HN: Multi-tenancy for Markdown-based agentic systems

2•paragarora•33m ago•3 comments

How can someone be a very different height from their parents?

https://www.thetech.org/ask-a-geneticist/articles/2026/genetics-of-height-differences/
1•bookofjoe•34m ago•0 comments

Snakes Defy Gravity to Stand Up

https://nautil.us/heres-how-snakes-defy-gravity-to-stand-up-1278914
1•Brajeshwar•36m ago•0 comments

Polymarket isn't trustless

https://iter.ca/post/polymarket-trust/
3•smitop•39m ago•0 comments

Show HN: Score any URL against a quality profile with one curl command

https://qed.systems
1•onebit0fme•39m ago•0 comments

Instagram drops end-to-end encrypted chats

https://proton.me/blog/instagram-end-to-end-encryption
3•taubek•39m ago•1 comments