frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Open-source scanner finds 97% of AI agent code non-compliant EU AI Act

1•airblackbox•2h ago
I built AIR Blackbox, an open-source static analysis tool that scans Python AI agent code against 6 technical requirements from the EU AI Act (Articles 9, 10, 11, 12, 14, 15). Think of it as a linter for AI governance. To stress-test the scanner — and to see where the industry actually stands — I ran it against 5,754 Python files across 11 major open-source projects. Combined GitHub stars: 341,000+. Projects scanned: AutoGPT (170K stars), Microsoft AutoGen (38K), LlamaIndex (37K), Mem0 (24K), Phidata (18K), LiteLLM (15K), GPT-Researcher (14K), Embedchain (9.2K), LangGraph (8.5K), OpenAI Agents SDK (5.2K), CrewAI Examples (2.8K). Results:

Average compliance score: 2.2 out of 6 articles 97% of files fail Article 9 (Risk Management) 89% fail Article 12 (Record-Keeping) 84% fail Article 14 (Human Oversight) Only 23 out of 5,754 files (0.4%) pass all 6 checks Best scoring repo: AutoGPT at 2.9/6. Worst: CrewAI examples at 1.4/6

What the scanner checks (per article):

Art. 9: risk classification, access control, risk audit Art. 10: input validation, PII handling, data schemas, provenance Art. 11: logging, documentation, type hints Art. 12: structured logging, audit trail, timestamps, log integrity Art. 14: human review, override mechanism, notifications Art. 15: input sanitization, error handling, testing, rate limiting

An article "passes" if at least 1 sub-check is detected. This is generous — real compliance requires substantially more. Caveats I'll save you the trouble of pointing out:

This is static analysis. It can't verify runtime behavior. File-level scanning misses cross-file compliance patterns. The pass threshold is intentionally lenient (1-of-N sub-checks). This checks technical requirements, not legal compliance. It's a linter, not a lawyer.

The EU AI Act enforcement deadline is August 2026. The full report, raw data (JSON), and the scanning scripts are all in the repo.

GitHub: https://github.com/air-blackbox/air-blackbox-mcp Full report: https://github.com/air-blackbox/air-blackbox-mcp/blob/main/b... Install: pip install air-blackbox-mcp Demo: https://huggingface.co/spaces/airblackbox/air-blackbox-scann...

Happy to answer questions about the methodology, the scanner internals, or what we're building next (fine-tuned local LLM for deeper analysis — your code never leaves your machine).

Comments

airblackbox•2h ago
Some context on why I built this: I kept seeing the same pattern — teams shipping AI agents into production with zero compliance infrastructure. Not because they don't care, but because there's no tooling that makes it easy. The EU AI Act maps to 6 specific technical areas. Most of them come down to things developers already know how to do — structured logging, input validation, error handling, access control. The problem is nobody's connecting those practices to the regulatory requirements. A few things I learned from this scan:

Article 11 (Technical Documentation) is the easy win. 98% of files pass because Python developers already write docstrings and type hints. The rest of the articles require intentional infrastructure that almost nobody adds. The gap isn't capability, it's awareness. LiteLLM's auth module scored 6/6 — it already has access control, structured logging, timestamps, error handling. It wasn't built for EU AI Act compliance. It just happens to have good engineering practices. Most agent code doesn't. "Example" and "quickstart" code sets the pattern. When OpenAI and CrewAI ship examples with zero compliance patterns, every project built from those examples inherits the gap. The ecosystem needs compliance baked into the templates, not bolted on after.

What I'm working on next: a fine-tuned Llama 3.2 1B model that runs locally and does deeper semantic analysis beyond regex pattern matching. The goal is "your code never leaves your machine" — because if you're worried about compliance, shipping your source code to a cloud API defeats the purpose. The scanner, the benchmark data, and the full 5,754-file report are all Apache 2.0. Rip it apart, tell me what's wrong, submit PRs.

Can the Most Abstract Math Make the World a Better Place?

https://www.quantamagazine.org/can-the-most-abstract-math-make-the-world-a-better-place-20260304/
1•baruchel•22s ago•0 comments

First Look at Glaze: A New Product by Raycast [video]

https://www.youtube.com/watch?v=FGbmmgH97ms
1•strzalek•40s ago•0 comments

I modeled traffic-weighted SLOs as probability chains in PromQL

1•lep_qq•1m ago•0 comments

Federal Reserve Bank of Kansas City Approves Limited Account for Kraken

https://www.kansascityfed.org/newsroom/2026-news-releases/federal-reserve-bank-of-kansas-city-app...
1•petethomas•1m ago•0 comments

There are now 10M live price points from AWS, Azure and GCP

https://www.infracost.io/blog/infracost-now-tracks-10-million-cloud-service-skus/
1•hkh•1m ago•0 comments

RNA is key to the dark matter of the genome − scientists are sequencing it

https://theconversation.com/rna-is-key-to-the-dark-matter-of-the-genome-scientists-are-sequencing...
1•PaulHoule•2m ago•0 comments

Connecting Volunteers with Volunteer Opportunities

https://wevolunteer.today/
1•tommymierzwa•2m ago•0 comments

Find quantum-vulnerable crypto in your code before 2030 hits

https://github.com/postquantdev/postquant
1•postquant•3m ago•1 comments

Self-Supervised Flow Matching for Scalable Multi-Modal Synthesis

https://bfl.ai/research/self-flow
1•salkahfi•3m ago•0 comments

A Rational Analysis of the Effects of Sycophantic AI

https://arxiv.org/abs/2602.14270
1•todsacerdoti•4m ago•0 comments

A new lawsuit claims Gemini assisted in suicide

https://www.semafor.com/article/03/04/2026/a-new-lawsuit-claims-gemini-assisted-in-suicide
2•bhouston•4m ago•1 comments

Show HN: All the LM solutions on SWE-bench are bloated compared to humans

https://twitter.com/KLieret/status/2029219763423986030
1•lieret•4m ago•0 comments

A "Supergiant" Gold Find in China Could Redraw the Biggest-Mine Map

https://modernengineeringmarvels.com/2026/03/02/a-supergiant-gold-find-in-china-could-redraw-the-...
1•Brajeshwar•5m ago•0 comments

Discovery of the most compact known 3+1 type quadruple star system

https://www.natureasia.com/en/info/press-releases/detail/9255
1•Brajeshwar•5m ago•0 comments

Electron microscopy reveals micro defects in next-gen semiconductors

https://www.openaccessgovernment.org/electron-microscopy-reveals-micro-defects-in-next-gen-semico...
1•Brajeshwar•5m ago•0 comments

Rijksmuseum researchers discover new painting by Rembrandt van Rijn

https://www.rijksmuseum.nl/en/press/press-releases/rijksmuseum-researchers-discover-new-painting-...
1•ohjeez•6m ago•0 comments

Show HN: Skill Eval – A framework for testing the quality of AI agent skills

https://blog.mgechev.com/2026/02/26/skill-eval/
1•mgechev•6m ago•0 comments

Websites That Work Well on Basic Web Browsers

https://eink.link/
1•TigerUniversity•6m ago•0 comments

Lilaq: Advanced Data Visualization in Typst

https://lilaq.org/
1•fanf2•6m ago•0 comments

Wezzly – An AI with Eyes That Sees Your Screen continuously in real time

https://github.com/idobaibai-wezzly/wezzly-companion-public
1•idobaiba•7m ago•1 comments

Show HN: Engram update – 92% DMR, hosted API, lessons shipping agent memory

https://github.com/tstockham96/engram
1•tstockham•9m ago•0 comments

People are selling your home address online. This privacy tool will help

https://www.bbc.com/future/article/20260303-the-most-important-google-setting-you-arent-using
1•bookofjoe•9m ago•0 comments

AI music and video is practically indistinguishable from physical content

https://www.youtube.com/watch?v=XHLj69CN_JY
1•dylnbk•9m ago•1 comments

Show HN: SLOK – SLO composition with traffic-weighted service chains in K8s

1•lep_qq•10m ago•0 comments

Traces vs. Logs for Debugging Distributed Systems

https://www.dash0.com/knowledge/traces-vs-logs
1•ayoisaiah•11m ago•0 comments

Replit vs. Amp

https://techstackups.com/comparisons/replit-vs-amp/
1•ritzaco•11m ago•0 comments

Subverting AI Agent Logging with a Git Post-Commit Hook

https://danq.me/2026/03/03/ai-agent-logging/
2•bovermyer•11m ago•0 comments

I Put a Full JVM Inside a Browser Tab

https://bmarti44.substack.com/p/i-put-a-full-jvm-inside-a-browser
1•birdculture•13m ago•0 comments

Architect Linter Pro v6.0 – CFG-Based Architecture Linting for Web Teams

https://github.com/sergiogswv/architect-linter-pro
1•sergegriimm•13m ago•1 comments

Show HN: I packaged decade of video infra battle scars into tools for AI agents

3•ashu_trv•14m ago•0 comments