Veritensor – open-source tool to scan AI models for malware and license issues

https://github.com/ArseniiBrazhnyk/Veritensor

1•arseniibr•3w ago

Comments

arseniibr•3w ago

Hi guys,

I've been working with MLOps pipelines lately, and it always bothered me that torch.load() (and Pickle in general) is basically an RCE vulnerability we've all just accepted. We download gigabytes of opaque weights from Hugging Face and run them in production, often with full privileges.

I looked for existing tools, but many relied on simple regex (easy to bypass) or didn't verify if the file was tampered with in transit.

So I built Veritensor. It’s a CLI tool to gatekeep models before they hit your runtime.

How it works under the hood: 1. Pickle Emulation: Instead of grepping for os.system, it emulates the Pickle VM stack. This catches obfuscated payloads (like STACK_GLOBAL assembly) without actually executing the code. 2. Identity Check: It hashes your local file and queries the Hugging Face Hub API to ensure it matches the upstream version bit-for-bit (detects MITM or corruption). 3. License Headers: It parses metadata from Safetensors/GGUF to detect restrictive licenses (like CC-BY-NC or AGPL) so you don't accidentally ship them in a commercial product. 4. Signing: Integrates with Sigstore Cosign to sign the container if the scan passes.

It supports PyTorch, Keras (checks for Lambda layers), and GGUF. Written in Python, Apache 2.0.

I’d love to hear your feedback on the detection logic or edge cases I might have missed with the Pickle emulation.

Repo: https://github.com/ArseniiBrazhnyk/Veritensor PyPI: pip install veritensor

arseniibr•3w ago

OP here. One of the annoying edge cases I hit was handling "Zip Bombs" in PyTorch files (since .pt is just a zip). Had to implement a stream reader with strict memory limits to prevent the scanner itself from OOMing on malicious archives.

Also, the "Identity Check" was tricky because people often rename files locally (e.g., model.bin instead of pytorch_model.bin). The tool now queries the HF API to find if any file in the repo matches the local hash, rather than just relying on the filename. Happy to answer any questions!

Welfare states build financial markets through social policy design

Market orientation and national homicide rates

California urges people avoid wild mushrooms after 4 deaths, 3 liver transplants

Matthew Shulman, co-creator of Intellisense, died 2019 March 22

Show HN: SuperLocalMemory – AI memory that stays on your machine, forever free

Show HN: Pyrig – One command to set up a production-ready Python project

Fast Response or Silence: Conversation Persistence in an AI-Agent Social Network [pdf]

C and C++ dependencies: don't dream it, be it

Show HN: Vbuckets – Infinite virtual S3 buckets

Open Molten Claw: Post-Eval as a Service

New York Budget Bill Mandates File Scans for 3D Printers

The End of Software as a Business?

Exploring 1,400 reusable skills for AI coding tools

Show HN: A unique twist on Tetris and block puzzle

The logs I never read

How to use AI with expressive writing without generating AI slop

Show HN: LinkScope – Real-Time UART Analyzer Using ESP32-S3 and PC GUI

Cppsp v1.4.5–custom pattern-driven, nested, namespace-scoped templates

The next frontier in weight-loss drugs: one-time gene therapy

At Age 25, Wikipedia Refuses to Evolve

Show HN: ReviewReact – AI review responses inside Google Maps ($19/mo)

Why AlphaTensor Failed at 3x3 Matrix Multiplication: The Anchor Barrier

Ask HN: How much of your token use is fixing the bugs Claude Code causes?

Show HN: Agents – Sync MCP Configs Across Claude, Cursor, Codex Automatically

Hello

FSD helped save my father's life during a heart attack

Show HN: Writtte – Draft and publish articles without reformatting, anywhere

Portuguese icon (FROM A CAN) makes a simple meal (Canned Fish Files) [video]

Brookhaven Lab's RHIC Concludes 25-Year Run with Final Collisions

Transcribe your aunts post cards with Gemini 3 Pro