Here's a demo: https://www.youtube.com/watch?v=8dQV3JbLrRE.
I built this because while tools like YARA are powerful, managing rule sets and decoding hex strings is slow. AI is great at explaining malware signatures, but I couldn't use ChatGPT for my work because pasting potential malware or sensitive logs into a web form is a massive security risk. I needed the intelligence of an LLM but with the privacy of an air-gapped machine.
Under the hood, it’s built on Python 3. I use subprocess to manage the heavy lifting of the scanning engines so the UI (built with CustomTkinter) doesn't freeze. The "secret sauce" isn't the AI itself, but the parser I wrote that converts the unstructured text output from YARA into a structured JSON format that the local LLM can actually understand and reason about.
I’ve been using it to triage files for my own learning. In one case, Syd flagged a file matching a "SilentBanker" rule and the AI pointed out specific API calls for keylogging, saving me about 20 minutes of manual hex-editing. In the demo video linked, you can see this workflow: scanning a directory, hitting on a custom YARA rule, and having the local AI immediately analyze the strings.
Through this process, I learned that "AI wrappers" are easy, but AI orchestration is hard—getting the tools to output clean data for the LLM is the real challenge. I'd love to hear if there are other static analysis tools (like PEStudio or Capa) you consider essential for a workstation like this, or how you currently handle the privacy risk of using AI for log analysis.
paul2495•39m ago
A bit more context on how Syd works: it uses Dolphin Llama 3 (dolphin-2.9-llama3-8b) running locally via llama-cpp-python. You'll need about 12-14GB RAM when the model is loaded, plus ~8GB disk space for the base system (models, FAISS index, CVE database). The full exploit database is an optional 208GB add-on.
What makes this different from just wrapping an LLM, the core challenge wasn't the AI—it was making security tools output data that an LLM can actually understand tools like YARA, Volatility, and Nmap output unstructured text with inconsistent formats. I built parsers that convert this into structured JSON, which the LLM can then reason about intelligently. Without that layer, you get hallucinations and garbage analysis.
Current tool integrations: - Red Team: Nmap (with CVE correlation), Metasploit, Sliver C2, exploit database lookup - Blue Team: Volatility 3 (memory forensics), YARA (malware detection), Chainsaw (Windows event log analysis), PCAP analysis, Zeek, Suricata - Cross-tool intelligence: YARA detection → CVE lookup → patching steps; Nmap scan → Metasploit modules ready-to-run commands
The privacy angle exists because I couldn't paste potential malware samples, memory dumps, or customer network scans into ChatGPT without violating every security policy. Everything runs on localhost:11434—no data ever leaves your machine. For blue teamers handling sensitive investigations or red teamers on client networks, this is non-negotiable.
Real-world example from the demo syd scans a directory with YARA, hits on a custom ransomware rule, automatically looks up which CVE was exploited(EternalBlue/MS17-010), explains the matched API calls, and generates an incident response workflow—all in about 15 seconds. That beats manual analysis by a significant margin.
What I'd love feedback on:
1. Tool suggestions: What other security tools would you want orchestrated this way? I'm looking at adding Capa(malware capability detection) and potentially Ghidra integration. 2. For SOC/IR folks: How are you currently balancing AI utility with operational security? Are you just avoiding LLMs entirely, or have you found other solutions? 3. Beta testers: If you're actively doing red/blue team work and want to try this on real investigations, I'm looking for people to test and provide feedback. Especially interested in hearing what breaks or what features are missing.