Nice to see Strix hitting GPT-5.3 to finish those HTB machines. On our side we run Strix with --output commands.json and pipe that command list into a tiny replay harness before we accept a vuln. The harness replays each recorded HTTP request / shell command inside a sandbox, compares exit codes, and only the steps that reproduce the same signal survive the report. That keeps stray hallucinated CVEs out of the compliance narrative while letting the agent still explore freely. Have you tried re-running the recorded steps for the hits you liked best?
guerython•1h ago