Author here. I'm a software engineer with zero cybersecurity experience. I entered a beginner CTF at MWC Barcelona mostly to stress-test Pi (a coding agent) on something I knew nothing about.
The most interesting part for me was reviewing the full conversation logs afterward to figure out whether my steering actually helped or hurt. Turns out about 4 of my 24 interventions were counterproductive and the agent solved the last two phases completely on its own.
The repo has the full writeup, all the exploit scripts, and a table rating every single human message I sent: https://github.com/kafkasl/ctf
Happy to answer questions about the process, the agent, or the competition.
pol_avec•1h ago
For those that don't know, Pi is the minimal agent harness powering Open Claw too
pol_avec•1h ago
The most interesting part for me was reviewing the full conversation logs afterward to figure out whether my steering actually helped or hurt. Turns out about 4 of my 24 interventions were counterproductive and the agent solved the last two phases completely on its own.
The repo has the full writeup, all the exploit scripts, and a table rating every single human message I sent: https://github.com/kafkasl/ctf
Happy to answer questions about the process, the agent, or the competition.