Earlier this month, a friend of OALABS reached out with an interesting situation. A server of theirs had been compromised, and the attacker was using it as a staging host to carry out further attacks. Our friend was able to download the attacker's working directory before cleaning up the host and noticed that the attacker was using the Anthropic Claude Code agent to drive most of their attacks. OpenAI's Codex agent was also used to a limited extent.
During our analysis of the recovered working directory, we discovered that the attacker was not just using the host as a proxy; they had full Claude and Codex agents installed locally and were using them remotely to carry out reconnaissance, exploitation, and data exfiltration activities. Because the agents were local to the host, their full session logs were recovered, including the attacker's prompts, the tools used, the internal monologue of the large language model (LLM), and any policy violations recorded during the sessions. In total, we collected more than 1,000 agent sessions for Claude and Codex, so many that we had Claude (ironic) develop a session-log forensics tool to assist with the scale of the analysis: ASF Triage. In addition to the session logs, we also discovered a myriad of LLM-developed tools, artifacts, and logs detailing the breach of at least 14 companies.
Tiberium•1h ago
Earlier this month, a friend of OALABS reached out with an interesting situation. A server of theirs had been compromised, and the attacker was using it as a staging host to carry out further attacks. Our friend was able to download the attacker's working directory before cleaning up the host and noticed that the attacker was using the Anthropic Claude Code agent to drive most of their attacks. OpenAI's Codex agent was also used to a limited extent.
During our analysis of the recovered working directory, we discovered that the attacker was not just using the host as a proxy; they had full Claude and Codex agents installed locally and were using them remotely to carry out reconnaissance, exploitation, and data exfiltration activities. Because the agents were local to the host, their full session logs were recovered, including the attacker's prompts, the tools used, the internal monologue of the large language model (LLM), and any policy violations recorded during the sessions. In total, we collected more than 1,000 agent sessions for Claude and Codex, so many that we had Claude (ironic) develop a session-log forensics tool to assist with the scale of the analysis: ASF Triage. In addition to the session logs, we also discovered a myriad of LLM-developed tools, artifacts, and logs detailing the breach of at least 14 companies.