2026,
13(6):
1257-1273.
doi: 10.1109/JAS.2026.126209
Abstract:
Autonomous self-hosted AI agent platforms are rapidly evolving from prompt-response assistants into persistent systems that can maintain long-lived state, invoke tools, ingest external content, and execute environment-changing actions. While this transition enables practical automation, it also introduces lifecycle security risks that cannot be fully explained by prompt-level analysis alone. In this paper, a security analysis of OpenClaw is presented, with OpenClaw serving as a representative autonomous agent operating environment and a concrete case study for broader security challenges in emerging agent ecosystems. A trust-boundary-first perspective is adopted to examine how attacks propagate across five boundary classes: Channel-Access, Session-and-State, Tool-Execution, External-Content, and Extension Supply-Chain. The results presented in this paper show that threats such as indirect prompt injection, memory poisoning, unsafe tool invocation, data exfiltration, and malicious skill abuse are not isolated anomalies; rather, they are stage-specific manifestations of a common systems problem in which untrusted influence progressively crosses into higher-privilege contexts. Based on this analysis, the defense-in-depth implications for OpenClaw deployments are discussed, including boundary-aware isolation, capability-scoped tool mediation, memory integrity controls, extension governance, and evidence-oriented operational oversight. This study provides a practical framework for evaluating and hardening long-running, tool-capable, autonomous AI agents in realistic deployment settings.
W. Ma, Q.-L. Han, X. Zhu, W. Zhou, J. Xiong, Z. Ren, S. Wen, and Y. Xiang, “OpenClaw in the wild: Security analysis of autonomous agents,” IEEE/CAA J. Autom. Sinica, vol. 13, no. 6, pp. 1257–1273, Jun. 2026. doi: 10.1109/JAS.2026.126209.