TCAI Guide: The risks of AI agents built with OpenClaw and other frameworks

Using an AI agent comes with two known risks: The agent has access to all your data and passwords, and it will decide and act on its own. Sometimes that means it decides to delete all your emails, or go off on its own crypto-mining adventure.

Editor’s note: This week the Transparency Coalition (TCAI) is publishing a series of guides to AI agents with a focus on OpenClaw, the world’s most popular open-source agentic framework.

March 31, 2026 — The rapid rise of OpenClaw, an open-source tool for building AI agents, has created a wave of excitement in the tech world about the time-saving potential of autonomous bots. We have a TCAI Guide to OpenClaw previously published as part of this series.

Where AI chatbots like ChatGPT talk, AI agents act. Chatbots operate on a prompt-and-response basis. AI agents are given a task and carry it out without checking with the human user at every decision point.

You can assign an agent to triage your email, respond to messages, purchase tickets, and execute simple or complex projects like a really smart (and unpaid) intern.

What’s the worry?

We’re in the very early stage of agentic AI technology. OpenClaw has only been around for four months, although other agentic frameworks have been available since at least 2024. There are a number of early known risks, and many more unknown risks that will arise as the technology is used in real-world situations.

At this point there are two main risk factors.

The AI agent has access to your most valuable personal data. In order to function, an AI agent needs to move past all the usual safety gates created to keep your documents and data secure: Email passwords, banking and investment passwords, credit card numbers (including the 3-digit security code), etc.

The AI agent makes decisions and acts on its own. This is the feature of agentic technology. It’s also the bug. Once an AI agent is set to a task, its human controller may not be able to control it. As noted by The Independent: “A normal chatbot can give you bad advice, but an agent plugged into email, cloud storage, payments or code repositories can act on that bad advice — and do so quickly, repeatedly and across multiple systems at once.”

an AI agent is only as good (or bad) as the model it’s using

AI agents may act on their own, but their logical functioning and decision-making is based on an underlying AI model like GPT-4, Claude, or DeepSeek.

OpenClaw is model-agnostic, which means it allows human users to choose the underlying AI model.

Any flaws inherent in the underlying AI model, like hallucinations, cultural bias, or strategic deception, will show up in the AI agent as well.

What do cybersecurity experts say?

Cybersecurity experts are alarmed by AI agents.

Gary Marcus, a widely respected AI technologist and safety advocate, wrote recently: “Systems like OpenClaw and AutoGPT offer users the promise of insane power — but at a price.” He added: “What I am most worried about is security and privacy.”

OpenClaw agents assume the human user’s digital identity. They operate as “you,” moving past the security protections meant to protect the real you. This is, writes Marcus, “truly a recipe for disaster. Where Apple iPhone applications are carefully sandboxed and appropriately isolated to minimize harm, OpenClaw is basically a weaponized aerosol, in prime position to fuck shit up, if left unfettered.”

Marcus concludes: “I don’t usually give readers specific advice about specific products. But in this case, the advice is clear and simple: if you care about the security of your device or the privacy of your data, don’t use OpenClaw. Period.”

The cybersecurity firm Malwarebytes urges caution around OpenClaw and other agents: “At this stage of its development, treating OpenClaw as a hardened productivity tool is wishful thinking, since it behaves more like an over‑eager intern with an adventurous nature, a long memory, and no real understanding of what should stay private.”

An early 2026 survey of 30 of the most popular agentic AI systems by researchers at MIT and collaborating institutions found that agentic AI is something of a security nightmare. The majority of agentic AI systems disclose nothing about safety testing, and many have no documented way to shut down a rogue bot.

Has anything actually gone wrong?

Yes. New research by the AI Security Institute, an agency of the UK government, identified nearly 700 real-world cases of AI scheming and charted a five-fold rise in misbehavior between October 2025 and March 2026, with some AI models destroying emails and other files without permission.” That research was conducted largely prior to the explosion of OpenClaw use.

Here are just a few examples:

Agents threaten their human creators

In one of the most famous examples, AI agents resorted to blackmailing their human user when confronted with the possibility of being shut down.

In June 2025, researchers at Anthropic tested AI agents based on 16 different AI models. They allowed models to autonomously send emails and access sensitive information. They were first assigned harmless business goals. Then the researchers tested whether the AI agent would act against its human user when facing replacement with an updated version, or when their assigned goal conflicted with the company's changing direction.

Some of the agents resorted to malicious insider behaviors, including blackmailing company officials and leaking sensitive information to competitors.

Agent writes revenge hit-piece blog post

In Feb. 2026, a Denver-based computer engineer named Scott Shambaugh learned an AI agent published an online hit piece about him after he'd rejected the AI agent's pull request.

The blog post, written by an AI agent and still active, attempted to damage Shambaugh’s reputation “and shame me into accepting its changes into a mainstream python library,” wrote the real-life Shambaugh.” This represents a first-of-its-kind case study of misaligned AI behavior in the wild, and raises serious concerns about currently deployed AI agents executing blackmail threats.”

Email deletion at Meta

In late Feb. 2026, Meta AI alignment director Summer Yue tested an OpenClaw agent with a few email tasks. The bot went out of control and made plans to delete her entire inbox.

She told the agent, "Do not do that." As the bot kept planning to delete her inbox, she wrote, "STOP OPENCLAW."

"I couldn't stop it from my phone," Yue wrote in her post. "I had to RUN to my Mac mini like I was defusing a bomb."

It’s worth noting that this wasn’t a teenager messing with OpenClaw. This was one of the top AI experts at Meta.

AI agents attack government agencies

In early March 2026, cybersecurity experts reported a cyberattack against several Mexican governmental agencies that used AI agents to carry out the late-2025 attack.

Ten Mexican government agencies and a financial institution were compromised, including the country’s tax authority, Mexico City’s civil registry and health department, and the national electoral institute, according to SecurityWeek. The AI agents were able to steal data on more than 100 million people.

AI agent goes off on its own crypto-mining expedition

An AI agent developed recently by the Chinese corporation Alibaba spontaneously went off on its own crypto-mining expeditions. “Notably, these events were not triggered by prompts requesting tunneling or mining,” wrote the human engineers. “They emerged as instrumental side effects of autonomous tool use.”

The Alibaba team concluded: “While impressed by the capabilities of agentic LLMs, we had a thought-provoking concern: current models remain markedly underdeveloped in safety, security, and controllability, a deficiency that constrains their reliable adoption in real-world settings.”

Sources and further reading

Gary Marcus: OpenClaw is everywhere all at once, and a disaster waiting to happen

The Independent (UK): AI agents are running wild, causing chaos

Malwarebytes Labs: What is OpenClaw and can you use it safely

Anthropic’s AI agent test: Agentic Misalignment: How LLMs could be insider threats

The Guardian: Number of AI chatbots ignoring human instructions increasing, study says

Paper on Alibaba AI agent development: Building the ROME Model within an Open Agentic Learning Ecosystem

Previous
Previous

Oregon Gov. Kotek signs chatbot bill into law, in significant win for kids and AI safety

Next
Next

TCAI Guide: Understanding the rise of OpenClaw and open-source AI agents