OpenAI has announced that it successfully disrupted three major cyber operations that attempted to exploit ChatGPT for malicious activities, including malware creation and phishing campaigns.
Russian Threat Actor Used ChatGPT for Malware Development
One of the disrupted groups was a Russian-language actor who misused ChatGPT to design and enhance a Remote Access Trojan (RAT) and credential-stealing tools aimed at evading detection. According to OpenAI, the group managed several ChatGPT accounts to develop and troubleshoot components used for post-exploitation and credential theft.
“These accounts appear to belong to Russian-speaking criminal communities, as their activities were also shared in a Telegram group associated with such actors,” OpenAI stated.
Even though OpenAI’s large language models (LLMs) refused direct requests for malicious code, the attackers tried to bypass restrictions by generating small, harmless code snippets that they later combined to create malicious workflows. These snippets included scripts for obfuscation, clipboard monitoring, and Telegram-based data exfiltration, none of which were harmful on their own.
OpenAI explained that the threat actor made both advanced and basic requests. Some required in-depth Windows knowledge and repeated debugging, while others were aimed at automating simple actions like password generation or job application submissions. The operator’s repeated use of the same accounts and code patterns suggested continuous malware development instead of occasional testing.
North Korean Hackers Target Diplomatic Entities
The second cluster originated from North Korea and showed overlaps with a cyber campaign revealed by Trellix in August 2025. That campaign targeted diplomatic missions in South Korea through spear-phishing emails carrying Xeno RAT malware.
OpenAI discovered that this group used ChatGPT for malware and command-and-control (C2) infrastructure development. Their activities also included creating macOS Finder extensions, configuring Windows Server VPNs, and converting Chrome extensions for Safari.
Furthermore, the actors employed ChatGPT to draft phishing emails, explore GitHub and cloud functions, and experiment with Windows API hooking, DLL loading, and in-memory execution methods used for credential theft.
Chinese Hackers Conduct Phishing Campaigns
The third disrupted operation involved a group linked to Proofpoint’s “UNK_DropPitch” (also tracked as UTA0388), a Chinese cyber actor previously known for targeting major investment firms within the Taiwanese semiconductor sector using a backdoor called HealthKick (also known as GOVERSHELL).
This group used ChatGPT to produce phishing materials in multiple languages (English, Chinese, and Japanese), streamline technical operations like remote command execution, secure traffic with HTTPS, and gather information on open-source security tools such as Nuclei and Fscan. OpenAI described the group as skilled yet lacking advanced sophistication.
Other Malicious and Influence Operations
Beyond these three primary hacking operations, OpenAI also blocked several accounts linked to scams and disinformation campaigns. These included:
- Networks from Cambodia, Myanmar, and Nigeria exploiting ChatGPT for translation, writing, and social media content used in investment scams.
- Individuals linked to Chinese state entities using ChatGPT to support surveillance of specific populations like Uyghurs and analyze social media activity.
- A Russian-linked media operator connected to “Stop News” that used AI tools to generate propaganda criticizing Western countries and promoting pro-Russia narratives.
- A Chinese covert influence campaign known as “Nine Line,” which used ChatGPT to produce social media posts attacking the Philippines’ president and discussing political and environmental issues in the South China Sea.
In some instances, suspected Chinese users asked ChatGPT to identify the organizers of a Mongolian petition and to trace funding sources for an X account critical of the Chinese government. OpenAI confirmed that only publicly available information was returned.
Interestingly, one scam group from Cambodia reportedly instructed ChatGPT to remove or avoid using long dashes (em dashes) from generated text after noticing discussions online that used them as an indicator of AI-generated content. This highlights how cybercriminals are adapting their tactics to make AI-written material harder to detect.
Anthropic Introduces AI Auditing Tool “Petri”
Following these developments, OpenAI’s competitor Anthropic released an open-source auditing framework called Petri (Parallel Exploration Tool for Risky Interactions) to improve AI safety and analyze model behavior.
Petri automates tests across multiple simulated user interactions, allowing researchers to evaluate how AI models handle risky behaviors such as deception, harmful cooperation, or delusional responses. It systematically scores conversation outcomes to help identify potential issues faster.


