A newly discovered security flaw in Apache Tika’s PDF parser module poses a serious threat to enterprise environments. The vulnerability, tracked as CVE-2025-54988, has been rated critical by security researchers because it enables attackers to steal sensitive data and send malicious requests to internal systems.
Key Points
- XXE flaw in Apache Tika’s PDF parser enables data theft through malicious XFA-embedded PDFs.
- Attackers can perform file access, network reconnaissance, and SSRF attacks.
- The issue affects multiple enterprise packages, and immediate upgrade is recommended.
XXE Vulnerability Explained
The vulnerability arises from an XML External Entity (XXE) injection weakness in Apache Tika’s PDF parser module (org.apache.tika:tika-parser-pdf-module).
Security experts Paras Jain and Yakov Shafranovich from Amazon discovered that versions 1.13 through 3.2.1 are vulnerable when handling specially crafted XFA (XML Forms Architecture) files embedded inside PDFs.
Attackers exploit this by embedding malicious XFA content inside PDF files. When processed, the parser improperly handles XML entity references, allowing attackers to:
- Steal sensitive files from local systems
- Conduct server-side request forgery (SSRF)
- Perform internal network reconnaissance
Why XFA Increases Risk
XFA (developed by Adobe) allows PDFs to include dynamic XML-based forms. While useful for document automation, improper handling of external entities in XFA structures makes it a dangerous attack vector for XXE exploitation.
Affected Apache Tika Packages
The flaw impacts not just the PDF parser but several dependent packages, significantly broadening the attack surface:
- tika-parsers-standard-modules
- tika-parsers-standard-package
- tika-app
- tika-grpc
- tika-server-standard
Risk Assessment
| Category | Details |
|---|---|
| Affected Products | Apache Tika PDF parser (1.13 to 3.2.1), tika-parsers-standard-modules, tika-parsers-standard-package, tika-app, tika-grpc, tika-server-standard |
| Impact | Unauthorized access to files, data leakage, SSRF attacks |
| Exploit Prerequisites | Upload of crafted PDF with malicious XFA content, vulnerable Tika version running, minimal user interaction |
| Severity | Critical |
Mitigation and Security Recommendations
Security teams should treat this vulnerability as a priority. If exploited, attackers could exfiltrate sensitive data, query internal systems, or redirect traffic to malicious servers.
Immediate Actions:
- Upgrade to Apache Tika version 3.2.2, which includes the official security patch.
- Implement PDF upload validation to block suspicious files.
- Apply network segmentation to minimize exploitation impact.
- Monitor for unusual XML parsing or outbound requests.
Because Apache Tika is widely used in enterprise document workflows, organizations must act quickly to patch and secure their systems.


