Backend Development
XML/RSS Tutorial
Understanding and Preventing XML External Entity (XXE) Attacks
Understanding and Preventing XML External Entity (XXE) Attacks
XXE attacks are implemented through XML parser to process DOCTYPE declarations containing external entities, such as reading /etc/passwd files; 2. Common consequences include local file leakage, SSRF and denial of service; 3. Most of the APIs, file parsers and legacy XML libraries that accept XML input; 4. The core of the prevention is to disable DTD and external entities, give priority to using JSON instead of XML, strictly verify the input and keep the library updated, and you can effectively defend against XXE attacks.

XML External Entity (XXE) attacks happen when an application processes XML input that contains references to external entities—often malicious ones—leading to data exposure, server-side request forgery (SSRF), or even remote code execution in rare cases. Understanding how XXE works and how to prevent it is essential for secure application development.

What Is an XXE Attack?
At its core, XXE exploits the way XML parsers handle DOCTYPE declarations. These declarations can define custom entities that reference local or remote resources. For example:
<?xml version="1.0"?> <!DOCTYPE foo [ <!ENTITY xxe SYSTEM "file:///etc/passwd"> ]> <user>&xxe;</user>
If the XML parser processes this input without safeguards, it may read and return the contents of /etc/passwd —exposing sensitive system files.

Common outcomes of XXE:
- Reading local files (eg, config files, passwords)
- Accessing internal services via SSRF (eg,
http://127.0.0.1:8080/admin) - Denial of service via "billion laughs" attacks (entity expansion bombs)
Where XXE Happens Most
- APIs accepting XML uploads (eg, SOAP, REST with XML payloads)
- File parsers (eg, DOCX, ODF, or SVG files that embed XML)
- Legacy systems using older XML libraries with unsafe defaults
How to Prevent XXE
The key is to disable external entity processing in your XML parser. Here's how:
✅ Disable DTDs Entirely (Best Practice)
Most applications don't need DOCTYPE declarations. Just turn them off:
Java (SAX, DOM, etc.) :
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance(); dbf.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true); // Or disable DTD completely: dbf.setFeature("http://xml.org/sax/features/external-general-entities", false);Python (lxml) :
from lxml import etree parser = etree.XMLParser(resolve_entities=False, no_network=True)
.NET (System.Xml) :
XmlReaderSettings settings = new XmlReaderSettings(); settings.DtdProcessing = DtdProcessing.Prohibit; // or Parse only if needed settings.XmlResolver = null;
✅ Use Simpler Data Formats (If Possible)
If you don't need XML, switch to JSON—it's less prone to parsing pitfalls and doesn't support entities by design.
✅ Validate and Sanitize Input
Even with safe parsing, treat XML input like any untrusted data:
- Whitelist expected elements/attributes
- Reject XML with DOCTYPE if not required
- Monitor for suspicious payloads (eg,
file://,http://internal)
✅ Keep Libraries Updated
Older versions of libraries like Apache Xerces, older PHP XML parsers, or outdated .NET frameworks had unsafe defaults. Always use current, patched versions.
Bottom Line
XXE is a classic vulnerability that's easy to prevent—if you know to look for it. The biggest mistake? Assuming your XML parser is safe by default. It often isn't.
Disable DTDs, avoid unnecessary XML complexity, and test your apps with known XXE payloads during security reviews.That's it—no magic, just solid defaults and awareness.
The above is the detailed content of Understanding and Preventing XML External Entity (XXE) Attacks. For more information, please follow other related articles on the PHP Chinese website!
Hot AI Tools
Undress AI Tool
Undress images for free
AI Clothes Remover
Online AI tool for removing clothes from photos.
Undresser.AI Undress
AI-powered app for creating realistic nude photos
ArtGPT
AI image generator for creative art from text prompts.
Stock Market GPT
AI powered investment research for smarter decisions
Hot Article
Popular tool
Notepad++7.3.1
Easy-to-use and free code editor
SublimeText3 Chinese version
Chinese version, very easy to use
Zend Studio 13.0.1
Powerful PHP integrated development environment
Dreamweaver CS6
Visual web development tools
SublimeText3 Mac version
God-level code editing software (SublimeText3)
Hot Topics
20519
7
13632
4
How to format and beautify XML code in Notepad ? (Pretty Print)
Mar 07, 2026 am 12:20 AM
Notepad needs to manually install and enable the XMLTools plug-in to format XML; if the tags are messed up or the content is lost after formatting, it means that the XML itself is illegal, and there are problems such as unclosed tags or illegal characters.
How to convert XML to YAML for DevOps? (Configuration Management)
Mar 12, 2026 am 12:11 AM
xmltodict PyYAMListhesafestcomboforDevOpsconfigfilesbecauseitpreservescomments,CDATA,namespaces,andattributesaccurately,unlikerawXML-to-YAMLtoolsorCLIutilitieslikeyqandxmllintwhichsilentlydropcriticalmetadata.
How to minify XML files for faster web loading? (Performance Optimization)
Mar 08, 2026 am 12:16 AM
RunningminifyonXMLwithoutunderstandingitsrulesbreaksparsingoralterssemanticsbecausewhitespacecanbemeaningful;safeminificationrequiresdata-orientedXML,controlledgeneration/consumption,andstrictparserawareness.
How to convert an XML file to a Word document? (Reporting)
Mar 09, 2026 am 01:05 AM
python-docx does not support direct reading of XML files. You need to use xml.etree.ElementTree or lxml to parse the XML extraction fields first, and then write them into the Document object segment by segment. Explicit declaration of prefixes is required to process namespaces, and manual manipulation of the underlying XML is required for table merging and styling. Chinese paths should be avoided when saving.
How to use Attributes vs Elements in XML? (Design Best Practices)
Mar 16, 2026 am 12:26 AM
You should use attributes to store short metadata (such as id, type), and use elements to store scalable content data; because attributes do not support namespaces, duplication, nesting, and internationalization, their parsing is error-prone and maintenance is difficult.
How to parse XML data from a URL API? (Rest Services)
Mar 13, 2026 am 12:06 AM
To parse remote XML API in Python, you need to use requests to get the response and then check the status code and Content-Type. Prioritize using r.text with xml.etree.ElementTree to parse; when encountering a namespace, you need to pass the namespace dictionary; use iterparse to stream large files and clear them manually; front-end JS requires CORS support or proxy.
How to open and view XML files in Windows 11? (Beginner Guide)
Mar 12, 2026 am 01:02 AM
The XML file cannot be opened by double-clicking because it is associated with Notepad by default, causing confusion in the display. You should use Notepad, VSCode or Edge instead; Edge can format and report errors, while VSCode requires the installation of extensions such as RedHatXML for normal highlighting, indentation and verification.
How to read XML data in C# using LINQ? (.NET Development)
Mar 15, 2026 am 12:43 AM
XDocument.Load() is the preferred method for reading local XML files and automatically handles encoding, BOM and format exceptions; absolute or correct relative paths are required; namespaces must be explicitly declared and participate in queries; Elements() and Descendants() behave differently and should be selected as needed; string parsing must capture XmlException and verify the source.





