How to Parse XML Files in Python with ElementTree
Use ElementTree to easily parse XML files: 1. Use ET.parse() to read the file or ET.fromstring() to parse the string; 2. Use .find() to get the first matching child element, .findall() to get all matching elements, and obtain attributes and .text to get text content; 3. Use find() to determine whether it exists or use findtext() to set the default value; 4. Support basic XPath syntax such as './/title' or './/book[@id="1"]' for in-depth searches; 5. Add new elements through ET.SubElement(), and after modifying the content, call tree.write() to save to the file; it is also recommended to use try-except to catch ParseError exception, pay attention to syntax when handling XML with namespace, and large files can use iterparse() to save memory. This method does not require external dependencies and is suitable for common scenarios such as configuration file reading and data exchange.

Parsing XML files in Python is straightforward using the built-in xml.etree.ElementTree module, commonly referred to as ElementTree . It's lightweight, easy to use, and perfect for reading, modifying, and creating XML data.

Here's how to work with XML files using ElementTree in real-world scenarios.
1. Load and Parse an XML File
Start by importing the module and parsing an XML file from disk.

import xml.etree.ElementTree as ET # Parse the XML file tree = ET.parse('data.xml') root = tree.getroot() # Get the root element
If your XML is in a string instead of a file:
xml_string = '''
<books>
<book id="1">
<title>Python Basics</title>
<author>John Doe</author>
</book>
</books>
'''
root = ET.fromstring(xml_string)✅ Use
ET.parse()for files,ET.fromstring()for strings.
2. Navigate and Access XML Elements
Once you have the root, you can traverse the tree using methods like .find() , .findall() , and .iter() .
Example XML:
<library>
<book id="1">
<title>Learning Python</title>
<author>Mark Smith</author>
</book>
<book id="2">
<title>Data Science with Python</title>
<author>Anna Lee</author>
</book>
</library>Access elements:
# Get the first <book> element
first_book = root.find('book')
# Get all <book> elements
books = root.findall('book')
for book in books:
title = book.find('title').text
author = book.find('author').text
book_id = book.get('id') # Get attribute
print(f"ID: {book_id}, Title: {title}, Author: {author}")?
.find()returns the first matching child;.findall()returns a list.
3. Handle Attributes and Text Content
XML elements can have attributes and text. Use .get() for attributes and .text for content.
for book in root.findall('book'):
print("ID:", book.get('id')) # Attribute
print("Title:", book.find('title').text) # Text inside childIf a tag might be missing, avoid errors by checking:
title_elem = book.find('title') title = title_elem.text if title_elem is not None else "Unknown"
Or use a default:
title = book.findtext('title', default='No Title')
4. Search with XPath-Like Expressions
ElementTree supports basic XPath expressions for deeper searches.
# Find all titles under any book titles = root.findall('.//title') # Find books with a specific attribute special_books = root.findall('.//book[@id="1"]') # Find any element with attribute 'id' elements_with_id = root.findall('.//*[@id]')
? Only a subset of XPath is supported, but it's enough for most use cases.
5. Modify and Write Back to File
You can also edit the XML and save it.
# Add a new book
new_book = ET.SubElement(root, 'book', attrib={'id': '3'})
ET.SubElement(new_book, 'title').text = 'Web Scraping with Python'
ET.SubElement(new_book, 'author').text = 'Jane Cole'
# Modify an existing element
for book in root.findall('book'):
if book.find('author').text == 'Anna Lee':
book.find('author').text = 'A. Lee'
# Write changes to a new file
tree.write('updated_data.xml', encoding='utf-8', xml_declaration=True)? Always call
.write()on thetree, not theroot.
Final Tips
Handle malformed XML with try-except:
try: tree = ET.parse('data.xml') except ET.ParseError as e: print(f"XML parsing error: {e}")Namespaces? Use
{namespace}tagnamein searches if needed.For large files, consider
iterparse()to stream and save memory.
Basically, ElementTree gives you a clean, independent way to work with XML without external dependencies. It's not as powerful as full XPath engines, but for most tasks — reading config files, processing feeds, or simple data exchange — it's more than enough.
The above is the detailed content of How to Parse XML Files in Python with ElementTree. For more information, please follow other related articles on the PHP Chinese website!
Hot AI Tools
Undress AI Tool
Undress images for free
AI Clothes Remover
Online AI tool for removing clothes from photos.
Undresser.AI Undress
AI-powered app for creating realistic nude photos
ArtGPT
AI image generator for creative art from text prompts.
Stock Market GPT
AI powered investment research for smarter decisions
Hot Article
Popular tool
Notepad++7.3.1
Easy-to-use and free code editor
SublimeText3 Chinese version
Chinese version, very easy to use
Zend Studio 13.0.1
Powerful PHP integrated development environment
Dreamweaver CS6
Visual web development tools
SublimeText3 Mac version
God-level code editing software (SublimeText3)
Hot Topics
20521
7
13634
4
How to convert XML to YAML for DevOps? (Configuration Management)
Mar 12, 2026 am 12:11 AM
xmltodict PyYAMListhesafestcomboforDevOpsconfigfilesbecauseitpreservescomments,CDATA,namespaces,andattributesaccurately,unlikerawXML-to-YAMLtoolsorCLIutilitieslikeyqandxmllintwhichsilentlydropcriticalmetadata.
How to format and beautify XML code in Notepad ? (Pretty Print)
Mar 07, 2026 am 12:20 AM
Notepad needs to manually install and enable the XMLTools plug-in to format XML; if the tags are messed up or the content is lost after formatting, it means that the XML itself is illegal, and there are problems such as unclosed tags or illegal characters.
How to minify XML files for faster web loading? (Performance Optimization)
Mar 08, 2026 am 12:16 AM
RunningminifyonXMLwithoutunderstandingitsrulesbreaksparsingoralterssemanticsbecausewhitespacecanbemeaningful;safeminificationrequiresdata-orientedXML,controlledgeneration/consumption,andstrictparserawareness.
How to convert an XML file to a Word document? (Reporting)
Mar 09, 2026 am 01:05 AM
python-docx does not support direct reading of XML files. You need to use xml.etree.ElementTree or lxml to parse the XML extraction fields first, and then write them into the Document object segment by segment. Explicit declaration of prefixes is required to process namespaces, and manual manipulation of the underlying XML is required for table merging and styling. Chinese paths should be avoided when saving.
How to parse XML data from a URL API? (Rest Services)
Mar 13, 2026 am 12:06 AM
To parse remote XML API in Python, you need to use requests to get the response and then check the status code and Content-Type. Prioritize using r.text with xml.etree.ElementTree to parse; when encountering a namespace, you need to pass the namespace dictionary; use iterparse to stream large files and clear them manually; front-end JS requires CORS support or proxy.
How to use Attributes vs Elements in XML? (Design Best Practices)
Mar 16, 2026 am 12:26 AM
You should use attributes to store short metadata (such as id, type), and use elements to store scalable content data; because attributes do not support namespaces, duplication, nesting, and internationalization, their parsing is error-prone and maintenance is difficult.
How to open and view XML files in Windows 11? (Beginner Guide)
Mar 12, 2026 am 01:02 AM
The XML file cannot be opened by double-clicking because it is associated with Notepad by default, causing confusion in the display. You should use Notepad, VSCode or Edge instead; Edge can format and report errors, while VSCode requires the installation of extensions such as RedHatXML for normal highlighting, indentation and verification.
How to read XML data in C# using LINQ? (.NET Development)
Mar 15, 2026 am 12:43 AM
XDocument.Load() is the preferred method for reading local XML files and automatically handles encoding, BOM and format exceptions; absolute or correct relative paths are required; namespaces must be explicitly declared and participate in queries; Elements() and Descendants() behave differently and should be selected as needed; string parsing must capture XmlException and verify the source.





