How to Parse XML Files in Python with ElementTree-XML/RSS Tutorial-php.cn

Table of Contents

1. Load and Parse an XML File

2. Navigate and Access XML Elements

Example XML:

Access elements:

3. Handle Attributes and Text Content

4. Search with XPath-Like Expressions

5. Modify and Write Back to File

Final Tips

Home

Backend Development

XML/RSS Tutorial

How to Parse XML Files in Python with ElementTree

Karen Carpenter

Sep 17, 2025 am 04:12 AM

Use ElementTree to easily parse XML files: 1. Use ET.parse() to read the file or ET.fromstring() to parse the string; 2. Use .find() to get the first matching child element, .findall() to get all matching elements, and obtain attributes and .text to get text content; 3. Use find() to determine whether it exists or use findtext() to set the default value; 4. Support basic XPath syntax such as './/title' or './/book[@id="1"]' for in-depth searches; 5. Add new elements through ET.SubElement(), and after modifying the content, call tree.write() to save to the file; it is also recommended to use try-except to catch ParseError exception, pay attention to syntax when handling XML with namespace, and large files can use iterparse() to save memory. This method does not require external dependencies and is suitable for common scenarios such as configuration file reading and data exchange.

How to Parse XML Files in Python with ElementTree

Parsing XML files in Python is straightforward using the built-in xml.etree.ElementTree module, commonly referred to as ElementTree . It's lightweight, easy to use, and perfect for reading, modifying, and creating XML data.

Here's how to work with XML files using ElementTree in real-world scenarios.

1. Load and Parse an XML File

Start by importing the module and parsing an XML file from disk.

 import xml.etree.ElementTree as ET

# Parse the XML file
tree = ET.parse(&#39;data.xml&#39;)
root = tree.getroot() # Get the root element

If your XML is in a string instead of a file:

 xml_string = &#39;&#39;&#39;
<books>
  <book id="1">
    <title>Python Basics</title>
    <author>John Doe</author>
  </book>
</books>
&#39;&#39;&#39;

root = ET.fromstring(xml_string)

✅ Use ET.parse() for files, ET.fromstring() for strings.

2. Navigate and Access XML Elements

Once you have the root, you can traverse the tree using methods like .find() , .findall() , and .iter() .

Example XML:

 <library>
  <book id="1">
    <title>Learning Python</title>
    <author>Mark Smith</author>
  </book>
  <book id="2">
    <title>Data Science with Python</title>
    <author>Anna Lee</author>
  </book>
</library>

Access elements:

 # Get the first <book> element
first_book = root.find(&#39;book&#39;)

# Get all <book> elements
books = root.findall(&#39;book&#39;)

for book in books:
    title = book.find(&#39;title&#39;).text
    author = book.find(&#39;author&#39;).text
    book_id = book.get(&#39;id&#39;) # Get attribute
    print(f"ID: {book_id}, Title: {title}, Author: {author}")

? .find() returns the first matching child; .findall() returns a list.

3. Handle Attributes and Text Content

XML elements can have attributes and text. Use .get() for attributes and .text for content.

 for book in root.findall(&#39;book&#39;):
    print("ID:", book.get(&#39;id&#39;)) # Attribute
    print("Title:", book.find(&#39;title&#39;).text) # Text inside child

If a tag might be missing, avoid errors by checking:

 title_elem = book.find(&#39;title&#39;)
title = title_elem.text if title_elem is not None else "Unknown"

Or use a default:

 title = book.findtext(&#39;title&#39;, default=&#39;No Title&#39;)

4. Search with XPath-Like Expressions

ElementTree supports basic XPath expressions for deeper searches.

 # Find all titles under any book
titles = root.findall(&#39;.//title&#39;)

# Find books with a specific attribute
special_books = root.findall(&#39;.//book[@id="1"]&#39;)

# Find any element with attribute &#39;id&#39;
elements_with_id = root.findall(&#39;.//*[@id]&#39;)

? Only a subset of XPath is supported, but it's enough for most use cases.

5. Modify and Write Back to File

You can also edit the XML and save it.

 # Add a new book
new_book = ET.SubElement(root, &#39;book&#39;, attrib={&#39;id&#39;: &#39;3&#39;})
ET.SubElement(new_book, &#39;title&#39;).text = &#39;Web Scraping with Python&#39;
ET.SubElement(new_book, &#39;author&#39;).text = &#39;Jane Cole&#39;

# Modify an existing element
for book in root.findall(&#39;book&#39;):
    if book.find(&#39;author&#39;).text == &#39;Anna Lee&#39;:
        book.find(&#39;author&#39;).text = &#39;A. Lee&#39;

# Write changes to a new file
tree.write(&#39;updated_data.xml&#39;, encoding=&#39;utf-8&#39;, xml_declaration=True)

? Always call .write() on the tree , not the root .

Final Tips

Handle malformed XML with try-except:

 try:
    tree = ET.parse(&#39;data.xml&#39;)
except ET.ParseError as e:
    print(f"XML parsing error: {e}")

Namespaces? Use {namespace}tagname in searches if needed.
For large files, consider iterparse() to stream and save memory.

Basically, ElementTree gives you a clean, independent way to work with XML without external dependencies. It's not as powerful as full XPath engines, but for most tasks — reading config files, processing feeds, or simple data exchange — it's more than enough.

The above is the detailed content of How to Parse XML Files in Python with ElementTree. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undress AI Tool

Undress images for free

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undresser.AI Undress

AI-powered app for creating realistic nude photos

ArtGPT

AI image generator for creative art from text prompts.

Stock Market GPT

AI powered investment research for smarter decisions

Hot Article

The Notepad upgrade, cheaper YouTube TV, and Nova Launcher's new owner: News roundup

4 weeks ago By DDD

How to get Iron Ore in Pokémon Pokopia

1 months ago By Jack chen

How to apply the facade pattern (Facade) in Golang Go language simplifies the API of complex systems

4 weeks ago By DDD

Solve the error of multidict build failure when installing Python package

1 months ago By DDD

Catch a Monster Best Mounts and Locations

4 weeks ago By DDD

Popular tool

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Hot Topics

Douyin level price list 1-75

20521

wifi shows no ip assigned

13634

Virtual mobile phone number to receive verification code

11966

Where is the login entrance for gmail email?

9005

How to turn off windows security center

8505

Related knowledge

How to convert XML to YAML for DevOps? (Configuration Management) Mar 12, 2026 am 12:11 AM

xmltodict PyYAMListhesafestcomboforDevOpsconfigfilesbecauseitpreservescomments,CDATA,namespaces,andattributesaccurately,unlikerawXML-to-YAMLtoolsorCLIutilitieslikeyqandxmllintwhichsilentlydropcriticalmetadata.

How to format and beautify XML code in Notepad ? (Pretty Print) Mar 07, 2026 am 12:20 AM

Notepad needs to manually install and enable the XMLTools plug-in to format XML; if the tags are messed up or the content is lost after formatting, it means that the XML itself is illegal, and there are problems such as unclosed tags or illegal characters.

How to minify XML files for faster web loading? (Performance Optimization) Mar 08, 2026 am 12:16 AM

RunningminifyonXMLwithoutunderstandingitsrulesbreaksparsingoralterssemanticsbecausewhitespacecanbemeaningful;safeminificationrequiresdata-orientedXML,controlledgeneration/consumption,andstrictparserawareness.

How to convert an XML file to a Word document? (Reporting) Mar 09, 2026 am 01:05 AM

python-docx does not support direct reading of XML files. You need to use xml.etree.ElementTree or lxml to parse the XML extraction fields first, and then write them into the Document object segment by segment. Explicit declaration of prefixes is required to process namespaces, and manual manipulation of the underlying XML is required for table merging and styling. Chinese paths should be avoided when saving.

How to parse XML data from a URL API? (Rest Services) Mar 13, 2026 am 12:06 AM

To parse remote XML API in Python, you need to use requests to get the response and then check the status code and Content-Type. Prioritize using r.text with xml.etree.ElementTree to parse; when encountering a namespace, you need to pass the namespace dictionary; use iterparse to stream large files and clear them manually; front-end JS requires CORS support or proxy.

How to use Attributes vs Elements in XML? (Design Best Practices) Mar 16, 2026 am 12:26 AM

You should use attributes to store short metadata (such as id, type), and use elements to store scalable content data; because attributes do not support namespaces, duplication, nesting, and internationalization, their parsing is error-prone and maintenance is difficult.

How to open and view XML files in Windows 11? (Beginner Guide) Mar 12, 2026 am 01:02 AM

The XML file cannot be opened by double-clicking because it is associated with Notepad by default, causing confusion in the display. You should use Notepad, VSCode or Edge instead; Edge can format and report errors, while VSCode requires the installation of extensions such as RedHatXML for normal highlighting, indentation and verification.

How to read XML data in C# using LINQ? (.NET Development) Mar 15, 2026 am 12:43 AM

XDocument.Load() is the preferred method for reading local XML files and automatically handles encoding, BOM and format exceptions; absolute or correct relative paths are required; namespaces must be explicitly declared and participate in queries; Elements() and Descendants() behave differently and should be selected as needed; string parsing must capture XmlException and verify the source.