What is XPath and how to use it with an XML document?
XPath is a query language used to navigate and select elements and attributes in XML documents, locating nodes through path expressions. It treats XML as a node tree, supports multiple axes (such as child, parent, attribute) and functions (such as contains, position), and can accurately select the required data. For example, /books/book[@id='1'] selects the book element with id 1, //author[text()='Jane Smith'] selects the author element whose text is Jane Smith. Tree.xpath() can be used to perform queries in Python's lxml library, which is widely used in XSLT, XML Schema, Web crawlers and other fields.

XPath is a query language used to navigate and select elements and attributes within an XML document. It provides a way to locate specific nodes—like elements, attributes, text, or other parts of the XML structure—by using path expressions. XPath is widely used in technologies like XSLT, XML Schema, and web scraping tools to extract or manipulate data from XML.
Understanding XPath Basics
XPath treats an XML document as a tree of nodes. These nodes can be elements, attributes, text, comments, processing instructions, or the root node. To select parts of the document, you write path expressions similar to file system paths.
For example, given this XML:
You can use the following XPath expressions:
- /books/book – selects all book elements under books
- /books/book/title – selects all title elements inside each book
- /books/book[@id='1'] – selects the book with attribute id equal to "1"
- //author[text()='Jane Smith'] – selects the author element with text "Jane Smith"
Navigating with XPath Axes and Functions
XPath supports axes to define the direction of navigation (eg, parent, child, ancestor, following). Common axes include:
- child:: – selects direct children (default axis)
- parent:: – selects the parent node
- attribute:: or @ – selects attributes
- descendant:: – selects all descendants at any level
XPath also includes built-in functions:
- contains(text(), 'XML') – matches text containing "XML"
- position() – returns position of current node in a node-set
- last() – returns the last position in a node-set
Example: /books/book[position()=1] selects the first book.
Using XPath in Practice
To use XPath, you typically employ it within a programming language or tool that supports XML processing. For instance, in Python using the lxml library:
from lxml import etree#Parse XML
tree = etree.parse("books.xml")
# Use XPath to find all titles
titles = tree.xpath("//title/text()")
print(titles) # Output: ['Learning XML', 'Advanced XPath']
# Find book by author name
book = tree.xpath("//book[author='Jane Smith']")
if book:
print(book[0].get("id")) # Output: 2
In web development or testing frameworks like Selenium, XPath helps locate elements in HTML (which is XML-like), especially when IDs or classes are missing.
Basically, XPath gives precise control over selecting and querying XML content. With practice, you can write efficient expressions to extract exactly the data you need.
The above is the detailed content of What is XPath and how to use it with an XML document?. For more information, please follow other related articles on the PHP Chinese website!
Hot AI Tools
Undress AI Tool
Undress images for free
AI Clothes Remover
Online AI tool for removing clothes from photos.
Undresser.AI Undress
AI-powered app for creating realistic nude photos
ArtGPT
AI image generator for creative art from text prompts.
Stock Market GPT
AI powered investment research for smarter decisions
Hot Article
Popular tool
Notepad++7.3.1
Easy-to-use and free code editor
SublimeText3 Chinese version
Chinese version, very easy to use
Zend Studio 13.0.1
Powerful PHP integrated development environment
Dreamweaver CS6
Visual web development tools
SublimeText3 Mac version
God-level code editing software (SublimeText3)
Hot Topics
20529
7
13638
4
How to convert XML to YAML for DevOps? (Configuration Management)
Mar 12, 2026 am 12:11 AM
xmltodict PyYAMListhesafestcomboforDevOpsconfigfilesbecauseitpreservescomments,CDATA,namespaces,andattributesaccurately,unlikerawXML-to-YAMLtoolsorCLIutilitieslikeyqandxmllintwhichsilentlydropcriticalmetadata.
How to parse XML data from a URL API? (Rest Services)
Mar 13, 2026 am 12:06 AM
To parse remote XML API in Python, you need to use requests to get the response and then check the status code and Content-Type. Prioritize using r.text with xml.etree.ElementTree to parse; when encountering a namespace, you need to pass the namespace dictionary; use iterparse to stream large files and clear them manually; front-end JS requires CORS support or proxy.
How to use Attributes vs Elements in XML? (Design Best Practices)
Mar 16, 2026 am 12:26 AM
You should use attributes to store short metadata (such as id, type), and use elements to store scalable content data; because attributes do not support namespaces, duplication, nesting, and internationalization, their parsing is error-prone and maintenance is difficult.
How to open and view XML files in Windows 11? (Beginner Guide)
Mar 12, 2026 am 01:02 AM
The XML file cannot be opened by double-clicking because it is associated with Notepad by default, causing confusion in the display. You should use Notepad, VSCode or Edge instead; Edge can format and report errors, while VSCode requires the installation of extensions such as RedHatXML for normal highlighting, indentation and verification.
How to read XML data in C# using LINQ? (.NET Development)
Mar 15, 2026 am 12:43 AM
XDocument.Load() is the preferred method for reading local XML files and automatically handles encoding, BOM and format exceptions; absolute or correct relative paths are required; namespaces must be explicitly declared and participate in queries; Elements() and Descendants() behave differently and should be selected as needed; string parsing must capture XmlException and verify the source.
How to use XML for Android layout design? (UI Programming)
Mar 19, 2026 am 12:20 AM
XMLisstillusedforAndroidlayoutdesignasastaticdeclarativelayer,notlogic;itdefinesviewhierarchyandattributesbutrequiresKotlin/Javaforbehavior,dynamicchanges,orcustomlogic.
How to read XML configuration in Spring Boot? (Java Framework)
Mar 13, 2026 am 12:17 AM
SpringBoot loads XML configuration files through the @ImportResource annotation, which needs to be used with the @Configuration class. It supports classpath relative paths and multi-file arrays, but you need to manually register the PropertySourcesPlaceholderConfigurer and declare the context namespace to parse placeholders.
How to validate XML against an XSD schema? (Syntax Checking)
Mar 15, 2026 am 12:09 AM
lxml is the most reliable choice in Python to verify whether XML conforms to XSD. You need to use etree.XMLSchema() to load XSD and etree.fromstring() to load XML. The standard library ElementTree does not support this function.





