Solution to using sax to parse xml in java
In java, there are two ways to parse xml documents natively, namely: Dom parsing and Sax parsing
Dom parsing is powerful and can be added, deleted, modified and checked. During operation, the xml document will be treated as a document object. The method is read into the memory, so it is suitable for small documents
Sax parsing reads the content line by line and element by element from beginning to end. It is more inconvenient to modify, but it is suitable for large read-only documents
This article mainly explains Sax parsing, and the rest will be placed later
Sax uses an event-driven approach to parse documents. To put it simply, it is like watching a movie in a cinema. You can watch it from beginning to end without going back (Dom can read it back and forth)
In the process of watching a movie, every time you encounter a plot, A tear, a shoulder rub, you will mobilize your brain and nerves to receive or process this information
Similarly, during the parsing process of Sax, when the beginning and end of the document are read, the beginning and end of the element will trigger some Callback methods, you can perform corresponding event processing in these callback methods
These four methods are: startDocument(), endDocument(), startElement(), endElement
In addition, light reading It is not enough to go to the node. We also need the characters() method to carefully process the content contained in the element.
Collecting these callback methods forms a class, which is the trigger we need.
Generally, the document is read from the Main method, but the document is processed in the trigger. This is the so-called event-driven parsing method.

As shown above, in In the trigger, the document is first read, and then the elements are parsed one by one. The content of each element will be returned to the characters() method
Then the element reading is ended. After all elements are read, the document is ended. Analysis
Now we start to create the trigger class. To create this class, we first need to inherit DefaultHandler
Create SaxHandler and override the corresponding method:
import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;
public class SaxHandler extends DefaultHandler {
/* 此方法有三个参数
arg0是传回来的字符数组,其包含元素内容
arg1和arg2分别是数组的开始位置和结束位置 */
@Override
public void characters(char[] arg0, int arg1, int arg2) throws SAXException {
String content = new String(arg0, arg1, arg2);
System.out.println(content);
super.characters(arg0, arg1, arg2);
}
@Override
public void endDocument() throws SAXException {
System.out.println("\n…………结束解析文档…………");
super.endDocument();
}
/* arg0是名称空间
arg1是包含名称空间的标签,如果没有名称空间,则为空
arg2是不包含名称空间的标签 */
@Override
public void endElement(String arg0, String arg1, String arg2)
throws SAXException {
System.out.println("结束解析元素 " + arg2);
super.endElement(arg0, arg1, arg2);
}
@Override
public void startDocument() throws SAXException {
System.out.println("…………开始解析文档…………\n");
super.startDocument();
}
/*arg0是名称空间
arg1是包含名称空间的标签,如果没有名称空间,则为空
arg2是不包含名称空间的标签
arg3很明显是属性的集合 */
@Override
public void startElement(String arg0, String arg1, String arg2,
Attributes arg3) throws SAXException {
System.out.println("开始解析元素 " + arg2);
if (arg3 != null) {
for (int i = 0; i < arg3.getLength(); i++) {
// getQName()是获取属性名称,
System.out.print(arg3.getQName(i) + "=\"" + arg3.getValue(i) + "\"");
}
}
System.out.print(arg2 + ":");
super.startElement(arg0, arg1, arg2, arg3);
}
}XML document:
<?xml version="1.0" encoding="UTF-8"?>
<books>
<book id="001">
<title>Harry Potter</title>
<author>J K. Rowling</author>
</book>
<book id="002">
<title>Learning XML</title>
<author>Erik T. Ray</author>
</book>
</books>TestDemo test class:
import java.io.File;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
public class TestDemo {
public static void main(String[] args) throws Exception {
// 1.实例化SAXParserFactory对象
SAXParserFactory factory = SAXParserFactory.newInstance();
// 2.创建解析器
SAXParser parser = factory.newSAXParser();
// 3.获取需要解析的文档,生成解析器,最后解析文档
File f = new File("books.xml");
SaxHandler dh = new SaxHandler();
parser.parse(f, dh);
}
}Output result:
…………开始解析文档…………
开始解析元素 books
books:
开始解析元素 book
id="001"book:
开始解析元素 title
title:Harry Potter
结束解析元素 title
开始解析元素 author
author:J K. Rowling
结束解析元素 author
结束解析元素 book
开始解析元素 book
id="002"book:
开始解析元素 title
title:Learning XML
结束解析元素 title
开始解析元素 author
author:Erik T. Ray
结束解析元素 author
结束解析元素 book
结束解析元素 books
…………结束解析文档…………Although the above shows the execution process correctly, the output is very messy
For more clarity To execute this process, we can also rewrite SaxHandler to restore the original xml document
Rewritten SaxHandler class:
import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;
public class SaxHandler extends DefaultHandler {
@Override
public void characters(char[] arg0, int arg1, int arg2) throws SAXException {
System.out.print(new String(arg0, arg1, arg2));
super.characters(arg0, arg1, arg2);
}
@Override
public void endDocument() throws SAXException {
System.out.println("\n结束解析");
super.endDocument();
}
@Override
public void endElement(String arg0, String arg1, String arg2)
throws SAXException {
System.out.print("</");
System.out.print(arg2);
System.out.print(">");
super.endElement(arg0, arg1, arg2);
}
@Override
public void startDocument() throws SAXException {
System.out.println("开始解析");
String s = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>";
System.out.println(s);
super.startDocument();
}
@Override
public void startElement(String arg0, String arg1, String arg2,
Attributes arg3) throws SAXException {
System.out.print("<");
System.out.print(arg2);
if (arg3 != null) {
for (int i = 0; i < arg3.getLength(); i++) {
System.out.print(" " + arg3.getQName(i) + "=\"" + arg3.getValue(i) + "\"");
}
}
System.out.print(">");
super.startElement(arg0, arg1, arg2, arg3);
}
}More solutions to using sax to parse xml in java For method-related articles, please pay attention to the PHP Chinese website!
Hot AI Tools
Undress AI Tool
Undress images for free
AI Clothes Remover
Online AI tool for removing clothes from photos.
Undresser.AI Undress
AI-powered app for creating realistic nude photos
ArtGPT
AI image generator for creative art from text prompts.
Stock Market GPT
AI powered investment research for smarter decisions
Hot Article
Popular tool
Notepad++7.3.1
Easy-to-use and free code editor
SublimeText3 Chinese version
Chinese version, very easy to use
Zend Studio 13.0.1
Powerful PHP integrated development environment
Dreamweaver CS6
Visual web development tools
SublimeText3 Mac version
God-level code editing software (SublimeText3)
Hot Topics
20521
7
13633
4
How to convert XML to YAML for DevOps? (Configuration Management)
Mar 12, 2026 am 12:11 AM
xmltodict PyYAMListhesafestcomboforDevOpsconfigfilesbecauseitpreservescomments,CDATA,namespaces,andattributesaccurately,unlikerawXML-to-YAMLtoolsorCLIutilitieslikeyqandxmllintwhichsilentlydropcriticalmetadata.
How to format and beautify XML code in Notepad ? (Pretty Print)
Mar 07, 2026 am 12:20 AM
Notepad needs to manually install and enable the XMLTools plug-in to format XML; if the tags are messed up or the content is lost after formatting, it means that the XML itself is illegal, and there are problems such as unclosed tags or illegal characters.
How to minify XML files for faster web loading? (Performance Optimization)
Mar 08, 2026 am 12:16 AM
RunningminifyonXMLwithoutunderstandingitsrulesbreaksparsingoralterssemanticsbecausewhitespacecanbemeaningful;safeminificationrequiresdata-orientedXML,controlledgeneration/consumption,andstrictparserawareness.
How to convert an XML file to a Word document? (Reporting)
Mar 09, 2026 am 01:05 AM
python-docx does not support direct reading of XML files. You need to use xml.etree.ElementTree or lxml to parse the XML extraction fields first, and then write them into the Document object segment by segment. Explicit declaration of prefixes is required to process namespaces, and manual manipulation of the underlying XML is required for table merging and styling. Chinese paths should be avoided when saving.
How to parse XML data from a URL API? (Rest Services)
Mar 13, 2026 am 12:06 AM
To parse remote XML API in Python, you need to use requests to get the response and then check the status code and Content-Type. Prioritize using r.text with xml.etree.ElementTree to parse; when encountering a namespace, you need to pass the namespace dictionary; use iterparse to stream large files and clear them manually; front-end JS requires CORS support or proxy.
How to use Attributes vs Elements in XML? (Design Best Practices)
Mar 16, 2026 am 12:26 AM
You should use attributes to store short metadata (such as id, type), and use elements to store scalable content data; because attributes do not support namespaces, duplication, nesting, and internationalization, their parsing is error-prone and maintenance is difficult.
How to open and view XML files in Windows 11? (Beginner Guide)
Mar 12, 2026 am 01:02 AM
The XML file cannot be opened by double-clicking because it is associated with Notepad by default, causing confusion in the display. You should use Notepad, VSCode or Edge instead; Edge can format and report errors, while VSCode requires the installation of extensions such as RedHatXML for normal highlighting, indentation and verification.
How to read XML data in C# using LINQ? (.NET Development)
Mar 15, 2026 am 12:43 AM
XDocument.Load() is the preferred method for reading local XML files and automatically handles encoding, BOM and format exceptions; absolute or correct relative paths are required; namespaces must be explicitly declared and participate in queries; Elements() and Descendants() behave differently and should be selected as needed; string parsing must capture XmlException and verify the source.




