• 技术文章 >web前端 >html教程

    jsoup:解析HTML用法小结_html/css_WEB-ITnose

    2016-06-24 11:42:42原创888
    1.解析方式

    (1)从字符串解析

    String html = "First parse

    Parse HTML into a doc.

    ";

    Document doc = Jsoup.parse(html);

    ?

    (2)从URL获取解析

    Document doc = Jsoup.connect("http://example.com/").get();

    String title = doc.title();

    Document doc = Jsoup.connect("http://example.com") .data("query", "Java").userAgent("Mozilla").cookie("auth", "token").timeout(3000).post();

    ?

    ?

    (3)从文件解析

    File input = new File("/tmp/input.html");

    Document doc = Jsoup.parse(input, "UTF-8", "http://example.com/");


    2.DOM方式遍历元素
    (1)搜索元素

    getElementById(String id)

    getElementByTag(String tag)

    getElementByClass(String className)

    getElementByAttribute(String key)

    siblingElements(), firstElementSibling(), lastElementSibling(), nextElementSibling(), previousElementSibling()

    parent(), children(), child(int index)

    (2)获取元素数据

    attr(String key) ? 获取key属性

    attributes() ? 获取属性

    id(), className(), classNames()

    text() ? 获取文本内容

    html() ? 获取元素内部HTML内容

    outerHtml() ? 获取包括此元素的HTML内容

    data() ? 获取

    "); }; //check document ready function docReady(t) { "complete" === document.readyState || "interactive" === document.readyState ? setTimeout(t, 1) : document.addEventListener("DOMContentLoaded", t); } //check if wwads' fire function was blocked after document is ready with 3s timeout (waiting the ad loading) docReady(function () { setTimeout(function () { if( window._AdBlockInit === undefined ){ ABDetected(); } }, 3000); });