java 、jsoup正则表达式
阿神
阿神 2017-04-18 09:56:41
0
3
477

如何通过正则表达式或者jsoup将19040172b-1SQL Server开发郑尚3-5,7-14(周)东区综合楼D-101 提取出来?,

 <p id="AE9D7F630640426F8457A661607D2B8E-5-2" style="display: none;" class="kbcontent">
  19040172b-1
  <br>SQL Server开发
  <br>
  <font title="老师">郑尚</font>
  <br>
  <font title="周次(节次)">3-5,7-14(周)</font>
  <br>
  <font title="教室">东区综合楼D-101</font>
  <br>
 </p>

已尝试下列办法均失败

1. Pattern pattern = Pattern.compile(">(.*?)<br>");

2. Elements msg = doc.select(":matchesOwn([>.*?<br>])");
阿神
阿神

闭关修行中......

reply all(3)
Peter_Zhu
class="kbcontent".*?>(.*?)<.*?>(.*?)<.*?老师.*?>(.*?)<.*?周次\(节次\).*?>(.*?)<.*?教室.*?>(.*?)<

Withdraw $1,$2,$3,$4,$5

伊谢尔伦
Document document = Jsoup.parse("<p id=\"AE9D7F630640426F8457A661607D2B8E-5-2\" style=\"display: none;\" class=\"kbcontent\"> 19040172b-1 <br>SQL Server开发 <br> <font title=\"老师\">郑尚</font> <br> <font title=\"周次(节次)\">3-5,7-14(周)</font> <br> <font title=\"教室\">东区综合楼D-101</font> <br> </p>");
System.out.println(document.text());

Output:19040172b-1 SQL Server开发 郑尚 3-5,7-14(周) 东区综合楼D-101
I don’t know if it meets the poster’s needs?


Document document = Jsoup.parse("<p id=\"AE9D7F630640426F8457A661607D2B8E-5-2\" style=\"display: none;\" class=\"kbcontent\"> 19040172b-1 <br>SQL Server开发 <br> <font title=\"老师\">郑尚</font> <br> <font title=\"周次(节次)\">3-5,7-14(周)</font> <br> <font title=\"教室\">东区综合楼D-101</font> <br> </p>");
Element p = document.getElementById("AE9D7F630640426F8457A661607D2B8E-5-2");
TextNode n1 = (TextNode) p.childNode(0);
System.out.println(n1.text()); // 19040172b-1

TextNode n2 = (TextNode) p.childNode(2);
System.out.println(n2.text()); // SQL Server开发
// ...

If the poster’s format is fixed, just parse it as aboveHTML会比较好一些,不需要REGEX.

Peter_Zhu
String html = "<p id=\"AE9D7F630640426F8457A661607D2B8E-5-2\" style=\"display: none;\" class=\"kbcontent\">  19040172b-1  <br>SQL Server Develop  <br>  <font title=\"teacher\">zheng</font>  <br>  <font title=\"week\">3-5,7-14</font>  <br>  <font title=\"classroom\">D-101</font>  <br> </p> ";
        html = html.replaceAll("<br>", "#~#");
        Document doc = Jsoup.parse(html.toString());
        String newHtml = doc.text();
        String[] ary = newHtml.split("#~#");

        for (int i = 0;i < ary.length;i++){
            System.out.println(ary[i]);
        }

My needs are roughly like this

Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template
About us Disclaimer Sitemap
php.cn:Public welfare online PHP training,Help PHP learners grow quickly!