网页爬虫 - java爬虫如何解析JavaScript

Question

用java爬取JavaScript动态生成的页面，怎么解析js？用rhino?phantomjs?还是其他的什么？

天蓬老师 · Answer

phantomJS can.

迷茫 · Answer

Unless it is very complex front-end logic (for example, there is a lot of logic to calculate tokens), it is not recommended to simulate the execution of js
If it is dynamically loaded data, it would be simpler to just get json directly#🎜🎜 #

怪我咯 · Answer

There is a jar package for parsing js scripts, but I can’t remember the specific package.

大家讲道理 · Answer

As far as crawlers are concerned, it is not advisable to directly simulate the browser to parse Javascript. You can directly capture and generate the json of the corresponding web page to achieve this.

迷茫 · Answer

If you use java, you can try Selinium’s WebDriver. If you use js, just use phantomjs

大家讲道理 · Answer

Refer to this document
How to crawl data dynamically generated by JS? http://doc.shenjianshou.cn/de...