Community Learn Tools Library Leisure

English

Home > Web Front-end > JS Tutorial > Implementation code for web scraping using phantomjs_javascript skills

Implementation code for web scraping using phantomjs_javascript skills

WBOY

Release： 2016-05-16 16:35:00

Original

1319 people have browsed it

Because phantomjs is a headless browser that can run js, it can also run dom nodes, which is perfect for web crawling.

For example, we want to batch crawl the content of "Today in History" on the web page. Website

Observing the dom structure, we only need to get the title value of .list li a. So we use advanced selectors to build DOM fragments

var d= ''
var c = document.querySelectorAll('.list li a')
var l = c.length;
for(var i =0;i<l;i++){
d=d+c[i].title+'\n'
}

Copy after login

After that, you only need to let the js code run in phantomjs~

var page = require('webpage').create();
	page.open('http://www.todayonhistory.com/', function (status) { //打开页面
		if (status !== 'success') {
			console.log('FAIL to load the address');
		} else {
			console.log(page.evaluate(function () {
					var d= ''
					var c = document.querySelectorAll('.list li a')
					var l = c.length;
					for(var i =0;i<l;i++){
					d=d+c[i].title+'\n'
					}
						return d
				}))

		}
		phantom.exit();
	});

Copy after login

Finally we save it as catch.js, execute it in dos, and output the content to a txt file (you can also use the file api of phantomjs to write)

Related labels：

phantomjs web scraping

source：php.cn

Previous article：Things to pay attention to in Javascript form validation_javascript tips Next article：How to get the real width and height of the image with js and jquery_javascript skills

Statement of this Website

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Latest Articles by Author

What is a NullPointerException, and how do I fix it?

2024-10-22 09:46:29
From Novice to Coder: Your Journey Begins with C Fundamentals

2024-10-13 13:53:41
Unlocking Web Development with PHP: A Beginner's Guide

2024-10-12 12:15:51
Demystifying C: A Clear and Simple Path for New Programmers

2024-10-11 22:47:31
Unlock Your Coding Potential: C Programming for Absolute Beginners

2024-10-11 19:36:51
Unleash Your Inner Programmer: C for Absolute Beginners

2024-10-11 15:50:41
Automate Your Life with C: Scripts and Tools for Beginners

2024-10-11 15:07:41
PHP Made Easy: Your First Steps in Web Development

2024-10-11 14:21:21
Build Anything with Python: A Beginner's Guide to Unleashing Your Creativity

2024-10-11 12:59:11
The Key to Coding: Unlocking the Power of Python for Beginners

2024-10-11 12:17:31

Latest Issues

function_exists() cannot determine the custom function Function test () {return true;} if (function_exists ('test')) {echo "test is function...

From 2024-04-29 11:01:01

0

3

2250

How to display the mobile version of Google Chrome Hello teacher, how can I change Google Chrome into a mobile version?

From 2024-04-23 00:22:19

0

11

2388

The child window operates the parent window, but the output does not respond. The first two sentences are executable, but the last sentence cannot be implemented.

From 2024-04-19 15:37:47

0

1

2000

There is no output in the parent window document.onclick = function(){ window.opener.document.write('I am the output of the child ...

From 2024-04-18 23:52:34

0

1

1885

Where is the courseware about CSS mind mapping? Courseware

From 2024-04-16 10:10:18

0

0

1955

Related Topics

More>

Popular Recommendations

Popular Tutorials

More>

Related Tutorials

Popular Recommendations

Latest courses

Latest Downloads

More>

Web Effects

Website Source Code

Website Materials

Front End Template