Introduction to WeChat applet parsing web content

不言
Release: 2018-06-27 14:35:15
Original
1610 people have browsed it

This article mainly introduces the relevant information about the detailed explanation and examples of WeChat applet parsing web page content. Here we use crawlers to crawl complex web pages. If we encounter some problems, we will sort them out and solve them here. Friends who need them can refer to them.

Detailed explanation of web content parsing by WeChat applet

I am currently writing a crawler, which needs to parse web pages for use by WeChat applet. Both text and image analysis are easy to understand, and the mini program also has corresponding text and image tags for presentation. More complex ones, such as tables, are more difficult. Whether it is server-side parsing or mini program rendering, it is very laborious, and it is difficult to cover all situations. So I thought that converting the HTML code corresponding to the table into images would be a workaround.

Here we use the node-webshot module, which lightly encapsulates PhantomJS and can easily save web pages as screenshots.

First install Node.js and PhantomJS, then create a new js file and load the node-webshot module:

const webshot = require('webshot');
Copy after login

Define options:

const options = { // 浏览器窗口 screenSize: { width: 755, height: 25 }, // 要截图的页面文档区域 shotSize: { height: 'all' }, // 网页类型 siteType: 'html' };
Copy after login

Here, the width of the browser window It should be set reasonably according to the situation of the web page. The height can be set to a very small value. Then the height of the page document area must be set to all, and the width defaults to the window width, so that the table can be completely screenshotted at the smallest size.

Next, define the html string:

let html = "target rich text html code, eg: ...
";
Copy after login

Note that the HTML code inside must remove newlines and replace double quotes with single quotes.

Finally, screenshot:

webshot(html, 'demo.png', options, (err) => { if (err) console.log(`Webshot error: ${err.message}`); });
Copy after login

In this way, the conversion from HTML code to local image is achieved, which can be subsequently uploaded to Qiniu Cloud, etc. Whether it is server-side parsing or small program presentation, there is no difficulty...

The above is the entire content of this article. I hope it will be helpful to everyone's learning. For more related content, please pay attention to PHP Chinese net!

Related recommendations:

About the dynamic parameter transfer of the WeChat applet

How to obtain album photos from the WeChat applet

The sequential execution of WeChat applet promsie.all and promise

The above is the detailed content of Introduction to WeChat applet parsing web content. For more information, please follow other related articles on the PHP Chinese website!

Related labels:
source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template
About us Disclaimer Sitemap
php.cn:Public welfare online PHP training,Help PHP learners grow quickly!