Home >Backend Development >PHP Tutorial >Code example for php to crawl images and save them locally
This article brings you code examples about crawling images with PHP and saving them locally. It has certain reference value. Friends in need can refer to it. I hope it will be helpful to you.
Review the usage of several php functions through a simple example
curl sends network requests
preg_match Regular match
$url = 'http://desk.zol.com.cn/bizhi/7386_91671_2.html'; $headers = [ 'user-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36' ]; $ch = curl_init(); curl_setopt($ch, CURLOPT_URL, $url); curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); //将curl_exec()获取的信息以字符串返回,而不是直接输出。 curl_setopt($ch, CURLOPT_HEADER, $headers); $output = curl_exec($ch); curl_close($ch); $str = mb_convert_encoding($output, 'utf-8', 'gb2312'); //或$str = iconv('gb2312//IGNORE', 'utf-8', $output); preg_match('!<img id="bigImg" src="(?<src>http.*\.(?<ext>jpg|png))".*>!', $str, $m); file_put_contents('./meinv.' . $m['ext'], file_get_contents($m['src']));
The steps to establish a curl connection in PHP are generally: initialization, setting options, performing operations, and releasing the connection.
$ch = curl_init(); curl_setopt($ch, CURLOPT, $opt); $out = curl_exec($ch); curl_close();
Commonly used CURLOPT
settings, more reference documents http://php.net/manual/zh/function.curl-setopt.php
CURLOPT_URL, string //设置url必须 CURLOPT_HEADER, array //设置请求header CURLOPT_RETURNTRANSFER, bool //为true时,以字符串返回响应,不包含header CURLOPT_SSL_VERIFYPEER, bool //为false时,不验证https证书,用于请求https的url CURLOPT_POST, int //为1时配合CURLOPT_POSTFIELDS使用post请求,默认使用get CURLOPT_POSTFIELDS, array //post数据数组
Direct output Garbled characters were found in $output. By checking the source code, we found that the web page uses gb2312 encoding. Use mb_convert_encoding or iconv to convert it to utf-8 encoding for output.
preg_match Regular match
By looking at the source code, we found that the image tag we need is 4753a14a50491eb8fdd0e2350de2de58
Regular Expression
<img id="bigImg" src="(?<src>http.*\.(?<ext>jpg|png))".*>
.* Match all, (?8a11bc632ea32a57b3e3693c7987c420) Using grouping, you can easily use $match['name'] to get the desired part
Finally $match['src'] Get the real URL of the image and save it through file_put_contents, even if it is completed
The above is the detailed content of Code example for php to crawl images and save them locally. For more information, please follow other related articles on the PHP Chinese website!