The captured content can be filtered through regular expressions to get the content you want. As for how to use regular expressions to filter, I will not introduce it here. For those who are interested, the following are several commonly used PHP methods. How to crawl content from web pages.
1.file_get_contents
PHP code
Copy code The code is as follows:
< ;?php
$url = "http://www.jb51.net";
$contents = file_get_contents($url);
//If Chinese garbled characters appear, use the following code
// $getcontent = iconv("gb2312", "utf-8",$contents);
echo $contents;
?>
2.curl PHP code
Copy code The code is as follows:
$url = "http: //www.jb51.net";
$ch = curl_init();
$timeout = 5;
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER , 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
//You need to add the following two lines to the webpage that requires user detection
//curl_setopt($ch, CURLOPT_HTTPAUTH, CURLAUTH_ANY);
//curl_setopt($ch, CURLOPT_USERPWD, US_NAME.":".US_PWD);
$contents = curl_exec($ch);
curl_close($ch);
echo $contents;
?>
3.fopen->fread->fclose PHP code
Copy code The code is as follows:
$handle = fopen ("http://www.jb51.net", "rb");
$contents = "";
do {
$data = fread($handle, 1024);
if (strlen($data) == 0) {
break;
}
$contents .= $data;
} while(true);
fclose ($handle);
echo $contents;
?>
Note :
1. Use file_get_contents and fopen to enable allow_url_fopen. Method: Edit php.ini and set allow_url_fopen = On. When allow_url_fopen is turned off, neither fopen nor file_get_contents can open remote files.
2. To use curl, you must have space to enable curl. Method: Modify php.ini under Windows, remove the semicolon in front of extension=php_curl.dll, and copy ssleay32.dll and libeay32.dll to C:WINDOWSsystem32; install the curl extension under Linux.
http://www.bkjia.com/PHPjc/319720.htmlwww.bkjia.comtruehttp: //www.bkjia.com/PHPjc/319720.htmlTechArticleThe captured content can be filtered through regular expressions to get the content you want. As for how Use regular expressions to filter. I won’t introduce it here. If you are interested, here’s what...