Home > Article > Backend Development > PHP uses Snoopy class to implement page crawling
This article mainly introduces how PHP uses the Snoopy class to implement page crawling. Interested friends can refer to it. I hope it will be helpful to everyone.
The examples in this article describe the usage of Snoopy class in php. The specific analysis is as follows:
Here is a demonstration of how to grab web page information through Snoopy in php
/* You need the snoopy.class.php from http://snoopy.sourceforge.net/ */ include("snoopy.class.php"); $snoopy = new Snoopy; // need an proxy?: //$snoopy->proxy_host = "my.proxy.host"; //$snoopy->proxy_port = "8080"; // set browser and referer: $snoopy->agent = "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)"; $snoopy->referer = "http://www.jonasjohn.de/"; // set some cookies: $snoopy->cookies["SessionID"] = '238472834723489'; $snoopy->cookies["favoriteColor"] = "blue"; // set an raw-header: $snoopy->rawheaders["Pragma"] = "no-cache"; // set some internal variables: $snoopy->maxredirs = 2; $snoopy->offsiteok = false; $snoopy->expandlinks = false; // set username and password (optional) //$snoopy->user = "joe"; //$snoopy->pass = "bloe"; // fetch the text of the website www.google.com: if($snoopy->fetchtext("http://www.google.com")){ // other methods: fetch, fetchform, fetchlinks, submittext and submitlinks // response code: print "response code: ".$snoopy->response_code."<br/>\n"; // print the headers: print "<b>Headers:</b><br/>"; while(list($key,$val) = each($snoopy->headers)){ print $key.": ".$val."<br/>\n"; } print "<br/>\n"; // print the texts of the website: print "<pre class="brush:php;toolbar:false">".htmlspecialchars($snoopy->results)."\n"; } else { print "Snoopy: error while fetching document: ".$snoopy->error."\n"; }
Summary: The above is the entire content of this article, I hope it will be helpful to everyone's study.
Related recommendations:
PHP’s method of implementing a circular queue based on memcache
PHP reading configuration file class instance
The above is the detailed content of PHP uses Snoopy class to implement page crawling. For more information, please follow other related articles on the PHP Chinese website!