Home  >  Article  >  Backend Development  >  What does php use for data collection?

What does php use for data collection?

(*-*)浩
(*-*)浩Original
2019-09-18 11:55:482578browse

What is collection?

is to use PHP programs to capture information from other websites into our own database and website.

What does php use for data collection?

PHP production and collection technology:

There are three methods from the bottom socket to the high-level file operation function. Implement collection.

1. Use socket technology to collect: (Recommended learning: PHP programming from entry to proficiency)

Socket collection is the lowest level. It just establishes a long connection, and then we have to construct the http protocol string ourselves to send the request.

For example, if you want to get the content of Youku page, use socket to write as follows:

<?php  
//连接,$error错误编号,$errstr错误的字符串,30s是连接超时时间  
$fp=fsockopen("www.youku.com",80,$errno,$errstr,30);  
if(!$fp) die("连接失败".$errstr);  
   
//构造http协议字符串,因为socket编程是最底层的,它还没有使用http协议  
$http="GET /?spm=a2hww.20023042.topNav.5~1~3!2~A HTTP/1.1\r\n";   //  \r\n表示前面的是一个命令  
$http.="Host:www.youku.com\r\n";  //请求的主机  
$http.="Connection:close\r\n\r\n";   // 连接关闭,最后一行要两个\r\n  
   
//发送这个字符串到服务器  
fwrite($fp,$http,strlen($http));  
//接收服务器返回的数据  
$data=&#39;&#39;;  
while (!feof($fp)) {  
$data.=fread($fp,4096);  //fread读取返回的数据,一次读取4096字节  
}  
//关闭连接  
fclose($fp);  
var_dump($data);  
?>

The printed result is as follows, including the returned header information and the source code of the page:

What does php use for data collection?

2. Use curl_a set of functions

curl encapsulates the HTTP protocol into many functions. You can directly pass the corresponding parameters, which reduces the writing time. The difficulty of HTTP protocol strings.

Prerequisite: The curl extension must be enabled in php.ini.

function getHTTPS($url) {
  $ch = curl_init();
  curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
  curl_setopt($ch, CURLOPT_HEADER, false);
  curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
  curl_setopt($ch, CURLOPT_URL, $url);
  curl_setopt($ch, CURLOPT_REFERER, $url);
  curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
  $result = curl_exec($ch);
  curl_close($ch);
  return $result;
}
var_dump(getHTTPS($url));

The printed results are as follows, including only the source code of the page:

What does php use for data collection?

3. Use file_get_contents directly (the top level)

Prerequisite: Set the url address that allows opening a network in php.ini.

What does php use for data collection?

//使用file_get_contents()  
$data=file_get_contents("http://www.youku.com");  
var_dump($data);

What does php use for data collection?

The above is the detailed content of What does php use for data collection?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn