Home >Backend Development >PHP Tutorial >Detailed explanation of the method of recording search engine crawling process in PHP

Detailed explanation of the method of recording search engine crawling process in PHP

php中世界最好的语言
php中世界最好的语言Original
2018-05-18 14:17:541541browse

This time I will bring you a detailed explanation of the php record search index engine crawling process method, what are the precautions for php record search engine crawling process, the following is a practical case, let’s take a look take a look.

The following is the complete code:

//记录搜索引擎爬行记录 $searchbot = get_naps_bot(); 
if ($searchbot) 
{ $tlc_thispage = addslashes($_SERVER['HTTP_USER_AGENT']); 
$url = $_SERVER['HTTP_REFERER']; 
$file = WEB_PATH.'robotslogs.txt'; 
$date = date('Y-m-d H:i:s'); 
$data = fopen($file,'a'); 
fwrite($data,"Time:$date robot:$searchbot URL:$tlc_thispage/r/n"); 
fclose($data);
}

WEB_PATH is the root directory path of define under index.PHP, which means that the robotslogs.txt file is placed in the root directory.

Get the spider crawling record through get_naps_bot(), then process it through addslashes, and store the data in the variable $tlc_thispage.

fopen opens the robotslogs.txt file, writes the data through the function fwrite, and closes it through the function fclose.

Because I felt it was unnecessary, I deleted the code on my website, so there are no examples of the effect.

PS: php code to obtain the crawling records of each search spider

Supports the following search engines: Baidu, Google, Bing, Yahoo, Soso , Sogou, Yodao crawling website records!

Code:

<?php 
/**
* 获取搜索引擎爬行记录
* edit by www.jb51.net
*/
function get_naps_bot() 
{ 
$useragent = strtolower($_SERVER[&#39;HTTP_USER_AGENT&#39;]); 
if (strpos($useragent, &#39;googlebot&#39;) !== false){ 
return &#39;Google&#39;; 
} 
if (strpos($useragent, &#39;baiduspider&#39;) !== false){ 
return &#39;Baidu&#39;; 
} 
if (strpos($useragent, &#39;msnbot&#39;) !== false){ 
return &#39;Bing&#39;; 
} 
if (strpos($useragent, &#39;slurp&#39;) !== false){ 
return &#39;Yahoo&#39;; 
} 
if (strpos($useragent, &#39;sosospider&#39;) !== false){ 
return &#39;Soso&#39;; 
} 
if (strpos($useragent, &#39;sogou spider&#39;) !== false){ 
return &#39;Sogou&#39;; 
} 
if (strpos($useragent, &#39;yodaobot&#39;) !== false){ 
return &#39;Yodao&#39;; 
} 
return false; 
} 
function nowtime(){ 
$date=date("Y-m-d.G:i:s"); 
return $date; 
} 
$searchbot = get_naps_bot(); 
if ($searchbot) { 
$tlc_thispage = addslashes($_SERVER[&#39;HTTP_USER_AGENT&#39;]); 
$url=$_SERVER[&#39;HTTP_REFERER&#39;]; 
$file="www.jb51.net.txt"; 
$time=nowtime(); 
$data=fopen($file,"a"); 
fwrite($data,"Time:$time robot:$searchbot URL:$tlc_thispage\n"); 
fclose($data); 
} 
?>

I believe you have mastered the method after reading the case in this article. For more exciting information, please pay attention to other related articles on the PHP Chinese website!

Recommended reading:

What are the methods for php to read local json files

What are the methods for php to output json objects The value of

The above is the detailed content of Detailed explanation of the method of recording search engine crawling process in PHP. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn