PHP implements search engine crawling code sharing-PHP Tutorial-php.cn

PHP implements search engine crawling code sharing

小云云

Release： 2023-03-20 17:22:01

Original

1677 people have browsed it

This article mainly introduces the implementation code of PHP to record search engine crawling records, and then provides a supplementary introduction to the code of PHP to obtain crawling records of each search spider. Friends who need it can refer to it. I hope it can help you.

The following is the complete code:

//记录搜索引擎爬行记录 $searchbot = get_naps_bot(); 
if ($searchbot) 
{ $tlc_thispage = addslashes($_SERVER['HTTP_USER_AGENT']); 
$url = $_SERVER['HTTP_REFERER']; 
$file = WEB_PATH.'robotslogs.txt'; 
$date = date('Y-m-d H:i:s'); 
$data = fopen($file,'a'); 
fwrite($data,"Time:$date robot:$searchbot URL:$tlc_thispage/r/n"); 
fclose($data);
}

Copy after login

WEB_PATH is the root directory path of define under index.PHP, which means that the robotslogs.txt file is placed in the root directory.

Get the spider crawling record through get_naps_bot(), then process it through addslashes, and store the data in the variable $tlc_thispage.

fopen opens the robotslogs.txt file, writes the data through the function fwrite, and closes it through the function fclose.

Because I felt it was unnecessary, I deleted the code on my website, so there are no examples of the effect.

PS: php code to obtain the crawling records of each search spider

Supports the following search engines: Baidu, Google, Bing, Yahoo, Soso, Sogou, Yodao crawling website records!

Code:

<?php 
/**
* 获取搜索引擎爬行记录
* edit by www.jb51.net
*/
function get_naps_bot() 
{ 
$useragent = strtolower($_SERVER[&#39;HTTP_USER_AGENT&#39;]); 
if (strpos($useragent, &#39;googlebot&#39;) !== false){ 
return &#39;Google&#39;; 
} 
if (strpos($useragent, &#39;baiduspider&#39;) !== false){ 
return &#39;Baidu&#39;; 
} 
if (strpos($useragent, &#39;msnbot&#39;) !== false){ 
return &#39;Bing&#39;; 
} 
if (strpos($useragent, &#39;slurp&#39;) !== false){ 
return &#39;Yahoo&#39;; 
} 
if (strpos($useragent, &#39;sosospider&#39;) !== false){ 
return &#39;Soso&#39;; 
} 
if (strpos($useragent, &#39;sogou spider&#39;) !== false){ 
return &#39;Sogou&#39;; 
} 
if (strpos($useragent, &#39;yodaobot&#39;) !== false){ 
return &#39;Yodao&#39;; 
} 
return false; 
} 
function nowtime(){ 
$date=date("Y-m-d.G:i:s"); 
return $date; 
} 
$searchbot = get_naps_bot(); 
if ($searchbot) { 
$tlc_thispage = addslashes($_SERVER[&#39;HTTP_USER_AGENT&#39;]); 
$url=$_SERVER[&#39;HTTP_REFERER&#39;]; 
$file="www.jb51.net.txt"; 
$time=nowtime(); 
$data=fopen($file,"a"); 
fwrite($data,"Time:$time robot:$searchbot URL:$tlc_thispage\n"); 
fclose($data); 
} 
?>

Copy after login

Related recommendations:

jQuery Jsonp cross-domain simulation search engine instance sharing

php Right Detailed explanation of calls to existing search engines

Example code sharing on how to switch the navigation web search box of search engines using JavaScript

The above is the detailed content of PHP implements search engine crawling code sharing. For more information, please follow other related articles on the PHP Chinese website!