This article mainly introduces the implementation code of PHP to record search engine crawling records, and then provides a supplementary introduction to the code of PHP to obtain crawling records of each search spider. Friends who need it can refer to it. I hope it can help you.
The following is the complete code:
//记录搜索引擎爬行记录 $searchbot = get_naps_bot(); if ($searchbot) { $tlc_thispage = addslashes($_SERVER['HTTP_USER_AGENT']); $url = $_SERVER['HTTP_REFERER']; $file = WEB_PATH.'robotslogs.txt'; $date = date('Y-m-d H:i:s'); $data = fopen($file,'a'); fwrite($data,"Time:$date robot:$searchbot URL:$tlc_thispage/r/n"); fclose($data); }
WEB_PATH is the root directory path of define under index.PHP, which means that the robotslogs.txt file is placed in the root directory.
Get the spider crawling record through get_naps_bot(), then process it through addslashes, and store the data in the variable $tlc_thispage.
fopen opens the robotslogs.txt file, writes the data through the function fwrite, and closes it through the function fclose.
Because I felt it was unnecessary, I deleted the code on my website, so there are no examples of the effect.
PS: php code to obtain the crawling records of each search spider
Supports the following search engines: Baidu, Google, Bing, Yahoo, Soso, Sogou, Yodao crawling website records!
Code:
<?php /** * 获取搜索引擎爬行记录 * edit by www.jb51.net */ function get_naps_bot() { $useragent = strtolower($_SERVER['HTTP_USER_AGENT']); if (strpos($useragent, 'googlebot') !== false){ return 'Google'; } if (strpos($useragent, 'baiduspider') !== false){ return 'Baidu'; } if (strpos($useragent, 'msnbot') !== false){ return 'Bing'; } if (strpos($useragent, 'slurp') !== false){ return 'Yahoo'; } if (strpos($useragent, 'sosospider') !== false){ return 'Soso'; } if (strpos($useragent, 'sogou spider') !== false){ return 'Sogou'; } if (strpos($useragent, 'yodaobot') !== false){ return 'Yodao'; } return false; } function nowtime(){ $date=date("Y-m-d.G:i:s"); return $date; } $searchbot = get_naps_bot(); if ($searchbot) { $tlc_thispage = addslashes($_SERVER['HTTP_USER_AGENT']); $url=$_SERVER['HTTP_REFERER']; $file="www.jb51.net.txt"; $time=nowtime(); $data=fopen($file,"a"); fwrite($data,"Time:$time robot:$searchbot URL:$tlc_thispage\n"); fclose($data); } ?>
Related recommendations:
jQuery Jsonp cross-domain simulation search engine instance sharing
php Right Detailed explanation of calls to existing search engines
The above is the detailed content of PHP implements search engine crawling code sharing. For more information, please follow other related articles on the PHP Chinese website!