The example in this article describes the method of generating Baidu sitemap sitemap function in PHP and shares it with everyone for your reference. The specific implementation method is as follows:
The company website is a Q&A encyclopedia website, and the SEO engineer made a request to generate xml files based on the questions on the website. Each xml file contains 5000 setsmap format data. There are currently about 700,000 questions on the online website, so basically 140 xml files are generated. There is also an index file. For example, the file name starts with a number. The content of the index file is the path and name of each xml file.
Why do we need to store 5,000 pieces of data in each file? Because this is a limit value of mysql. If we fetch too much each time, it may affect online user access or slow down the speed. Each file stores 5,000 pieces of data, but when using mysql selsect, you cannot fetch 5,000 pieces each time. What is written now is to fetch 1,000 pieces each time. Then the logic is a bit complicated.
First take out 1000 pieces of data (which can be more flexible to facilitate later modification), and then generate xml format files in a loop. file_puts_contens writes files. Then write the generated xml file name, the minimum id of the retrieved question, the maximum id of the retrieved question, and the number of retrieved questions into a txt file for index query. The format is roughly like this.
0,3146886,3145887,1000
Did you find that the last number of items is 1000? The first time you select 1000 items of data, then write them into the 0.xml file. Write the extracted xml file name, minimum id, maximum id, and number of entries into the index query txt. For the first time, 1,000 pieces of data were written to 0.xml, and the number of pieces generated was 1,000. The select statement will become when querying for the second time. where id > The maximum id taken out (currently mysql is a forward order query, if it is in reverse order, change it to less than) limit 1000 In this case, another 1000 is taken out, and then the minimum id and maximum id of the index query txt are modified, and the number of generated items is added to 2000 . By analogy, when the number of generated items reaches 5000, start another line and write it into the index file, similar to this
0,3146886,3145887,5000
1,3148886,3147887,1000
Writing this way reduces the pressure on the server.
The implementation code is posted below (the style is a bit messy):
/*
* SiteMap interface class
*/
class SitemapAction extends Action{
private static $baseURL = ''; //URL address
private static $askMobileUrl = 'http://m.xxx.cn/ask/'; //Q&A mobile version address
private static $askPcUrl = "http://www.xxx.cn/ask/"; //Q&A pc address
private static $askZonePcUrl = "http://www.xxx.cn/ask/jingxuan/"; //Q&A selected Pc link
private static $askZoneMobileUrl = "http://m.xxx.cn/ask/jx/"; //Q&A selected mobile version link
//Q&A setmaps
public function askSetMap(){
header('Content-type:text/html;charset=utf-8');
//Get the question list
$maxid = 0; //Maximum id of index file
$minid = 0; //Minimum id of index file
$psize = 1000; //Quantity fetched from the database each time
$maxXml = 5000; //Number of records written in xml
$where = array();
//Read index file
$index = APP_PATH.'setmapxml/Index.txt';
//Associate setmaps path
$askXml = "../siteditu/ask/ask.xml";
if(!file_exists($index)){
$fp=fopen("$index", "w+");
if ( !is_writable($index) ){
die("File:" .$index. "Not writable, please check!");
}
fclose($fp);
}else{
//index.txt file description 0: xml file name (starting from 1), 1: maximum file id, 2: minimum file id, 3: current number of records in the file
$fp = file($index);
$string = $fp[count($fp)-1];//Display the last line
$arr = explode(',', $string);
}
//Whether the number of index files is less than $maxXml
//If this is the first run
if(!$arr[1]){
$bs=1;
$filename=0;
}else{
if($arr && $arr[3]<$maxXml){
$filename = $arr[0];
$psize = $maxXml-$arr[3]>$psize?$psize:($maxXml-$arr[3]);
$bs = 0;
}else{
$filename = $arr[0]+1;
$bs=1;
}
}
$maxid = empty($arr[1])?0:$arr[1];
$minid = empty($arr[2])?0:$arr[2];
echo "File name:".$filename.".xml"."
";
echo "maxid:".$maxid."
";
echo "minimum id:".$minid."
";
echo "Maximum record written in xml:".$maxXml."
";
echo "The number of reads per database:".$psize."
";
$list = self::$questionObj->getQuestionSetMap($where,$maxid,$psize);
if(count($list)<=0){
echo 1;exit;
}
$record = $arr[3]+count($list); //The number of records written in the index file
$indexArr = array('filename'=>$filename,'maxid'=>$maxid,'minid'=>$minid,'maxXml'=>$record);
$start = ' '.chr(10);
$start.="
".chr(10);
$start.="";
foreach($list as $k=>$qinfo){
if($k==0)
$indexArr['minid']=$qinfo['id'];
$qinfo['lastmod'] = substr($qinfo['lasttime'],0,10);
$qinfo['mobielurl'] = self::$askMobileUrl.$qinfo['id'].'.html'; //Mobile version link
$qinfo['pcurl'] = self::$askPcUrl.$qinfo['id'].'-p1.html'; //PC version link
$xml.=$this->askMapMobileUrl($qinfo); //Mobile version
$xml.=$this->askMapPcUrl($qinfo); //PC version
}
$maxid = end($list);
$indexArr['maxid'] = $maxid['id'];
//Update index file
if($bs==0){
//Update the last row
$txt = file($index);
$txt[count($txt)-1] = $indexArr[filename].','.$indexArr[maxid].','.$indexArr['minid'].','.$indexArr['maxXml' ]."rn";
$str = join($txt);
if (is_writable($index)) {
if (!$handle = fopen($index, 'w')) {
echo "Cannot open file $index";exit;
exit;
}
if (fwrite($handle, $str) === FALSE) {
echo "Cannot write to file $index";exit;
exit;
}
echo "Successfully written to file $index";
fclose($handle);
} else {
echo "File $index is not writable";exit;
}
fclose($index);
}elseif($bs==1){
//Add a new line
$fp = fopen($index,'a');
$num = count($list);
$string = $indexArr[filename].','.$indexArr[maxid].','.$indexArr['minid'].','.$num."rn";
if(fwrite($fp,$string)===false){
echo "Failed to append new line...";exit;
}else{
echo "Add successfully
";
//Update sitemap index file
$xmlData="".chr(10);
$xmlData.="
".chr(10);
$xmlData.="";
if(!file_exists($askXml))
file_put_contents($askXml,$xmlData);
$fileList = file($askXml);
$fileCount = count($fileList);
$setmapxml = "http://www.xxx.cn/ask/setmapxml/{$filename}.xml";//Normal question link
$txt = $this->setMapIndex($setmapxml);
$fileList[$fileCount-1]=$txt."";
$newContent = '';
foreach($fileList as $v){
$newContent.= $v;
}
if(!file_put_contents($askXml,$newContent)) exit('Unable to write data');
echo 'The document has been written' . $askXml;
}
fclose($fp);
}
$filename = APP_PATH.'setmapxml/'.$filename.'.xml';
//Update to xml file, add ending
If(!file_exists($filename))
file_put_contents($filename,$start);
$xmlList = file($filename);
$xmlCount = count($fileList);
$xmlList[$xmlCount-1]=$xml."";
$newXml = '';
foreach($xmlList as $v){
$newXml.= $v;
}
if(!file_put_contents($filename, $newXml))exit("Writing data error");
Else
echo "Data written successfully
";
}
//Q&A mobile version xml
private function askMapMobileUrl($data){
$xml = '';
if(is_array($data)&&!empty($data)){
$xml .="
".chr(10);
if($data['id'])
$xml.=''.$data['mobielurl'].''.chr(10);//Mobile version link
$xml.="".chr(10);
if($data['lastmod'])
$xml.=''.$data['lastmod'].''.chr(10);
$xml.='daily'.chr(10);
$xml.='0.8'.chr(10);
$xml.="".chr(10);
return $xml;
}
}
//Q&A pc version xml
private function askMapPcUrl($data){
$xml = '';
if(is_array($data)&&!empty($data)){
$xml.='
'.chr(10);
if($data['id'])
$xml.=''.$data['pcurl'].''.chr(10);//PC version link
if($data['lastmod'])
$xml.=''.$data['lastmod'].''.chr(10);
$xml.='daily'.chr(10);
$xml.='0.8'.chr(10);
$xml.=''.chr(10);
return $xml;
}
}
//setmaps index file
private function setMapIndex($filename){
$xml = '';
$xml.="
".chr(10);
$xml.="{$filename}".chr(10);
$xml.="".date("Y-m-d",time())."".chr(10);
$xml.="".chr(10);
return $xml;
}
}
?>