Home > Backend Development > PHP Tutorial > Get all links by reading the source files of a site

Get all links by reading the source files of a site

WBOY
Release: 2016-07-25 09:11:10
Original
823 people have browsed it
Read the source file of a certain site, then use regular expressions to analyze its source code and get all the links.
  1. /**********qiushuiwuhen(2002-5-20)***********/
  2. if(empty($url))$url = "http://www.csdn.net/expert/";//Set url
  3. $site =substr($url,0,strpos($url,"/",8));//Site
  4. $base=substr($url,0,strrpos($url,"/") 1);//File Directory
  5. $fp = fopen($url, "r" );//Open url
  6. while(!feof($fp))$contents.=fread($fp,1024);//
  7. $pattern="|href=['"]?([^ '"] )['" ]|U";
  8. preg_match_all($pattern,$contents, $regArr, PREG_SET_ORDER);//Match all href=
  9. for ($i=0;$iif(!eregi("://",$regArr[$i][1]))//Whether It is a relative path, that is, whether there is ://
  10. if(substr($regArr[$i][1],0,1)=="/")//whether it is the root directory of the site
  11. echo "link". ($i 1).":".$site.$regArr[$i][1]."
    ";//Root directory
  12. else
  13. echo "link".($i 1). ":".$base.$regArr[$i][1]."
    ";//Current directory
  14. else
  15. echo "link".($i 1).":".$regArr [$i][1]."
    ";//Relative path
  16. }
  17. fclose($fp);
  18. ?>
Copy code


source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template