Home > Backend Development > PHP Tutorial > PHP fsockopen/curl怎么获取目标转向后的页面代码有关问题

PHP fsockopen/curl怎么获取目标转向后的页面代码有关问题

WBOY
Release: 2016-06-13 13:23:26
Original
735 people have browsed it

PHP fsockopen/curl如何获取目标转向后的页面代码问题

PHP code
<!--

Code highlighting produced by Actipro CodeHighlighter (freeware)
http://www.CodeHighlighter.com/

-->
 
$ghurl = isset($_GET['id']) ? $_GET['id']:'http://3gabc.com/'; 
// php 获取 
function getContents($url){ 
$header = array("Referer: http://3gabc.com/"); 
$ch = curl_init(); 
curl_setopt($ch, CURLOPT_URL, $url); 
curl_setopt($ch, CURLOPT_TIMEOUT, 30); 
curl_setopt($ch, CURLOPT_HTTPHEADER,$header); 
curl_setopt($ch, CURLOPT_FOLLOWLOCATION,1);  //是否抓取跳转后的页面
ob_start(); 
curl_exec($ch); 
$contents = ob_get_contents(); 
ob_end_clean(); 
curl_close($ch); 

return $contents; 
} 

$contents = getContents($ghurl); 
echo $contents; 
?> 

Copy after login


失败。。。

PHP code
<!--

Code highlighting produced by Actipro CodeHighlighter (freeware)
http://www.CodeHighlighter.com/

-->
<?php function get_page_content($url){
 $url = eregi_replace('^http://', '', $url);
 $temp = explode('/', $url);
 $host = array_shift($temp);
 $path = '/'.implode('/', $temp);
 $temp = explode(':', $host);
 $host = $temp[0];
 $port = isset($temp[1]) ? $temp[1] : 80;
 $fp = @fsockopen($host, $port, &$errno, &$errstr, 30);
 if ($fp){
     @fputs($fp, "GET ".$path." HTTP/1.1\r\nHost: ".$host." \r\nAccept: */*\r\nReferer:".$url." \r\nUser-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)\r\nConnection: Close\r\n\r\n");
 }
 $Content = '';
 while ($str = @fread($fp, 4096)){
     $Content .= $str;
 }
 @fclose($fp);
 //echo $Content;
 //重定向
 if(preg_match("/^HTTP\/\d.\d 301 Moved Permanently/is",$Content)){
  if(preg_match("/Location:\s+(.*?)\s+/is",$Content,$murl)){ 
      return get_page_content($url."/".$murl[1]);
  }
 }

 //读取内容
 if(preg_match("/^HTTP\/\d.\d 200 OK/is",$Content)){
  preg_match("/Content-Type:(.*?)\r\n/is",$Content,$murl);
  $contentType=trim($murl[1]);
  $Content=explode("\r\n\r\n",$Content,2);
  $Content=$Content[1];
 }
 return $Content;
}


echo get_page_content('3gabc.com');

?>

Copy after login


失败。。。

先后 尝试fsockopen/curl等方法,获取header并判断执行,但都失败,请教各位。

------解决方案--------------------


3gabc.com就只有这么一句。
你也只能获取到这个,meta refresh是在浏览器上执行的!!!

------解决方案--------------------
因为3gabc.com就只有这么一句。

你在抓取页面后 “echo $contents;” 页面自然就重定向到http://www.3Gabc.com 了。
所以不能echo $contents; 而是用正则“preg_match("//is",$content, $matches)”
抓出转向地址,然后在curl这个转线地址,就可以抓到你要的内容了。
Related labels:
source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template