Home > Backend Development > PHP Tutorial > 怎么采集防采集的网站

怎么采集防采集的网站

WBOY
Release: 2016-06-13 11:17:42
Original
896 people have browsed it

如何采集防采集的网站
我想用php采集一个网站的数据,但是无法获取该网站的数据。网址如下:
http://www.alldatasheet.com/view.jsp?Searchword=78HC
希望您能试一下,只要能返回数据就行了。我试了很久不能成功。


------解决方案--------------------
<br /><br />$header = array ( <br />"GET /view.jsp?Searchword=78HC HTTP/1.1",<br />"Host: www.alldatasheet.com",<br />"Connection: keep-alive",<br />"Cache-Control: max-age=0",<br />"Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",<br />"User-Agent: Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.22 (KHTML, like Gecko) Chrome/25.0.1364.152 Safari/537.22",<br />"Accept-Encoding: gzip,deflate,sdch",<br />"Accept-Language: en-US,zh-CN;q=0.8,zh;q=0.6",<br />"Accept-Charset: UTF-8,*;q=0.5",<br />"Cookie: JSESSIONID=BD1418BC3F4CA9084F0C022A98687A09; track_id=117.25.173.111363310326444; seekstr=*78H*..; <br /><br />seekshot=78H..1..75..8..112; __utma=191189370.2036196682.1363308553.1363308553.1363308553.1; <br /><br />__utmb=191189370.3.10.1363308553; __utmc=191189370; __utmz=191189370.1363308553.1.1.utmcsr=(direct)<br><font color='#FF8000'>------解决方案--------------------</font><br>utmccn=(direct)<br><font color='#FF8000'>------解决方案--------------------</font><br><br /><br />utmcmd=(none); arp_scroll_position=900"<br />); <br /><br />// 初始化一个 cURL 对象<br />$curl = curl_init(); <br /> <br />// 设置你需要抓取的URL<br />curl_setopt($curl, CURLOPT_URL, 'http://www.alldatasheet.com/view.jsp?Searchword=78HC');<br /><br />curl_setopt($curl, CURLOPT_HTTPHEADER, $header); //设置header <br /> <br />// 设置header显示方式<br />curl_setopt($curl, CURLOPT_HEADER, 0);<br /> <br />// 设置cURL 参数,要求结果保存到字符串中还是输出到屏幕上。<br />curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);<br /> <br />// 运行cURL,请求网页<br />$data = curl_exec($curl);<br /> <br />// 关闭URL请求<br />curl_close($curl);<br /> <br />// 显示获得的数据<br />var_dump($data);<br /><br />
Copy after login

------解决方案--------------------
只要是浏览器能访问的页面,应该都能采集的。
关键是cookie。
Related labels:
source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template