Article Topic Learning Download Q&A Programming Dictionary Game Recent Updates

简体中文(ZH-CN) English(EN) 繁体中文(ZH-TW) 日本語(JA) 한국어(KO) Melayu(MS) Français(FR) Deutsch(DE)

Home> Backend Development> PHP Tutorial> body text

简单的php爬虫案例是什么呢？

WBOY

Release： 2016-06-06 20:19:15

Original

1504 people have browsed it

本人php新手，尤其是对php运用极差。
有没有来一段php爬虫的简单案例，激发新手对php热爱的欲望。
比如用php爬某个网站的数据库？
请问如何用php不同的方法和不一样的函数或者用正则比。用file还是什么的？

回复内容：

本人php新手，尤其是对php运用极差。
有没有来一段php爬虫的简单案例，激发新手对php热爱的欲望。
比如用php爬某个网站的数据库？
请问如何用php不同的方法和不一样的函数或者用正则比。用file还是什么的？

https://github.com/search?utf8=%E2%9C%93&q=php+crawler

能好好提问吗?
谁告诉你能爬别人网站的数据库?数据!=数据库.
获取HTML内容的方法可以用file_get_contents(),curl,fopen,fsockopen,etc

最简单的:获取segmentfault首页内容:

echo file_get_contents('https://segmentfault.com/');

Copy after login

HTML内容提取

可能你需要对页面内容进行提取,可以用正则表达式,但是不建议这么用,一旦网站改版很难维护,或者html内容不规则,

建议使用phpquery这样的DOM解析,有国人基于phpquery开发的querylist

find('article.newsentry') as $article) { $item['time'] = trim($article->find('time', 0)->plaintext); $item['title'] = trim($article->find('h2.newstitle', 0)->plaintext); //$item['content'] = trim($article->find('div.newscontent', 0)->plaintext); $news[] = $item; } print_r($news);

Copy after login

比如上面就是用Simple HTML DOM这个PHP的DOM分析库采集php.net的首页新闻,可以很方便地像jQuery那样进行DOM操作,获取HTML里需要的数据.
http://simplehtmldom.sourceforge.net/manual.htm

Related labels：

php

source：php.cn

Previous article：javascript - 有什么好的解决方案实现php和html的代码分离么 Next article：mysql - Thinkphp 的连贯查询语句正确表达

Statement of this Website

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Latest Articles by Author

NVIDIA Introduces NIM Microservices for Generative AI in Japan and Taiwan

2024-08-27 15:47:16
Toncoin (TON) and Notcoin (NOT) Face Pivotal Moments as Market Sentiment Weighs on Their Abilities to Reclaim Lost Ground

2024-08-27 15:45:16
Hongmeng Zhixing Zhijie R7's new official image released and will be unveiled at the new product launch conference

2024-08-27 15:35:48
Bitcoin (BTC) Enters Rebound Phase, Analysts Divided on the Immediate Future

2024-08-27 15:35:17
How to sign in a wave of supermen How to sign in a wave of supermen

2024-08-27 15:35:00
Binance Integrates Yield Guild Games (YGG) on Ronin Network, Enables Deposits and Withdrawals

2024-08-27 15:33:10
Power consumption halved! Samsung's new OLED panel is coming

2024-08-27 15:33:03
Defend Carrot 4 Return of the Pharaoh 33 Defend Carrot 4 Return of the Pharaoh 33 Clearance Guide

2024-08-27 15:32:38
MoonTaurus (MNTR): A Promising Meme Coin Surges in Second Presale Stage

2024-08-27 15:32:10
How to choose a thousand-yuan machine? The battery life is very good, vivo Y37 Pro is equipped with 6000mAh

2024-08-27 15:31:51

Latest Issues

How to list data in a section by ID using while loop in PHP? I have a mysql table with these columns: series_id, series_color, product_name In the outp...

From 2023-11-17 20:03:03

0

1

290

Call to undefined function create_function() I get this message on the home page of the website: Fatal error: Uncaught error: calling /...

From 2023-11-16 19:00:36

0

1

277

From 2023-11-14 23:55:21

PHP trim unicode spaces I'm trying to trim unicode spaces such as this character and I was able to do it using thi...

From 2023-11-13 08:49:45

0

2

398

request->getArguments() is empty" class="wdcdcTitle">TYPO3 V11: "PHP warning: undefined array key", $this->request->getArguments() is empty I'm a new user of typo3, I made a plugin to show users and use the search bar to filter th...

From 2023-11-12 21:35:09

0

1

362

Related Topics

More>

Popular Recommendations

Popular Tutorials

More>

Related Tutorials

Popular Recommendations

Latest courses

Latest Downloads

More>

Web Effects

Website Source Code

Website Materials

Front End Template

About us Disclaimer Sitemap: php.cn：Public welfare online PHP training，Help PHP learners grow quickly！