What should you pay attention to when getting web content in php?

Notes on obtaining web page content with PHP
1. Network errors will occur, and any error is possible. For example, the machine is down, the network cable is broken, the domain name is wrong, the network times out, the page is gone, the website jumps, the service is banned, the host load is not enough...
2. The server has added restrictions. Only allow common browsers to access
3. The server has added anti-hotlinking restrictions
4. Some websites do not care whether there is an Accept-Encoding header in your HTTP request, or whether you have a header. What is the specific content of the part? Anyway, I will always send you the gzipped content
5. URL links are all kinds of weird, including ones with Chinese characters, and some even have carriage return and line feed
6. Some websites have a Content-Type in the HTTP header, and there are several Content-Types in the web page. What’s even more outrageous is that each Content-Type is different. The most outrageous thing is that these Content-Types may not be used in the text. Content-Type, resulting in garbled characters
7. The network link is very slow. Multiplied by the time it takes to analyze thousands of pages, I suggest you have a good meal
Get PHP Web page content method
Method 1. Use the file_get_contents method to implement
$url = "http://news.sina.com.cn/c/nd/2016-10-23/doc-ifxwztru6951143.shtml";
$html = file_get_contents($url);
//如果出现中文乱码使用下面代码
//$getcontent = iconv("gb2312", "utf-8",$html);
echo "<textarea style='width:800px;height:600px;'>".$html."</textarea>";Method 2. Use curl to implement
$url = "http://news.sina.com.cn/c/nd/2016-10-23/doc-ifxwztru6951143.shtml";
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 10);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
$html = curl_exec($ch);
curl_close($ch);
echo "<textarea style='width:800px;height:600px;'>".$html."</textarea>";
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);Adding this code means that if the request is redirected, you can access the final request page, otherwise the request result will display the following content:
<head><title>Object moved</title></head> <body><h1 id="Object-nbsp-Moved">Object Moved</h1>This object may be found <a href="some link." rel="external nofoll
Recommended tutorial:PHP video tutorial
The above is the detailed content of What should you pay attention to when getting web content in php?. For more information, please follow other related articles on the PHP Chinese website!
ACID vs BASE Database: Differences and when to use each.Mar 26, 2025 pm 04:19 PMThe article compares ACID and BASE database models, detailing their characteristics and appropriate use cases. ACID prioritizes data integrity and consistency, suitable for financial and e-commerce applications, while BASE focuses on availability and
PHP Secure File Uploads: Preventing file-related vulnerabilities.Mar 26, 2025 pm 04:18 PMThe article discusses securing PHP file uploads to prevent vulnerabilities like code injection. It focuses on file type validation, secure storage, and error handling to enhance application security.
PHP Input Validation: Best practices.Mar 26, 2025 pm 04:17 PMArticle discusses best practices for PHP input validation to enhance security, focusing on techniques like using built-in functions, whitelist approach, and server-side validation.
PHP API Rate Limiting: Implementation strategies.Mar 26, 2025 pm 04:16 PMThe article discusses strategies for implementing API rate limiting in PHP, including algorithms like Token Bucket and Leaky Bucket, and using libraries like symfony/rate-limiter. It also covers monitoring, dynamically adjusting rate limits, and hand
PHP Password Hashing: password_hash and password_verify.Mar 26, 2025 pm 04:15 PMThe article discusses the benefits of using password_hash and password_verify in PHP for securing passwords. The main argument is that these functions enhance password protection through automatic salt generation, strong hashing algorithms, and secur
OWASP Top 10 PHP: Describe and mitigate common vulnerabilities.Mar 26, 2025 pm 04:13 PMThe article discusses OWASP Top 10 vulnerabilities in PHP and mitigation strategies. Key issues include injection, broken authentication, and XSS, with recommended tools for monitoring and securing PHP applications.
PHP XSS Prevention: How to protect against XSS.Mar 26, 2025 pm 04:12 PMThe article discusses strategies to prevent XSS attacks in PHP, focusing on input sanitization, output encoding, and using security-enhancing libraries and frameworks.
PHP Interface vs Abstract Class: When to use each.Mar 26, 2025 pm 04:11 PMThe article discusses the use of interfaces and abstract classes in PHP, focusing on when to use each. Interfaces define a contract without implementation, suitable for unrelated classes and multiple inheritance. Abstract classes provide common funct


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

Atom editor mac version download
The most popular open source editor

ZendStudio 13.5.1 Mac
Powerful PHP integrated development environment

Safe Exam Browser
Safe Exam Browser is a secure browser environment for taking online exams securely. This software turns any computer into a secure workstation. It controls access to any utility and prevents students from using unauthorized resources.

EditPlus Chinese cracked version
Small size, syntax highlighting, does not support code prompt function

Dreamweaver CS6
Visual web development tools







