Home > Backend Development > PHP Tutorial > Why Does file_get_contents() Corrupt Special Characters in Remote HTML, and How Can I Fix It?

Why Does file_get_contents() Corrupt Special Characters in Remote HTML, and How Can I Fix It?

Susan Sarandon
Release: 2024-12-08 07:17:10
Original
354 people have browsed it

Why Does file_get_contents() Corrupt Special Characters in Remote HTML, and How Can I Fix It?

File_get_contents() Breaks XML Formatting in HTML

When using file_get_contents() to retrieve content from a remote HTML document, some special characters may malfunction. This occurs primarily with content encoded in UTF-8 that involves characters such as Ľ, Š, Č, Ť, Ž, and others. Instead of rendering properly, these characters display corrupted versions such as Å, ¾, ¤, and similar nonsensical symbols.

Solution:

To resolve this issue, convert the retrieved content to HTML entities using the mb_convert_encoding() function. Here's the modified code:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="sk" lang="sk">
<head>

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<meta http-equiv="Content-Style-Type" content="text/css" />
<meta http-equiv="Content-Language" content="sk" />
<title>Test</title>

</head>
<body>

<?php

$html = file_get_contents('http://example.com');
$convertedHtml = mb_convert_encoding($html, 'HTML-ENTITIES', "UTF-8");
echo $convertedHtml;

?>

</body>
</html>
Copy after login

By converting the UTF-8 characters to their corresponding HTML entities, we ensure proper rendering of special characters in the loaded HTML document.

The above is the detailed content of Why Does file_get_contents() Corrupt Special Characters in Remote HTML, and How Can I Fix It?. For more information, please follow other related articles on the PHP Chinese website!

source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Latest Articles by Author
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template