How to solve the problem of garbled characters when reading files in PHP

藏色散人
Release: 2023-03-07 18:16:01
Original
2677 people have browsed it

The solution to garbled Chinese characters when reading files in php: first open the corresponding code file; then use the "iconv($encodType, "utf-8", $content); method to solve the Chinese garbled characters.

How to solve the problem of garbled characters when reading files in PHP

Recommended: "PHP Video Tutorial"

PHP reads files and solves Chinese garbled UTF- 8

$opts = array(  
'file' => array(  
        'encoding' => "utf-8"  
  )  
);  
$opts = array('http' => array('encoding' => 'utf-8'));  
$ctxt = stream_context_create($opts);  
$content = file_get_contents($filePath, FILE_TEXT, $ctxt);
Copy after login

The simplest is to change GF2312→UTF-8

$str=iconv("gb2312", "utf-8", $str);
Copy after login

It doesn’t work

$content
 = mb_convert_encoding(
$content
, 
"UTF-8"
, 
"auto"
);
Copy after login

**************** ***************************The ugly dividing line tells everyone that the above is bad: the following is the correct method... Ha ha···********************************************** ************

define('UTF32_BIG_ENDIAN_BOM', chr(0x00) . chr(0x00) . chr(0xFE) . chr(0xFF));  
define('UTF32_LITTLE_ENDIAN_BOM', chr(0xFF) . chr(0xFE) . chr(0x00) . chr(0x00));  
define('UTF16_BIG_ENDIAN_BOM', chr(0xFE) . chr(0xFF));  
define('UTF16_LITTLE_ENDIAN_BOM', chr(0xFF) . chr(0xFE));  
define('UTF8_BOM', chr(0xEF) . chr(0xBB) . chr(0xBF));  
  
$text = file_get_contents($newPath);  
$first2 = substr($text, 0, 2);  
$first3 = substr($text, 0, 3);  
$first4 = substr($text, 0, 3);  
$encodType = "";  
if ($first3 == UTF8_BOM)  
    $encodType = 'UTF-8 BOM';  
else if ($first4 == UTF32_BIG_ENDIAN_BOM)  
    $encodType = 'UTF-32BE';  
else if ($first4 == UTF32_LITTLE_ENDIAN_BOM)  
    $encodType = 'UTF-32LE';  
else if ($first2 == UTF16_BIG_ENDIAN_BOM)  
    $encodType = 'UTF-16BE';  
else if ($first2 == UTF16_LITTLE_ENDIAN_BOM)  
    $encodType = 'UTF-16LE';  
  
$content = file_get_contents($newPath);  
  
$content = iconv($encodType, "utf-8", $content);
Copy after login

ULTIMATE EDITION·····

$text = file_get_contents($filePath);  
                        //$encodType = mb_detect_encoding($text);  
                        define('UTF32_BIG_ENDIAN_BOM', chr(0x00) . chr(0x00) . chr(0xFE) . chr(0xFF));  
                        define('UTF32_LITTLE_ENDIAN_BOM', chr(0xFF) . chr(0xFE) . chr(0x00) . chr(0x00));  
                        define('UTF16_BIG_ENDIAN_BOM', chr(0xFE) . chr(0xFF));  
                        define('UTF16_LITTLE_ENDIAN_BOM', chr(0xFF) . chr(0xFE));  
                        define('UTF8_BOM', chr(0xEF) . chr(0xBB) . chr(0xBF));  
                        $first2 = substr($text, 0, 2);  
                        $first3 = substr($text, 0, 3);  
                        $first4 = substr($text, 0, 3);  
                        $encodType = "";  
                        if ($first3 == UTF8_BOM)  
                            $encodType = 'UTF-8 BOM';  
                        else if ($first4 == UTF32_BIG_ENDIAN_BOM)  
                            $encodType = 'UTF-32BE';  
                        else if ($first4 == UTF32_LITTLE_ENDIAN_BOM)  
                            $encodType = 'UTF-32LE';  
                        else if ($first2 == UTF16_BIG_ENDIAN_BOM)  
                            $encodType = 'UTF-16BE';  
                        else if ($first2 == UTF16_LITTLE_ENDIAN_BOM)  
                            $encodType = 'UTF-16LE';  
  
                        //下面的判断主要还是判断ANSI编码的·  
                        if ($encodType == '') {//即默认创建的txt文本-ANSI编码的  
                            $content = iconv("GBK", "UTF-8", $text);  
                        } else if ($encodType == 'UTF-8 BOM') {//本来就是UTF-8不用转换  
                            $content = $text;  
                        } else {//其他的格式都转化为UTF-8就可以了  
                            $content = iconv($encodType, "UTF-8", $text);  
                        }
Copy after login

The ultimate edition or above·can adapt to the ANSI """ established by the Chinese operating windows system txt text of "``UTF-8"""Unicode"``····

The above is the detailed content of How to solve the problem of garbled characters when reading files in PHP. For more information, please follow other related articles on the PHP Chinese website!

Related labels:
source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template