Home > Article > Backend Development > What should I do if big5 is converted to utf8 garbled code in PHP?
The solution to convert big5 to utf8 garbled characters in php: first generate the tab file, and ensure that the tab file does not exist when generating; then convert the specified page to test; then print out the text library; finally big5 convert [utf -8】That’s it.

Solution to convert big5 to utf8 garbled code in php:
The first step: generate tab file, when generating Make sure the tab file does not exist
writebig5UnicodeFile();
Step 2: Specify page transcoding test
testCode();
Step 3: Print out the text library
printfCode();
<?php
//生成big5-unicode 编码文件
function loadBig5(){
$fp = fopen( './big5-unicode.txt', 'r' );
$big5_unicode_arr = array();
while($one_line = fgets($fp)) {
$one_line_arr = explode("\t",$one_line);
$big5 = hexdec(trim($one_line_arr[0]));
$unicode = trim($one_line_arr[1]);
if(strpos($unicode,',')) {
$unicode = ltrim(explode(',',$unicode)[0],'<');
}
$big5_unicode_arr[$big5] = hexdec($unicode);
}
return $big5_unicode_arr;
}
//追加形式写入文件
function putContent($content) {
static $fp;
if(!isset($fp)) {
$fp = fopen( './big5-unicode-new.tab', 'a+' );
}
fwrite($fp,$content);
}
//生成tab文件
function writebig5UnicodeFile() {
$big5_unicode_arr = loadBig5();
$big5_unicod_content = array();
$min = 2000;
$max = 0;
$max_unicode = 0;
foreach($big5_unicode_arr as $big5 => $unicode) {
$h = floor($big5/256);
$l = $big5%256;
$index = ($h-135)*256*3+$l*3;
if($index<$min) {
$min = $index;
}
if($max<$index) {
$max = $index;
}
if($unicode>$max_unicode) {
$max_unicode = $unicode;
}
$h_1 = floor($unicode/65536);
$h_2 = floor($unicode/256);
$h_3 = $unicode%256;
$big5_unicod_content[$index] = chr($h_1).chr($h_2).chr($h_3);
}
for($i=0;$i<=$max;$i=$i+3) {
if(!isset($big5_unicod_content[$i])) {
$big5_unicod_content[$i] = chr(0).chr(0).chr(0);
}
}
for($i=0;$i<=$max;$i=$i+3) {
if(strlen($big5_unicod_content[$i]) == 3) {
putContent($big5_unicod_content[$i]);
}else{
die('error');
}
}
}
//测试编辑结果
function testCode() {
$content = file_get_contents( './temlate_2.html');
echo b2u($content);
}
//打印出编码库文字
function printfCode() {
$fp = fopen( './big5-unicode-new.tab', 'r' );
$len = filesize('./big5-unicode-new.tab');
$x = 0;
$outstr = array();
// fseek( $fp, 21000 - 900 + 42*3);
for($i=$x=0;$i<$len;$i=$i+3) {
$uni = fread( $fp, 3 );
$codenum = ord($uni[0])*65536 + ord($uni[1])*256 + ord($uni[2]);
if($codenum == 0) {
$outstr[$x++] = ' ';
}elseif( $codenum < 0x80 ) {
$outstr[$x++] = chr($codenum);
}elseif($codenum < 0x800) {
$outstr[$x++] = chr( 192 + $codenum / 64 );
$outstr[$x++] = chr( 128 + $codenum % 64 );
}elseif($codenum < 0x10000){
$outstr[$x++] = chr( 224 + floor($codenum / 4096 ));
$codenum = $codenum%4096;
$outstr[$x++] = chr( 128 + floor($codenum / 64 ));
$outstr[$x++] = chr( 128 + ($codenum % 64) );
}else{
$outstr[$x++] = chr( 240 + floor($codenum / 262144 ));
$codenum = $codenum%262144;
$outstr[$x++] = chr( 128 + floor($codenum / 4096 ));
$codenum = $codenum%4096;
$outstr[$x++] = chr( 128 + ($codenum / 64) );
$outstr[$x++] = chr( 128 + ($codenum % 64) );
}
}
echo join( '', $outstr);
}
//big5 转 utf-8
function b2u( $instr ) {
$fp = fopen( './big5-unicode-new.tab', 'r' );
$len = strlen($instr);
$outstr = '';
for( $i = $x = 0 ; $i < $len ; $i++ ) {
$h = ord($instr[$i]);
if( $h >= 135 ) {
$l = ord($instr[$i+1]);
fseek( $fp, ($h-135)*256*3+$l*3 );
$uni = fread( $fp, 3 );
$codenum = ord($uni[0])*65536 + ord($uni[1])*256 + ord($uni[2]);
if($codenum == 0) {
$outstr[$x++] = ' ';
}elseif( $codenum < 0x80 ) {
$outstr[$x++] = chr($codenum);
}elseif($codenum < 0x800) {
$outstr[$x++] = chr( 192 + $codenum / 64 );
$outstr[$x++] = chr( 128 + $codenum % 64 );
}elseif($codenum < 0x10000){
$outstr[$x++] = chr( 224 + floor($codenum / 4096 ));
$codenum = $codenum%4096;
$outstr[$x++] = chr( 128 + floor($codenum / 64 ));
$outstr[$x++] = chr( 128 + ($codenum % 64) );
}else{
$outstr[$x++] = chr( 240 + floor($codenum / 262144 ));
$codenum = $codenum%262144;
$outstr[$x++] = chr( 128 + floor($codenum / 4096 ));
$codenum = $codenum%4096;
$outstr[$x++] = chr( 128 + ($codenum / 64) );
$outstr[$x++] = chr( 128 + ($codenum % 64) );
}
$i++;
}
else
$outstr[$x++] = $instr[$i];
}
fclose($fp);
if( $instr != '' )
return join( '', $outstr);
}Related learning recommendations: PHP programming from entry to proficiency
The above is the detailed content of What should I do if big5 is converted to utf8 garbled code in PHP?. For more information, please follow other related articles on the PHP Chinese website!