The iconv function library can complete conversions between various character sets and is an indispensable basic function library in PHP programming.
1. Download the libiconv function library http://ftp.gnu.org/pub/gnu/libiconv/libiconv-1.9.2.tar.gz;
2. Decompress tar -zxvf libiconv-1.9.2.tar.gz ;
3. Install libiconv
#configure --prefix=/usr/local/iconv
#make
#make install
4. Recompile php and add compilation parameters --with-ic/local/iconv
Under windows
Recently To make a thief program, you need to use the iconv function to convert the captured utf-8 encoded page into gb2312. I found that only by using the iconv function to transcode the captured data, the data will be less for no reason. I was depressed for a while. After checking the information on the Internet, I found out that this was a bug in the iconv function. iconv will make an error when converting the character "—" to gb2312
The solution is very simple, that is, add "//IGNORE" after the encoding that needs to be converted, which is the second parameter of the iconv function. As follows:
The following is the quoted content :
Copy code The code is as follows:
iconv("UTF-8","GB2312//IGNORE",$data)
ignore means ignoring errors during conversion, if there is no ignore parameter , all strings following this character cannot be saved.
Copy code The code is as follows:
echo $str= 'Hello, we sell coffee here!';
echo '
';
echo iconv(' GB2312', 'UTF-8', $str); //Convert the string encoding from GB2312 to UTF-8
echo '
';
echo iconv_substr($str, 1, 1, ' UTF-8'); //Truncate by the number of characters instead of bytes
print_r(iconv_get_encoding()); //Get the encoding information of the current page
echo iconv_strlen($str, 'UTF-8'); //Get the setting Fixed encoding string length
//It can also be used like this
$content = iconv("UTF-8","gbk//TRANSLIT",$content);
?>
iconv is not the default function of php, and it is also a module installed by default. It needs to be installed before it can be used.
If it is windows2000+php, you can modify the php.ini file and remove the ";" before extension=php_iconv.dll. At the same time, you need to copy the iconv.dll in your original php installation file to your winnt/system32 (If your dll points to this directory)
In the Linux environment, use static installation and add an additional item --with-iconv when configure. phpinfo can see the iconv item. (Linux7.3+Apache4.06+php4.3.2),
Download: ftp://ftp.gnu.org/pub/gnu/libiconv/libiconv-1.8.tar.gz
Installation:
#cp libiconv-1.8. tar.gz /usr/local/src
#tar zxvf lib*
#./configure --prefix=/usr/local/libiconv
#make
#make install
Compile php
#./configure --prefix=/ usr/local/php4.3.2 --with-ic/local/libiconv/
Simple example of use:
echo iconv("gb2312","ISO-8859-1","we");
?>
Introduction to mb_convert_encoding and iconv functions in PHP
mb_convert_encoding This function is used to convert encodings. I used to not understand the concept of program coding, but now I seem to understand a little bit.
However, English generally does not have encoding problems, only Chinese data will have this problem. For example, when you use Zend Studio or Editplus to write a program, you use gbk encoding. If the data needs to be entered into the database, and the database encoding is utf8, then the data must be encoded and converted, otherwise it will become garbled when entering the database. .
See the official usage of mb_convert_encoding:
http://cn.php.net/manual/zh/function.mb-convert-encoding.php
Make a GBK To UTF-8
< ?php
header("content- Type: text/html; charset=Utf-8");
echo mb_convert_encoding("You are my friend", "UTF-8", "GBK");
?>
Another GB2312 To Big5
< ; ?php
header("content-Type: text/html; charset=big5");
echo mb_convert_encoding("You are my friend", "big5", "GB2312");
?>
But use The above function needs to be installed but the mbstring extension library needs to be enabled first.
Another function iconv in PHP is also used to convert string encoding, and its function is similar to the function above.
There are some detailed examples below:
iconv — Convert string to requested character encoding
(PHP 4 >= 4.0.5, PHP 5)
mb_convert_encoding — Convert character encoding
(PHP 4 >= 4.0.6, PHP 5)
Usage:
string mb_convert_encoding ( string str, string to_encoding [, mixed from_encoding] )
You need to enable the mbstring extension library first, and remove the ; in front of extension=php_mbstring.dll in php.ini
mb_convert_encoding can specify multiple types Input encoding, it will automatically identify based on the content, but the execution efficiency is much worse than iconv;
string iconv (string in_charset, string out_charset, string str)
Note: The second parameter, in addition to specifying the encoding to be converted to, You can also add two suffixes: //TRANSLIT and //IGNORE. //TRANSLIT will automatically convert characters that cannot be directly converted into one or more approximate characters. //IGNORE will ignore characters that cannot be converted. By default, The effect is to truncate from the first illegal character.
Returns the converted string or FALSE on failure.
Use:
It is found that iconv will make an error when converting the character "-" to gb2312. Without the ignore parameter, all strings following this character cannot be saved. No matter what, this "-" cannot be converted successfully and cannot be output. In addition, mb_convert_encoding does not have this bug.
In general, iconv is used. The mb_convert_encoding function is only used when the original encoding cannot be determined, or iconv cannot be displayed normally after conversion.
from_encoding is specified by character code name before conversion. it can be array or string - comma separated enumerated list. If it is not specified, the internal encoding will be used.
/* Auto detect encoding from JIS, eucjp-win, sjis-win, then convert str to UCS-2LE * /
$str = mb_convert_encoding($str, “UCS-2LE”, “JIS, eucjp-win, sjis-win”);
/* “auto” is expanded to “ASCII,JIS,UTF-8,EUC-JP ,SJIS” */
$str = mb_convert_encoding($str, “EUC-JP”, “auto”);
Example:
$content = iconv(”GBK”, “UTF-8″, $content);
$ content = mb_convert_encoding($content, "UTF-8","GBK");
Parameters that are easily overlooked when using the iconv function in php
When I was processing the content today, when using iconv for encoding conversion, I found the result It will be interrupted. I guess it is a problem with the character set. I thought about how to skip characters that do not exist in the target character set. I checked the manual and found that the iconv function only has three parameters, which seems not to work. Then I checked online and someone said it could, but it was very strange how to implement it. Finally I found that the English description said that you can add a mark to the end of the target code: "TRANSLIT". I was very depressed. How to add it? It turns out that "//" is added first, which is really depressing. There is such a design
Prototype: $txtContent = iconv("utf-8",'GBK',$txtContent);
Special parameters: iconv("UTF-8" ,"GB2312//IGNORE",$data)
Two optional auxiliary parameters: TRANSLIT and IGNORE, (where IGNORE means to skip if it encounters something that cannot be converted).Description
string iconv ( string in_charset, string out_charset, string str )
Performs a character set conversion on the string str from in_charset to out_charset. Returns the converted string or FALSE on failure.
If you append the string //TRANSLIT to out_charset transliteration is activated. This means that when a character can't be represented in the target charset, it can be approximated through one or several similarly looking characters. If you append the string //IGNORE, characters that cannot be represented in the target charset are silently discarded. Otherwise, str is cut from the first illegal character.
The above introduces how to use the round function and how to use the iconv function in PHP, including how to use the round function. I hope it will be helpful to friends who are interested in PHP tutorials.