MySQL and PHP: Cyrillic Characters with UTF-8 Encoding
When attempting to store Cyrillic text in a MySQL database, it's crucial to ensure proper character encoding to avoid data corruption. The issue you encountered is likely related to a character encoding mismatch between PHP and MySQL.
To resolve this issue, you must verify that every aspect of your PHP script and database configuration is specified with UTF-8 encoding. Here are key factors to consider:
-
PHP File Encoding: Save the PHP script as UTF-8 without BOM (Byte Order Mark).
-
HTML Header: Specify the character set as UTF-8 within the HTML tag.
-
PHP Output Encoding: Use header('Content-Type: text/html; charset=utf-8') to set the output encoding to UTF-8.
-
MySQL Database and Table Encoding: Alter the database and table character sets to utf8, using the ALTER DATABASE and ALTER TABLE commands.
-
Connection-Object Charset: Set the charset for the mysqli connection object to UTF-8 with mysqli_set_charset($conn, 'utf8').
-
JSON Encoding: If using json_encode(), consider using the JSON_UNESCAPED_UNICODE flag to prevent character conversion to hexadecimal.
Additionally, remember that all components in your application, including HTML, PHP, and MySQL, must use consistent encoding settings. If any step is out of sync, character issues may arise.
Note:
- UTF-8 with a dash (utf-8) is used in HTML and PHP, while UTF-8 without a dash (utf8) is used in MySQL.
- Collation is different from charset in MySQL. Both should be set to utf8. The collation should be either utf8_general_ci or utf8_unicode_ci.
- For emojis, use the utf8mb4 charset instead of utf8 in MySQL.
The above is the detailed content of How to Properly Handle Cyrillic Characters in MySQL and PHP with UTF-8 Encoding?. For more information, please follow other related articles on the PHP Chinese website!