PHP Decoding and Encoding JSON with Unicode Characters
Unicode Character Decoding Issues
When attempting to decode JSON containing unicode characters, you may encounter issues if the characters fall within certain bounds. Specifically, characters such as "" and "" prohibited from appearing in JSON strings. Additionally, control characters are not allowed.
UTF-8 Encoding and Decoding
To address this, you can use utf8_encode to allow the string to be decoded with json_decode. However, this may result in the characters being mangled. For instance, "Odómetro" would be converted to "Odómetro".
Re-Encoding and Character Escaping
Upon re-encoding the array, the character is escaped to ASCII, which is correct according to the JSON spec: "Tag"=>"Odu00f3metro". To unescape the character, you could utilize the JSON_UNESCAPED_UNICODE option for `json_encode. However, this is only available in PHP 5.4 or later.
Alternative Solution Using Regex
If you're limited to PHP 5.3, you can employ a regex-based solution:
$json = json_encode($array, JSON_UNESCAPED_SLASHES); // Replace escaped unicode characters with their UTF-8 equivalents $json = preg_replace('/\\u([0-9a-fA-F]{4})/', '&#x;', $json);
By using the JSON_UNESCAPED_SLASHES flag, you prevent the slashes () from being escaped in the JSON string. The regex pattern then matches escaped unicode characters (u####) and replaces them with their UTF-8 equivalents.
The above is the detailed content of How to Decode and Encode JSON with Unicode Characters in PHP?. For more information, please follow other related articles on the PHP Chinese website!