MySQL's Treatment of Unicode Characters: Ä, Ö, and Ü
It is puzzling to encounter identical query results when searching for both "Harligt" and "Härligt" in MySQL. This behavior stems from MySQL's default collation settings that equate certain Unicode characters.
MySQL's non-language-specific Unicode collations, such as utf8_general_ci and utf8_unicode_ci, treat certain characters as equivalent, namely:
As a result, the two queries perceive all three characters as identical, thus yielding matching results.
To resolve this issue, there are two options:
<code class="sql">select * from topics where name='Harligt' COLLATE utf8_bin;</code>
This approach allows for case-sensitive searches without the automatic character conversion.
If a case-insensitive search is desired but without the character conversion, MySQL does not currently provide a suitable collation. However, there are ongoing discussions and potential solutions within the community.
The above is the detailed content of Why Do `Harligt` and `Härligt` Return the Same Results in MySQL?. For more information, please follow other related articles on the PHP Chinese website!