中文網站一般會選擇兩種編碼:gbk/gb2312或是utf-8。
gbk編碼下每個中文字元所佔位元組為2,例:
<code><span>$zhStr</span> = ‘您好,中国!’; <span>echo</span> strlen(<span>$zhStr</span>); <span>// 输出:12</span></code>
<code>utf-8编码下每个中文字符所占字节为3,例: </code>
<code><span>$zhStr</span> = ‘您好,中国!’; <span>echo</span> strlen(<span>$zhStr</span>); <span>// 输出:18</span></code>
那麼如何計算這組中文字串的長度呢?有人可能會說gbk下獲取中文字串長度除以2,utf-8編碼下除以3不就行了嗎?但是您要考慮字串並不老實,99%的情況會以中英混合的情況出現。
這是WordPress中的一段程式碼,主要想法就是先用正規將字串分解為個體單元,然後再計算單元的個數即字串的長度,程式碼如下(只能處理utf-8編碼下的字串) :
<code><span>$zhStr</span> = ‘您好,中国!’; <span>$str</span> = ‘Hello,中国!’; <span>// 计算中文字符串长度</span><span><span>function</span><span>utf8_strlen</span><span>(<span>$string</span> = null)</span> {</span><span>// 将字符串分解为单元</span> preg_match_all(“/./us”, <span>$string</span>, <span>$match</span>); <span>// 返回单元个数</span><span>return</span> count(<span>$match</span>[<span>0</span>]); } <span>echo</span> utf8_strlen(<span>$zhStr</span>); <span>// 输出:6</span><span>echo</span> utf8_strlen(<span>$str</span>); <span>// 输出:9</span></code>
下面我封裝了一個函數準確計算中文字串的長度:
<code><span><span>function</span><span>count_strlen</span><span>(<span>$string</span> = null)</span> {</span><span>$fileType</span> = mb_detect_encoding(<span>$string</span> , <span>array</span>(<span>'UTF-8'</span>,<span>'GBK'</span>,<span>'LATIN1'</span>,<span>'BIG5'</span>)) ; <span>//判断字符串中文编码的类型</span><span>$length</span> = iconv_strlen(<span>$string</span>,<span>$fileType</span>);<span>//根据字符编码计算字符串长度</span><span>return</span><span>$length</span>; } <span>$str</span> = <span>"中文45汶"</span>; <span>$len</span> = count_strlen(<span>$str</span>); <span>echo</span><span>$len</span>; <span>//输出5</span></code>
以上就介紹了PHP 統計中文字串的長度,包括了字串,php方面的內容,希望對PHP教學有興趣的朋友有幫助。