Home >Web Front-end >JS Tutorial >js processes strings containing Chinese characters

js processes strings containing Chinese characters

一个新手
一个新手Original
2017-10-12 09:36:161280browse

Scenario:

The length attribute that comes with the String type in js gets the number of characters in the string, but the front end often needs to limit the string. Display length: One Chinese character occupies the display position of two English lowercase characters. Therefore, it is often incorrect to use the length value to judge the display length when Chinese and English are mixed.

The conventional solution is to traverse the string. Chinese characters count as length 2, non-Chinese characters count as length 1, and the display length of the string is limited by the newly calculated sum of the lengths. Look at the code ↓↓↓

var Tools ={    //是否包含中文
    hasZh: function(str){        
    for(var i = 0;i < str.length; i++)
        {            
        if(str.charCodeAt(i) > 255) //如果是汉字,则字符串长度加2
                return true;            
                return false;
        }
    },    //重新计算长度,中文+2,英文+1
    getlen: function(str){       
    var strlen = 0;        
    for(var i = 0;i < str.length; i++)
        {            
        if(str.charCodeAt(i) > 255) //如果是汉字,则字符串长度加2
                strlen += 2;            
                else
                strlen++;
        }        return strlen;
    },    //限制长度
    limitlen: function(str, len){        
    var result = "";        
    var strlen = 0;        
    for(var i = 0;i < str.length; i++)
        {            
        if(str.charCodeAt(i) > 255) //如果是汉字,则字符串长度加2
                strlen += 2;            
                else
                strlen++;

            result += str.substr(i,1);            
            if(strlen >= len){                
            break;
            }
        }        return result;
    }
}

The principle of this method is based on the different unicode encoding ranges of Chinese and English. Chinese occupies 2 bytes and English occupies 1 byte, so the Chinese unicode encoding value It is definitely greater than 2^8-1=255.

The above method can be more rigorous: consider the unicode encoding range. The specific range can be found in the Unicode Table

PS: The unicode encoding range of Chinese characters is 4E00-9FA5 in hexadecimal and decimal. Then it is: 19968-40869, that is, the accurate expression for judging Chinese is:

str.charCodeAt(i)>=19968 && str.charCodeAt(i)<=40869

Insert a less rigorous sentence. The code does not need to limit the scope of the code. After all, you don’t know what strange things the user (test) will lose. s things.

The above is the detailed content of js processes strings containing Chinese characters. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn