Java uses Unicode encoding. The range of char type variables is 0-65535. Unsigned values can represent 65536 characters. Basically, all characters on the earth can be included. In practice, we hope to determine whether a character is a Chinese character, or whether the characters in a string have Chinese characters to meet business needs. There is a method in the String class to get the character length length(). See the example below.
String s1 = "我是中国人"; String s2 = "imchinese"; String s3 = "im中国人"; System.out.println(s1+":"+new String(s1).length()); System.out.println(s2+":"+new String(s2).length()); System.out.println(s3+":"+new String(s3).length());
OUTPUT:
I am Chinese:5
imchinese:9
im中文:5
java determine characters Whether the string is Chinese:
/** * 判断该字符串是否为中文 * @param string * @return */ public static boolean isChinese(String string){ int n = 0; for(int i = 0; i < string.length(); i++) { n = (int)string.charAt(i); if(!(19968 <= n && n <40869)) { return false; } } return true; }
unicode encoding range:
Chinese characters: [0x4e00,0x9fa5] (or decimal [19968,40869])
Numbers: [ 0x30,0x39] (or decimal [48, 57])
Lowercase letters: [0x61,0x7a] (or decimal [97, 122])
Uppercase letters: [0x41,0x5a] (or decimal [65, 90])
unicode Chinese range
Chinese character encoding range:\u4e00-\u9FA5
Double-byte character encoding range:\u0391-\ uFFE5
For more java knowledge, please pay attention to java basic tutorial.
The above is the detailed content of Determine whether a string is Chinese in java. For more information, please follow other related articles on the PHP Chinese website!