In C language
1, char a[10] = {"China"}, this How many bytes does a occupy?
Answer: It takes up 10 bytes.
Analysis: The above code assigns value to a,
a[0]='C', a[1]='h', a[2]='i' , a[3]='n', a[4]='a', a[5]='\0', a[6]='\0', a[7]='\0', a [8]='\0', a[9]='\0'
So, it occupies 10 bytes.
2. If it only refers to "China"?
Answer: A total of 6 bytes. China occupies 5 bytes, ends with '\0', and occupies 1 byte.
In Java language
1. How many bytes does String s = "China"; occupy?
Answer: 5 bytes.
System.out.println(Charset.defaultCharset());//获取ide默认编码类型 String s = new String("China".getBytes()); byte[] b = s.getBytes(); System.out.println("" + b.length);
The above code can output that the byte length occupied by "China" is 5
2. How many bytes does String s = "China"; occupy?
Answer: If it is Chinese characters, encoding needs to be considered.
(1) In GBK encoding (ide default), each Chinese character occupies 2 bytes, then China occupies 4 bytes.
(2) When UTF-8 is encoded, each Chinese character occupies 3 bytes, so China occupies 6 bytes.
3. Transcoding problem, when converting GBK to UTF-8, will the byte usage increase? What about the other way around?
//获取的是 UTF-8编码 System.out.println(Charset.defaultCharset()); String s; try { s = new String("中国".getBytes(),"GBK"); byte[] b = s.getBytes(); System.out.println("" + b.length); } catch (UnsupportedEncodingException e) { e.printStackTrace(); }
The above code is: UTF-8 to GBK encoding format, the s length changes from 6 bytes to 9 bytes.
s Content changes: China —> Juan浗
Because the current encoding is UTF-8, it becomes 3 characters after transcoding, occupying 9 bytes. However, the reverse situation is like this:
China (gbk) —> ?й? (utf-8)
s The length changes from 4 bytes to 4 bytes, Although the length has not changed, the text has. here ? Occupies 1 byte
The above is the detailed content of The byte size problem occupied by har[] and String types. For more information, please follow other related articles on the PHP Chinese website!