Encoding and Decoding UTF-8 Byte Arrays and Java Strings
In Java, manipulating text data requires seamless conversions between strings and byte arrays in various encodings. This article explores how to perform these conversions efficiently, focusing on the widely-used UTF-8 encoding.
Encoding Strings to Byte Arrays
To convert a Java string into a UTF-8-encoded byte array, utilize the getBytes() method. For instance:
String str = "Hello, world!"; byte[] byteArray = str.getBytes(StandardCharsets.UTF_8);
Decoding Byte Arrays to Strings
To obtain a Java string from a given byte array, use the new String() constructor, specifying the desired encoding. Example:
byte[] byteArray = {(byte) 72, (byte) 101, (byte) 108, (byte) 108, (byte) 111}; String str = new String(byteArray, StandardCharsets.UTF_8);
Importance of Encoding Specification
Selecting the appropriate encoding is crucial for correct string conversions. UTF-8 is a versatile encoding that supports most characters found in human languages. However, for other encoding needs, choose the appropriate encoding constant from the StandardCharsets class.
The above is the detailed content of How Do I Encode and Decode UTF-8 Byte Arrays and Java Strings?. For more information, please follow other related articles on the PHP Chinese website!