Indexing Characters in Golang Strings
To retrieve characters from a string, you use indexing operations. However, you may encounter cases where the indexed value doesn't match the expected character. For instance, in the code below:
package main import "fmt" func main() { fmt.Print("HELLO"[1]) }
The output is 69 instead of the letter "E."
Understanding Golang String Encoding
Golang uses UTF-8 encoding for string literals. ASCII characters, including the letter "E," occupy a single byte. However, larger Unicode characters may be represented by multiple bytes.
Using Runes for Character Indexing
To index characters correctly, use runes: integers representing Unicode code points. A rune is the underlying type for characters in UTF-8. You can convert a byte to a rune using the rune() function.
Converting Bytes to Characters
To convert a byte to its corresponding character, you can use the string() function:
fmt.Println(string("Hello"[1])) // ASCII only
This approach works well for ASCII characters.
Converting Runes to Characters
For Unicode characters, you can convert a rune to a character using the []rune() slice:
fmt.Println(string([]rune("Hello, 世界")[1])) // UTF-8
Example with Unicode Characters
Consider the following example:
fmt.Println(string([]rune("Hello, 世界")[8])) // UTF-8
This will print "界," which is the Unicode code point for the Chinese character "world."
Additional Resources
The above is the detailed content of Why Doesn't String Indexing in Go Always Return the Expected Character?. For more information, please follow other related articles on the PHP Chinese website!