Exploring Differences: Range over String vs. Rune Slice in Go
When iterating over character sequences in Go, developers may encounter two similar approaches: ranging over strings and ranging over rune slices. While they appear to produce identical results, there's a subtle distinction between the two.
Ranging over String:
Ranging over a string, as in the provided code snippet, treats the string as a sequence of bytes. Each iteration yields a single byte represented by the s variable. However, this approach has limitations when working with multibyte characters, such as Unicode characters that span multiple bytes.
Ranging over Rune Slice:
An alternative approach is to convert the string to a rune slice, a collection of rune values, using []rune(str). Runes are character units in Unicode, providing a more accurate representation of character sequences. Iterating over a rune slice yields a rune value (s), which can represent a single-byte or multibyte character.
The Difference:
The critical difference lies in the index of the loop variable. In the example, both loops use a range variable i, which represents the byte index in the original string. However, when ranging over a rune slice, the i may advance by more than one byte if the preceding rune value spanned multiple bytes. This behavior is due to the fact that runes can encode characters with varying byte lengths, unlike bytes, which always represent a single character.
Conclusion:
While ranging over strings may appear simpler, it can lead to inconsistent results when dealing with multibyte characters. For accurate and reliable character iteration, ranging over a rune slice is generally recommended.
The above is the detailed content of When to Use `range over String` vs. `range over Rune Slice` in Go?. For more information, please follow other related articles on the PHP Chinese website!