How to Handle Invalid UTF-8 Characters in a String in Go
When marshalling a list of strings using json.Marshal, you may encounter the error message "json: invalid UTF-8 in string." This occurs due to invalid UTF-8 sequences within the strings.
Replacement or Removal of Invalid Characters
In Python, you have methods to remove, replace, or raise exceptions for invalid UTF-8 characters. Go provides a similar solution:
Using strings.ToValidUTF8 (Go 1.13 )
This function removes invalid UTF-8 sequences from a string and replaces them with the Unicode replacement character (U FFFD).
fixedString := strings.ToValidUTF8("a\xc5z", "")
Mapping and Replacing with utf8.RuneError (Go 1.11 )
You can also map characters in a string using strings.Map and utf8.RuneError. If the character is a Unicode error (invalid), it is replaced with the specified fallback value (e.g., -1).
fixUtf := func(r rune) rune { if r == utf8.RuneError { return -1 } return r } var input1 = "a\xc5z" fmt.Println(strings.Map(fixUtf, input1)) // Output: az var input2 = "posic�o" fmt.Println(strings.Map(fixUtf, input2)) // Output: posico
The above is the detailed content of How to Effectively Handle Invalid UTF-8 Characters in Go Strings?. For more information, please follow other related articles on the PHP Chinese website!