Golang writes garbled files

WBOY
Release: 2023-05-10 09:36:36
Original
936 people have browsed it

Writing data to a file in golang is a basic operation, but in some cases, you will encounter the problem of garbled characters after writing the file. Some wrong characters will appear in the file, which means it will cause errors when reading the file. Therefore, this article will discuss the problem of garbled files written by golang and how to solve it.

Cause of garbled code

The reason for garbled code is that the encoding format of the file does not match the encoding format of the data written by the program. Golang uses UTF-8 encoding format by default. If the files use different encoding formats, garbled characters will result. Specific situations include the following:

The encoding format of the file itself is wrong

When the encoding format of the file is wrong, golang will produce garbled characters when writing the file. For example, if the encoding format of the file is GBK and the written data uses UTF-8 encoding, you will get wrong characters in the file.

Different language settings of the operating system

Differences in the language settings of the operating system may also cause garbled characters. If the golang program runs on an operating system that uses a different language, the written file may contain the operating system's default character set, and the output of the golang program will be affected by UTF-8 encoding.

Solution

Option 1: Force the use of UTF-8 encoding format

The simplest solution is to force the use of UTF-8 encoding format when writing files. For this purpose, you can use the "unicode/utf8" package from the Go standard library.

In this package, use the BytesRune() function to convert utf8 strings into byte slices. Next, use "os.File" to open the file and use the Write() or WriteString() function to write this byte slice to the file. The sample code is as follows:

import (
    "os"
    "unicode/utf8"
)

func main() {
    file, err := os.Create("test.txt")
    if err != nil {
        panic(err)
    }
    defer file.Close()

    str := "hello world"
    byteArr := []byte(str)

    // 将str强制转换为utf8编码的byte数组
    utf8Byte := make([]byte, len(byteArr))
    count := 0
    for len(byteArr) > 0 {
        size := 0
        c := make([]byte, 4)
        if utf8.ValidRune(rune(byteArr[0])) {
            size = 1
            c[0] = byteArr[0]
        } else {
            size = utf8.EncodeRune(rune(c[0]), rune(byteArr[0]))
        }
        utf8Byte[count] = c[0]
        count++
        byteArr = byteArr[size:]
    }

    // 将utf8编码的byte数组写入文件
    _, err = file.Write(utf8Byte)
    if err != nil {
        panic(err)
    }
}
Copy after login

In the above code, the ValidRune() function is used to determine whether the element in the byte slice is a legal rune character. If it is not, the given rune character will be converted to a utf8 character using the EncodeRune() function and stored in the given bytes.

However, this method can only force the file to be written in UTF-8 encoding format. If the encoding format of the file is GBK, this method cannot solve the problem.

Option 2: Use buffered writes in the "io" package

Another solution is to use buffered writes in the "io" package. The advantage of the buffer is that it can reduce the number of system calls and improve performance when writing files, and avoid writing garbled characters.

To use buffered writing of the io package, simply create a buffered writer and use the Write() or WriteString() function to write data to the buffer. When the buffer is full, the Flush() function is automatically called to write the data in the buffer to the file.

The following is a sample code:

import (
    "bufio"
    "os"
)

func main() {
    file, err := os.Create("test.txt")
    if err != nil {
        panic(err)
    }
    defer file.Close()

    writer := bufio.NewWriter(file)
    str := "hello world"
    _, err = writer.WriteString(str)
    if err != nil {
        panic(err)
    }
    err = writer.Flush()
    if err != nil {
        panic(err)
    }
}
Copy after login

In the above code, the NewWriter() function of the bufio package is used to create a buffered writer. Then use the WriteString() function to write the data to the buffer. Finally, use the Flush() function to write data from the buffer to the file.

In this case, the buffer will be automatically flushed before the buffer is full to avoid writing garbled characters.

Summary

The problem of garbled files written by golang is caused by the mismatch between the file encoding format and the golang program encoding format. To solve this problem, you can force the use of UTF-8 encoding format, or use the io package buffered write operation. No matter which method is used, you need to understand the encoding format of the file and process it according to the actual situation.

The above is the detailed content of Golang writes garbled files. For more information, please follow other related articles on the PHP Chinese website!

source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template
About us Disclaimer Sitemap
php.cn:Public welfare online PHP training,Help PHP learners grow quickly!