What is the difference between java character stream and byte stream-JavaBase-php.cn

What is the difference between java character stream and byte stream

尚

Release： 2019-12-02 14:10:14

Original

4956 people have browsed it

What is the difference between java character stream and byte stream

The difference between character stream and byte stream in java: (Recommended: java video tutorial)

1. The basic unit of byte stream operations is byte; the basic unit of character stream operations is Unicode code element.

2. Byte streams do not use buffers by default; character streams use buffers.

3. Byte stream is usually used to process binary data. In fact, it can process any type of data, but it does not support direct writing or reading of Unicode code elements; character stream usually processes text data, which supports Write and read Unicode code units.

Byte stream

The most basic unit of byte stream processing in Java is a single byte, which is usually used to process binary data. The two most basic byte stream classes in Java are InputStream and OutputStream, which represent the basic input byte stream and output byte stream respectively.

The InputStream class and the OutputStream class are both abstract classes. In actual use, we usually use a series of their subclasses provided in the Java class library. Let's take the InputStream class as an example to introduce the byte stream in Java.

The InputStream class defines a basic method read for reading bytes from a byte stream. The definition of this method is as follows:

public abstract int read() throws IOException;

Copy after login

This is an abstract method, that is Any input byte stream class derived from InputStream needs to implement this method. The function of this method is to read a byte from the byte stream. If it reaches the end, it returns -1, otherwise it returns the read byte. .

What we need to note about this method is that it will block until it returns a read byte or -1. In addition, byte streams do not support caching by default, which means that each time the read method is called, the operating system is requested to read a byte, which is often accompanied by a disk IO, so the efficiency is relatively low.

Some friends may think that the overloaded method of read in the InputStream class that takes a byte array as a parameter can read multiple bytes at a time without frequent disk IO. So is this really the case? Let’s take a look at the source code of this method:

public int read(byte b[]) throws IOException {
    return read(b, 0, b.length);
}

Copy after login

It calls another version of the read overload method, so let’s follow up:

public int read(byte b[], int off, int len) throws IOException {
        if (b == null) {
            throw new NullPointerException();
        } else if (off < 0 || len < 0 || len > b.length - off) {
            throw new IndexOutOfBoundsException();
        } else if (len == 0) {
            return 0;
        }

        int c = read();
        if (c == -1) {
            return -1;
        }
        b[off] = (byte)c;

        int i = 1;
        try {
            for (; i < len ; i++) {
                c = read();
                if (c == -1) {
                    break;
                }
                b[off + i] = (byte)c;
            }
        } catch (IOException ee) {
        }
        return i;
    }

Copy after login

From the above code we can see Yes, in fact, the read(byte[]) method internally reads a byte array "at a time" by calling the read() method in a loop, so essentially this method does not use the memory buffer. To use a memory buffer to improve reading efficiency, we should use BufferedInputStream.

Character stream

The most basic unit of character stream processing in Java is the Unicode code element (size 2 bytes), which is usually used to process text data. The so-called Unicode code element is a Unicode code unit, ranging from 0x0000~0xFFFF. Each number in the above range corresponds to a character. The String type in Java encodes characters according to Unicode rules by default and then stores them in memory.

However, unlike storage in memory, data stored on disk usually has various encoding methods. Using different encoding methods, the same characters will have different binary representations. In fact, the character stream works like this:

Output character stream: Convert the character sequence to be written to the file (actually a Unicode code element sequence) into a byte sequence under the specified encoding method, and then write Into the file;

Input character stream: decode the byte sequence to be read into the corresponding character sequence (actually a Unicode code element sequence) according to the specified encoding method, so that it can be stored in the memory.

We use a demo to deepen our understanding of this process. The sample code is as follows:

import java.io.FileWriter;
import java.io.IOException;


public class FileWriterDemo {
    public static void main(String[] args) {
        FileWriter fileWriter = null;
        try {
            try {
                fileWriter = new FileWriter("demo.txt");
                fileWriter.write("demo");
            } finally {
                fileWriter.close();
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

Copy after login

For more java knowledge, please pay attention to the java basic tutorial column.

The above is the detailed content of What is the difference between java character stream and byte stream. For more information, please follow other related articles on the PHP Chinese website!