What is the number of audio samples collected per second-Common Problem-php.cn

The number of audio samples collected per second refers to the "sampling frequency", which is measured in samples per second or Hertz. A lower sample rate means fewer samples per second, which in turn means less audio data because there are fewer sample points to represent the amount of audio; a higher sample rate requires more storage space and processing power to handle.

What is the number of audio samples collected per second

The operating environment of this tutorial: Windows 7 system, Dell G3 computer.

When it comes to audio processing, there are a lot of terms that most people have heard of before, but don't really understand. I used to be one of these people before I had to go into audio processing. To do this, I want to talk about some of these terms, describe what they are, and show what they mean for the quality of your audio recording or stream. For the remainder of this article, we will assume that we are only dealing with one channel of uncompressed audio.

1. Sampling rate/sampling frequency

The first term we often hear is sampling rate or sampling frequency, both of which refer to the same thing. Some values you may have come across are 8kHz, 44.1kHz and 48kHz. What exactly is the sample rate of an audio file?

The sampling rate refers to the number of audio samples recorded per second. It is measured in samples per second or hertz (abbreviated as Hz or kHz, 1kHz is 1000Hz). An audio sample is simply a number that represents a measured sound wave value at a specific point in time. It is very important that these samples are taken at equal moments in time within a second. For example, if the sampling rate is 8000 Hz, then it is not enough to have 8000 samples in one second; they must be collected exactly in 1/8000th of a second. In this case, the number 1/8000 is called the sampling interval (in seconds), and the sampling rate is just the multiplicative reciprocal of that interval.

Sampling rate is similar to a video's frame rate or FPS (frames per second) measurement. A video is simply a series of pictures, often called "frames" here, displayed back to back very quickly, giving the illusion of continuous uninterrupted motion or movement (at least to us humans).

While audio sample rates and video frame rates are similar, the usual minimum numbers that guarantee usability in each are very different. For video, in order to ensure accurate description of motion, at least 24 frames per second are required; less than this number, the motion may appear unsmooth, and the illusion of continuous, uninterrupted motion cannot be maintained. This is especially true the more motion occurs between frames. Additionally, videos at 1 or 2 frames per second may have "momentary" events that are guaranteed to be missed between frames.

For audio, to unambiguously represent English speech, the minimum number of samples per second is 8000 Hz. Using a sampling rate lower than this number will result in speech being unintelligible for a variety of reasons, one of which is that similar utterances will be indistinguishable from each other. Lower sampling rates can confuse phonemes, or sounds in language, that have significant high-frequency energy; for example, at 5000 Hz, it is difficult to distinguish /s/ from /sh/ or /f/.

Now that we mentioned video frames, another term worth elaborating on is audio frames. Although audio samples and audio frames are both measured in Hertz, they are not the same thing. An audio frame is a group of audio samples from one time instance of one or more audio channels.

The most common sample rate values are the aforementioned 8kHz (most common in telephone communications), 44.1kHz (most common in music CDs), and 48kHz (most common in movie soundtracks). A lower sample rate means fewer samples per second, which in turn means less audio data because there are fewer sample points to represent the amount of audio. The choice of sampling rate depends on which acoustic artifacts need to be collected. Some acoustic artifacts such as speech intonation require a lower sampling rate than acoustic artifacts such as musical tunes on a music CD. It's worth noting that higher sample rates require more storage space and processing power to handle, although this may not be as much of an issue now when digital storage and processing power were the primary concern in the past.

2. Sampling depth/sampling accuracy/sampling size

In addition to the sampling rate, which is how many audio data points we have, there is also the sampling depth. Measured in bits per sample, sample depth (also called sample precision or sample size) is the second important property of an audio file or audio stream, and represents the level of detail, or "quality", of each sample. As we mentioned above, each audio sample is just a number, and while having many numbers helps represent audio, you also need the range or "mass" of each individual number to be large enough to accurately represent each sample or data point. What does "quality" mean? For an audio sample, it simply means that the audio sample can represent a higher amplitude range. A sampling depth of 8 bits means we have 2^8=256 different amplitudes, while a sampling depth of 16 bits means we have 2^16=65,536 different amplitudes, and so on for higher sampling depths. The most common sample depths for phone audio are 16-bit and 32-bit. In a digital recording, the more different amplitudes there are, the closer the digital recording will sound to the original acoustic event.

Again, this is similar to the 8-bit or 16-bit numbers we might hear about image quality. For images or videos, each pixel in the image or video frame also has a certain number of bits to represent the color. The higher the bit depth in a pixel, the more accurate the resulting pixel colors are, because the pixel has more bits to "describe" the color to be represented on the screen, and the pixel or image overall looks more like what people would see in real life. look. Technically, a pixel's bit depth indicates how many different colors that pixel can represent. If you allow each of R, G, and B to be represented by 8 bits, then each pixel is represented by 3 x 8 = 24 bits. This means there are 2^24~17 million different colors that can be represented by that pixel.

3. Bit rate

What links the sampling rate and sampling depth is the bit rate, which is a simple product of the two. Since sampling rate is measured in samples per second and sampling depth is measured in bits per sample, it is given by (samples per second) x (bits per sample) = Measured in bits per second, abbreviated as bps or kbps. It's worth noting that since sample depth and bitrate are related, they are often used interchangeably, albeit incorrectly.

The bitrate in audio varies from application to application. Applications that require high audio quality, such as music, typically have a higher bitrate, producing higher quality, or "clearer" audio. Telephone audio, including call center audio, does not require a high bitrate, so the bitrate of a regular phone call is usually much lower than that of a music CD. Whether it's sample rate or bit rate, lower values may sound worse, but again, depending on the application, lower values may save storage space and/or processing power.

To summarize, what exactly does compression mean when it comes to audio? Compressed audio formats, such as AAC or MP3, have bitrates that are smaller than the true product of sample rate and sample depth. These formats are implemented by "surgically" removing information from the bitstream, meaning that frequencies or amplitudes that are biologically inaudible to the human ear in dynamic situations are not stored, resulting in smaller overall file sizes. .

For more related knowledge, please visit the FAQ column!

The above is the detailed content of What is the number of audio samples collected per second. For more information, please follow other related articles on the PHP Chinese website!