Directly call the WeChat applet recording interface, and then upload it to the server. Baidu Voice’s interface cannot recognize files in this format, so the question is
1. How to transcode? ffmpeg?
2. How to use it in PHP?
This article will solve these two problems.
Third-party tools used:
1. Baidu Voice
2. Silk file format conversion
1. Transcoding silk files Format
The portal is here: github.com/kn007/silk-v3-decoder
It should be noted that:
1. First, install - ffmpeg , please refer to Baidu for the specific installation method. If you want to transcode to MP3 format, please remember to enable libmp3lame, that is, --enable-libmp3lame
2. If you are using silk -v3-decoder prompts that the transcoding failed during the process. Please add ffmpeg to the environment variables or modify converter.sh as shown below. The latter is recommended because executing shell scripts through PHP does not read environment variables
2. How to use it in PHP
With the above artifact, mom no longer has to worry about speech recognition problems.
Currently, thinkPHP 5 is used as the server system, and audio files are uploaded to the background on the applet side. TP5 already has a file upload function package, and the specific upload code will not be detailed;
We only need to read the uploaded file, transcode it through the shell command, and then send the transcoded file to the Baidu voice interface to get the speech recognition result.
You need to pay attention to the following points:
1. The code for php to perform transcoding is as follows
$real_file is The absolute address of the transcoded audio file is recommended to be converted into WAV format. The amr format conversion was unsuccessful and the reason is unknown.
2. Call Baidu Voice Interface Format Settings
You need to set the audio format to WAV, with a code rate of 16000 or 8000, as shown in the figure:
3. Summary
The voice interface of the mini program is different from the voice interface of the official account. The voice interface of the official account returns the amr format, while the mini program returns silk. format is a network audio format open sourced by Skype. It can be transcoded through the tools in the portal. The transcoding tool first converts it into a pcm format file, and then converts it into a specified format. In fact, Baidu Voice can already recognize the pcm format. If you need it, you can modify the converter.sh script.
【Related recommendations】
1. Complete source code download of WeChat mini program
2. WeChat mini program game demo choose different colors Block
3. WeChat applet demo: carousel image transformation
The above is the detailed content of Example of voice search (SUSE version) for mini program development. For more information, please follow other related articles on the PHP Chinese website!