"Hi, welcome baby to the live broadcast room, click to follow and not get lost, there are live broadcast benefits every night~"
"618 welfare is here, fans in the live broadcast room, we will be the first to draw free orders at 12 o'clock~"
When you walk into the live broadcast room and see the "virtual anchor" greeting you, don't be surprised. This year, the 618 platform is focusing on launching the "new highlight" of the live broadcast track, which is the result of more and more virtual personnel on major platforms serving as anchors in the live broadcast room.
In fact, if you don’t observe carefully, it will be difficult to find that these anchors are AI virtual anchors. Not only do they look similar to real people, but their voices, emotions and movements are very realistic, and in some cases they even have multiple talents such as dancing and singing.
AI empowers digital people to break the dimensional wall of live broadcast delivery
Starting from the preliminary exploration of live broadcasts of virtual IPs represented by "Yi Zen Little Monk", "I Don't Eat for Free", "Momojiang", etc., AI digital human live broadcasts are gradually changing the traditional retail industry and e-commerce industry. More and more well-known brands have begun to try to use AI virtual human live broadcasts to promote their products, such as Perfect Diary’s “Stella”, Nature Hall’s “Tang Xiaomei”, Hua Xizi’s “Hua Xiaoxi” and other virtual anchors.
Compared with the high operating costs of real live anchors, virtual anchors are not restricted by time, location, or environment. They can start broadcasting with one click and are online 24 hours a day, which greatly reduces the cost of live streaming for merchants. For the platform, virtual anchors are a link that cannot be ignored in the ecological layout of e-commerce platforms. By continuously lowering the live broadcast threshold, the platform is more attractive to small and medium-sized merchants.
High-quality synthetic data set to create high-quality anchors who are “eloquent”
AI digital people live streaming to deliver goods is already a general trend. However, achieving widespread commercialization also faces considerable challenges. Especially for high-end versions of virtual humans, the more realistic the effect, the higher the cost.
In the live broadcast room, the anchor mainly introduces product information through sound and pictures. Among them, sound is the “first medium” that cannot be ignored. First of all, the anchor must have a natural, smooth and emotional timbre to give users a comfortable listening experience. On the contrary, if the anchor's voice is too mechanical and indifferent, it will affect the desire to watch; secondly, there is the interactive experience, such as the anchor's voice control Sending red envelopes and fans communicating with anchors through voice chat will increase the stickiness with users in the live broadcast room.
Therefore, in order to achieve better live broadcast effects and user experience, merchants need to continuously debug the voice interaction capabilities of virtual anchors, polish their live broadcast skills, and improve the function of interacting with users.
No matter what kind of machine learning capabilities, the accumulation of algorithms and data is required to support their technical level. To improve voice interaction capabilities in live streaming scenarios, a large amount of high-quality live streaming scenario data is needed to support model training.
Biaobei Technology has been deeply involved in the field of AI data services for many years and has rich practical experience in data collection and annotation. For the live broadcast delivery scenario, Biaobei Technology has carefully established a high-quality speech synthesis database based on professional-grade recording studios and high-quality voice actor resources, and completed the phonetic character annotation, rhyme annotation, phoneme boundary annotation, colloquial label annotation, etc. of the database, which can It is used directly for algorithm optimization to ensure that the synthesized timbre is more stable and natural.
Synthetic database based on live broadcast delivery scenario
Language: Chinese Mandarin, Chinese and English mixed
Collection environment: professional recording studio, signal-to-noise ratio not less than 35dB
Data duration: 5 hours in Chinese, 1 hour in Chinese and English
Recording corpus: Anchor’s live delivery skills
Sampling format: Uncompressed PCM WAV format
Sampling rate: 48KHz 24bit
Annotation content: phonetic character annotation, rhyme annotation, phoneme boundary annotation, stress, drag, laughter and other label annotations
Applicable fields: live broadcast delivery
We welcome industry partners who are interested in the above data sets to contact us~
If the above data cannot meet your current needs, Biaobei Technology can also provide corresponding data customization services for specific groups of people, specific scenarios, and specific languages, and fully help corporate customers obtain satisfactory data services.
The above is the detailed content of Biaobei Technology's live streaming scene synthesis database helps create high-quality 'AI anchors”. For more information, please follow other related articles on the PHP Chinese website!