On July 5th, 2017 Alibaba Group officially released the first intelligent voice terminal device developed by its A.I. Labsoratory (A.I. Labs) - Tmall Genie X1. This is a consumer-grade AI product for home users, priced at 499 yuan, built-in Alibaba launched the first generation of human-computer communication system - AliGenie.
When the user speaks "Tmall Genie" to the smart speaker, he can summon the cloud's AliGenie to provide services. For example, broadcasting music, listening to stories, telling jokes, checking fortune, playing games, checking the weather, looking for mobile phones, asking for encyclopedias, setting alarm clocks/timers, charging charges, checking express, checking prices, controlling Tmall Box, and controlling smart appliances. . Relying on Alibaba Cloud's powerful machine learning technology and computing power, AliGenie can continue to evolve and grow, and the more intelligent it is, the more intelligent assistant it can be.
Alibaba released Tmall Genie X1 to explore the new world of human-computer interaction
"This is our exploration on the new world of human-computer interaction. I hope to experience the fun of exploring the unknown world with everyone." Shi Xue, head of the Alibaba A.I. Labsoratory, said that language is the most important means of communication between people. It should be the main way people communicate with another kind of intelligence. The trend brought by cloud integration is highly intelligent, and smart terminals need a more powerful human-computer interaction than mobile phone touch screens.
Lightweight and smart in appearance
The Tmall Genie X1 has a cylindrical design with black and white color matching and a diameter of 83 mm. In the center of the top of the X1, there is a mute button. When the user triggers this button, X1 will immediately stop the sound playback and stop the sound recognition function to effectively ensure user privacy.
A hidden light is designed on the bottom of the X1. The user's position is judged by the sound, and the light is illuminated to indicate the reminder. The light will also be prompted according to different functions and scenes.
In terms of configuration, the X1 uses the industry's first SmartAudio professional processing chip, which has 25% higher processing efficiency and 32% lower power consumption than the mainstream chips on the market.
The X1 is equipped with a 6-microphone circular array that enables voice recognition in the home range of 5 meters. The independent power amplifier chip with professional sound adjustment makes the X1 also have excellent external release effect.
For the variability of the sound environment, the X1 also has a certain self-learning function, which can be optimized according to the ambient noise to adapt to different home environment noise. After a week or so of use, the X1 will be more adaptable to the environment, and the accuracy of speech recognition will reach the highest level in the industry.
Considering the complexity of the Chinese semantic environment, Alibaba A.I. Labs collects semantic questions for various life scenarios in life through crowdsourcing platforms. Only 786 Chinese can be understood by weather forecast. Questioning, through deep machine learning, the Tmall Genie X1 has covered Chinese natural semantic understanding in 20 fields and can understand 80% of human intentions.
Voiceprint recognition can distinguish everyone in the family
Different from other smart speaker products, the Tmall Genie X1 has access to a wealth of life services through AliGenie in addition to voice control music and audio content playback. The partners that have been reached include Mattel, KEEP, Xixi Paradise Complex, Youku, Gaode Map, Taobao Ticket, Alipay, Shrimp Music, Tmall Supermarket, Rookie Wrap, Himalayan FM, Taobao, Alibaba Intelligent Alliance, Alibaba number entertainment, Tmall Box, Crayola, Wu Xiaobo channel, flying pig, box horse fresh and so on. AliGenie's rich partners and third-party skills services will bring a better experience.
The Tmall Genie X1 can distinguish everyone in the family by voiceprint recognition technology. Introduction of light snow, voiceprint recognition technology is one of the important identification methods of biometric identification, combined with multiple security mechanisms of the service chain, has reached the commercial level. This is also one of Alibaba's core technologies in the field of speech deep learning. It is reported that the Tmall Genie can currently identify up to 6 people. Through personalized recommendation, voiceprint recognition can realize “thousands of thousands of faces” after identifying the user's identity, and set and push different content according to each person's preferences.
For example, voiceprint recognition technology can be applied to some shopping scenes. The user first registers his voice to generate a voice password. After binding with the machine, confirm the voiceprint function. Then, when you say the need to "buy me a box of milk", the Tmall Wizard will ask the user to follow a string of random numbers for voiceprint verification. If it is confirmed as the user, the Tmall Wizard will be tied from the user. The fixed Alipay will be debited to complete the transaction.
The Tmall Genie X1 began a limited beta on July 5th. Users can apply for a beta on the Tmall.com website (bot.tmall.com) and the first batch will be officially released on August 8.
Released AliGenie Developer Platform at the same time Open hardware and software core technology
According to reports, the Tmall Genie X1 has built the first generation of human-machine communication system AliGenie, developed by Alibaba's team of scientists, applying natural language understanding and processing technology accumulated for many years. On the same day, Alibaba A.I. Labs also released the AliGenie developer platform for developers and hardware vendors.
The AliGenie developer platform will open up many natural language processing technologies such as NLP semantic understanding and TTS speech synthesis to application developers. Developers can create skills to serve more voice users, or connect their devices to cloud services for voice interaction.
In addition to the openness of technology, the opening of ecology will be the characteristics and focus of the platform. Developers can develop a variety of "skills" services for Tmall Genie users. At present, Tmall supermarket, rookie, KEEP, etc. have launched a voice application based on the Tmall Genie X1. Users can complete the service of recharging, purchasing goods, fitness voice prompts, etc., and will soon be on the line, calling and calling. Takeaway, call cleaning and other services.
For content creators, AliGenie also provides a voice public number function. Developers can create and publish applications by uploading voice or text to the background. The text will be converted to voice through the speech synthesis engine, and users can subscribe to the timing. Play, on-demand, developers can also actively push, or combined with other applications for in-depth integration, combined play, opened up a new way of communication for content creators.
The AliGenie Developer Platform also provides a reference design for single-microphone to multi-microphone arrays for hardware manufacturers, and provides reference designs for related kits including wake-up word customization, acoustic structures, core circuit design and chip solutions, as well as cloud services. And the full set of tools and user application SDK components necessary for application management. The connected hardware device can quickly have human-machine voice interaction capabilities and share all application skills of the application store.
At the press conference, Alibaba A.I. Labs also announced the first hardware open partner - will cooperate with international toy giant Mattel to explore the cooperative development of its main core IP such as Fisher, Barbie, Thomas and friends. opportunity.