Demystifying the First Smart Speaker in Alibaba A.I. Labs - Tmall Genie X1

Source: Internet
Author: User
Keywords Tmall Genie artificial intelligence
Tags deep learning artificial intelligence tmall genie aligenie a.i. labs

On the afternoon of July 5th, Alibaba A.I. Labs officially released its first smart device in Beijing, the Tmall Genie X1. According to reports, this product uses the Chinese semantic understanding engine independently developed by Alibaba A.I. Labs. The first generation of Chinese human-machine communication system AliGenie, and relying on Alibaba Cloud's machine learning technology to achieve smart home control, voice shopping, mobile phone recharge, music playback and other functions.

Here are the questions and answers about Tmall Genie X1 and AliGenie!

First, about the Tmall Genie X1

Q: What is the Tmall Genie X1?

A: The Tmall Genie X1 is the first intelligent voice terminal device launched by Alibaba A.I. Labs. It has built-in first-generation Chinese human-machine communication system AliGenie. AliGenie lives in the cloud, it can understand Chinese Mandarin voice commands, and now can achieve smart home control, voice shopping, mobile phone recharge, call out, audio music playback and other functions, bringing a new experience of human-computer interaction. Relying on Alibaba Cloud's powerful machine learning technology and computing power, AliGenie can continue to evolve and understand the user's preferences and habits, and become a human intelligent assistant.

Q: Can you introduce Alibaba A.I. Labs?

A.I. Labs was founded in 2016 and is responsible for the development of consumer-grade AI products under the Alibaba Group. A.I. Labs' mission is to explore the new world of human-computer interaction and lead people to experience the fun of exploring the unknown world.

Q: Why did Ali do the Tmall Genie X1?

A: Language is the most important way of communication between people. It should also be the main way for people to communicate with another kind of intelligence. We believe that with the high intelligence brought by cloud integration, smart terminals need to match the more powerful human-computer interaction than mobile phone touch screens. AliGenie will shoulder the mission of Alibaba in the field of intelligent human-machine communication systems. We will provide a developer platform for developers and hardware vendors, including voice technology, service portals and hardware solutions, and integrate the rich Internet services and business linking capabilities of the Alibaba ecosystem to provide consumers with a new intelligent experience. . The Tmall Genie X1 is just a new tree that has grown up in this ecosystem. We hope to grow a new forest in the future.

Q: Why does X1 mean this name?

A: In mathematics, ""X"" stands for unknowns and variables. The Tmall Genie X1 is the beginning of China's consumer AI products, full of unknowns and variables. This is also the first product launched by Alibaba A.I. Labs, so it is named X1.

Q: What features does the Tmall Genie X1 currently have?

A: Currently, there are music audio content to play, listen to stories, tell jokes, check fortune, play games, check the weather, find mobile phones, ask encyclopedia, set alarm clocks/timers, check food calories, charge charges, check express, check prices , Tmall Magic Box Control, Tmall Supermarket Shopping, Smart Appliance Control and other functions. With the developer's presence, the functions that the Tmall Genie X1 can achieve will increase rapidly. You can check the Tmall Genie official website or download the Tmall Genie app.

Q: How did the voice of the Tmall Genie X1 come from?

A: We contacted 100 professional seiyuu and finally chose the one we are most satisfied with. With the addition of speech synthesis technology, this is the answer that everyone finally heard, I hope everyone can like it. In addition, the voice packet interface will be opened in the future.

Q: What is the hardware configuration of the Tmall Wizard X1?

A: The Tmall Genie X1 uses the first chip developed specifically for the intelligent voice industry, and has made special optimizations in decoding, noise reduction, sound processing, and multi-channel coordination. For AliGenie, which requires a lot of audio processing and sound synthesis, the custom chip incorporates a separate NEON processing unit. NEON technology accelerates audio and speech processing, telephony and voice synthesis, resulting in better speech recognition and audio processing. effect.

In the radio program, we have adopted an industry-recognized excellent solution ------ six-microphone radio array technology. The six high-sensitivity microphones at the top help to collect sound from different directions, making it easier to identify useful information in the surrounding noise for better far-field interaction.

Q: Can I use it normally in a noisy environment? How is it done?

A: The team behind the Tmall Genie X1 has done a lot of research on noise reduction technology and optimized it for home use scenarios. The Tmall Genie is not working in an absolutely quiet environment. There are all kinds of noise in the family. The developers are in the kitchen, living room, bedroom, study and other environments, for glass, wood, concrete, metal, stone, etc. Thousands of experiments have been carried out on materials and environments. The stone and wood materials used in China's home environment have been measured in a targeted manner, and can be adapted to wake up in the home environment. It also has a certain learning function, which can learn and evolve according to environmental noise, adapt to different family environment noises, and after 7 days or so optimization, it will be more suitable for the home environment.

In addition, the Tmall Genie X1 also uses techniques such as echo cancellation and far-field pickup to receive voice commands even while playing music.

Q: When is it officially released?

A: The Tmall Genie X1 will start a limited beta on July 5th. Users and developers can apply for beta testing on the website ( On August 8th, the first batch will be officially launched for Tmall members. .

Q: Does this need a mobile app?

A: When using for the first time, the user installs the Tmall Wizard app on the phone to bind the account. The mobile app can display the connection status with the hardware product in real time, the response of the command, the latest function online reminder of the product, and the initiative to recommend content suitable for the user's usage habits.

Tmall Genie APP will be launched in major app stores on July 5th. Users of X1 can also access the operation page through “Mobile Taobao” ------ “My Device”, without having to install a separate app.

Second, about AliGenie and the developer platform

Q: What features does AliGenie currently cover?

A: At present, the following functions are available. With the development of more functions and the participation of third-party developers, the functions will continue to expand.

1. Music Audio: Massive Library and Content Library
2. Home Control: Voice Control Smart Home Appliances
3. Shopping recharge: voiceprint to achieve the whole process of shopping
4. Children's Education: Select children's audio content, entertaining and entertaining
5. Skills market: aggregating various services and content, continuously expanding functions

Q: What life and business scenes will AliGenie enter in the future?

A: Industry solutions that have been or are currently expanding include 1. Children's field 2, Hotel area 3, Family scenes 4, TO B other business scenarios 5, offline retail scenes 6, and other display devices

Q: What range of smart appliances can AliGenie access?

A: Currently supporting more than 100 brands, including products such as Ali Smart Alliance, Graffiti Technology, Broadlink and other smart home solutions, more smart appliances are being accessed.

Q: What capabilities will the AliGenie Developer Platform open?

A: The AliGenie developer platform is targeted at four types of developers, including content developers, application developers, smart home developers, and hardware manufacturers. Developers can create skills to serve more voice users, or connect their devices to cloud services for voice interaction.

Relying on powerful underlying technology, intelligent algorithm engine, perfect cloud service and mature software and hardware standard system, AliGenie will continue to export comprehensive and easy-to-use core technology capabilities, bringing more possibilities to developers. Through the AliGenie developer platform, developers can link to hundreds of millions of consumers and massive life and business scenarios in the Ali ecosystem.

Q: What core technologies are open to the AliGenie Developer Platform?

A: 1, deep learning

We have developed the world's leading deep learning technology as the brain of AliGenie, which was published at leading international conferences such as KDD and CVPR. Our self-developed deep learning can quickly and efficiently learn from massive data and can be used in a wide range of application scenarios.

2, natural language processing

Based on our vast amount of natural language data and our own internationally advanced deep learning technology, we have achieved efficient, accurate and stable natural language understanding.

3. Search/recommendation algorithm

Through the user portrait accumulated by Ali, the user is provided with the information and content services required by the user.

4. Knowledge representation and reasoning question answering system

We have built a vast knowledge base to achieve a structured description of everything. This knowledge base not only helps us understand the language better, but more importantly, we can answer various knowledge questions through reasoning.

We will open the above capabilities to developers and hardware vendors free of charge, without having to build an AI voice system from scratch, saving a lot of investment in research and development, so that developers can better serve users.

Q: How to become a developer on the AliGenie platform?

A: We can apply to become a developer through our developer platform. We can use our deep learning training platform with simple authentication.

1. Register through the official website, fill in the information application invitation code

2. Issue the invitation code within one week to open related tools and platforms.
3, can use the platform for related development and application, and then submit a test review, after the online can be launched in the app store

Q: How do hardware vendors integrate AliGenie into their products?

A: We will introduce a full set of hardware reference design solutions to give our partners sufficient support.

1. The manufacturer applies for cooperation documents and technical reference documents through the official website.

2. We will evaluate the application and discuss with the partners to prepare the relevant hardware design, access plan and business strategy.
3. The two sides carried out joint development and testing, and the whole process was completed in about 1.5 months.

Q: What kind of intelligent hardware can be connected to AliGenie? How do hardware vendors join?

A: At present, the tens of millions of smart home devices of the Ali Intelligent Alliance have been able to connect with the Tmall Wizard X1.

Hardware device vendors can access AliGenie in two ways:

1. Access via SDK

Provide SDKs for common platforms (such as embedded Linux and Android) to device vendors. The SDK includes function modules such as long connection communication, device user binding, audio broadcast control, and state management. The implementation details of the package are encapsulated. It is easy to access.

2. Access by protocol

Provide a set of standardized protocols based on Websocket, which is directly connected by the manufacturer and directly calls AliGenie's capabilities.

Q: How are developers divided?

A: The developer can get all the benefits, and the platform does not participate in the split during the promotion period. We will also launch the relevant Ali AI Innovation Developer Program.

Third, about technology

Q: Is this product developed by Ali himself? What are the core technologies?

A: Both Tmall Genie X1 and AliGenie are developed by Alibaba's team of scientists and engineers, using Alibaba's many years of speech recognition, natural language processing, human-computer interaction and other technologies. Among them, Alibaba A.I. Labs is applying for patents on core technologies such as voiceprint recognition, voiceprint purchase, and NLP Chinese dialogue engine. Not long ago, our NLP technical team also published a paper at the international authoritative technical forum KDD 2017.

Q: What are the unique technical advantages of Alibaba in the field of artificial intelligence?

A: In the Global Speaker Recognition Competition (NIST SRE2016) held by the US National Standards Institute in 2016, Alibaba used OpenSesame as the team name, and adopted feature extraction based on deep learning network to improve data by distance measurement learning. The ability of pan-methods, pioneering the use of symmetry support vector machines to improve system performance, in the nearly two hundred teams, Ali's final system performance ranked first in the voice recognition performance in Greater China, the second in the US Division . At the same time, we submitted four related patents. This system was invited to give a public speech at the top of the NIST SRE2016 workshop.

At the Voice Top International Conference Interspeech2017, our two papers were also accepted: The Opensesame NIST 2016 Speaker Recognition Evaluation System, The I4U Mega Fusion and Collaboration for NIST Speaker Recognition Evaluation 2016.

This voiceprint recognition technology has also been applied to the X1, which identifies different users based on sound conditions, thereby ensuring the security and privacy of use. After remembering everyone, X1 can also achieve ""Thousands of People"", and can set and recommend content according to everyone's favorite.

Based on voiceprint recognition technology, we also launched the soundprint purchase function, which is the first commercial voiceprint shopping system, which can be paid through voiceprint. When you initiate shopping, recharge, etc., you only need to say the voiceprint password. The voice recognition system will check the identity and confirm that it is the person who will complete the transaction, otherwise the request will be rejected.

Q: Does the product support multiple rounds of dialogue?

A: Support. Based on the understanding of natural language, Alibaba A.I. Labsoratory has added a “decision engine” mechanism to understand the context of speech and determine which module should respond to the decision. This advanced human-computer interaction and natural language processing system also published important papers at the international top academic forum KDD 2017 and is applying for technology patents.

Q: Is the semantic understanding of X1 using its own technology? How is it done?

A: The difficulty of Chinese speech interaction lies in the semantic understanding of Chinese. The Chinese semantic understanding engine developed by Alibaba A.I. Labsoratory has been specially optimized for common timing, reminders, weather, entertainment content, home control, assistants, and shopping. Only the weather forecast can understand 786 Chinese questions.

Through in-depth machine learning, the Tmall Wizard X1 is compatible with natural semantic understanding of 20 domains.

In addition, we also take into account various practical cases in the process of Chinese language dialogue, the childlike voice of the northern language, the clean and easy questioning, the children often stack words, and some Chinese expression methods confused with Mandarin in the Southerners. Advanced processing. In response to the characteristics of Chinese language pronunciation, the use of swallowing words, biting words, missing words, Beijing dialects, and Henan dialects were specially optimized to compensate and correct.

This semantic understanding system also has a memory function and a powerful summary and induction function. It also incorporates the simulation of ""long-term memory"" and ""short-term memory"" functions, which can make the semantic understanding system closer to the user for different scenarios and time. In addition, Alibaba A.I. Labs has also started research in many other languages.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.