Alibaba Intelligent Voice Platform Helps Human-Computer Interaction

Source: Internet
Author: User
Keywords Tmall Genie artificial intelligence
Tags deep learning artificial intelligence tmall genie

At the Yunqi Conference, Nie Zaiqing, the behind-the-scenes team of the Tmall Genie, introduced the operation mechanism and perfect method of the voice interactive platform. In the contemporary era of deepening the popularity of intelligence, human-computer interaction has become an urgent problem to be solved. In order to solve this problem, Alibaba's A.I. Labsoratory has conducted in-depth research and comprehensive optimization of the intelligent voice interaction platform.

Optimize the goals of the next phase of human-computer interaction

The development of the human-computer interaction platform has gone through the character phase, the image phase, and the touch-screen phase. The popularity of the touch-screen phase has brought convenience and business inconvenience. The production of the ""low-headed family"" is the best proof. The needs of human beings are never stopped, and the convenience of constraining the eyes of people to the interface is no longer convenient. The voice operating system allows people to obtain intimate services by simply issuing instructions, so that people no longer need to invest their eyesight and hand strength, so the popularity of intelligent voice interactive platform is inevitable.

In order to make the intelligent voice interaction platform more intimate, the first thing to be solved is to have the correct understanding. The artificial intelligence team realized the problem and worked out a detailed plan. They believe that innovating human-computer interaction is an effective way to solve problems. The intelligent voice interaction platform not only requires computing power, but also knowledge, reasoning ability, action ability, perception ability, and even cognitive ability. Having the ability of the Tmall Genie's intelligent voice interactive platform is the goal that the A.I. Labsoratory will strive to achieve in the future.

Specific problem to be solved

Tmall Genie's voice interactive platform, as a representative of the industry, has many intimate functions to meet the daily needs of the public, and in the process of meeting the needs of the public, also found some specific problems:

The promotion of intelligent voice interactive platforms requires the efforts and cooperation of all walks of life. In order to make the intelligent voice interaction platform more deeply into the public life and bring convenience to everyone, it is necessary to use it everywhere in life. For example, order delivery requires the cooperation of ordering software or restaurant. Booking of airline tickets requires the cooperation of travel software or airlines. The weather needs cooperation from the meteorological department. Therefore, in order to improve the functions of the voice interactive platform, it is necessary to obtain support from all walks of life, which is one of the problems that need to be solved.

Security issues with voice interaction. For example, when a client using a Tmall Wizard needs to complete a payment password, the resulting security problem needs to be solved. It is not possible to provide services simply through the content of the voice, but also to ensure that the right person is served. In order to identify the person requesting the service, the lab added voiceprint recognition to the Tmall Genie's voice interaction system to ensure that the right service is given to the right person.

The understanding of natural language. This issue is critical, and Nie Zaiqing introduced a detailed solution. Intent recognition is the key to the service provided by the Tmall Genie. The main intention of parsing the user's voice command must be completed by the voice interaction platform before the correct service can be performed. Taking the weather as an example, the instructions issued by the customer to query the weather are not uniform, but the purpose of the expression is unique, that is, to check the weather.

The interactive platform needs to have the ability to recognize multiple statements, extract the correct meaning that the customer wants to express, and execute it accurately, giving the customer a correct response. This may require the invocation of a third-party API and a dialog strategy. The difficulty in understanding the instructions lies in the multi-meaning and ambiguity of people's speech. These two characteristics are caused by the fact that people's daily speech is very casual. Or take the weather as an example. You can say ""what is the weather tomorrow"", ""I want to know the weather tomorrow"", ""the wind is not big tomorrow"", etc., and the meaning of these instructions is the same, that is, the query the weather. The voice interaction platform needs to have the ability to recognize multiple expressions of one instruction, and also has the ability to distinguish the meaning of similar instructions. The way to solve these problems is to add corpus, which is not simple.

Customers are uncontrollable, and we can never predict how a customer will express an instruction in a language. Asking an expert or a professional to solve this problem is costly, and the problem must be solved. Therefore, a solution for developers to provide data, that is, a custom skill to add corpus and a template is proposed. The specific program is for developers who don't know much about corpus, and can provide data such as jokes. Developers who have a little understanding of the corpus are provided by the department, and developers are responsible for labeling and adding corpus; and about customer privacy. The corpus is not convenient for developers to store, so as to avoid leaks, so it is solved by the memory function of artificial intelligence. When the customer's corpus is not obvious or does not exist, the platform will locate or explore the corpus by interactively constructing alternative word dictionaries and corpus templates. Specifically, it is through the Internet to query the corpus similar to the network library template in the instruction, filter the meaningless or useless corpus, extract the speech ambiguity and replace it with a clear corpus. This operation requires the establishment of an alternative dictionary of words and the gradual improvement of the dictionary in future use. For example, when the customer uses the customer corpus, it matches the corpus in the dictionary. When the meaning is the same, not only can the replacement be completed, but also the corpus of the customer can be added to the dictionary. This way of snowballing will make the dictionary more and more rich, and the voice interactive platform understands the customer's meaning more and more accurately and quickly.

How to avoid clever clever voice interaction

In the end, Nie Zaiqing summed up some of his views on how to avoid the clumsiness of intelligent voice interaction. The first is to do vertical applications, the development team should not have the idea of solving all the problems, because the advanced level of technology is not enough to achieve, so focus on vertical applications; the second is the user's expectations to be realistic, this It requires a professional explanation, telling the user what is possible, avoiding the customer's expectation value is too high, disappointing too much; the third is the knowledge map and user portrait, knowledge is the premise of the correct use of the voice interaction platform, and the user portrait refers to the The user's understanding, this is also essential for the correct application of the platform. If the ecosystem of voice interactions can be recruited by a large number of developers, it is obvious that the completion of the system will be more effective.

Artificial intelligence is designed to help people, not to replace people. In keeping with this philosophy, Alibaba will continue to work hard to popularize artificial intelligence and bring convenience to people.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.