Currently in a car-driven project, one of the requirements is to realize the dialogue between people and mobile phone when driving, the whole process does not need to use the hand, only voice control.
This is similar to the human and robot dialogue, the robot in the background has been on standby, the user talking robot to make corresponding reflection.
However, due to the user's valuable mobile phone power, but also can not allow users to open the recording monitoring, this is very resource-intensive. Therefore, the wake-up function provided by the Voice over speech is used.
How do you do it specifically?
Look at a flowchart: This flowchart uses most of the technology of the flying (voice wake, Voice wake + command word recognition, semantic recognition, speech synthesis), no nonsense, look at the picture
The flowchart has been written very clearly, a brief introduction to the next
Start wake-up when the program starts, and this time the user says the wake-up word will wake the machine and listen to the command. But if there is a message at this time, it will give priority to the information, the time of the broadcast will be suspended, the broadcast after the completion of the wake-up. There is an important reason for this is that the wake-up is always occupy the recording resources, and this time to broadcast voice will be intermittent, I heard this can be set, but the broadcast when used in general will not say wake-up words.
There are two modes of awakening: simple wake-up and wake-up + command word recognition
A simple wake-up will have a callback that wakes up successfully.
and wake up + command word recognition can not only wake up, if you say a wake-up word at the same time to say a command, then he will recognize the command, you can simply receive this command to execute, and do not need to start what semantic recognition after the execution of the command, which is also very cool for users.
However, the command word has a certain limit, that is, the command word must be built before the grammar, and the content of the command word must be known in advance. But if the user said a Shihezi university how to go, this command in your command word build grammar file did not! What to do? At this point you have to prompt the user to let the user semantic input.
So my idea here is: the user says the command word, initiates the wake, and then recognizes the command word. The recognition command word executes the command successfully, recognizing the command Word error starting semantic recognition.
One drawback is that users say that the content of the Awakening Word + semantic recognition, the content of semantic recognition is consumed by the command word, the user can only repeat the semantic recognition of the content to recognize the semantics.
In order to avoid this problem, we after the wake-up word recognition, if the command word is not recognized, with the voice of synthetic speech prompts the following users "what can I help you", this means that the machine does not recognize the user just the semantic content, users need to re-say, I am not very treacherous O (∩_∩) o
The next step is semantic recognition, which is nothing to say, the main point is that if the user does not speak, you have to keep it recording state? Of course not, ah, how much electricity! In order to save the user power, I also designed a user does not speak 20s automatically into the state of waiting to wake up the process. 20s how come? Use time stamp Ah! is to record a timestamp every time the user command recognizes success or if the wake succeeds. Then the next time to start the semantic recognition before the first judge whether the current time and timestamp time difference is greater than 20s, if less than 20s continue to start semantic recognition, if more than 20s start wake up, ready to let users say command word to wake it.
All right, it's almost there. Don't ask me why not always let the user say wake-up words and then execute the corresponding commands. If you're going to have to talk about waking words before you do something, I guess you're going to freak out, even if you're not crazy, people will think you're insane. No derogatory, joking, O (∩_∩) o haha
My github Address: https://github.com/dongweiq/study
Welcome attention, Welcome to star O (∩_∩) O. What's the problem please contact [email protected] qq714094450
(original) to realize the function of human-computer interaction by using the message flying voice