I'm just too lazy to write this blog post now.
Here I will summarize the ideas used to do the project, as well as the problems and solutions that arise in the middle. 1, the final implementation of the program (Raspberry pie, php+html, Arecord, Baidu Voice, face++ image recognition) 1.1, hardware parts
Because of the addition of a switch to control voice input, so the use of the raspberry pie interrupt, so the hardware in addition to the raspberry pie a switch and a few DuPont line with a few small resistors. The final switch with the raspberry pie schematic is shown below:
![ Here to write a picture description] (https://img-blog.csdn.net/20170410222711875?watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQvY3Njc2hhaGE= /font/5a6l5l2t/fontsize/400/fill/i0jbqkfcma==/dissolve/70/gravity/southeast)
Of course there is also a monitor and a 4-port USB Tow board with a switch. Yes, there is a microphone with a USB driver camera, Taobao can be found on as long as dozens of pieces of a, must say hello is not a driver, the mic if not drive. The first time I bought it, the camera had no drive, but the microphone wanted a sound card. The final parts and results are shown below.
in fact, the monitor is with touch screen function, but has not been fixed drive, so finally used the switch to interact.
1.2, software part
The software part mainly includes interactive interface and background control. The
interactive interface is mainly through the Chromium browser Kiosk mode to achieve full screen, and then build a local Web server to display the interface. The main technology used is APACH+PHP+HTML+CSS+JS. Because the front-end technology is still unfamiliar, so you can only use this way to achieve, and HTML+CSS+JS this part is also poorly written.
background control is mainly through the C program to carry out the whole process of scheduling, and then use PHP to deal with network interfaces and business logic. The OpenCV library is used to perform real-time video and WIRINGPI libraries to interrupt. By the Arecord, I used the tape to make the recording. Then the image recognition is the use of the Face++ network interface, speech recognition and synthesis is the use of the interface Baidu. In fact, the beginning is to fly with the interface, but they give the interface example is mostly through JS to achieve, unlike Baidu's interface is RESTful API style, because of my front-end technology is still shallow so the last choice of Baidu interface.