Editor's Note: Microsoft technology Festival (techfest 2012), the largest holiday of the year in the Microsoft Research Institute, was officially opened. This technology festival focuses on "natural human-computer interaction" and "Big Data". Microsoft's Asia Research Institute has brought nearly 40 latest technologies, specifically, the project "converting a single language into a mixed language" is a typical example of interaction with data. Let's take a look at how to extract big intelligence from big data!
(Images are from the Internet)
At the technology festival, nearly 40 innovative technologies from Microsoft's Asia Research Institute received the attention of Microsoft's product department and guests from all walks of life. Dr. Hong Xiaowen, chief of the Microsoft Asia Research Institute, said, "as Microsoft's largest basic research institution overseas, Microsoft's Asia Research Institute has always insisted on promoting the development of the entire computer science field through technological innovation, it also helps improve people's computing experience. We hope that more innovative achievements of Microsoft's Asia Research Institute will be transferred to Microsoft products to accelerate these exciting computing experiences ."
In the technology presented by the Microsoft Asia Research Institute, the technology of "converting a single language into a mixed language" can be used to synthesize a training corpus of different languages by recording a single language of a speaker, to build a multilingual text-to-speech conversion system with statistical models. The "High Fidelity facial animation capturing" technology makes full use of the most advanced motion capturing technology and 3D scanning technology, in order to obtain a high-fidelity 3D facial expression with realistic dynamic wrinkles and fine facial details; the "automatic resolution of buildings in Urban Areas" technology allows users to start a 3D tour of urban areas with only one image; "Windows
The language learning Game On Phone and Kinect focuses on how to get a pleasant "entertaining" language learning experience on different Microsoft Product platforms. Next let's take a look at three of the wonderful projects!
Convert a single language into a hybrid language
The speech User Interface needs to use text-to-speech to synthesize speech technology to "speak" another language of speech synthesis, sometimes people even want to express it in different mixed languages. For example, if a person is abroad and he is not familiar with the local language, it is very convenient if the Navigator can issue instructions in the mixed language mode. That is to say, the navigator command can express street names and other special terms in the form of a local language, while the route direction is expressed in the native language of this person. The conversion of mixed languages requires users to speak both languages at the same time, but such talents are usually hard to find.
This project shows a new way to translate what a user says into another language and retain the accent, tone, and tone of the user's speech, it sounds like I personally said. Rick rester, Chief Research Officer of Microsoft, demonstrated the functionality of the software, then the software was used to translate the passage into Spanish, Italian, and Mandarin. As a result, the pronunciation of these three languages sounds very similar to rassid himself.
Using this speech translation system, you need to perform about one hour of training to model your speech and integrate it with Microsoft's standard text-speech mode for translation of the target language. Take Microsoft's standard Spanish mode as an example. The standard Spanish mode has a "S" pronunciation. After training, you can use your own "S" sound to replace it. Follow these steps to process all single Phoneme in Microsoft's Spanish text-voice mode. At present, this method can achieve translation between all 26 languages supported by the Microsoft Speech platform. These languages cover most important languages in the world. For more project introductions and examples, see http://research.microsoft.com/en-us/projects/mixedlangtts/default.aspx
High Fidelity Face Animation capturing
"High Fidelity facial animation capturing" shows a new way of High Fidelity 3D facial presentation, used to get realistic dynamic wrinkles and fine facial details. This method makes full use of the most advanced motion capture technology and 3D scanning technology to obtain facial presentation. The system captures the facial performance with the spatial resolution of the static facial scanning System and the acquisition speed of the dynamic capturing system.
Existing face and facial expression capturing techniques include tag-Based Motion Capture and high-resolution scanners. In the tag-based technology, a small reflective point needs to be fixed on the face of the person to be photographed. When his expression changes, the relative location changes between these reflection points will be recorded in the video. This method can accurately capture ever-changing expressions, but the spatial resolution is low, so it cannot capture the details of the changes. On the other hand, the high-resolution scanner can capture all the nuances of the face, even including small wrinkles and skin pores, but is generally only applicable to static posture. A specially configured high-speed camera can also be used to capture facial expressions, but they are expensive and can only provide less facial details.
Based on the characteristics of these two capture technologies, the research team tried to combine the accuracy of motion capture based on the tag system with the rich details of the high-resolution scanner. Researchers also want to increase the efficiency of capturing and recognizing from a computational perspective, minimizing the amount of data needed to reconstruct precise facial expressions.
Next, the study team used a laser scanner to capture high-fidelity facial scans. The scans are then matched with the corresponding image in the tag-Based Face data. They use the new algorithm to achieve mutual registration of facial scans. Finally, the research team used the motion capture information and facial scan information to reconstruct the actual expression made by the actor at the time. The images produced by the actor captured both the "big" Motion on the face, it captures the delicate details of Skin textures and movements.
Language learning games on Windows Phone 7 and Kinect
"Language learning games on Windows Phone 7 and Kinect" is a language learning program that focuses on how to promote a pleasant "entertaining" experience on various Microsoft platforms:
- Spatialease: uses an Xbox 360 Kinect game learning language, a learning method that associates language with thoughts and actions. Learners must quickly understand the commands in the second language, such as translating the sentence "Move left hand to the right" and moving their bodies accordingly.
- Tip tap tones: A game that uses Windows Mobile Games to learn Chinese pronunciation-an efficient phone game that retrains ears and brains, it can quickly and accurately perceive Chinese syllables with tones.
- Polyword flashcards: a "network recognition card" with comprehensive skills ". Based on our adaptive learning algorithm, which has been transferred to the Bing dictionary, we have created an HTML5 platform for deep personalized learning, it also integrates language learning, games, and exploration.
See project details http://research.microsoft.com/en-us/projects/languagelearninggames/
More highlights of the 2012 Microsoft technology festival, please pay attention to http://research.microsoft.com/en-us/um/redmond/events/techfest2012/default.aspx
Related reading:
Opening Speech at Kerry rester 2012 Microsoft technology Festival
Natural Human-Computer Interaction and big data-vision for future computing in Microsoft Technology Section 2012
Exploring and creating the future -- Warmly celebrate the 20th anniversary of Microsoft Research Institute
Microsoft Research Institute has turned its dream into a reality for 20 years
___________________________________________________________________________________