Google engineers have developed a machine learning algorithm for translating picture themes using techniques similar to language translation

Source: Internet
Author: User

Google engineers have developed a machine learning algorithm for translating picture themes using techniques similar to language translation

The automatic translation of one language into another language has always been a difficult problem to overcome. But in recent years, Google has changed the traditional translation process by developing machine translation algorithms, and has fundamentally changed the cross-cultural translation exchange through Google Translate.

Today, Google is using the same machine learning technology to turn images into text. The result is an automatic title that accurately describes the content of the picture. The technology will be used in Internet search engines, image auto-publishing, visually impaired web browsing, and other broader areas.

The traditional step of translating language is a process of change--starting with the translation of individual words, and then improving the accuracy of translation by rearranging the order of words and phrases. But in recent years, in a completely different way, Google has been able to use its own large-scale search database to convert text.

The essence of Google's approach is to count the frequency of occurrences of adjacent or similar words and to define their relationships in vector space. In this way, each word can be represented in space with a vector, and each sentence is a combination of different vectors. Then Google made an important assumption--no matter what language, the same relationship between specific words. For example, in all languages, the vector "king-man + woman = Queen" should be a true proposition.

This makes the translation of language into a problem in vector space science. Google Translate is done this way: first convert the sentence into vectors, then use this vector to produce the same meaning, another language sentence.

Now Oriol Vinyals and his collaborators in Google are using a similar approach to convert images into text. Their technique is to use a neural network to learn the data set of 100,000 images and their titles, in order to classify the contents of these images.

But in addition to generating a set of words that can describe a picture, their algorithm can also generate vectors that represent the relationship between words. This vector can be combined with Google's existing translation algorithms to generate headlines in English or any other language. In fact, Google's machine learning method has been able to convert images into words.

To consider the effect of this approach, they hired evaluators from Amazon's "Mechanical Turk" (a labor resource platform with different skills) to score the titles automatically generated by the above methods, as well as other methods and the titles of human translators.

The results show that the new system that Google called the neural image title (neural image Caption, NIC) was very successful. Using a well-known image data set called Pascal, the translation function of the neural image title is significantly beyond the other non-human translation methods. According to Vinyals, the NIC's bleu (wiki) score is 59, and today's best non-human translation technology has a score of 25 and a human-translated score of 69.

This is a good result, and as the training data set grows, this method produces better results. "We see very clearly from the experiment that the translation capabilities of the NIC are improved due to the increase in the data set." "The Google team said.

is an example of a group of image translation results-grouped by translation results score:

Obviously, this is another project in the near future in which machines will surpass human beings. Google original thesis title: Show and Tell:a neural imagecaption Generator

Paper Link: arxiv.org/abs/1411.4555

Editor's note: Recently upgraded version of "Google Translate", has added a similar function, called "Word Lens", under the digest from Lei Feng Net (leiphone.com)

Original link http://www.leiphone.com/news/201501/4d8lzMhsZBfqy1NG.html

iOS version of Google Translate introduced a newer version, the new version added the "Word Lens" feature, you can directly capture the lens of the text image for real-time translation, and displayed in the camera view. And it can be used even when there is no network connection. Unfortunately, the text currently supported for translation is limited to English, French, Russian, German, Italian, Portuguese, and Spanish, although more languages will be supported in the future.

In addition, the new version also adds real-time session mode, which automatically identifies both languages and translates them in real time when both sides use a natural speaking speed for voice conversations.

This article transferred from: Http://www.tuicool.com/articles/FbyUn2B

Google engineers have developed a machine learning algorithm for translating picture themes using techniques similar to language translation

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.