Multi-lingual Semantic model multilingual Semantic Models

Source: Internet
Author: User
Multilingual SEMANTIC MODELS

In this post I ' ll discuss a model for learning word embeddings, such that they end up in the same space in different Langu Ages. This means we can find the similarity between some 中文版 and German words, or even compare the meaning of a sentences in different languages. It is a summary and analysis of the paper by Karl Moritz Hermann and Phil Blunsom, titled "Multilingual Models for Composi tional distributional Semantics ", published at ACL 2014. The Task

The goal of the extend the distributional hypothesis to multilingual data and joint-space embeddings. This would give us the ability to compare words and sentences in different languages, and also make use of labelled Traini Ng data from languages and other than the target language. For example, below are an illustration of 中文版 words and their Estonian translations in the same semantic space.

This actually turns off to be a very difficult task, because the distributional hypothesis stops working across different Languages. While "fish" are an important feature of "cat", because they occur together often, "Kass" never occurs with "fish", because They is in different languages and therefore used in separate sets of documents.

In order to learn these representations in the same space, the authors construct a neural network that learns from Paralle L sentences (pairs of the same sentence in different languages). The model is then evaluated on the task of topic classification, training on one language and testing on the other.

A bit of a spoiler, but here are a visualisation of some words from the final model, mapped into 2 dimensions.

The words from 中文版, German and French are successfully mapped to clusters based on meaning. The colours indicate gender (Blue=male, Red=female, green=neutral). The multilingual Model

The main idea is as Follows:we has sentence  a  in one language, and We have a function  f (a)  w Hich maps that sentence into a vector representation (we'll come back to that function). We then have sentence  B, which are the same sentence just in a different language, and function  g (b) &nbs P;for mapping it into a vector representation. Our goal are to have  F (a)  and  g (b)  be identical, because both of these sentences has the same mean Ing. So during training, we show the Model A series of parallel sentences  a  and  B, and each time we adjust T He functions  F (a)  and  g (b)  so that they would produce more similar vectors.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.