A deep interpretation of Google Syntaxnet: a new TensorFlow natural language processing model

Last Update:2018-07-17 Source: Internet

Author: User

Tags knowledge base

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Http://www.leiphone.com/news/201605/WOgdrkYSUwuwqQjD.html

This summer, Lei Feng (public number: Lei Feng Network) will be held in Shenzhen, an unprecedented "global artificial Intelligence and Robotics Innovation Conference" (abbreviated as Gair). The scene of the conference, Lei Feng Network will release the "Artificial Intelligence & Robotics TOP25 Innovation Enterprise List" list. We are currently visiting companies in the field of artificial intelligence and robotics to sift through the list of companies that have been selected for the final list. If your company also wants to join our list, please contact: 2020@leiphone.com.

Image source: Spacy

Editor's note: Spacy is a free open source code base, Matthew Honnibal is the founder and CTO of Spacy Company. He studied linguistics at the undergraduate level and never imagined that he would become a programmer in the future. Honnibal, who received a PhD in computer science from the University of Sydney and worked as a researcher, left academia in 2014 to write Spacy. In this paper, hobbibal a deep interpretation of Google's natural language processing model.

Last week, Google open-source its TensorFlow natural language analytic database syntaxnet based on AI system. Over the past two years, Google researchers have used this analysis to publish a series of neural network analysis models. Since the release of Syntaxnet, the author has been concerned about it, of course, also always expect this software to open source. However, this article attempts to focus on the relevant background of this open source to do some discussion, such as what the new source of fresh materials, open source and what significance.
In natural language text processing libraries (such as Spacy), Syntaxnet provides a very important model. If you "shrink" the concept of natural language processing, you will realize that the technology you are focusing on can expand the scope of your computer's application. Even now, you can't write software to control a car, respond to emails in your tone of voice, or use software to analyze customer feedback or monitor global news to avoid major business risks. It is true that natural language processing does not control driverless cars, but wait a minute, language is the most distinctive human ability, humans have inevitably mastered this skill, but natural language processing technology is also excellent, we even have difficulty in predicting its potential. Google search is a natural language processing application, so you will find that the technology is actually changing the world. However, in the author's opinion, natural language processing still has great development space.

In a larger value chain, syntaxnet is actually a lower-level technique, like an improved drill bit that doesn't give you the oil itself, the oil itself doesn't provide you with the energy and the plastic, and the energy and the plastic itself don't automatically form a product. But if the entire value chain is a bottleneck in oil extraction efficiency, it is also important to substantially increase the bit technology, although it is a low-level technology.

In the author's opinion, the grammar parsing is a bottleneck technique in natural language processing, if it has four or five years to do the optimization improvement, will have the huge influence to the natural language processing. Now, you might say, I think this is a problem because the technology is changing from academic research to commercial application. But all I can say is that this is actually a reversal of causation: it is because I understand the importance of the problem that I put it into it, not the other way around.

Well, I know that even if a technology encounters a bottleneck, it cannot negate its importance. Syntaxnet How to take a big step forward. If you've already used a neural network model in Stanford CORENLP, then it's certain that you're using an algorithm that is completely consistent at design level, but not in detail. The same is true with the Spacy syntax parsing model. Conceptually, the contribution of syntaxnet may seem less significant, after all, it is used primarily for experimentation, optimization and improvement. However, if Google does not do the job, there may be no one to do it. It can be said that Syntaxnet opened a window for the neural network model, where people saw a beautiful view full of ideas, and researchers were busy exploring it. of course, there is also a bias in the industry that Syntaxnet will make the researchers look (and feel) smarter. Maybe we'll end up with a very accurate parsing model, but this model doesn't make the right assumptions (of course, the accuracy of the system design is very important), which leads to the future development of neural network models more and more slowly. In the six months after the introduction of the CORENLP model, the first syntaxnet papers were released, using larger networks, better activation functions, and different optimization methods, and syntaxnet also applied more principled directional search methods to replace more current work. Using the LSTM model can achieve the same exact parallel work, rather than publishing the Feedforward network as described in the syntaxnet paper. what syntaxnet used to do.

The Syntaxnet parser can describe the grammatical structure of a sentence and help other applications understand the sentence. Natural language produces many unexpected ambiguities, and people can often use their own knowledge to filter out those ambiguities. Let's give you a favorite example:

They ate pizza with anchovies (They ate the pizza with anchovies)

Image source: Spacy

The correct grammatical analysis is to associate "with" and "pizza" together, i.e. they eat pizza with anchovies;

Image source: Spacy

The incorrect syntax analysis is to associate "with" and "eat" together, they eat pizza with anchovies.

Image source: Spacy

If you want to feel this technique more visually, you can look at our Displacy demo, or look at a concise, rule-based example to see how the syntax tree is computed. "Word-word" relationships can also be used to identify simple grammatical semantics that can be easily extended to form a "word package" technique (such as Word2vec, which is a tool that converts words into vector forms). The processing of text content can be simplified into vector operation in vector space, and the similarity degree in vector space is calculated to represent the similarity of text semantics. For example, we parse every comment on the Reddit forum last year, and using Word2vec is obviously more helpful than strictly restricting the way a space splits words, because the latter can analyze phrases, entities, and words to produce a very good concept map.

Syntaxnet is a model library for training and running syntactic dependency resolution. This model can better weigh the speed and accuracy of semantic analysis. Perhaps to look more fashionable, Google gave the model a cool name--parsey mcparseface. Hopefully they can continue this fashionable naming style, I think there should be a better way to make the model development timeline seem clearer, and natural language processing technology should be the same. How much progress syntaxnet brings.

Despite the "most accurate semantic analysis" tag in the world today, Parsey Mcparseface is only a little ahead of recent semantic analysis studies, and today's semantic analysis model uses more complex neural network architectures, but has more restrictive tuning of parameters. As a result, many of the same technologies will no longer be confined to academic circles. On the other hand, if you are concerned about whether the model can actually do something, the reality may disappoint you, and the technology is not really going to work. Since last year syntaxnet paper published, I have been on the intermittent study of neural network model Spacy, but the effect is not very good, we want to make spacy easy to install, we want it on a single CPU fast operation, we also want it to maintain multithreading, But all these requirements are now difficult to achieve.

for semantic analysis benchmarks, Parsey Mcparseface at 600 words per second, the accuracy can exceed 94%. Similarly, the spacy 15,000-word accuracy is 92.4% per second. This accuracy may not sound very high, but it's actually pretty good for the application.

The most important consideration in any predictive system is the difference in baseline predictions, not absolute progress. A model for predicting the weather may be the same today as it was yesterday, but it will not add any value. With regard to dependency analysis, about 80% of the dependencies are simple and clear, which means that a system that only correctly predicts that dependency is injecting a small amount of extra information, not just by looking at each word, but with the relationship between words and words.

In a word, I think Parsey mcparseface is a very good milestone under the current trend of AI. What is important is how quickly it can be achieved, and how advanced natural language processing technology can be achieved. I think there are a lot of ideas can not be achieved, but there will certainly be a moment of arrival, all of a sudden it becomes feasible. what's next.

What excites me most is that with the Parsey mcparseface model design, natural language processing technology has a very clear direction, and you might say, "Well, if it works, it'll be great." "In 2004, one of the leading figures in the field of semantic analysis, Joakim Nivre, said that this type of parser could read sentences at once and then reduce misunderstanding. It applies to any State expression, any set of behaviors, any probability model schema. For example, if you parse the input of a speech recognition system, you can have the parser optimize the speech recognizer to guess what the other person is saying based on the syntactic environment. If you use the Knowledge base, you can extend the state expression so that it contains your target semantics and let it learn the syntax.

Joint model and semi-supervised learning have always been the perfect embodiment of the study of natural language understanding. No one ever doubted their merits--but without a concrete method, the technology was just a cliché. Obviously, understanding a sentence requires splitting the word correctly, but doing so will bring many problems and make it difficult to find a satisfactory solution. In addition, a natural language understanding system should be able to take advantage of existing volumes of unread text, which also requires different types of model support. In my opinion, a transitional neural network model can give an answer to these two problems. You can learn any architecture, the more text you see, the more you learn, and the neural network model does not need to add any new parameters.

Obviously, we want to build a bridge between the Parsey Mcparseface and the Spacy model, so that you can use a more accurate model with the support of the Spacy application interface. However, for any individual use case, there are always variables that allow this technology to really work. In particular, there will always be different types of text in each application, if the data model can be adjusted to the domain, the accuracy can be substantially improved, such as some fully edited text, such as financial reports, you have to let the semantic analysis model to consider the term "market capitalisation" as a decisive indicator to better understand the full text But when it comes to understanding Twitter tweets, it usually doesn't make sense for you to let the semantic analysis model interpret "market capitalisation" as a decisive indicator.

Our goal is to provide a series of proactive training patterns to address this problem and to adapt the semantic analysis model to different languages and styles. We also have some very exciting ideas to help each user train their own custom model as easily as possible. We think that in natural language processing, the algorithm always rushes to the front, and the data often lags behind. We hope to solve this problem.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More