Why should we develop a new Chinese input method

Source: Internet
Author: User
Keywords Microsoft Pinyin Input

Absrtact: Microsoft Engkoo Pinyin Input Method Development Team Readme, why we want to develop a new Chinese input methods of the history and today's Chinese input method has a long history, in the PC far from the advent (at least from the last century 40 's

The Microsoft Engkoo Pinyin Input Method Development team Readme, "Why we want to develop a new Chinese input method"

IME History and today

The Chinese input method has a long history, and inventors began to engage in electronic and mechanical Chinese input before the PC was far from the world (at least since the late 40). With the advent of PC and popularization, we also ushered in the wave of technological innovation in Chinese input method. The ultimate goal of Chinese input method is input efficiency, the competition focus on performance, accuracy and ease of use. The term "IME" (Input Method Editor, or IME) is the first use of Windows 95, and Windows 95 comes with the first edition of Microsoft Pinyin input (pinyin based input is now the most mainstream form of Chinese input). Since Windows 95, this 18 years, the input method world is surging, whether in technology, or competition.

Today's input method market is very competitive, local and international software companies are involved in the competition, especially the number of companies providing online services. The rationale for their participation is simple: For most Chinese users, the input method is a "portal". Over the past decade, with the advent of the Internet and cloud computing, the cloud-based input method has opened up an opportunity for many companies to open their online services business. Everyone wants to get a piece of the 600 million Chinese internet users ' entrance battle, and the big business is self-evident.

Issues, trends and opportunities

People may wonder: Now that Microsoft has an Input method product (Microsoft Pinyin Input method), why Microsoft to do this is called "engkoo Pinyin Input Method" of the new input method? Let alone in the input method market looks like "the world has been divided, the overall situation has been determined" under the circumstances.

The answer is actually very simple, that is, we think that the crux of the Chinese input method is far from being solved. With the rise of the internet age, we have increasingly found that Chinese users in the Internet era of the use of language communication habits and trends are quietly changing, and this has spawned some traditional Chinese input method can not meet the needs of use. In addition, coupled with research breakthroughs in the field of natural language processing, we believe that the time has come to usher in the core technology of the next generation of Chinese input methods.

First of all, the network generation of new user needs. For example, we note that the frequency of Chinese users ' use of English is increasing every year, and that the use of Sino-British hybrid is increasing rapidly. Today, about 325 million Chinese are learning English. By 2025, English-speaking Chinese are expected to surpass the total number of native English speakers in other parts of the world.

However, although there are so many people learning and using English, we find that the Chinese Input method software can provide effective and friendly English input accessibility function is very few. We think that for Chinese users, the use of input method to assist the English input is the best solution, because we have been very familiar with the input method, and through the input method, we can use a lot of relevant technology behind.

On the other hand, the language of the English-speaking world is rapidly changing, with the language tracking system estimated to have a new English word created every 98 minutes. Most of these words do not have a common Chinese counterpart translation at all. In some areas of specialization this phenomenon is particularly apparent, for example, software technology books. You pick up a software technology book and you can see a stack of English terms. Nowadays English is already a part of Chinese daily language, in many fields, even essential. That being the case, should our Chinese input method not keep up with the trend of blending with China and the United Kingdom, and provide a better, smoother, fresher and more accurate mix-and-output experience?

In addition, prompted us to do engkoo Pinyin input method is also a very important reason. Now when we communicate online, we will find that the content of communication is already beyond the simple text, and then contains pictures, videos, music, maps and so on "rich media" content. Millions of web users communicate, tweet, blog, and even use the text in their documents every day. It is noteworthy that the content is often through search.

So, why, when we need to paste and send them, we have to leave the current input context, run over to open a browser, enter the URL, enter the search keyword, and then the results of the search (pictures, maps, etc.) copy pasted back? This access completely affects the flow of input, Interrupted our precious attention. Why can't we just finish this whole process directly in the input method?

This fluent experience without the need to leave context is the soul of an efficient input. Now we know that the mainstream input method has a so-called "cloud candidate", which means that each input box is equivalent to the search box, so why not the search content from the plain text to other forms of rich media content? Imagine, once this becomes a reality, All the searchable content on the Internet will be on your fingertips.

In addition to the above mentioned factors, we also believe that the core technology of input method itself will usher in a new era: more accurate, more relevant data, and faster. For our part, we are interested in two core technology areas: one is to drive the input method's core engine through the new algorithm, the other is through the new network mining technology to enhance the data freshness and the quality. With the help of Microsoft's top natural language processing research results, we believe that our input method has a unique competitive advantage.

Input method and innovation

So why Microsoft want to do a new input method? First of all, we are interested in any technical challenges, and from scratch to create an advanced input method and release in a year, it is such a challenge for us. In addition, Curiosity is our source of motivation, we are curious about whether we can solve the problems we observe today's input method, and are willing to try to use innovative technology, excellent engineering ability and innovative ideas to solve these problems. The input method we envision touches many areas of research in computer science: Natural language Processing, web search and data mining, human-computer interaction, speech processing, machine learning, cloud computing, images and media, and so on.

When we look at the history and status of input method, we notice that the innovation of input method is inevitably facing the "Innovator's Dilemma" (Proposed by Clayton M. Christensen of Harvard Business School). The reason behind this is that there is a very complex technology behind a successful input method, and the more innovative the software, the more likely it is to succeed. However, the "Innovator's Dilemma" points out: Over time, successful, mature, and complex products will eventually achieve huge user volume, the software version after years of iteration, which inevitably accumulated at all levels of technical complexity, on which the new innovation will inevitably bring great risks and costs. So what actually happens is often called "continuous innovation". We believe that most of the input method innovations in today's market belong to this category.

Successful people must travel light from scratch and focus on the product rather than the market in order to eventually become the solution to the "Innovator's Dilemma", which, if successful, results in what is known as "disruptive innovation" (disruptive innovation). The latter is what we do engkoo Pinyin Input Method project: Based on research, skip product cycle constraints, focus on innovative solutions and new user pain points.

Results

So far, the results of Engkoo Pinyin Input Method project is very exciting. Our natural language processing researchers have fundamentally modeled Chinese input, and we interpret input as a translation process from Pinyin to Chinese characters, similar to those in English and Mandarin. This approach allows us to use the Microsoft more than 10 years of statistical machine translation methods to solve the problem of Chinese input.

In addition, Engkoo Pinyin Input method includes Chinese-English mixed transmission and English auxiliary mode. It has built-in machine translation, word alignment, and Bing Dictionary (formerly the Engkoo Dictionary) unique "phonetic search" function (such as knocking "Fiziks" can be found "physics", as in English pinyin). The origins of these functions can be traced back to the technical research reserves of Chinese and English natural language processing in the more than 10 years of our institute.

Sino-British mixed-lose

Finally, we have innovative support for the input of non-text class content. We call it "rich candidates" (corresponding to "text candidate"), which lets our daily input go beyond the tedious text. Our inspiration comes from search engine technology, we know that search engine has "question and answer", this is implicit search, and "vertical search", it corresponds to explicit search. For two examples, if we enter "Hei" in the chat, it is likely that you want to express a good mood, so our input method can automatically give some such as pictures, expressions and other candidates for direct insertion of the dialogue. The explicit search is the user manually select which type of content to search: Chinese and English translation, Yan text, map and so on.

Team background and software development philosophy

Our team has been a multisectoral partnership from the outset: researchers and product developers work together. Chinese Input Method Product department and Microsoft come together, finally developed engkoo Pinyin input method. The help from the product department makes our software development process quite smooth, with rare products that go directly from the lab.

I myself as the development director of this project, we will certainly feel strange-why should an old foreign lead to develop a Chinese input method? The answer is that although I am not Chinese, I have a sincere love for Chinese language and culture. Although I am not Chinese, but I have a Chinese heart. Why do you say that? My childhood was spent in Flushing, New York, and flushing attracted many Asian immigrants with a strong Asian flavour. I was influenced by Chinese and Chinese culture from my childhood.

For the ability to lead the development of engkoo Pinyin input method, I feel great joy. On the other hand, I believe that as an outsider to the Chinese input method, I can also bring a new perspective to our team. In addition, since I also led the Engkoo (translation and language learning) project, and from the dictionary to input method, natural language processing in the same vein, so I also naturally become the development director of this project.

Engkoo Pinyin Input Method project for me is a change of mind, must be from the existing input method of the rut to jump out. We have to be different from other players in the industry and we must be bold and unconventional in addressing the challenges we face. Another factor is that teams must be composed of top software engineers, researchers, and designers. In fact, we did assemble a group of the most bull people-people with the desire and ability to transform the world-to create the best input technology.

Our philosophy of development is simple: multiple releases, learning and improvement from release. Our improvements are based primarily on server intelligence and data collection analysis, rather than on traditional discussion groups (focus group). Our approach is called "practice-driven research" (Deployment-driven), which is like an agile approach in the field of research.

One of the common problems with computer science labs is that it is not so tightly connected to the end user. In addition to the time lag that results in the technology entering the market, the lack of actual user feedback can lead to slow or biased research. Our philosophy of "practice-driven research" is designed to address this problem, so our products are quickly moving to the market, and the feedback we get from them is a great boost to our team, and it's important to determine where we invest our time and energy, because one of the difficulties in basic research is how to choose, and "practice-driven research" has given us the beacon.

Contemplation of the future

The future begins with history. Historically, the "disruptive innovation" of input method is based on user experience and input efficiency breakthrough. From the perspective of research, such as human-computer interaction, we can see that the "Natural user Interface" is the future theme. In this sense, the future of the input method can also be expected to be a more "spontaneous" and intuitively intuitive way for users to provide a rich experience in a variety of input scenarios. The perfect input method should be to let a person regardless of what input scene, want to enter what type of content in the case, all feel smooth without trace, without any thought block and burden.

Another intersection of industry and research is "big data" and the use of machine learning techniques to build input systems that can handle large data. Ultimately, for the user, this means that fewer keystrokes bring more input. With the rapid development of mobile devices in indomitable, how to achieve a more efficient input experience on mobile devices is not only a user experience problem, but also a core technical problem. From this perspective, we are looking forward to a more mature model of haptics (tactile), natural user interface, and multi model fusion research, which can make full use of a large amount of input context information.

Finally, in terms of development trends, we believe that apps (extended application) will play an important role in the future of input methods. In other words, the input method will be regarded as a platform, rather than a variety of complex technology tangled together in the whole, to build Input method application development platform will enable countless developers for the future of the input method to accelerate the development of a strong resultant force.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.