Reprinted from: http://tech.qq.com/a/20080517/000075_1.htm
Chen Xilin: Thank you, host. Thank you. Good morning. First of all, I am very grateful to the Conference for providing us with such an opportunity. This opportunity is not for me, but for the disabled. Because we have been doing this sign language for almost 10 years, and I have been impressed with it for almost 16 or 17 years since. I am very happy to see the care for the disabled at the end of over 10 years, and the whole society is constantly improving. It should be said that this has made great progress and the living environment of the disabled has been greatly improved. Therefore, I am very grateful to the Conference for providing such an opportunity.
The title I reported is "sign language interaction-Communication words and deeds ". In fact, we usually use languages for communication. However, for the disabled, especially for the deaf, we know that in ancient China there was a saying "Ten-deaf, nine-Mute" for the majority of the Deaf, even though he is not a dumb person, however, his words are not effectively developed. We hope this work can help the disabled.
Why are we talking about information accessibility? Our human society is not concerned with information in the early stages of social development from the Copper Age to the current information age. Why don't you care? The first is that there are few sources of information, and the second is that information itself has not given us such an important degree in our lives. Now, let's look back at these 10 years. If you still remember, 99
One of the things that our news media reported was a 72-hour online survival experiment. At that time, I put these people in the room and showed a computer and a network how your life was like. At that time, some people survived. Now we are dependent on the Internet. If I don't receive emails in a day, I am really worried. Even if I am on vacation, I have to find an environment to check the emails, we have produced research dependencies on the Internet and email. Early people were called Stanford syndrome. It was a long time to talk about this in 1980s. We have more or less such problems when surfing the Internet. In another aspect, it reflects the importance of information to us. Now we can stay away from home and get what you need through the Internet. Hotels including air tickets, banks, and even a month later are all handled in the office. However, they face great challenges for the disabled.
Disabled
According to statistics from 30%, less than 40.55% of the total number of people with disabilities is physically disabled, but about of the total number of people with disabilities is impaired. In the current information society, information is so important, but these people are very difficult. For example, for the disabled, we know that daily life, clothing, food, shelter, and clothing are indispensable to everyone. From the perspective of "behavior", you have to go out to work, it's not as fast as you do at home. I don't know how many people are going to the bookstore and how many others are going to do research. When I went to the library, I asked countless students to ask me, how did you do research without the Internet at that time? I said we went to the library. Now, the Library has become a library to a large extent because of the development of information technology. Therefore, we are faced with the challenges faced by the disabled. For example, physical disability can be achieved through speech. Some research institutions in China have also done a good job in speech. Software and speech synthesis are available for visual impairment. But for people with hearing and speech disabilities, sign language is of great help to them.
According to instructor Yang, why should I use the sign language for information accessibility standards? What we don't understand is that we use text. Why do we use sign language? I don't know when you watch TV, when you watch an English movie, I believe most of you will watch subtitles, but when you watch subtitles, due to the special results of people's eyes, in fact, our visual cells can only feel rough movements. When you stare at subtitles, you will ignore many details. When people are doing experiments and people with disabilities do sign language, they find that the speed of sign language is basically the same as that of speech, but reading text is much slower. Therefore, this is an important part of sign language.
I once said that we have a small number of people with disabilities, but we cannot see them on the street. But if you go abroad, especially in developed countries, you can see a lot of people with disabilities. It doesn't mean that there are too many people with disabilities in those countries. In fact, our accessibility facilities are very poor. For example, if we have problems with medical treatment for the disabled, we do not want to seek medical treatment if we have problems with words, it would be much better to have a simple conversation with doctors. We now have mobile phones everywhere, but it is very important to combine speech recognition and sign language so that people with disabilities can hear the words of their distant peers on their mobile phones.
The situation that Sign Language Recognition should introduce to you is that, although we have been doing this for more than 10 years internationally, this work is still very challenging or can be used in a small scope, we also need your unremitting efforts. We hope that we will be able to push our speech recognition to the market in three to five years.
This was the first data glove patent since 1983. This is a gesture input facility made by New South Wales. In fact, the dialect of sign language is very serious, but the standard sign language has also been greatly improved. In addition, data glove like George Washington University divides gestures, directions, and so on, and can make more than 100 words. In addition, some other foreign institutions are doing this. There are also some institutions in China, such as domestic automation workers. We have been doing sign language for more than 10 years. The Sign Language Recognition we have done so far is the largest vocabulary system in the world. In addition, there are some other institutions, including Tsinghua.
We know that in addition to using glove, we also have a very important problem, that is, the glove you bring is very expensive, so people always hope to use visual methods like people, you can see the expression of the gesture, so there is also such work in this regard. This includes bringing cameras to the eyes, but there are still some limitations in the sense of sign language. Because we see that others are opposite when using sign language, the opposite is the best view.
In addition, other researchers have done a lot of work.
In terms of application, in fact, Hitachi of Japan used to combine the recognition and synthesis of Japanese sign language in 96 years to build an Automatic Ticketing System. Application already exists here. In addition, IBM actually built an Sisi system last year. This system does not currently have a Chinese Sign Language. In 1300, we distributed the funds to more than deaf schools. There is a CD after the Chinese Sign Language Dictionary version 2004, which is also recommended by our project team.
Chinese Sign Language has its own particularity. Our current sign language basically has gesture words, including 30 letters and tones of sign language. This is actually the case for Sign Language Recognition and synthesis. We see that this end is disabled and this end is our healthy person. We hope that speech can be understood by people with disabilities through synthesis. In turn, people with disabilities can be understood by ordinary people and converted into idioms and texts through sign language expressions. In Chinese Sign Language, it is actually expressed in combination with the hand shape, action, facial expression, and orientation. Facial expressions can help increase understanding by 20-30%. We have made two implementations before and after doing this. The first is the sign language glove, and the second is the method we think. I will take a look at the example later. Because the hand is very important to the body, we have a position tracker in our hands. The entire recognition process includes Feature Extraction and pre-matching. Why should we emphasize speed? Because there are more than 5000 words, the text should be formed immediately. This data volume is very large. Therefore, the possible combination is the 10th power. In this case, how to quickly identify is a great challenge. In addition, the length of different sign languages is different. For example, "House" is such a gesture. This gesture is "sitting in the sky ". So the length is very different. This is also a technical problem to be solved.
I will show you a video. This is a sign language recognition process. This is what we put into speech synthesis software after the final recognition. In addition, because the data glove is expensive and not easy to use. We can use a standard sign language as a template and view another sign language process through another camera. They will form a corresponding relationship between them. If these two relations meet the supply relationship, I think it is an action to complete a Sign Language Recognition process. It is easy to say, but feature matching, including final recognition, is a very complicated process. We have been doing this for almost three years.
In fact, its essential process is to match a three-dimensional action through a two-dimensional trajectory. This is a process of recognition and tracking. Why do we always stare at our faces in the process of recognition? Because the sign language is actually relative to the body. For example, for a word like "I", gestures are completely the same, so the relative position becomes a key factor.
In addition, in terms of synthesis, we need to convert a large amount of data into synthetic speech to make the deaf understand the world. We have divided into two aspects: one is the video-based method, and the other is the animation-based synthesis method. We want to do it based on animation. The advantage of doing so is that roles and actions can be separated. At the same time, you can flexibly modify the movements of sign languages and add your expressions and thoughts. Let me show you the software examples. Here, I would like to say, "Thank you for your interest in the career of the disabled ". I don't know if anyone can understand this passage. It is included in the standard sign language dictionary. We know that Mandarin has gained popularity over the years. The reason for its popularity is TV. With the popularization of sign language and standard software, it will play a major role in the future. Let's take a look at the same action. For example, I change a role because many children like animated roles. It can be adjusted a little faster and slower according to the requirements of learners.
To sum up, I have done a lot of work in sign language and gesture at home and abroad. However, from the perspective of Sign Language Recognition, there is a certain technical possibility so far. It mainly involves non-specific vocabulary and continuous and natural recognition. In terms of synthesis, there have been some applications on TV broadcasting, teaching, and websites, and they still need to be promoted. The synthesis of facial expressions can be better understood. Although we have been doing this for more than 10 years, the road ahead is still long. Thank you very much for your concern.
Although this is done by one of our teams, the guidance of the leaders has played a very important role in this work. Some of our researchers are still working in this team, and some have gone to other units, so our sign language has been promoted. During the data collection process, the teachers and students of the deaf school have provided us with a lot of help. Thank you for choosing here.
--------------
A: What is Mr Chen Xilin?
B: Computer Vision expert. I still remember Alexander Bell, who invented the phone, who was once a teacher at a deaf-mute school. It is said that he wanted to invent a machine that could see sound with his eyes.
A: The eyes see the sound?
B: This is a Sign Language Recognition process.
A: We have used text. Why do we need sign language?
B: The speed of sign language is basically the same as that of speech, but reading is much slower. Therefore, this is an important part of sign language.
A: So ......
B: He also wrote an insightful article titled "from service industry to industry-Integration of academics and industries".