Using. NET to develop MSN Chat robot-MSN Chat Robot Development Secrets.
Written in front:
I am not a developer, not a master, it is more like playing. In technology, there is nothing like the spirit of groping, but like opportunistic. In this article, you can not "less labor and get" through my robot to modify a own robot, because I think the program write more smelly, so will not open source. However, if you have a little understanding of. NET or C #, believe that from this article, you can find all the resources you need to develop your own, absolutely available MSN robot. To chat with my robot, you can add tbot01@hotmail.com, named "Tower Qi Brinkema", is from the cartoon Shell Mobile team name. At the same time, you can also go to http://www.guanqun.com, where there is a similar to this MSN Robot Web chat robot, you can chat to see, as far as possible in Chinese chat.
This is not a beginner's article, if you do not know what is. NET, do not understand the database of even a little things, I suggest you look first. At the same time, also hope that the real master do not joke hit me, after all, a common, non-developers of computer enthusiasts through groping, and tell you how to do a fun thing, not a wrong.
Why do you want to do MSN Chat robot
1 reasons I can think of
The most important thing is because it's fun. What your MSN robot is saying must reflect your character (if you want to). Of course, this is my reason, the purpose of this robot is only suddenly one day I want to do. Maybe you want your robot to help you do something, like an expert system or a customer service, etc.
2 today's MSN Chat robot
Now there are a lot of MSN robots, if you add the MSN robot, I think your list of the most is called "" "or his siblings a lot of guys (http://www.9zi.com), may be based on load considerations, Every time you get online, you're probably surrounded by a bunch of their family's requests to join friends. There are some so-called "free sms" robot, I have been doing SP, I said directly, in order not to delay your money, I do not comment on this robot. To mention the robot Msgerai (msgerai@hotmail.com), the man who developed it wanted very much to be able to be as intelligent as a person, although it may not be done in his lifetime, but I wish him success. After all, having a dream is good, and the robot can now do some work for him (HTTP://WWW.FUNNYOK.NET/NLP). There are other MSN robots, such as the specialized information inquiry services, to help you search Google and so on. MSN to carry out when there is a list (http://www.msning.com), you can go to see the good.
second, why use. NET
In fact, the reason is very simple. C # and Java are very similar, but Java I really can not find a very useful, consistent with their own use of the IDE. C # is different, vs.net (http://msdn.microsoft.com/vstudio/) Of course the best use, C # Builder (http://www.borland.com/csharpbuilder/) is also good, Even SharpDevelop (http://www.icsharpcode.net/OpenSource/SD/) is quite comfortable to use. So choose. NET is relatively good.
Other than that. NET is very convenient to develop, as long as you have a little development foundation, it is not difficult to write programs with. Net. I'm a user, not a developer, and I don't have to delve into a lot of technical or optimized stuff, and I don't have the ability to go into Microsoft Research.
I suggest you use the latest version of Visual Studio.NET, you can save a lot of trouble.
Meanwhile,. NET development can find a lot of resources, we will mention next.
What kind of chat robot do you want ?
1 Pre-development scenarios
What I'm talking about here is the concept of "chat robot", meaning that all he can do is chat with you. You have to have a program to "teach" him to speak, and to make him understand the general meaning of the words, and to do a largely less outrageous answer.
2 What else can I get him to do?
You can also let him do a lot of other things, such as query IP, mobile phone number, registration number, flight number, or directly let him check Google, help you search. None of this is any trouble, as long as you want it.
Let the robot speak first.
Whether your robot is clever or not, it is most important for him to answer like a sample on MSN. So, you need to have an MSN account, connect to the MSN server, get a variety of server messages, and send messages back to the server.
Of course, you can analyze the MSN Protocol (http://www.hypothetic.org/docs/msn/index.php) and write your own communications section. But I mentioned that I'm a person who likes to be opportunistic, so find a good interface to use. So, I found some MSN development interface.
Both of these are for. NET developed, I use the DOTMSN, it uses the MSNP8 protocol. Note that dotmsn do not use the version on the SourceForge, you need to use the address given above.
Next, download this example:
Http://members.home.nl/b.geertsema/dotMSN/...ple/Example.zip
Open with Vs.net, compile, execute.
See, understand. After you log in, double-click a person on the list to send a "Hello world!" to the person. You can not go through the original MSN program, but directly and people talk.
This part of the code is like this:
private void contactjoined (conversation sender, Contacteventargs e)
{
Someone joined our conversation! Remember this is also occurs when you are
Only talking to 1. Log this event.
Log.text + + E.contact.name + "joined the conversation./r/n";
Now say something. You can send messages using the Conversation object.
Sender. SendMessage ("Hello world!");
}
It means that when the other person joins the conversation, you send him a "Hello world!" The news past. At this point, if the person on your list double-clicks on your name, you will also receive a Hello world!.
Let the robot understand Chinese
1 Database
Because we want to do a Chinese chat robot, the size of the corpus directly related to your robot is not smart. Because of my own habits, I used MySQL as a repository of corpus and Chinese word segmentation database. And MySQL is very fast. Of course, you can use Access, or SQL Server, completely and easily. NET call MySQL library can be found here MySQL Driver CS
http://sourceforge.net/projects/mysqldrivercs/
2 whole sentence matching
The concept of whole sentence matching is simple. Chat, do not know people will usually come up to say "hello", or "hi~~" and so on. This is usually very simple, and there is not much change, just let the robot answer the line. For example, the other side said "Hello", the robot saw this "Hello", the direct answer "hello", it can be. Or the other said "88", you can let the robot say "good-bye", or 88 or something. This is called the whole sentence match. Is the robot to get the whole sentence, in the library inside a check, ah, the library has this sentence how to answer, pick out a reply to the past, the other side will not think this robot stupid.
Even if the other person said "you are stupid", you let the robot answer "I am not stupid", the other side must feel that the robot is OK, but also know that others say he is stupid.
3 Chinese participle
A chat robot of course must understand some Chinese. The basis of Chinese processing is Chinese participle. What is participle. "Participle is the process of combining consecutive word sequences into word sequences according to certain specifications." "I copied the definition." Please refer to this article: http://www.hylanda.com/center/knowledge.htm They do Chinese word segmentation should have a certain score. The domestic word segmentation system, Ictclas do also better. VC has the source code, you can come down to see.
Http://www.nlp.org.cn/project/project.php?proj_id=6
Some people will say, I do not understand this thing, I have not studied. In fact, I do not understand. However, if you do not do Chinese participle, chat robot can only stay in the point of evidence matching. We can use the maximum matching method, the chat robot received words to do a simple participle. For an algorithm, please refer to Mr. Jianwedong's handout, which you will see.
Basic of Chinese processing of Chinese as a course
Download this ppt handout: http://ccl.pku.edu.cn/doubtfire/Course/Chinese%20Information%20Processing/contents/Chapter_07_1.ppt
Word segmentation algorithm does not need to be too complex, simple point is good.
In addition, the algorithm needs a Chinese word segmentation library. I have provided a MySQL that can be downloaded here. You can import it into your MySQL. Other databases can also be used to simply change the SQL statement.
Chinese Word library Download: Http://www.guanqun.com/down/wordlist.rar
4 matching of words
Only participle is not enough, if you really want to let the robot understand what people say, certainly need some artificial intelligence algorithm. We just do a robot to play, do not need to study so deep. Artificial intelligence has come to the present, too clever chat robot is also very few. And let the professional researchers do the research, we just play. So...... We'll use one of the easiest ways. Our method is to let the robot find this sentence of the key words, this sentence roughly the part of the word collocation, and then go to the corpus found in accordance with such rules of the answer.
For a simple example:
For example, each other said:
"You're so funny."
We first use the word segmentation algorithm to divide this sentence into
"You're so funny,"
Then find the keyword "fun". At the same time, the speech of the word collocation also recorded. In this way, when the keyword "fun" in the corpus, we again to find whether there is similar to this sentence, if there is a word, random answer: "Haha ... I love it when you say that. "So that you can give a better feeling to the chat person."
So the question is, how to find the key words. My method is ... (rather rotten, but usually effective), find the longest word in the sentence as a keyword. No why, because it will be faster. If all the words in a sentence are scanned into keywords, and then to check the library, there will be some matching problems. (unscientific, but usually effective).
Five, let the robot again "clever" some
the design of 1 whole sentence matching corpus
The first step is of course to do your whole sentence matching corpus. The corpus must write itself, do not be lazy. Find out what people say most often, such as Hello thank you sorry what, put more answers in the inside, lest each answer is the same, and then to answer the time, first write a SQL query, such as
SELECT * from reply where ' key ' = ' +sentense+ ' ' ORDER by rand () limit 1
It is ok to reply to the past if you find it directly. If you can not find the whole sentence matching, and then do word processing.
the design of 2 word matching corpus
Because our word segmentation algorithm has not been optimized, at the same time, we find the keyword method is not so good, so you give the answer must be not so clear. To be clear, the answer must be "vague". The goal is to make people feel that the robot has understood what he said, and the answer is relatively "road". Do not ask 100%, as long as there are more than 40% of the right, chat people will probably accept the basic. At the same time, the answer to the corpus, it is best to guide the other side to answer the time, you can say that you have a corpus, preferably the whole sentence to match the sentence.
For a fun example:
Question: Are you a man or a woman? /Are you a man or a woman/Are you a man or a woman? (whether there are punctuation, we want to record the sentence of the part of the word collocation, at the same time, to do some processing of punctuation)
Like such a word, we can through participle, find the key words: "or", and by judging the part of speech, you can know, this is a question. And the question is to choose between the two situations. (Of course, we have a simple algorithm, we can not know this sentence is actually asking the gender)
What is your robot's answer to this question? In fact, it is very simple, first of all, to answer the "road", as far as possible not to let people feel the question, at least let people feel that your robot is to know what the other person is asking. So, my robot answers:
The robot replied: All are ... Ha ha
Because the answer is the chat language, and with a bit of joking, so it will make the chat that the robot is not so stupid.
This is just a simple example. Many specific sentences have to be analyzed by yourself. Of course, the more corpora there is, the more robots understand, and the smarter they become.
3 does not match the keyword how to do
Corpus is not a lot of cases, it is very likely that our word segmentation algorithm can not match the appropriate answer to deal with. So we have to do a separate corpus, to do not match the key words, to answer. Such answers are more likely to require answers from people like "Suangua", because the other person may say anything and our robots don't understand. So, try to "muddle through" and, at the same time, steer the other person to the aspect that your robot might answer. You can try to chat with "small", will find that it can not answer the time, it will casually pick a "Buddhist scriptures" to say.
In fact, one of the most important skills is to learn what suangua people say, are foggy, people can not touch the mind, but also think that may be right. We have to let the robot learn this technique to achieve the goal of looking "smart".
Last words:
In fact, write such a robot program quickly, if familiar with some words, an estimated day should be able to write out. I spent about a day and a half, plus the time to prepare some corpus. If you really want to be a little "smarter" robot to play with, this article should give you at least 3-5 hours of time to find information. If you do not bother to study for yourself, there are other companies can only match the whole sentence of the program to download, their next play will forget.
Originally published in My blog:http://bot.donews.net/bot reproduced please do not remove this
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.