Java and Python use Youdao dictionaries to make word-search scripts

Source: Internet
Author: User

Java and Python use Youdao dictionaries to make word-search scripts

Let's take a look at the results in two pictures.
Java's:

Python's:

Today's whim, want to do a look up the words of things, hurriedly to Youdao dictionary official website looked, originally we want to query the word is embedded in the page address to Youdao dictionary, and then the page result is we need word interpretation, so this thing need technical knowledge only:

Regular expressions

All we have to do is extract the word definition from the source page, so here we only say the regular expression that extracts the word.
Analysis of the source of the Web page, we can see that the word interpretation is in a div tag,

The primary goal is to get this part, and the regular expression can be written like this:

(?s)<div class=\"trans-container\">.*?<ul>.*?</div>//(?s)的含义是使‘.‘可以匹配换行符,默认是不匹配的//.*?意思是在非贪婪模式下,匹配任意多个字符

After getting to this part, further, what we need is the word definition inside, so we can do this:

(?m)<li>(.*?)</li>//(?m)的含义是按行匹配,在没一行都按照这个正则表达式匹配,默认情况是不分行,统一匹配的//这里用小括号把.*?包起来,为的是可以直接获取单词的含义,舍去旁边的标签

Here is the specific code:

One, Java code
ImportOrg.apache.http.client.methods.CloseableHttpResponse;ImportOrg.apache.http.client.methods.HttpGet;ImportOrg.apache.http.impl.client.CloseableHttpClient;Importorg.apache.http.impl.client.HttpClients;ImportOrg.apache.http.util.EntityUtils;ImportJava.io.IOException;ImportJava.util.Scanner;ImportJava.util.regex.Matcher;ImportJava.util.regex.Pattern; Public  class Test {     Public Static void Main(string[] args)throwsIOException {closeablehttpclient httpClient = Httpclients.createdefault (); System.out.print ("Please enter the word you want to check:"); Scanner s =NewScanner (system.in);        String Word = S.nextline (); Word = Word.replaceall (" ","+");//Find addresses based on find word constructsHttpGet Getwordmean =NewHttpGet ("http://dict.youdao.com/search?q="+ Word +"&keyfrom=dict.index"); Closeablehttpresponse response = Httpclient.execute (Getwordmean);//Get back the source pageString result = entityutils.tostring (response.getentity ()); Response.close ();//Note (? s), meaning let '. ' Match line breaks, which do not match by defaultPattern Searchmeanpattern = Pattern.compile ("(? s) <div class=\" Trans-container\ ">.*?<ul>.*?</div> "); Matcher m1 = Searchmeanpattern.matcher (result);//m1 is to get the entire <div> containing translations        if(M1.find ()) {String means = M1.group ();//All interpretations, including page labelsPattern Getchinese = Pattern.compile ("(? m) <li> (. *?) </li> ");//(? m) for matching by rowMatcher m2 = getchinese.matcher (means); System.out.println (" Interpretation:"); while(M2.find ()) {//In Java (. *?) IS group 1th, so use Group (1)System.out.println ("\ T"+ M2.group (1)); }        }Else{System.out.println ("no explanation found."); System.exit (0); }    }}
Two, Python code
#!/usr/bin/python#coding: Utf-8ImportUrllibImportSysImportReifLen (sys.argv) = =1:#没有单词就提示用法    Print "Usage:./dict.py the word to find"Sys.exit () word ="" forXinchRange (len (SYS.ARGV)-1):#查找的可能是短语, there are spaces in the middle, such as "Join in", where the words are stitched togetherWord + =" "+ Sys.argv[x +1]Print "Word:"+ Wordsearchurl ="http://dict.youdao.com/search?q="+ Word +"&keyfrom=dict.index"   #查找的地址Response = Urllib.urlopen (Searchurl). Read ()#获得查找到的网页源码#从网页源码提取出单词释义那一部分Searchsuccess = Re.search (R "(? s) <div class=\" Trans-container\ ">.*?<ul>.*?</div> ", response)ifSearchsuccess:#获取我们想提取的核心单词释义, in the case of only one grouping, FindAll returns a list of this subgroup stringmeans = Re.findall (r "(? m) <li> (. *?) </li> ", Searchsuccess.group ())Print " Interpretation:"     forMeaninchMeansPrint "\ T"+ Mean#输出释义Else:Print "no explanation found."

Java and Python use Youdao dictionaries to make word-search scripts

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.