Because the school asked to brush a door called "Packaging the World" net course, and the course has more than 200 choice questions, and can only be completed in the mobile phone, web version can not do the problem, and watching video is impossible to watch video, this life is impossible to see ... So wrote a few lines of code to carry out Baidu search answers .
Ideas are as follows:
- The phone screen is projected onto the computer;
- and identify the picture text;
- Call Baidu to search;
- Extracts the HTML keyword.
Environment configuration: python3.6, third-party library: Pyautogui, PIL, pytesseract, <a href= "Https://github.com/tesseract-ocr/tesseract" > Recognition Engine tesseract-ocr</a>
To recognize Chinese, the OCR engine downloads a Chinese package <a href= "Https://github.com/tesseract-ocr/tesseract/wiki/Data-Files" >chi_sim </a> put it in the tesseract-ocr\tessdata. After the installation of OCR also configure the call path, in Python36\lib\site-packages\pytesseract find pytesseract.py (This is my Windows path), open the inside add a path:
1 # Change this IF tesseract are not in YOUR PATH, OR is NAMED DIFFERENTLY2 tesseract_cmd = ' tesseract ' 3 tesseract_cmd = ' C :/program Files (x86)/tesseract-ocr/tesseract.exe ' 4 Img_mode = ' RGB '
Then use Airdroid or Vysor,mobile phone assistant , such as the screen of the phone projection to the computer, with the mouse to determine the coordinates, the code is as follows:
1 Import Pyautogui as Pag2 x, y = pag.position () 3 Posstr = "Position:" +str (x). Rjust (4) + ', ' +str (y). Rjust (4) 4 PR Int (posstr)
To get two coordinates (start and end coordinates), then use the obtained coordinates to use the following code and call the OCR engine recognition (the identified word is each separated by a space, so to remove the space in the string), the code is as follows:
1 fromPILImportImage2 fromPILImportImagegrab3 Importpytesseract4 ImportWebBrowser5 6pos = (0,245,425, 327)7Cut_img =Imagegrab.grab (POS)8Cut_img.save ('c:/imgsave/1.jpg')#Save to Folder9 Print("Screenshots sucess")Ten OneText=pytesseract.image_to_string (Image.open ('c:/imgsave/1.jpg'), lang='Chi_sim')#call recognition engine recognition AText=text.replace (" ","") #去空格 - Print(text) -URL ='http://www.baidu.com/s?wd=%s'% text#call Baidu Search theWebbrowser.open (URL)
Finally because the search out all is the question bank, therefore did not extract the HTML key word, actually is lazy .
Python recognizes picture text