Natural Language Processing with Python, processingpython

Source: Internet
Author: User
Tags nltk

Natural Language Processing with Python, processingpython

A year ago, I could not dream of writing a technical summary here. I accidentally bumped into a college in southwest China, and became an engineering male majoring in liberal arts. Besides the film ha every day, it is a cool-filling CS. The mentor is engaged in computational linguistics, so the top priority is to first learn computer natural language processing and prepare for Scientific Research (seriously face ).

Go to the topic and find the book "Natural Language Processing with Python" from the library. The book looks like this. The authors are Steven Bird, Ewan Klein, and Edward Loper. Paste a watercress link for reference: https://book.douban.com/subject/5336893/

IDE: PyCharmIDE I chose PyCharm, which is said to be useful. Download and install Python as follows: 1. Download Python from the python official website, open terminal, and enter Python to display version information. 2. Download the IDE PyCharm developed by Python. The activation code of Professional edition can be used by Du Niang.

Python file encoding Declaration 1. Location: must be placed in the first or second line of the python file 2. Format: a. With an equal sign
1 #coding=<encoding name>

B. It is the most common and can be recognized by most editors.

1 #!/usr/bin/python2 # -*- coding: <encoding name> -*-             
C. vim:
1 #!/usr/bin/python               2 # vim: set fileencoding=<encoding name>
3. Purpose: Tell python interpreter how to interpret the character string encoding. If there is no file encoding type declaration, python uses ASCII encoding by default. If the code is not declared, but the file contains non-ASCII characters, the python interpreter will naturally report an error when interpreting the python file. 4. Example: the first line indicates that the script language is python, and the second line is used to specify the file encoding as UTF-8.
1 #!/usr/bin/python                 2 # -*- coding: utf-8 -*-                          
5. Note: Only single encoding is allowed in a single python source code file, and multiple types of encoding cannot be embedded. Otherwise, an error is returned !!! 6. python word divider + compiler working logic:. read File B. different files are parsed to Unicode c according to their declared encoding. convert to UTF-8 string d. for UTF-8 string de-segmentation e. compile, create Unicode object 7. UTF-8: 8-bit Unicode Transformation Format, is a variable-length character encoding for Unicode, also known as Wanguo code. In short, to make Python programs support Chinese characters, you need to add such an encoding statement at the beginning of the Python source file. My First Python Program-Hello World! 1. file --> New Project --> select the Save path of the Project (I personally feel like setting working directory in R language) 2. right-click the project you just created --> New --> Python File --> give the File a name (I personally think this is a script File, similar to the script in the R language) 3. input the file encoding Statement (not necessary, because we enter the English "Hello World! ", Not Chinese) 4. Hello World
1 print ("Hello World!") 
5. You will find that the running and debugging buttons (green triangle) are gray, because we have not set the console. Python settings console 1. click the black inverted triangle next to running to go to the Run/Debug Configurations configuration page (or Run-> Edit Configurations) 2. click the green plus sign to create a configuration item and select python (because the source code is a python Program). 3. in the configuration interface, write a Name in the Name column and click the Script option to find the one you just wrote. py file 4. click OK to return to the editing page automatically. The running and debugging buttons are all green. click Run to view the output result. Installing Packages in PyCharm-Mac1.Pycharm-> preference-> project interpreter2. + for adding packages-for deleting packages-> for updating packages NLTK (Natural Language Toolkit) by entering the following code, calling the NLTK package, and then downloading the required data sets (actually the corpus used in the book)
1 import nltk2 nltk.download()
Run and you will get to the NLTK DownloaderThe CollectionsTab on the downloader shows how the packages are grouped into sets, and you shoshould select the line labeled BookTo obtain all data required for the examples and exercises in this book. I said that the download speed is amazing. Even though the MIT (Minhang Institute of Technology, translated as: Minhang men's Vocational and Technical College) is very fast, there is no need to pay the Internet fee !!! Make a film before dinner! Hey.

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.