This is a Python crawler for small white free teaching course, only 7 section, let the zero basis of your initial understanding of the crawler, followed by the course content to crawl resources. Look at the article, open the computer hands-on practice, an average of 45 minutes to learn a section, if you want, today you can enter the reptile Gate ~
Not much to say, formally start our first lesson, "Installation of the Python environment" ~
Cheerleading starts, look at the blackboard, look at the blackboard ~
1. Installing Anaconda
In our teaching, we use the version is Python3, as to why Choose Python3, Hum!
工欲善其事, its prerequisite, you have to build your programming environment before you learn the crawler. Don't say much nonsense, build it according to the following method:
1.1 Downloads Anaconda
When you open the Anaconda Web page, you see a page like this:
Depending on your computer system, select the appropriate version of Anaconda (remember to choose the version of Python 3.6), Mac OS users choose Mac version, if you do not want trouble, select graphical Installer.
1.2 Installing Anaconda
Select the default location to install:
Two selection boxes are checked, installed:
1.3 View Anaconda from the Start menu
After the Anaconda is installed, you can view it in the Start menu. You can see the components that are included as shown:
After that we will use the main:
Anaconda Prompt:anaconda comes with the command line
Jupyter Notebook: An easy-to-use IDE for getting Started
2. Install common Packages
2.1 Installing the Python package requests
Open cmd terminal, enter PIP install requests, if the installation is unsuccessful, you can try: Conda install requests
The successfully installed appears, which means the installation was successful. If you need to detect, first enter Python, and then input import requests, no error occurred, indicating that the installation is successful and can be used normally. Note: Quit when you're done: Quit ().
If the installation shows that Conda is not an internal or external command, install it in the following manner (if no error is required, do not use the following method)
On the Start menu, open Anaconda Prompt:
In Anaconda Prompt enter Conda install requests:
2.2 Installing the Python package lxml
Also in the terminal input: Conda install lxml, the successfully installed, that is, the successful installation, if not successfully installed, please try the following methods.
Go to http://www.lfd.uci.edu/~gohlke/pythonlibs/and manually download the third-party package you need to install (note whether your Python version is 32-bit or 64-bit).
In the directory where the downloaded files are located, hold down SHIFT and right-click, choose to open the PowerShell window here, the command line using PIP install + Download the full file name, you can complete the installation.
So you should know how to install the Python package, the general method is to enter in the terminal: Conda Install + package name or PIP Install + package name. In the event of a special inability to install, you can install it after downloading.
- 3. Jupyter Notebook
3.1 Opening Jupyter Notebook
On the Start menu, open Jupyter Notebook:
Jupyter will open automatically in the Web page:
3.2 Jupyter Notebook Interface
Files: All the items (code) and default stores in your current working environment are here:
Runing: The projects you are currently running are here:
3.3 Create a new document and start writing code
Click on the top right: New > Python 3, which creates an Ipython file,
Click above utitled can change the name of the document, the following space can write code:
3.4 Jupyter Notebook Function Introduction
- 4. Create the first instance: crawl Baidu Home
With only four lines of code, we can download the contents of the homepage of Baidu:
1. Import the requests library; 2. Download Baidu homepage content; 3. Change the encoding; 4. Print content
Specific crawler principles and the connotation of the code, in the next section of the case to explain in detail ~
All right, this is the class.
Python Crawler Primer 1 installation of the Python environment