Python (pronunciation: English [? Pa?θ?n], beauty [? Pa?θɑ:n]), is an object-oriented, literal translation of computer programming language, but also a powerful general-purpose language, has nearly 20 years of development history, mature and stable. It contains a comprehensive set of standard libraries that are easy to understand and can easily accomplish many common tasks. Its syntax is very simple and clear, unlike most other programming languages, it uses indentation to define the statement.
Python supports imperative programming, object-oriented programming, functional programming, slice-oriented programming, generic programming, and many programming paradigms. Like scheme, Ruby, Perl, Tcl and other dynamic languages, Python has garbage collection and can automatically manage storage usage. It is often used as a scripting language to handle system administration tasks and network programming, but it is also well suited to perform high-level tasks. The Python virtual machine itself can run on almost all operating systems. Using tools such as Py2exe, PyPy, Pyinstaller, you can convert Python source code into a program that can run out of the Python interpreter.
Self-taught Python for a period of time, using Django to make a website, but also with Requests+beautifulsoup
Crawler over some simple website, the weekend study learned a wave, ready to crawl QQ Space said, and the content exists in TXT, read generated cloud.
Long time not to QQ, space is said to be more than a few years do not play, the inside is full of memories of school time, looking at the smile, smiled and laughed on ... Ha ha ~ ~
Without a picture of the void
I was still in the prime of the year, humorous and funny ...
This time, the use of
Selenium
Analog Login +
BeautifulSoup4
Crawl Data +
Wordcloud
Create a word cloud
BeautifulSoup Installation
Pip Install Beautifulsoup4
Here are the official documents of BEAUTIFULSOUP4.
Also need to use the parser, I choose to be
Html5lib
Parser
Pip Install Html5lib
The following table lists the main parsers and their pros and cons:
Selenium Analog Login
Use Selenium to login QQ space, install
Pip Install Selenium
I'm using a Chrom browser,
Webdriver. Chrome ()
To get the driver for your Chrome browser.
You also need to download the driver to install the corresponding browser, or you will be prompted when you run the script
Chromedriver executable needs to BES in PATH
Error, using a Mac, online search for a download-driven article
Similarly, the same window, download the corresponding driver, unzip, the downloaded **.exe to the Python installation directory, such as D:\python. You also need to add the Python installation directory to the system environment variable.
The Python learning route is divided into three main stages: basic-advanced-framework-project combat
Basic first Stage: understanding of basic python. Basic second stage face to object programming (emphasis on programming ability)
The third stage of the basic object-oriented "design idea"-encapsulation-inheritance. Basic Phase IV Python advanced topic.
The first stage of Advanced class: Linux Foundation. Second: Python Web tools. The third Python deployment tool.
The four relational database. Fifth Python web Framework Foundation principle.
Framework phase. Python Web development phase web.py. Base Second Django Foundation.
The Third Flask Foundation. The foundation of the four tornado,
Project Combat: Personal Blog System-development-Enterprise OA system = Network Disk System.
QQ Login page http://i.qq.com, using Webdriver to open the QQ space login page
Driver = Webdriver. Chrome () driver.get ("http://i.qq.com")
After opening, right click to check the page elements, find the account password login in
Login_frame
, first locate the frame,
Driver.switch_to.frame ("Login_frame")
, and then automatically click the account password Login button, automatically enter the account password login, and open to say the page, detailed code as follows,
This time can see has opened the QQ said page, note that some of the space will appear after a prompt box, you need to simulate the Click event to close this prompt box
TM I used to have a yellow diamond, so scary ~ ~, the space head is so young, mainstream ...
At the same time, because the content is dynamically loaded, you need to pull the scroll bar automatically, load all the content, and then simulate clicking on the next page loading content. See below for specific code.
BeautifulSoup crawl to talk about
F12 view content can be found to say in Feed_wrap this
, Inside the
In the tag array, the specific content of each word
Class= "BD"
The label.
At this point QQ said has climbed down, and saved in the Qq_word file
Next, create a word cloud
Word Cloud
Use Wordcloud package to generate word cloud, pip install Wordcloud
Here can also use Jieba participle, I did not use, because I think QQ said sentence read only a little feeling, personal preferences, with Jieba participle can be seen to say high-frequency times of some words.
Set some properties of the next Wordcloud, note that here to set the Font_path property, otherwise the characters will appear garbled.
Here's another reminder that if you're using a virtual environment, don't run the following script in a virtual environment, or you might get an error Runtimeerror:python is not installed as a framework. The Mac OS X backend'll isn't able to function correctly if Python was not installed as a framework. See the Python documentation for more information on installing Python as a framework on MAC OS X. Either reinstall Python as a framework, or try one of the other backends. If you is using (Ana) Conda Please install Python.app and replace the use of ' Python ' with ' pythonw '. See ' Working with Matplotlib on OSX ' in the Matplotlib FAQs for more information. , I was in this situation, deactivate out of the virtual environment and then run
At this point, crawl QQ said content, and generate word cloud.
What can python do?
Web development and crawler are more suitable for the 0 foundation
Automated operation and maintenance development and automated testing are suitable for those who are already doing operations and testing
Big Data data analysis This aspect is very need professional professional of relatively strong
Scientific calculations are generally used by researchers
Machine learning and AI first degree requirements high next high number requirements high difficulty is very big
I have a public number, and I often share some of the stuff about Python technology. If you like my share, you can use the search "Python language learning" to follow
Welcome to join thousands of people to exchange questions and answers skirt: 588+090+942
Python crawler crawl QQ say and generate word cloud, memories full!