WTI website recruitment information crawling, visualization operations, and recruitment information

Source: Internet
Author: User

WTI website recruitment information crawling, visualization operations, and recruitment information

Objective: To use Python to capture data on the internship website and analyze job information, and use Python for Visual Analysis

Software: Python 3.0

  1. Introduction to the website Crawler 

Intern monk Website: http://www.shixiseng.com/

 

Enter data in the search box and jump to the page. Fn + f12 will show the webpage debugging tool.

Refresh the page and click the first link.

Url is the url used by the crawler. The meaning of k and p has been explained. Click the last page to view a total of 109 data pages.

 

 

Then, the request Headers information is used to simulate browser logon.

Right-click the webpage and check the source code. We want to crawl the job name, job details URL, monthly salary, work location, and other information. The regular expression is as follows:

 

Okay. After the basic work is completed, the code needs to be further built.

 

How to flip the page and crawl the next page is to use the cycle to adjust the parameter P to capture the entire page.

 

Then, combine the fields to be crawled and write them into an excel file.

Required: import xlwt # Read and Write Excel files

 

Finally, run the code and get the result. There are 1085 records in total, which takes more than 30 seconds.

 

 

Ii. PTYHON Data Analysis

 

First, import the required package and then read the Excel file.

 

Get:

 

The website data is temporarily unavailable, so the two columns are deleted.

 

 

 

It mainly analyzes the salary, working days, work location and time requirements.

 

 

Let's take a look at a simple one:

1. Requirements and distribution of working days

 

2 internship requirements

 

 

 

3 Distribution of internship locations

 

 

What the hell?

 

Filter out frequencies less than 5

 

 

4. internship salary level

 

 

The same problem ,,,

 

 

There are 168 categories, so it is crowded like that... If the frequency is less than 10.

 

 

Summary:

Internship location: There are many data analysis job internships in Beijing and Shanghai, followed by Guangzhou and Shenzhen. Second, second-tier cities, such as Chengdu, Nanjing, and Hangzhou.

Working days: the maximum number of internships is required for five days/week, accounting for 44.61%, followed by four/week and three/week.

Internship time: requires at least three months of internship, followed by six months and four months.

Internship salary: the most concentrated in the-yuan range. More than half of my internship salary exceeds 100.

--------------------------------------------------

For the first time, please confirm.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.