0 reply content: There are actually many implementation methods. python is only one of them. Below I will introduce what I think is more convenient:
1. Excel VBA
Note: The following two templates are from http://ExcelPro.blog.sohu.com
(1) color selection of map of China by province
Below is:
The usage is also described in the figure.
: Excel template-color map of China
(2) map of China is accurate to city coloring
: China
Data extraction is a common requirement for analysts in their daily work. For example, the loan amount of a user, the total interest income of a month or quarter, the loan amount and number of transactions in a specific period of time, and the number of loans larger than 5000 yuan. This article describes how to extract data using python based on specific dimensio
When we are dealing with some Web site data, sometimes we need a lot of data is dynamic loading, and not all static, the following with an example to introduce a simple access to dynamic data, first affirm I small white, still learning Python, this method is still more clumsy, but for beginners still need to know.First
Python can use Faker to make some virtual dataPreferred Installation FakerPip Install FakerThe old version is called Faker-factory, but it doesn't apply.Use Faker. Factory.create () to create and initialize the Faker generator (generator)Here's how to use it:From Faker import Factoryfake = Factory.create ()# ORFrom Faker Import Fakerfake = Faker ()Fake.name ()# ' Lucy cechtelar 'Fake.address ()# "426 Jordy Lodge# Cartwrightshire, SC 88120-6700 "Fake.t
in reptiles, so not detailed introduction
If you know the form of a website request, be skilled in using the F12 Developer tool and check the network inside.Take a look at the caseOf course, not all Web pages are sent by the request to get the data, there are non-sending data Dynamic Web page.For such a site, we generally use selenium to do the simulation b
Tags: try src temporary exe creat upload Mat website crawlRecently learned Python web crawler, so I wrote a simple program to practice practiced hand (hehe. )。 I use the environment is python3.6 and mysql8.0, crawl target site for Baidu hotspot (http://top.baidu.com/). I only grabbed real-time hotspot content, and other columns should be similar. There are two variables in the code seconds_per_crawl and cra
(newPage) atGetlinks ("")Run resultsRecursive Crawl page principle:Three uses scrapy acquisitionHigh-rise buildings are from the simplest brick by brick stacked up, write web crawler is also a lot of simple duplication of operations, find the page key information and the chain, and then so loop. And the Scrapy library, you can significantly reduce the Web page link lookup (do not have to do a lot of filters and regular expressions) can also reduce the recognition of the complexity of the work.U
NumPy modules can efficiently process data, provide array support, and many modules rely on him, such as: Pandas, SciPy, matplotlibInstalling NumPyFirst to the website: https://www.lfd.uci.edu/~gohlke/pythonlibs/Find NUMPY+MKL My Python version is 3.6.1, the system is 64-bit So the package that corresponds to the download is: After downloading the package, go t
. Here I mainly study, so although there are more than 10,000 data sets, but also only set 500. Just 500 of my little computer also ran for a long while.
Train
Here do not know how, my day and Day_part add in the error, only delete these two variables calculation, but also to study patching.Then use the EXP function to restore
Train$registered
Finally, the date after 20th is truncated, write a new CSV file upload.
Train2
Done!GitHub Code Add Group
The
The online approach to sending Multipart/form-data using Python is mostly based onUlrlib2 's analog post method, as follows:Import Urllib2boundary= ' -------------------------7df3069603d6 ' data=[]data.append ('--%s '% boundary) data.append (' Content-disposition: Form-data; Name= "app_id" \ r \ n ') data.append (' xxx
information:Data={ input1:"*******", input2:"******* ", remember:"false"}Take the E-Donkey download website for example: http://secure.verycd.com/signin?error_code=emptyInputcontinue=http://www.verycd.com/The data information is in the From Data tab: data={username: **** " **** " continue : " http:
Securities Bao Www.baostock.com is a free, open-source securities data platform.Provide a large number of accurate and complete securities historical market data, listed company financial data, real-time securities market push services.Acquisition of securities data information through the
scenarioHttpcookieprocessorProxyhandlerHttpshandlerHttpredirecthandlerUrllib2 three ways to download Web pages:Web ParserTools to extract valuable data from Web pages1. Regular expressions (complex, fuzzy matching)1. Html.parser2. Beautiful Soup (third party plugin, powerful)3. lxmlBeautiful SoupPython third-party library for extracting data from HTML or XMLOfficial we
step first judge, otherwise cleaning is not diligent, generally require the Crawler colleague store request URL for data quality
4.2 Calculate the data volume of the crawler data source and each ETL cleaning data table
Note: SQL scripts are not aggregated filtered 3 table data
for ETL cleaning work self-examination start the field to clean the data source to write a script check SOCOM Web site is mainly for the region and the industry has been cleaned to other fields do replace extra field processing, so take a script check,Find Page_url and website data for verificationWhere it's written to make it easier to see the cleaning of a fie
Reply content:In fact, there are a lot of implementations, Python is just one of them, let me introduce what I think is more convenient:
1. Excel VBA
Note: The following two templates are from
/ http
ExcelPro.blog.sohu.com
(1) Map of China to save color
Here are the following:
The use of the method is also described in the diagram
Download Address: Excel template-China map fill chart
(2) China map accurate to city fill color
Download Addr
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.