Tips on python data statistics:

Last Update:2016-07-29 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

I have recently used python for data statistics. Here I have summarized some tips for searching and summarizing recent usage, hoping to help some kids shoes in this area. Some tips are common. We do not pay attention to them at ordinary times. However, in specific scenarios, these small methods can be of great help.

1. Map keys to multiple values in the dictionary

{'b': [4, 5, 6], 'a': [1, 2, 3]}

Sometimes when calculating the same key value, we want to add all entries with the same key to a dictionary with the key as the key, and then perform various operations, at this time, we can use the following code for operations:

from collections import defaultdictd = defaultdict(list)print(d)d['a'].append(1)d['a'].append(2)d['a'].append(3)d['b'].append(4)d['b'].append(5)d['b'].append(6)print(d)print(d.get("a"))print(d.keys())print([d.get(i) for i in d])

Here we use the methods in collections. There are also many useful methods here, so we have time to continue to learn more.

The running result of the above Code:

defaultdict(, {})defaultdict(, {'b': [4, 5, 6], 'a': [1, 2, 3]})[1, 2, 3]dict_keys(['b', 'a'])[[4, 5, 6], [1, 2, 3]]

After filling in the data, we can quickly group the data. Then, we can traverse each group to count the data we need.

2. quickly convert dictionary key-value pairs

data = {...}zip(data.values(), data.keys())

Data is the data in our format. After zip is used for fast key-value conversion, you can then use functions such as max and min for data operations.

3. Sort dictionaries by public keys

from operator import itemgetterdata = [  {'name': "bran", "uid": 101},  {'name': "xisi", "uid": 102},  {'name': "land", "uid": 103}]print(sorted(data, key=itemgetter("name")))print(sorted(data, key=itemgetter("uid")))

The data format is data. to sort the name or uid, we use the method in the code.
Running result:

[{'name': 'bran', 'uid': 101}, {'name': 'land', 'uid': 103}, {'name': 'xisi', 'uid': 102}][{'name': 'bran', 'uid': 101}, {'name': 'xisi', 'uid': 102}, {'name': 'land', 'uid': 103}]

As we expected

4. grouping multiple dictionaries in the list based on a certain field

Note: Before grouping, sort the data, and select the sorting field according to the actual requirements.

Data to be processed:

rows = [  {'name': "bran", "uid": 101, "class": 13},  {'name': "xisi", "uid": 101, "class": 11},  {'name': "land", "uid": 103, "class": 10}]

Expected processing result:

{101: [{'name': 'xisi', 'class': 11, 'uid': 101},{'name': 'bran', 'class': 13, 'uid': 101}],103: [{'name': 'land', 'class': 10, 'uid': 103}]}

We group by uid. Here we only demonstrate that uid is generally not repeated.

This is more complicated. Let's take a step to break it down.

some = [('a', [1, 2, 3]), ('b', [4, 5, 6])]print(dict(some))

Result:

{'b': [4, 5, 6], 'a': [1, 2, 3]}

Here we aim to convert tuples into dictionaries. This is very simple and should be understandable. Next we will sort the processed data:

data_one = sorted(rows, key=itemgetter("class"))print(data_one)data_two = sorted(rows, key=lambda x: (x["uid"], x["class"]))print(data_two)

Here we provide two sorting methods with the same principle, but the style is slightly different. The first type of data_one is to directly use itemgetter, Which is sorted by a certain field based on what we used earlier, however, sometimes we have another requirement:

Sort by a field first. When the first field is repeated, sort by another field.

In this case, we use the second method to sort the values of multiple fields.
The sorting result is as follows:

[{'name': 'land', 'class': 10, 'uid': 103}, {'name': 'xisi', 'class': 11, 'uid': 101}, {'name': 'bran', 'class': 13, 'uid': 101}][{'name': 'xisi', 'class': 11, 'uid': 101}, {'name': 'bran', 'class': 13, 'uid': 101}, {'name': 'land', 'class': 10, 'uid': 103}]

The results are slightly different.

Next, let's take the last step and combine the two methods we just mentioned:

data = dict([(g, list(k)) for g, k in groupby(data_two, key=lambda x: x["uid"])])print(data)

We group the sorted data, generate a list of tuples, and convert the data into a dictionary. This means we have successfully grouped the data.

Some tips on python data statistics can be shared here. For more information, see.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Tips on python data statistics:

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Tips on python data statistics:

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support