Tips on python data statistics:
I have recently used python for data statistics. Here I have summarized some tips for searching and summarizing recent usage, hoping to help some kids shoes in this area. Some tips are common. We do not pay attention to them at ordinary times. However, in specific scenarios, these small methods can be of great help.
1. Map keys to multiple values in the dictionary
{'b': [4, 5, 6], 'a': [1, 2, 3]}
Sometimes when calculating the same key value, we want to add all entries with the same key to a dictionary with the key as the key, and then perform various operations, at this time, we can use the following code for operations:
from collections import defaultdictd = defaultdict(list)print(d)d['a'].append(1)d['a'].append(2)d['a'].append(3)d['b'].append(4)d['b'].append(5)d['b'].append(6)print(d)print(d.get("a"))print(d.keys())print([d.get(i) for i in d])
Here we use the methods in collections. There are also many useful methods here, so we have time to continue to learn more.
The running result of the above Code:
defaultdict(, {})defaultdict(, {'b': [4, 5, 6], 'a': [1, 2, 3]})[1, 2, 3]dict_keys(['b', 'a'])[[4, 5, 6], [1, 2, 3]]
After filling in the data, we can quickly group the data. Then, we can traverse each group to count the data we need.
2. quickly convert dictionary key-value pairs
data = {...}zip(data.values(), data.keys())
Data is the data in our format. After zip is used for fast key-value conversion, you can then use functions such as max and min for data operations.
3. Sort dictionaries by public keys
from operator import itemgetterdata = [ {'name': "bran", "uid": 101}, {'name': "xisi", "uid": 102}, {'name': "land", "uid": 103}]print(sorted(data, key=itemgetter("name")))print(sorted(data, key=itemgetter("uid")))
The data format is data. to sort the name or uid, we use the method in the code.
Running result:
[{'name': 'bran', 'uid': 101}, {'name': 'land', 'uid': 103}, {'name': 'xisi', 'uid': 102}][{'name': 'bran', 'uid': 101}, {'name': 'xisi', 'uid': 102}, {'name': 'land', 'uid': 103}]
As we expected
4. grouping multiple dictionaries in the list based on a certain field
Note: Before grouping, sort the data, and select the sorting field according to the actual requirements.
Data to be processed:
rows = [ {'name': "bran", "uid": 101, "class": 13}, {'name': "xisi", "uid": 101, "class": 11}, {'name': "land", "uid": 103, "class": 10}]
Expected processing result:
{101: [{'name': 'xisi', 'class': 11, 'uid': 101},{'name': 'bran', 'class': 13, 'uid': 101}],103: [{'name': 'land', 'class': 10, 'uid': 103}]}
We group by uid. Here we only demonstrate that uid is generally not repeated.
This is more complicated. Let's take a step to break it down.
some = [('a', [1, 2, 3]), ('b', [4, 5, 6])]print(dict(some))
Result:
{'b': [4, 5, 6], 'a': [1, 2, 3]}
Here we aim to convert tuples into dictionaries. This is very simple and should be understandable. Next we will sort the processed data:
data_one = sorted(rows, key=itemgetter("class"))print(data_one)data_two = sorted(rows, key=lambda x: (x["uid"], x["class"]))print(data_two)
Here we provide two sorting methods with the same principle, but the style is slightly different. The first type of data_one is to directly use itemgetter, Which is sorted by a certain field based on what we used earlier, however, sometimes we have another requirement:
Sort by a field first. When the first field is repeated, sort by another field.
In this case, we use the second method to sort the values of multiple fields.
The sorting result is as follows:
[{'name': 'land', 'class': 10, 'uid': 103}, {'name': 'xisi', 'class': 11, 'uid': 101}, {'name': 'bran', 'class': 13, 'uid': 101}][{'name': 'xisi', 'class': 11, 'uid': 101}, {'name': 'bran', 'class': 13, 'uid': 101}, {'name': 'land', 'class': 10, 'uid': 103}]
The results are slightly different.
Next, let's take the last step and combine the two methods we just mentioned:
data = dict([(g, list(k)) for g, k in groupby(data_two, key=lambda x: x["uid"])])print(data)
We group the sorted data, generate a list of tuples, and convert the data into a dictionary. This means we have successfully grouped the data.
Some tips on python data statistics can be shared here. For more information, see.