Application of Python Text processing

Source: Internet
Author: User

Recently, according to the company's operation department needs to the MONGO database of relevant information statistics, I generally like to the database server related data export (PS: First, because MONGO is NoSQL, in the related table is not good processing, and the other is because although for the test environment, But in order not to affect performance, I still used to export data, but the disadvantage is that their own test machine pressure ratio, but the needle for the current data situation can be met.

The data format exported according to the CreateDate time period is as follows (has been processed, the process has been skipped)

1 a:5, b:111, C:52 a:1, b:222, C:33 a:2, b:333, C:4

These forms of textual information are stored as rows in the text, where, for each column delimited, the information is extracted according to the criteria in the MONGO database. Where A is the activity id,b for the active user id,c to participate in the activity scoring situation.

In order to count the number of participants in different activities, as well as the corresponding activities of the score, so to judge each of the keys, the statistical method is very simple, this time the main record I deal with the method of judging the field:

Through type () can judge each behavior of a str, for such a form of STR, hope to be able to convert to Dict to deal with, I did not try, I was through the RE module split through the ', ' slice it into a list as a listing to deal with, the effect is the same, Because the process of judging is based on the activity ID condition, so that only through two slices, the ': ' to continue processing after the delimiter and then take out the value of the judgment.

The entire statistical implementation process uses text manipulation, string processing, and looping statements. It's easy to implement.

The previous processing of text is to use the shell to achieve, but because I shell is halfway decent, many tools are not familiar with the words can not think of what method, and the individual currently feel the shell details of the processing is not delicate, (PS: Also because I use not much, In fact, Perl has a very powerful character processing feature, so this time it will be appropriate to use the Python I recently saw in the exercise.

The implementation of the code is relatively simple, but also embarrassed to stick out, I hope in little bit to continue to move forward, and June mutual encouragement!!!!

 

Application of Python Text processing

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.