Introduction to the contest: links
This article records 2015 Ali Tianchi Big Data Contest, some of my code, because currently in the game, only share a naive solution, based on rules, code home on my GitHub: link, below is the code description. If you are interested, please read the code comments without further details.
Description of the Repo catalogue
- Data Storing Data
- preprocess Data preprocessing
- Rule generate submission files according to rules
- Model Training Machine learning model (temporarily not shared)
Code usage Instructions
Fork Ben Repo, non-GitHub users please click on the lower right corner of theDownlown ZIP
After decompression, tianchi_mobile_recommend_train_user.csv
and tianchi_mobile_recommend_train_item.csv
put into the /data/
directory
Only two steps to get a submission,F1 can reach 7.6%
- First step, go to
/preprocess/
directory, rundata_preprocess.py
- Step two, go to
/rule/
directory, rungen_submission_by_rule.py
After completing the two steps above, /rule/
a file is generated in the directory tianchi_mobile_recommendation_predict.csv
and submitted.
Additional Information
Pure Python, without any dependencies.
About the function of the code implementation, in each code file has comments, the code may be written more chaotic, there may be bugs, welcome issues.
If you want to get higher F1 value, modify gen_submission_by_rule.py
This document, add some rules,F1 can reach more than 9% .
recommended to run under Linux ; on my PC (8 cores), the above two steps took a total of less than 20 minutes.
On the basis of the rules, do feature engineering, training model, this is the purpose of the competition.
After entering the second season, please delete this code, not suitable for processing big data.
2015 Ali Tianchi Big Data Contest-solution