International - English

Cart Console

Topic Center

Contact Sales

Home > Hot Categories > Big Data

2015 Ali Tianchi big data game algorithm design

Last Update:2015-04-10 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Project Address: Https://github.com/Huangtuzhi/AlibabaRecommand

Alibabarecommand

Alibaba Mobile recommending algorithm competition.

Competition Introduction

The contest analyzes the user's behavior data for one months on the mobile terminal and makes a recommendation for the following day's user purchase behavior.

Directory structure

├──license #许可证 └──readme.md #使用说明 # ├──create_table.sql #创建基本表 ├──add_table.sql #后续增加的表 ├──add_index.sql #为表建立索引 ├──add_table_31day.sql #建立存储31天数据的表, Structure ditto └──add_index_31day.sql #为表建立索引 # data import ├ ──datatodb.sql #大赛csv格式原始数据导入基本表 └──featuretodb.sql #feature. txt to import the corresponding table #main├──__init__.py├──trainmodel.py  ├──obtainpredict.py└──getfeature31day.py# data ├──feature.txt #符合某个标准的记录 (user_id,item_id,look,store,cart,buy) ├── Data_features.txt #feature The N-dimensional feature ├──data_features.npy #转为矩阵格式 (NumPy Library) recorded in. txt, #feature with ├──data_labels.txt. The label recorded in txt (1/0 = purchased/not purchased) ├──data_labels.npy├──feature_pos.txt #feature. txt all positive cases ├──feature_p.npy├──feat Ure_neg.txt #feature All negative examples ├──feature_p.npy├──trainset.npy #训练集 ├──testset.npy #测试集 └──31day_ in. txt Data_features.txt #31天所有数据的n维特征 # results ├──predict_all_pairs.txt #得到所有预测的userid itemid to └──filter_pairs.txt #用train_item过 Filter the UserID itemid to

Principle

The topic gave 31 days of data, and we chose the 30th day as the dividing point. Extract the n-dimensional features from the first 30 days of data (each [user_id,item_id] pair can fetch a single line of features) and mark each line with the real data of day 31st.

For example: A [user_id,item_id] pair [9909811,266982489] appears in the first 30 days, if on the 31st day it also appears and Behavior_type for purchase, the label for this line is 1, otherwise 0.

This formed a lot of characteristics of the data, we put the data in the logistic regression training, get a two classification model, so the model is trained.

The next thing to predict is the label above, which is the output of the model. A label of 1 means we think the user will buy it. So what is the input to the model? The input to the model is the characteristic of all data for 31 days.

1th~30th————> 31th的label1th~31th————> 32th的label

Since the 31th label data is known, it can be used to evaluate the trained model. The 32th label is the result of the output.

Description

This is a predictive framework, and the feature engineering needs to be further improved.

2015 Ali Tianchi big data game algorithm design

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

Related Keywords:

Big Data era: a summary of knowledge points based on Microsof... 11-05

Big Data Architecture Development Mining Analytics Hadoop HBa... 04-28

Big Data Architecture Development Mining Analytics Hadoop HBa... 12-02

0 Basic Learning Cloud computing and Big Data DBA cluster Arc... 02-21

"Big Data dry" implementation of big data platform based on H... 10-21

MYSQL Big Data Import 12-08

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

What's Trending

Top 10 Tags

datastax versions naming convention zookeeper client class definition md5 microsoft sql server 2005 data structures exception handling error handling

Top 10 Keywords

microsoft download center down wordpress address url site address url wordpress address url windows installer 4 0 download 302 not found web address url definition site address url wordpress db2 integer mac os installation step by step pdf abbreviation for return

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

2015 Ali Tianchi big data game algorithm design

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support