"Tianchi Contest Series" forecast idea of capital inflow and outflow

Source: Internet
Author: User

Title Address: http://tianchi.aliyun.com/competition/information.htm?spm=5176.100067.5678.2.VZW16k&raceId=3

Log in to download data


Title: According to the purchase redemption data from July 13 to August 14, the purchase redemption data of each September 14 is forecasted.


Algorithm problem: This problem can use linear regression and time series prediction, as long as the characteristics of good effect can be. We are using the random forest +lm under R, the final only 4 submissions, the first time we 26, the back of every day, these rockets are terrible, the last day we have a good fortune in 47. Later asked the Rockets to know that they are using the STL time series prediction, because this method predicts the value is too large, just the answer is too large. So there will be a lot of rockets.


Preprocessing problem: The topic gives the daily user's operation data, we need to summarize by the date to purchase the redemption data, because the submission result is also by day

Summarized down like 427, observation can be found from 13 to 14 is not very stable, so we can take stable data, 3-8 or 4-8 months can be.

In addition, since the 3-8-month data, it represents no previous year's National Day data, the problem is relatively large, because the need to predict September 14, the end of September data and September 13 data trends are correlated. But the September 13 data changes more drastic, because the game is not allowed to single point, unable to insert the September 30, 14 day of data, how to do? We measured the value of the next 20140930, and the September 29 value ratio is probably 11:9-11:8. So we manually inserted a 20130930 purchase redemption data into the training set to fit the 14 forecast. I don't know if this is a single point?


Feature problem: The official baseline is the use of 7 features of the week using LM modeling. When we analyzed the data, we found that there was no strong correlation with the week, but that there was a strong correlation between working vacations (the first season was actually strongly correlated with the stock, and the second season was not much of a big hit).

So we designed the following features:

-One week normal 1th/5 days of work, weekends 1th/2 days, holidays 1th/3 days, pre-leave/after normal work day
--the day before work, working days, holidays, month of the month (10 days), the first of each month
--The last crest/trough was a few days ago
--Take a few days off on the last day of work (2-3,7 3 01 features)
--Put a few days off before the first day of work (2-3 2 01 features)
--Two days of leave, three days off
--Sunday Supplementary course
--Stock wave theory, 135 waves

19 of these features can reach 203 in pure LM when part1

All features at PART2 LM+RF can reach 201 points.


Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.

"Tianchi Contest Series" forecast idea of capital inflow and outflow

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.