Context-aware ensemble of multifaceted factorization models for recommendation prediction in social networks
Yunwen Chen, zuotao Liu, Daqi Ji, yingwei Xin, Wenguang Wang, Lu Yao, Yi Zou
Abstract
This paper describes the solution of Shanda innovations team to Task 1 of KDD-cup 2012. A novel approach called multifaceted factorization models is proposed to inmarshate a great variety of features in social networks. social relationships and actions
Between Users are integrated as implicit feedbacks to improve the recommendation accuracy. keywords, tags, profiles, time and some other features are also utilized for modeling user interests. in addition, user behaviors are modeled from the durations of recommendation
Records. A context-aware ensemble framework is then applied to combine multiple predictors and produce final recommendation results. the proposed approach obtained $0.43959 $ (Public score)/$0.41874 $ (Private score) on the testing dataset, which achieved
2nd place In the KDD-cup competition.
Introduction
Social networking services (SNS) have gain tremendous popularity in recent years, and voluminous information is generated from social networks every day. it is desirable to build an intelligent recommender system to identify what interests users efficiently.
The task of KDD-cup 2012 track 1 is to develop such a system which aims to capture users 'interests, find out the items that fit to users' taste and most likely to be followed. the datasets are provided by Tencent Weibo, one of the largest social networking
Website in China, is made up of 2,320,895 users, 6,095 items, 73,209,277 training records, and 34,910,937 testing records, which is relatively larger than other publicly released datasets. besides, it provides richer information in multiple domains, including
User Profiles, item categories, keywords, and social graph. timestamps for recommendations are also given for parameter Ming session analysis. for each user in the testing dataset, an ordered list of the recommender results is demanded. mean average Precision
(MAP) is used to evaluate the results provided by 658 teams around the world.
Compared to traditional recommender problems, e.g ., the Netflix prize, where the scores users rate movies are predicted, the settings of KDD-cup 2012 appears more complex. firstly, there are much richer features between users on social networking website.
In the social graph, users can follow each other. besides, three kinds of actions, including ''comment' (add comments to someone's tweet), ''retweet ''(repost a tweet and append some comments) and ''at ''(Policy another user), can be taken between users.
User profiles contain rich information, such as gender, age, category, keywords and tags. so models that are capable to integrate varous features are required. secondly, items to be recommended are specific users, which can be a person, a group, or an organization.
Compared to the items of traditional recommender systems, e.g. books on Amazon or movies on Netflix, items on social network sites not only have profiles, but also have their behaviors and social relations. as a result, item modeling turns out more complicated.
Thirdly, the training data in the social networks is quite noisy, and the cold-start problem also poses severe challenge due to the very limited information for a large number of users in testing dataset. it is demanding to have an effective tive preprocessing
To continue with this challenge.
In this paper we present a novel approach called context-aware ensemble of multifaceted factorization models. various features are extracted from the training data and integrated into the proposed models. a two stage training framework and a context-aware
Ensemble method are introduced, which helped us to gain a higher accuracy. we also give a brief introduction to the session analysis method and the supplement strategy that we used in the competition to improve the quality of training data.
The rest of the paper is organized as follows. section 2 introduces preliminary of our methods. section 3 presents the preprocessing method we used. in section 4, we will propose multifaceted factorization models, which is adopted in the Final Solution.
A context-aware ensemble and user behavior modeling methods are proposed in section 5. Experimental results are given in section 6 and conclusions and future work are given in section 7.
Https://kaggle2.blob.core.windows.net/competitions/kddcup2012/2748/media/Shanda3.pdf