Big data on mobile phones: The challenge of big data on mobile phones

Source: Internet
Author: User
Keywords Select select very select very large data select very large data large data selection very large data can

This article is jointly written by percentile information http://www.aliyun.com/zixun/aggregation/11129.html, Senior director of > Wireless services, Xiaodong, talking Data coo Xu Yi, and UESTC of Chengdu Shangliang.

Through the mobile phone reading, mobile phone music user behavior, we can see the big mobile phone data in mobile internet application power, but at the same time, mobile phone large data is not omnipotent, its development also faces many practical problems and challenges.

Sparsity of data

App apps for smartphones end up with 100,000, however, there is very little overlap among the two users in each application, and if the sparsity of the system is measured by the proportion of the existing selection relationship between the user and the product, the sparsity is not more than 4% in the few app data we study. In fact, these are very dense data. Think of a tens user, millions singer app, on average, a user can listen to 100 songs, not estimated, so the sparsity should be at one out of 10,000 or below the magnitude.

This problem cannot be completely overcome in nature, but there are many ways to alleviate the problem to a considerable extent. such as diffusion method, random default value method, random selection, etc.

Cold start problem

In the Music app we discussed earlier, we found that the songs were covered only about 2%, which was due to a large number of songs that were caused by the cold start state. This is because the new product has been chosen few times or no, it is difficult to find a suitable way to recommend to the user conclusion.

A recent interesting study has shown that it is easier for new users to choose a particularly popular item-which is good news anyway, and it also shows good results with a hot song list.

The problem of large data processing and increment computation

Although the data is sparse, most of the data contains millions users, while new users continue to enter the system. The amount of data is not only large, but also the data itself constantly changing, how to quickly and efficiently deal with these data becomes an imminent problem. Under this premise, the complexity of algorithm time and space, especially the former, has gained unprecedented attention. In general, an efficient algorithm, either its own complexity is very low, or can be well parallelization, or both.

With the increase in the amount of information added, it is ultimately necessary to use global data for recalculation every once in a while. A more advanced but also more difficult approach is to devise an algorithm that ensures that the error does not accumulate, meaning that the difference between the result and the results recalculated by the full data does not rise monotonously.

Mining and utilization of user behavior patterns

Digging deeply into user behavior patterns can more accurately capture user preferences, and hopefully make a better user experience. For example, in music apps, new users and old users have a very different choice: Generally speaking, new users tend to choose popular songs, while older users pay more attention to the diversity of songs.

The spatio-temporal statistical characteristics of user behavior can also be used to improve the design of specific scenarios. For example, when making personalized reading recommendations for mobile phones, if once the data shows that a user has only about one hours of cell phone reading between 7 and 8 (probably on the subway or on a bus at work), it is unwise to send an ebook-reading SMS AD 9 o'clock. The long-term and short-term interest in user selection can also be analyzed from time-included data, and the recommended accuracy can be improved by separating the two effects.

(Responsible editor: The good of the Legacy)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.