The feature of click-through features in Internet advertising review

Source: Internet
Author: User

The feature of click-through features in Internet advertising review


Statement:

1) This blog post is compiled from the online very Daniel and the experts selfless dedication of the material. Please refer to the references for specific information. Specific version statements also refer to the original literature

2) This article is for academic exchange only, non-commercial. So each part of the specific reference does not correspond in detail, and some parts of it is directly copied from other blogs. If a part of the division accidentally violated the interests of everyone, but also look haihan, and contact Swaiiow deleted or modified until the relevant parties satisfied.

3) I Caishuxueqian, finishing summary of the time is inevitable error, but also hope that the predecessors, thank you.

4) Reading this article requires machine learning, statistical learning theory, optimization algorithms and other basic (if not also do not matter, do not look, as with the students bragging cost).

5) I have the word version and the PDF version, if necessary, can be uploaded to csdn for you to download


A Internet Advertising feature Engineering

Blog "Internet Advertising overview of the click-Through System" discusses the Internet advertising click-through system, you can see that the logistic regression model is relatively simple and practical, although there are many training methods, but the goal is consistent, the effect of training results on the effect is relatively large, But the training method itself, the impact on the effect is not decisive, because the training is the weight of each feature, the weight of the subtle differences will not cause a huge change in Ctr.
After the training method is determined, the selected features are decisive for CTR estimation.


1.1 Feature selection and use

Do click-through estimate needs two aspects of data, on the one hand is the advertising data, on the other hand is the user's data, now all the data have, then the job is to use these two aspects of the data to evaluate the likelihood of the user clicking the ad (that is, probability).
User's characteristics are more, the user's age, gender, geography, occupation, school, mobile platform and so on. Advertising features are also very rich, such as advertising size, advertising text, advertising industry, advertising pictures. There are also feedback features, such as the real-time CTR for each ad, and the gender-cross CTR for advertising. It is a big problem for data mining engineers how to choose from so many features to characterize a person's interest in an advertisement.
When you select a feature, you also need to pay attention to how the feature is selected, for example, if you are taking age as a feature, what can you eventually train? Because the age-added subtraction is meaningless, so you can only make each age as a feature, but the light is OK? How to use features, is a big issue of advertising algorithm engineer.


1.1.1 Select Features

What features are appropriate to estimate CTR? This is a problem that many ad algorithmic engineers need to consider.
Machine learning algorithms talk about models at most, and the discussion of features is seldom involved. In real applications, the work of most data mining engineers is to characterize and validate features.
Want to feature is a mental and physical life, need a lot of knowledge in the field, more depressing is that the industry does not have a set of methods to think of characteristics, the industry has only to verify the characteristics of the method. For the Internet advertising industry, simply talk about the general characteristics of how to come.
First of all, the age of this feature, how to know it is related to Ctr? Now the intuitive explanation is that young people generally like sports ads, 30-year-old men like cars, houses and other ads, people over 50 years old like health care ads. As you can see, the reason for choosing age as a trait is based on a rough division of different types of things that people of all ages like, and is a very subjective thing.
Sex This characteristic, the intuitive feeling is that men generally like sports, car class, travel ads, women generally like cosmetics, clothing category ads. It can also be seen that the choice of gender as a feature is also based on similar reasons, that men and women generally like different things.
For the characteristics of the region, this is more learning, South China's people are more like animation and games, north China people like wine and cigarettes?
In terms of advertising features, the size of the image of the advertisement, the background color of the advertisement foreground can really affect the people's click? This is actually a kind of speculation. The picture is a star or an animal or something that can be considered.
In short, want the characteristics of this thing basically not much spectrum, can only far apart imagination, but also to learn more about the knowledge of all walks of life, in order to think of more characteristics, even if a feature with the human relationship is not big, but also to be well verified. This is basically the same as a man to go home late to excuse the same, have to have an excuse to think about how to explain the better point, no excuse to think of excuses.
When you think of a feature, you need to verify and judge.
There are many ways to verify features, such as direct observation of CTR, Chi-square test, single-feature AUC, etc. Direct observation of CTR is a very effective method, such as according to the delivery records, cosmetics ads in women above the click rate is much higher than the male above, indicating that the characteristics of sex in the cosmetics industry is predictive ability, and sporting goods ads in men above the CTR also higher than women, This characteristic of gender is also predictive in the sports industry, and it is considered that sex is a feature that can be used in many industries.
Age this characteristic evaluation type, mainly is to observe an advertisement in different ages the click rate has the difference, then observes the different advertisement's click rate in different age the distribution whether is not the same, if has the difference, explains the age this characteristic can use.
In the actual use, found that the characteristics of gender is more effective, mobile phone platform This feature is more effective, the region and age of the two characteristics have a certain effect, but not the first two is so obvious, and their use of the way may be related, but also need to dig further.
At the same time, the actual use also found that the ad Feedback Ctr This feature is also very effective, this feature means that the current ads are running, has been put part of, this part of the click-through rate can be regarded as the rate of the ads, can also be considered to be the quality of this advertisement embodiment, The CTR used to estimate a traffic is very effective.


Processing and use of 1.1.2 features

Choosing to get a feature, how to use it is also a problem.
The first thing to say about demand, in fact, is predicting what CTR is going to do is work on the figure below-Calculate CTR for a user/AD portfolio.

The above has been selected features, tentative ad feedback Ctr, User age, gender three characteristics.
first, the discretization of
Feedback Ctr is a floating-point number, directly as a feature is possible, assuming that the 1th feature is Feedback Ctr. This is not the case for age, because age is not a floating point, and the age of 20 years old with 30 years of this two number 20, 30 size comparison is meaningless, add subtraction is meaningless, in the optimization calculation and the actual calculation of CTR will involve the size of the two number comparison. such as w.x, in the case of W has been determined that the value of a feature of X is 20, or the value of 30,w.x is very large, even if the logical formula to compare, the value is relatively large, but often 20-year-old people with 30-year-old people to the same advertising interest is not so big. The way to solve this situation is that each age a feature, such as a total of only 20 to 29 years old 10 age, each age to make a feature, the number is from 2 to 11 (1th is the advertising feedback Ctr), if this person is 20 years old, then the number 2 is the value of the feature is 1, The number 3 to 11 is 0. Thus, the age of this category has 10 characteristics, and these 10 features are mutually exclusive, such a feature is called discretization features.
Second, cross
That would seem to solve the problem, but is that enough?
For example, a person is 20 years old, then in the number 2 features above, it has been 1, the advertisement for the basketball is 1, the advertisement for cosmetics is 1, so the result of training is 2 of the weight of the significance is-20-year-old people click All the possibilities of advertising is this weight, this is actually unreasonable.
The point is that the 20-year-old is a value when it comes to sports, and when it comes to health care, it's a value. This looks reasonable. If this is not deep enough, based on the same principle as above, gender is the same trait, if also do the above discretization, the number is 12 and 13,12 is male, 13 is female. In this case, for a male/sports ad combination, the characteristic value of number 12 is 1, and the male/cosmetic combination of the number 12 is also 1. This is unreasonable, too.
How do you make it reasonable? In the case of the sex above. The eigenvalues of the number 12 do not take 1, the value of the ads in the male users above the click-through rate, such as for the male/sports ads combination, the value of the character number 12 is the male in the sports ads above the click Rate, so that the character number 12 has become a floating point, the addition of this floating point is meaningful.
This is called the intersection of features, and now is the intersection of sex and advertising to get the eigenvalues. There are many other ways to cross, at present the most applied industry is advertising with the user's cross-feature (the character numbered 1), advertising and gender cross-feature, advertising and age cross-feature, advertising and mobile phone platform cross-features, advertising and geographical cross-features. If you do more, there may be advertisers (each ad is a delivery plan submitted by an advertiser, and an advertiser may submit multiple delivery plans) to cross each feature.
three, continuous feature variable discrete feature
Is it enough to cross the eigenvalues? The answer is still not necessarily.
such as the number 1 of the feature, is the ad's CTR, assuming that the Internet Ad CTR conforms to a long tail distribution, called the logarithmic normal distribution, its probability density is (note this is hypothetical, does not represent the real data, from the real data observation is in line with such a shape, It seems that Yahoo's smooth paper says it fits the beta distribution.

You can see that most of the ads are in a small range of clicks, the higher the click-through ads less, and these ads cover less traffic. In other words, when the click-through rate is around 0.2%, if ad A's CTR is 0.2%, ad B's Ctr is 0.25%, ad B's CTR is higher than ad a 0.05%, which is actually enough to indicate that ad B is a lot better than Guang a, but when the CTR is about 1%, the ad a CTR is 1%. , ad B's click-through rate is 1.05%, and there is no way to say that ad B is much better than ad a, because there are not many ads in this 0.05% interval, the two ads basically can be considered similar. That is, the click-through rate in different intervals, should be considered as a different weight factor, because the ad click-through rate consists of a number of 1 of the characteristics of the user's click on the ad's probability is not a complete positive correlation, there may be a greater value of the characteristics of the more important, there may be a value increase to a certain extent, the
For such a problem, Baidu has scientists have proposed to discretization of continuous features. They believe that the continuous value of features in different intervals of importance is not the same, so the hope that continuous features in different areas have different weights, the method of implementation is to divide the characteristics of the interval, each interval is a new feature.
The realization is to use the equal frequency discretization method: 1) for the above number 1 of the feature, the first statistical history of each display record in the value of the character 1 of the order of the values, assuming that there are 10,000 display records, each display record of this eigenvalue is a different floating point, For all the display records according to this floating point number from low to high order, take the lowest 1000 display records of the eigenvalues as an interval, ranked 1001 to 2000 of the display record of the eigenvalues as an interval, and so on, a total of 10 intervals. 2) The feature number is rearranged, for the ranking from 1 to 1000 of the 1000 display records, their original number 1 features into the new feature number 1, the value is 1; for records from 1001 to 2000, their original number 1 is changed to a new feature number 2, a value of 1, And so on, the new feature number has 1 to 10 total of 10. For each display record, if it is ranked 1 to 1000, the new feature number is only the value of the number 1 is 1, 2 to 10 is 0, the other display record is similar, so that the ad itself's Ctr occupies 10 feature numbers, it becomes discretized into 10 features.
Equal-frequency discretization needs to do with each of the original features, that is, the original numbering of 1 to 13, will be discretized into a lot of numbers, if each feature is discretized into 10, then eventually there will be 130 features, training results W will be a 130-dimensional vector, respectively, corresponding to the weight of 130 features.
The actual application table name, the discretization characteristic can fit the non-linear relation in the data, obtains the better result than the original continuous characteristic, and in the online application, does not need to do multiplication, also speeds up the calculation Ctr speed.


Filtering and correcting of 1.1.3 features

As mentioned above, many features are actually feedback characteristics, such as ad feedback Ctr, advertising and gender crossover characteristics, which can be obtained through the history of the statistics of the display log. But some ads would have been very small, showing less on male users, and it would be inaccurate to calculate the CTR of advertising and gender crossover, which needed to be corrected. The specific correction method can be referred to the blog post, "Bayesian smoothing of ad clicks."
After the revised CTR to do the features, the actual effect on the line has a relatively large increase.
If the characteristics of the use of more, with the school and advertising cross-feature what, after discretization has tens of thousands of features, which will produce a variety of characteristics caused by various problems, such as over-fitting. One way to solve this problem is to evaluate the data offline, such as the use of CTR for differentiation. The other is the use of regular, especially L1 regular, through the L1 regular training of the resulting weight vector, some features if the CTR prediction is not strong, the weight will become 0, does not affect the estimate. This is the feature filter, specific about the L1 of some of the discussion and implementation of the blog "from generalized linear model to logistic regression" "OWL-QN algorithm" and "Online learning algorithm Ftrl".


Thanks

A number of LINKEDLN, Baidu Company's researcher selfless public information.
Blog material of many bloggers.


Reference documents

[1] Ad Click prediction:a View from the trenches. H. Brendan McMahan, Gary Holt et Al,google's paper
[2] http://www.cnblogs.com/vivounicorn/archive/2012/06/25/2561071.html @Leo Zhang's Blog
[3] Computational advertising:the LinkedIn. Deepak Agarwal, LinkedIn Corporation cikm

The feature of click-through features in Internet advertising review

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.