The algorithm of traffic allocation in contract advertisement

Source: Internet
Author: User
Tags ad server

Brief introduction

Contract advertising is a contract-based advertising model, and one of the main ways of online advertising is guaranteed delivery (guaranteed DELIVERY,GD). GD is a quantity of better than the quality of advertising, it is necessary to ensure that advertisers can obtain in the contract agreed to the audience users of the traffic. GD, the media traffic according to the attribute division, the media to the different advertisers in accordance with the contract allocation of good traffic. The criteria for Ad Server is to choose the right advertiser for each presentation that satisfies multiple contracts, so that each advertiser works best, and the traffic can be distributed more efficiently. As shown, supply is the media side, providing traffic, the media traffic can be divided according to gender, age, geography; demand is advertisers, different advertisers need different segments of traffic (or users behind the traffic), such as the first advertiser needs 200K of male traffic. The problem of traffic allocation is to assign different traffic segments to different advertisers, as much as possible to meet the needs of all advertisers. For example, how to allocate six-part traffic to three advertisers in order to meet the needs of all advertising contracts? We can allocate the CA's traffic to the second advertiser first, and then it's easier to allocate traffic to the other two advertisers.

For the above problem solving, there are already more work to expand, and there are some good methods. To understand the strengths and weaknesses of different approaches, you need to understand the two basic concepts first. The first concept is the traffic arrival order (input order), because in the actual ad distribution, we do not know in what order the traffic is reached, it is not clear what traffic will arrive. It is obvious that the order of arrival of different traffic has a strong influence on the final effect of the algorithm. In order to measure the effect of different algorithms more objectively, it is necessary to compare the effect of different algorithms in the same traffic arrival order. In general, there are four kinds of models for traffic arrival order:

First, traffic arrives in the worst case order. This traffic arrival model assumes that an opponent already knows your traffic allocation algorithm, and that the opponent can manipulate the order in which traffic arrives, resulting in a sequence of traffic that makes your traffic allocation algorithm the worst-performing.

Traffic is reached in random order.

Traffic is reached with an unknown independent distribution (IID) method.

The flow is reached with a known independent distribution (IID) method.

The second concept is the competition rate (competitive Ratio), which is an indicator of the effectiveness of the algorithm. The competition rate is defined as the ratio of the target value obtained by the algorithm to the target value obtained by the optimal allocation algorithm. The desired ratio of the target value if the traffic is reached in IID mode. With the concept of traffic arrival order and competition rate, it is possible to analyze the merits and demerits of different algorithms. Here we examine some of the features of existing algorithms from simple to complex.

Greedy allocation algorithm

The first algorithm to be examined is the greedy allocation algorithm, the greedy allocation algorithm runs the following way:

Every time you request, find the ad that will match the highest bid.

Now it is necessary to consider the final effect of the greedy allocation algorithm, it can be proved that in the traffic input order is the worst case, the greedy algorithm competition rate is 0.5, and can only reach 0.5. The results are not shown here, and an example shows that the greedy algorithm has a competitive rate of up to 0.5.

For example, the black point represents a traffic flow, the traffic arrives in the order from top to bottom, two advertisers need the traffic Unit is 1, the traffic bid is 1 and M respectively. When M is greater than 1 o'clock, the greedy algorithm allocation scheme makes the media final income is M, and the optimal allocation scheme media revenue is 1+m, so in this order, the greedy distribution algorithm competition rate is. So when M infinity approaches 1 o'clock, the competition rate of greedy allocation algorithm is nearly 0.5, and when M is far greater than 1, the competitive rate of greedy distribution algorithm is close to 1, and the allocation efficiency is close to the optimal allocation algorithm. Therefore, the greedy allocation algorithm is a good allocation algorithm when the advertisement main bid difference is big .

The above survey is the order of the flow arrival sequence in the worst case, greedy allocation algorithm competition rate. In a real world, traffic arrival order is generally not the worst case scenario. Therefore, the competitive rate of greedy allocation algorithm is more consistent with the real application scenario when the flow arrival sequence is the IID distribution. Goel in their paper (Online budgeted matching in random input models with applications to AdWords) proves that the input sequence is random, The greedy algorithm has a competitive rate for AdWords issues. It is important to note that the AdWords problem and GD traffic distribution problem is not equivalent, many problems are not consistent conclusion, so on and traffic allocation problem, in the input sequence is random, the greedy algorithm competition rate still have no clear answer, interested students can imitate Goel thinking, see whether can get the same conclusion. In the following one of the bid scaling algorithms, the AdWords issue is very different from the distribution of display ads, and the traffic allocation algorithm for display ads is more difficult than the AdWords issue.

Another algorithm that is as simple as the greedy algorithm is the stochastic algorithm. Random algorithms randomly select a presentation from a contract ad that meets the requirements for each ad request. You can think about the stochastic algorithm, when the different traffic reached the model, the competition rate is how much. In which case, the stochastic algorithm behaves better than the greedy algorithm.

Bidding scaling algorithm (bid scaling)

The following is a more general algorithm, which can be called the bid scaling algorithm. The bid scaling algorithm is the name of every ad request, the bid for each eligible ad is multiplied/subtracted by a scaling factor (for the AdWords issue is multiplied by a factor, for display ad is minus a factor), and the highest value of ad serving is selected. In order to understand the bid scaling algorithm, it is necessary to formally describe the problem mathematically:

As above formula, the left side is the original problem, the right is the duality problem. Which is the amount of traffic v assigned to The Advertiser U (a value of 1 or 0), for Advertiser u to Traffic v bid, for the advertiser U total required traffic. The problem now is that flow V is unknown and we need to solve the problem online. by Kkt condition, the optimal solution of the above problems needs to meet the following conditions:

It can be known from the above kkt condition that the optimal solution can be obtained when traffic V is allocated to the largest advertiser. Therefore, the bid scaling algorithm is as follows:

For each ad request, the traffic is assigned to the advertiser with the highest value for the advertiser calculation that meets the requirements. (For AdWords Issues, it's calculated at this point)

The core of the Bid scaling algorithm is how to choose. Greedy algorithm can also be regarded as a bid scaling algorithm, the algorithm chooses when the advertiser budget is not exhausted, when the budget is exhausted.

J. Feldman, etc. in article online ad assignment with the free disposal, presents a choice of ways in which, in the worst case scenario, and when the advertiser budget is larger, the competitive rate is the way they are chosen:

Shale algorithm

None of the previously examined algorithms took into account the distribution of traffic arrival. In the real environment, we have certain prediction ability to flow, and the shale algorithm tries to instruct the online traffic distribution by predicting the flow rate and the algorithm of traffic allocation under the line method. In addition, the shale algorithm has some differences with the bid scaling algorithm for the traffic assignment problem (although its essence can still be classified as bid scaling algorithm), the revenue of shale algorithm no longer considers The advertiser's bid, more consider the loss of the unfinished contract, The mathematical model is as follows:

In the above, the flow according to the attribute divided into different units, under the line to predict the flow of different flow units, the flow of the predicted flow unit I, the number of contract traffic for The Advertiser J. , which is the sum of traffic that meets the requirements of advertiser J to measure the uniformity of traffic distribution to advertiser J. The solution is to represent the flow ratio that the flow unit I assigns to Advertiser J.

Therefore, the shale algorithm consists of two parts, part of the Offline Planning section, the use of the predicted flow of each flow unit, the other part of the online serve section, the use of guidance traffic allocation. It should be pointed out that for the bid scaling algorithm, for each advertiser only need to save a number, can be online distribution, and the naïve shale algorithm needs to be saved, so that the number of each advertiser needs greatly increased, increased memory consumption. In order to solve this problem, the shale algorithm draws on the idea of the bid scaling algorithm, using the KKT condition, realizes that only need to save a value for each advertiser, online to find out, so that the allocation process does not need to consume a lot of memory.

For the shale algorithm, the solution of the shale algorithm is the optimal solution if the flow prediction algorithm can accurately predict the number of traffic in the flow unit. So the main question is, if the prediction algorithm has errors, is the shale algorithm robust? Here, for the traffic arrival order can be considered as a known distribution of the IID sequence, in fact, we can think of the number of flows per unit as a Gaussian distribution, we can predict the expected flow, but for some traffic unit flow fluctuations (large variance), there may be a large prediction error. The shale algorithm takes into account the prediction error, generally if the offline planning part of the calculation interval is shortened, and the use of feedback correction algorithm (such as PID controller) can greatly reduce the impact of the prediction error on the algorithm. Unfortunately, there is no theoretical analysis to give a competitive rate that the shale algorithm can achieve in this case. J.feldman etc. in the article online stochastic matching:beating 1-1/e, this paper presents a binary graph matching algorithm based on the offline plan instruction. The competitive rate given is 0.67, but it is almost impossible to generalize the shale algorithm from the theoretical analysis of J.feldman, so so far there is no theoretical analysis on the competitive rate of shale algorithm.

Pattern Generation algorithm

There is a flaw in the model of the shale algorithm: it does not consider the effect of frequency control, so the shale algorithm is not an optimal allocation scheme even if the predictive algorithm accurately predicts traffic. There are some online improvements that can improve the effectiveness of the shale algorithm in the case of frequency requirements. The following is a method to improve the shale algorithm by using pattern generation algorithm. The pattern generation algorithm generates some advertising patterns for each user in accordance with the mode of advertising, so as to ensure the frequency requirements of advertisers. The pattern here is a sequence of ad serving, such as:

For two ad delivery modes, if a user is assigned to run according to Mode 1, when the user first arrives to display ad A, the second time is B, the third is C, the fourth time is a, and the fifth time is B. The pattern generation algorithm assumes the ability to predict the number of times each user accesses a site within a frequency control period, and control the frequency of the ad through the build mode. The mathematical model of the pattern generation algorithm is similar to the shale mathematical model, as follows:

The number of traffic and the number of independent users for all schema libraries, and for the type of user access in Traffic unit I (the user access type is determined by the user for a period of time), representing the number of users in traffic unit I with Mode N as the ad serving mode ; Indicates the number of AD J occurrences in mode N, indicating whether the number of AD J occurrences is within the required frequency range in mode N, and indicates the number of independent users required by Advertiser J. Used to describe the quality of the model N, such as the diversity of advertising, frequency control and so on. The above mathematical model can also be modified according to business requirements.

The problem now is how to solve the above mathematical model, and the same as the shale algorithm, we use the predicted traffic for offline planning. But unlike shale, the pattern generation algorithm requires not only predicting traffic per traffic unit, but also predicting the number of each user type in each traffic unit and what type each user belongs to. In addition, the objective function of the pattern generation algorithm is much more complex than the shale algorithm, the shale algorithm is only a two-time programming problem, the offline plan is very easy to solve. However, in the pattern generation algorithm, the objective function contains integer programming, we need to construct a schema library, and then optimize the whole objective function. It is obviously not feasible to construct all the patterns in the Schema Library (exponential complexity), fortunately there are already mature algorithms to solve this type of optimization problem, interested students can go to view the column generation algorithm, here is not detailed description. In summary, with column generation algorithms, you can generate all the useful patterns and assign different patterns to each type of user.

When online advertising, when the user first visit the site on the day, through the user type and the traffic unit, find, you can get the percentage of user delivery mode N. Select mode n as the user's ad serving mode, and note the mode N, when the user second access, according to mode N continue to serve ads to the user.

The pattern generation algorithm does not solve all the problems in AD delivery, one problem is that the use of pattern generation algorithms needs to remember an ad serving mode for each user, which consumes memory very much. Another problem is that the pattern generation algorithm assumes that the user belongs to only one traffic unit, and the actual user may belong to more than one traffic unit. The theoretical analysis of the pattern generation algorithm is also very complex, so the current competition rate for the pattern generation algorithm is still unknown.

Conclusion

Research work has been active in the field of ad distribution algorithms, and new work is being published every year. This article summarizes the allocation algorithms that are applied to display ads guaranteed delivery. Some of these algorithms, because of the theory is too complex, I did not include (such as the use of reinforcement learning algorithm to learn the value of the bid scaling algorithm, so as to obtain a close to the optimal allocation algorithm). There's also a lot of important work to be done with AdWords problem ad distribution algorithms, some of which can be extended to display ads, but others are not very simple to extend, although the two issues look very similar. Have time or wait for me to figure out these questions and then expand this algorithm document.

The algorithm of traffic allocation in contract advertisement

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.