Marketing is to discover or tap the needs of prospective consumers and many businesses, through the optimization and customization of their own goods and services, and then promote, disseminate and sell products, realize the process of maximizing benefits. For example, a bank can market a user who is at the edge of staging by means of an interest-free card or a price reduction to promote its overall profit by staging, and choose the best time and place to promote the conversion of the user's ad serving.
In the big data and "thousand people face" background, marketing upgrade to "precision marketing", the needs of each user more sophisticated personalized analysis and delivery, and then achieve user satisfaction, advertisers and the platform to benefit from the multi-win situation. The steps of the marketing algorithm are generally: 1) circle people, 2) recall and sort 3) determine the final strategy by maximizing ROI or profit uplift under budget constraints.
The marketing algorithm differs from the traditional method of recommending recall, and the recommended recall can be evaluated by CTR, accuracy, AUC and other methods. But the evaluation of marketing will be more complex, with the following four aspects of the problem:
1) To improve overall profits and customer satisfaction as the goal, and even consider the long-term benefits, not a simple two classification or regression problem
2) marketing usually includes multiple decisions and more complex policy processes than recommended, with the ultimate effect depending on the optimization of the strategy and overall process
3) Marketing generally has budget constraints, and to consider the input-output ratio
3) The root cause is that it is almost impossible to know whether the current decision is optimal for any individual, as compared to the classification, because the response is not observable under a particular decision . For example, to issue an interest-free coupon, we can only issue a certain type of interest-free coupon to an individual at a single time, but we cannot know beforehand that the decision is optimal. repetitive experiments are not possible for individuals . Therefore, even the data obtained from the random experiment is unlabeled in the probability perspective, because even on the training set, the actual values (such as the response rate, etc.) that are tried to predict under the optimal decision are unknown.
Due to cost reasons, it is impossible to do a lot of random experiments. However, when doing A/b test, it is still necessary to evaluate the marketing algorithm model with clear and easy-to-explain business metrics like the AUC. But is there a more accurate way to evaluate the performance of marketing algorithms?
In this context, we have implemented the multiple treatment (multi-decision evaluation) algorithm, the idea comes from the paper "Uplift Modeling with multiple treatments and general Response Types", The algorithm is simple, complete in theory, and has the same excellent intuition and contrast as AUC. At the same time, Model-free is independent of the algorithm and strategy used.
We will give its easy-to-understand business interpretation, supplement its use of the necessary conditions, perfect theoretical deduction.
Simple illustration of multi-treatment
We first give the simplest of a marketing plan, to illustrate the principle of the algorithm.
For all users, there are two strategies: Original price/price reduction, before the algorithm experiment, the first round of a small flow of random experiment, the two strategies to put the proportion of 50% per cent. We propose one of the simplest threshold strategies: "Low-response user price reduction, high response to user price", then need to evaluate the strategy's online profit margin.
As shown, the longitudinal axis is the response rate axis and the horizontal axis is a random shaft. There are two overlapping parts of the stochastic experiment and strategy, which is figure 3. Since the overlap is historically known to be observable, the profit margin of the overlapping portion is assessed to represent the overall strategy's profitability .
Thinking through the whole method is very simple, that is, the use of unbiased partial sample characteristics and statistics to represent the whole , which is almost the knowledge of junior high school mathematics. As long as the sample is large enough, its historical backtesting is completely close to the actual line effect.
Although the idea is simple, it is important to pay attention to some conditions that are easily overlooked:
1) The number of random samples, that is, the area of each block is large enough to avoid the impact of abnormal or special samples on the overall evaluation effect.
2) Random experiments must be random and different treatment of the delivery ratio should be consistent, but the actual delivery of each treatment do not need to be evenly distributed. Explains the reasons for this:
3) Each treatment needs to be discrete and the number is much smaller than the number of samples, and independent (although it is difficult to imagine the treatment between the different cases)
4) need to use the recent historical data to be relatively accurate, require the environment must have a certain degree of smoothness, or based on the historical data of the backtesting is meaningless.
And then expand it into a complex case of multiple treatment. For multi-segmented decision-making on response rates, it is not necessary to prove that the decisions must be straight and parallel to each other in the graph, orthogonal to the random axis (otherwise the strategy has the ability to guess random), as shown in:
If the strategy needs not only to consider the response rate, but also to consider other indicators such as occupation, you can add an axis to represent the occupation, also must be orthogonal with the random axis, forming a high-dimensional linear space, hit part into a small block in space, the aggregation of the block can still represent the overall strategy of the profit margin. It is worth noting that occupations do not necessarily need to be orthogonal to response rates.
Theoretical derivation
A uplift model splits the entire feature space into multiple sub-spaces, and each space represents a strategy. In a randomized trial, it is possible to obtain the probability of a sample randomly falling into a subspace (i.e. hit) and its corresponding response. Therefore, the real response of the whole feature space can be obtained by calculating the response of the whole hit subspace.
In a random experiment, K is the number of all possible treatment, so \ (p_t\) represents the probability that a treatment equals t, and in any meaningful occasion, it can be guaranteed \ (p_t>0 \ for \ t=0,... K\)
The following paper gives a lemma:
For a set of randomized experimental data \ (S_n = {(x^{(i)}, t^{(i)}, y^{(i)}, i =,..., N)}\), calculation \ (z{(i)}\) is easy. If \ (ith\) samples exactly match the real treatment, then \ (z{(i)}= y^{(i)}/p_{t}\), that is, the actual response will be scaled by the probability of the treatment, otherwise \ (z{(i ) (}\) is always 0. Since the average of the samples is an unbiased estimate of the expected value, we have the following concepts:
Further, the confidence interval of the Z-mean can be computed to help us estimate \ (e[y| The confidence level of t=h (X)]\) . Here you can refer to the relevant article of the significance test (for example, this article)
How to calculate and model long-term earnings?
Although the idea is simple, easy to implement, multi-treatment evaluation can only solve the short-term decision-making evaluation, but the user and the environment is time-varying, when the user accepts multiple treatment (such as price reduction reward, red envelopes or raise prices), the mind will change, short-term maximization of income does not represent long-term benefits.
Assuming that a strategy has been implemented several times, we have been able to observe the long-range treatment and response of a single user/group, and how to model timing information using reinforcement learning? How to evaluate the long-range income accurately and effectively? These are all very interesting questions.
Have any questions, welcome to the discussion.
A marketing evaluation algorithm based on multiple treatment