The shallow unspoken rules of mobile advertising cheating traffic

Source: Internet
Author: User

Turn from here

Traffic cheating is pervasive in the Internet advertising industry, has become a public secret.

Hegel's philosophical proposition " existence is reasonable ", is often abused, its original intention is " reasonable is a certain kind of law", through Hegel dialectics, there is no unchanging things, this proposition can also be interpreted as " whatever exists, should perish ." Not knowing when the false traffic will perish, then this article loses its meaning, but since this article exists now, it is reasonable (rational).

Today's topic focuses on mobile traffic cheating above, cheating (anti-cheating) Form and technology and PC cheating is different. The user ID of the PC usually uses the browser's cookie, while the mobile logo is usually IDFA (apple), Google Ads ID (Overseas Android), IMEI (domestic Android). Mobile apps have more signals (opportunities) to discern authenticity, and PC browsers have more restrictions.

This article on the superficial introduction of false traffic after some of the latent law , a lot of analysis is very simple, so called shallow unspoken rules .

1. Terminology of false traffic

This article chats the cheating traffic, has the good many kinds of sayings, the emphasis also is different.

    1. Cheat Traffic (Fraud traffic): Plain English, easy to understand, is a deceptive flow

    2. non-artificial flow (No-human traffic): This traffic refers to some bot traffic, machine simulation, for some hijacked traffic, some gray area, so not accurate.

    3. non-excitation normal flow (no-incentive traffic): Although some traffic is artificial flow, but often by some kind of temptation (such as unreasonable page design, lottery, red envelopes, game points cards, etc.), so the incentive flow is usually less conversion effect.

    4. Invalid traffic (Invalid traffic): In order to avoid overly sensitive cheating (fraud) and the use of terminology, it is not easy to offend, invalid traffic in both intentional and unintentional.

    5. anomalous Traffic (abnormal traffic): Similar to invalid traffic, emphasizing the anomaly of traffic.

These terms don't have too much of a relationship (or no need to subdivide them strictly), and more are used in different scenarios and roles. For example, some developers are concerned with No-human traffic (Bot traffic); some effect-monitoring companies pay more attention to metered traffic, so they are willing to use invalid traffic; Earlier, these traffic is called fraud traffic, So cheating traffic is also a common argument.

2. Mobile advertising business model diagram:

The money is the place is cheating, often walk along the river, which have not wet shoes? To see how money flows, you can understand the interests of the relationship. The upstream of Wuyue is the owner of the money, the downstream agencies want to enlarge the income, so in the payment of the owner's tolerable range to maximize income is the goal of each link optimization. This is a problem with the single optimization goal under limited conditions.

Common Traffic cheating motives:

1. Media: Create false traffic and increase revenue

2. Advertising agency/Sales: Operating false traffic, guarantee contracts, increase revenue

3. Trading platform: Not strictly review false supply, increase revenue

4. Users: To obtain incentives (red envelopes, point cards, etc.) and produce low (no) effect flow

5. Advertiser: Malicious consumption of competitor's budget

3. The current ratio of false traffic to mobile ads

Because the false traffic is too complex and sensitive, everyone is very cautious when they count off. In spite of this, the number of people at the end of the report varies widely, so it is not possible to verify the confidence of each data, please refer to it.


1.ANA(Association of National Advertisers): "The bad reputation of the trading platform cheating traffic reached 25-50%, the reputation of good is usually less than 10%".

2.Appflyer: In 2016, Applift reported that 34% of the mobile traffic was risky false traffic (Risk of fraud, 22% suspicious, 12% high risk). Android false traffic ratio is larger than iOS, the system version is about high, the lower the false ratio.

2. second hand: 2016 the vertical web site and the Network Alliance media abnormal traffic accounted for the highest. Among them, the vertical type of media exposure anomaly increased to 24.93%, click the exception of the network of the most obvious media, accounting for up to 71.07%.

3.AdMaster : The overall proportion of invalid traffic in 2016 was 30.2%, and the second half of the year showed minor deterioration, and invalid traffic increased by 3.7%;

4. Classification of mobile false traffic

There are many kinds of false traffic classification, all kinds of classification will have gray fields, below I try to use the basic principle of cheating to classify, and mainly for mobile scenes. More comprehensive and systematic classification, can refer to Liu Peng Teacher's "Internet advertising cheating terrorize".

Another classification can be categorized by device and man-made four quadrant



5. Mobile anti-false traffic model

Before discussing how to deal with the method of moving false traffic, we should first look at some of the major black techniques of mobile cheating to know the enemy.

There are many black techniques involved in mobile cheating, including some of them:

    1. Simulator: BlueStacks, Andywin, genymotion

    2. Spoofer: Constantly modifying the machine's IP, IMEI, Mac, etc.

    3. Proxy: Gateway, modify ISP, IP, UA, device type, etc.

    4. Apple: No simulator, mainly through hardware and software simulations

    5. Excitation flow (incent traffic): Real person traffic, but poor conversion rate of traffic

...

This is indeed a complex issue for how to prevent false traffic from moving. Not without the high-end technology to prevent cheating, and not because the problem is not serious enough, the main reason is three.

    • Accurate anti-cheating cost is higher

    • Causes of interest distribution for various players

    • Cheaters benefit a high and low risk, and in most cases cheaters are not punished.

For example, the recent League of Friends + in the court to sue an app brush volume company, the reason is to affect the friends of the league statistical calculation of correctness and fairness. At present the court has not decided, I also cannot know its lawsuit rationality. For example, there is a brush wall company to the road all billboards are painted into a company, and then have a brand influence ranking company to sue the wall company, seriously affected its brand ranking fairness. Always feel this logic, not too right. I also really hate app brush companies, but from which point of view to criticize and punish them, it really deserves more discussion of laws and regulations.

Do not talk about ethics and regulations, talk about technology, I think technically can follow the following model to deal with false traffic, especially mobile side.

Hardware: mobile phone has more hardware information, so through the hardware information to prevent false traffic, can be prevented by non-mobile phone (ie, bot, server, etc.) false traffic. Though, the mobile system now provides some standard functions to obtain hardware information, such as IMEI,MAC, but these functions are easily compromised by some common software tools. In addition, this hardware identification information can not be validated on the server side. Therefore, in the battle of false traffic, the first step is often to identify the source of traffic, is the real phone, or simulator, server simulation and other tools.

rule policy: rules are often the simplest and most effective defense mechanism, for example, to set the likelihood of false traffic to high for the first time that new traffic is accessed. Regular access to the extra x times per day, strong resistance and so on. There are many and many rules, constantly increasing, modifying, developing to the last, the matching order of rules has become an art. For some novice counterfeiters, they often fall into these rules.

machine Learning: machine learning is to train a classifier through some training data sets, for some features, training some weight information, and then for the classification of traffic flow identification. A team that makes false identification often gets deeper in this direction, using more features, using more data, using more timely data, and experimenting with more models. This field of work is very "bitter", do strict, the income may be affected by the image, do loose, advertisers complain of ROI decline, this balance is a bit inside and out is not a person.

are strong: Some cheating is not necessarily through the rigid technical means to complete, in fact, there are a lot of different methods. For example, by increasing the penalties for the media, the media can increase the cost of cheating, thus reducing the cheating rate. In addition, there is an interesting anti-cheating method, called honey AD (sometimes called Bluff AD), these ads have some characteristics (for example, the expected click-through rate is very low), by observing whether the CTR and the same as expected, you can determine whether the traffic is machine traffic (machine traffic cannot judge the point of these ads).





6 technology genre for identifying false traffic

This section focuses on techniques for identifying false traffic by means of machine learning, and a large part of it can be found in relevant papers.

6.1 Classification methods

Most algorithmic engineers deal with false traffic from the classification technology, construct a classifier, find a variety of features, find some false traffic (such as conversion rate anomaly) training data. This method is very dependent on the sample of the false flow, and the different samples are easy to train different models, and the transition fitting is easy. For the new false traffic patterns, it is not easy to find in time.

Common logistic regression and Bayesian methods, can refer to the following papers

"Measuring and fingerprinting click-spam in Ad Networks" Vacha Dave etc.

6.2 exception checking (anomaly-based Detection)

There are many academic papers that discuss the identification of abnormal traffic by clustering scheme, on the mobile side, can track the historical behavior of a user's identity, various internet behavior, advertising request behavior, browsing behavior, especially the use of cross-media, to identify whether this traffic is the normal use of mobile phone track.

      1. Analysis of anomalies based on historical information and industry average level

      2. Find some mutation points based on the change of time

      3. This technology in the financial and trading anti-fraud, with a lot of technology is also very diverse

      4. Common methods can be clustered, classified or content analysis;

"Using co-visitation Networks for classifying non-intentional traffic" Ori Stitelman et al dstillery 2013.

Automated check of 6.3 app ads cheating

There are many apps on the market, and those apps are the source of making false traffic? Is there any automatic inspection method? Microsoft has a paper to introduce this aspect of the work, by automatically running the app, analyzing the ads of the app: too many ads, advertising size is too small, overlapping ads and so on.

2014, "decaf:detecting and characterizing Ad fraud in Mobile Apps"

6.4. Audit (Audit)

Audit is a traditional anti-fraud method, but also effective, for the investigation of some brush volume problems also have a direct help.

      1. When some clicks occur in some media (publisher) click

      2. The ad platform/Advertiser sends some audit requests to the media, confirming the validity of the previous clicks (Point in time, basic information), and then comparing them.

6.5 Pseudo-AD verification (Honey ads)

      1. The ad platform sends some small percentage of ads, such as some informational cues, which, by reason, do not allow the user to click on the intent.

      2. If the click-through rate of these pseudo-ads is still high, as is the case with other ad clicks, it indicates that these traffic is problematic.

6.6 Identification of the authenticity of the device ID

On mobile devices, the identification of device IDs can greatly help to identify spurious traffic. There are two things to be sure, first, this ID is a valid ID, and secondly, this ad request really comes from the device where this ID is located.

Mobile device ID is also more, the domestic Android to the IMEI md5/sha256 mainly; the IMEI usually comes with some basic information from the manufacturer.

How to determine if this ID is from a real device requires the use of hardware technology, or some analysis of historical data. For example, there is an IMEI, the requested IP source is uncertain, the morning of the IP in Zhengzhou, in the afternoon in Hangzhou and Nanning, or unfamiliar places, then these ad requests, there is usually a false ingredient. Therefore, the identification of the ID, the common ID can be used to identify the auxiliary technology, including frequency of access, IP range, browsing behavior, search behavior, app things and so on, access time, etc., but also through the data chain to determine the authenticity of the request.

6.7 Share some anti-fraud papers

A paper from the old club is recommended here,"Click fraud detection:adversarial Pattern recognition over 5 years at Microsoft", This article describes some of the mind-rulebitmap that Microsoft had before 2014, from a to Model fighting, how to define metric (Value per Click). I had the privilege of working with some of the authors, and I really felt that the anti-fraud work was a non-coronal hardship.

I have collected some papers, interested students can go to my homepage to download

Internet home-Advertising technology Download-Ouyang

7 Anti-Cheating Technology Inc.

1) Integral Ad Science

The anti-cheating company, which was established in 2009, protects brand safety and so on, this company and Nielsen have a lot of cooperation, details visible/ http/integralads.com

2. Solve Media

Professional offer Captcha ("completely Automated Public Turing test to tell Computers and HUmans apart "), that is, verifying whether it is a human operation, not a machine access.

3.Double Verify

Primarily engaged in the detection of video visibility, it works with both Facebook and YouTube, after the MRC is serious.

4. Forensiq

Professional processing of false traffic technology companies, in the pre-investment, investment, after the investment can provide a solution.



8. The last thing to say

To fight abnormal traffic, is a elbow grease always more than credit , handling adverse often by advertisers complaints and platform credibility decline, if too aggressive, advertisers consumption may be significantly reduced, trading platform water will be reduced. Anti-fraud algorithm students usually need to keep the secret of the rules of various algorithms, many times, some new rules on-line, can not be publicized, can only secretly observe the effect, and abnormal traffic began a wave after wave of tug of battle.

Finally, for the long-term anti-fraud students applause cheer!

My related reference articles

Advertising technology:

Attribution analysis of Internet Advertising (attribution) (New!) )

Martech is the marketing, technology and management of The advertiser's perspective

What is the ad CTR estimate?

Google AMP "Freedom is Slavery"

Two minutes to figure out Beacon,ibeacon and Eddystone.

How does Budget smoothing (Budget Smooth) cost?

Internet advertising CPM,CPC,CPA spells and Holy Grail

Reject the monopoly and go to the open header Bidding

Freedom of equipment, independent personality: From device identification to cross-screen marketing

The bustling and sad DSP

Mobile Deeplink's former present

Discussion on the bidding principle of advertisement platform: GFP,GSP,VCG

Talk about the construction of Xiaomi advertising platform, the underlying modules and pits

Java Technology:

Modularization of Java 9-The Nirvana of "wrist"

To youth, Java 20 years!

Big Data analytics:

Upstream, an inventive Oracle Exadata database

An open source Analysis database-clickhouse

Author Introduction:

Ouyang, Xiaomi Miui commercial Products Division Architect/director, over 16 years of Internet veterans, is responsible for the advertising platform architecture and data platform, was responsible for Microsoft Mobile Contexual Ads advertising platform, participated in the Bing search engine Indexserve Core module development, the free will also in the personal public number "Connected Home", share some of the Internet technology experience, subscribe to the "internet resident" public number, direct communication with the author.

The shallow unspoken rules of mobile advertising cheating traffic

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.