Both AdobeAnalytics and Webtrekk are the Giants in the data analysis field. one is the first in the U.S. market and the other is the first in Europe. they can provide world-class digital analysis solutions. I am lucky to have the opportunity to gain an in-depth understanding of and apply these two solutions,
Adobe Analytics and Webtrekk are both the Giants in the field of data analysis. one is the first in the U.S. market and the other is the first in Europe. they can provide world-class digital analysis solutions. I am very fortunate to have the opportunity to gain an in-depth understanding of and apply these two solutions. I also feel that the development of the digital analysis field is never lacking in concepts. what is lacking is application scenarios and value extraction. This article analyzes and compares the two solutions in depth. Because the article is too long, I divided it into two parts. This article is the next one. For the previous article, click Adobe Analytics and Webtrekk digital analysis solution analysis and comparison (I).
III. rigorous and scientific data spirit
(1) data mining algorithms
Data mining and website analysis are two different fields of data analysis. Data Mining focuses on discovering and mining unknown knowledge from massive data through models, website analysis usually extracts value through subdivision, trend, and conversion. In our previous applications, we always wanted to combine the two data analysis methods and explore the related methods of website data mining. now, these two solutions have begun to implant data mining algorithms into the system and provide data mining insights in the analysis system.
Adobe Analytics
Adobe Analytics's data mining application is embodied in the Anomaly Detection (exception Detection) report. This report is used to find the maximum and minimum values of data fluctuations by performing data operations on the selected dataset and time, an alert is triggered when the actual data value exceeds this range.
Anomaly Detection is essentially a time series algorithm, with the core being:
Holt Winters Multiplicative (Triple Exponential Smoothing) -- Holter ventures multiplication (Triple Exponential Smoothing) Holt Winters Additive (Triple Exponential Smoothing) -- Holter ventures addition (Triple Exponential Smoothing) holts Trend Corrected (Double Exponential Smoothing) -- Holts Trend correction (Double Exponential Smoothing)
These three algorithms are actually combined to form the wentetes seasonal exponential smoothing model. The basic idea is to study the specific linear trends, seasonal changes, and time series of random changes, combined with the exponential smoothing method, we can estimate the long-term trend (Ut), the incremental trend (bt), and the seasonal change (Ft), and combine them with the exponential smoothing method, we can also process both the trend and seasonal changes, filter out the effects of random fluctuations, and create a prediction model. therefore, it is particularly suitable for the prediction of time series that contain trend and seasonal changes.
However, this application currently has two problems:
Anomaly Detection can only provide data Detection reports as of yesterday. The essence of exception monitoring is not to tell the user what happened yesterday, but to what exceptions have occurred and how specific exceptions have occurred. the Function of Exception detection is only limited to viewing SiteCatalyst, if you can provide trigger prompts similar to email triggers, text messages, or other types of triggers, the effect will inevitably increase;
For more information about this function, see Adobe Analytics exception detection, an application example of statistics in click stream data.
Webtrekk
The core application of the data mining algorithm of Webtrekk is association analysis. this model can be applied to pages, search words inside and outside the site, products, and advertising channels. Different from the existing reports for Adobe Analytics exception detection, the associated model reports of Webtrekk must be configured in a simple way to view. Generally, when we use a data mining tool for data mining, we will select a data source, data preprocessing, and the algorithm itself. we need to configure the minimum support, minimum confidence level, and maximum number of items, you also need to configure the association analysis configuration of Webtrekk:
The association analysis algorithm supports two types of algorithms: cross-selling and upward-selling. datasets use Raw Data, and the maximum time is one day; the analysis rules support the Association of pages, channels, products, advertisements, search terms inside and outside the site, that is, Page Association reports, site search term reports, advertising Channel reports, and product reports; supports advanced configuration. Supports the minimum frequency. when you select the dataset time for upward sales, you need to confirm the time of the dataset for upward sales. you can also use the subdivision function in website analysis. for example, you need to view the association effect of a page, you only need to filter the page.
The associated model of Webtrekk has a wide range of applications. It provides the following data value insights:
After A user searches for A keyword in the site, which keyword is usually optimized for search?
Which page does A user usually view after reading page?
After A user buys product A, which product will the user buy together? Which product will I buy next time?
After A user enters the website from Channel A, which channel will the user normally access again?
In fact, among all data mining algorithms, Rule extraction is the most important algorithm for business applications, because extracted rules can directly guide them in business practice, therefore, it is the most practical (the so-called rule extraction algorithms include association, regression, and decision trees, which are Analysis-oriented and extract algorithms that can implement the target rules, for example, the user who buys A usually buys B next time ).
Although this algorithm of Webtrekk is good, the problem is that the dataset time is too short. In general, we will select an appropriate sample size. if the data size is too large, the data is wasted. if the data size is too small, the problem cannot be explained. The data size of a day is a little small, and some unexpected factors may appear in the results. it would be better if the data size can be expanded to one week or even one month. Of course, a larger amount of data means more data processing needs and a longer time. This requires a trade-off.
(2) more scientific data insights
Experienced data analysts did not start data analysis and mining as soon as they got the data, but they wanted to look at the data first.
What is data?
Looking at the data is to evaluate the current overall sample to confirm how the data needs to be pre-processed (the complete data analysis process includes requirement processing, data processing, special analysis, deployment optimization, and project summary )., for details, see "how to establish an implementation-type data analysis (mining) process? ).
How to view data?
To view data, you need to determine whether the data is stable and whether there are abnormal values based on the overall data distribution, data trend, data extreme value, average value, standard deviation, and other values. Compared with Adobe Analytics and Webtrekk, Adobe Analytics only provides the data aggregation function at the bottom of the report. In contrast, Webtrekk makes the following possible points:
Provides more options for viewing the overall data. Including average, maximum, minimum, summary, and data summary on the page. with these indicators, you can see the data distribution in the selected report at a glance, data distribution can be effectively determined based on data trends and other graphs on the top. Provides the prompt function for daily data. Including highlight or column chart. when we summarize data in Excel, this representation is a basic method to mark data attention. if we make this form into a Dashboard and send it directly to the boss, the boss will be more likely to find the highlighted data in a pile of data; in addition, we will be very easy to find the abnormal values of the data when doing our own data.