Michael Berry is quite dismissive of the grandiose rhetoric of big data. As the analysis director of TripAdvisor, a travel site, he argues that more data does not necessarily have a positive impact on business, such as big data and predictive analytics.
"Many predictive analytics applications don't really need all the data. "Berry the keynote speech at predictive Analytics world," he said. Therefore, it is not important for data scientists to think about how to analyze all the data, but rather to see what data can be used to derive truly valuable results. So what do we do? " There is no straightforward answer to this question. "Berry said.
However, it is sufficient to determine how much data can be finalized by testing the validity of the predictive model each time the data is added. For example, when Berry wants to know the standard price of a travel agent for a hotel or a particular customer, the average value is calculated by selecting two averaging, then three ... At the end of the 10,000, the mean value stabilizes. If you take 20,000, the mean value will definitely change, but it's not necessary.
"That's the key. If you have enough data, the sheer number of increases will not have a big impact on the results. "Berry said.
If too much big data doesn't make an essential difference, what's the key? " Many aspects. "Berry said. The purity of the data, the reasonable and comprehensive sample, and the people who focus on the visualization of big data and the talent for digging, will lead to different results.
These are key points in predictive analytics, such as pointing out which variables can make the model more robust, or combining which sources of data can discover new patterns.
"Wind chill factor, for example." "Berry said. Combined with the actual temperature and wind speed, we can effectively analyze the human body's feelings for the outside environment.
Myths about Big Data
Berry is not the only one who has complained about the current big data and predictive analytics situation. Karl Rexer, the founder of Rexer Analytics, a consulting firm, thinks data scientists are somewhat confused stupefied. In its 2013 survey of data-mining practitioners, respondents ' feedback indicated that data size was becoming more and more large. However, when asked how much data was being used for real analysis, the answer was not the same as the results of the 2007 survey.
This is not to prove that the so-called Big data is a farce. "The overall sample size has not increased for traditional predictive analytics modeling or data mining projects. "Rexer said.
Abbreviated vocabulary naming
Translating analytic terminology into a language that can be understood by the business side is a huge challenge. Payroll, labor, and service outsourcing provider Paychex is breaking the fence like this: describe it according to the recommendations of the business side.
"When we build the model, we hold a naming contest. "Paychex's modeling analyst, Tom Kern, said in the predictive Analytics world. Kern's team sends an e-mail message to the user, which briefly describes the model and provides some vocabulary for its use. Based on the actual work, the user creates abbreviated words, such as Sam, which represents the sales expectation model, and Tim represents the domain identification and mapping model (territory identification and mapping models).
If the recommendations of the business end user are eventually adopted, they will receive a gift card. As a result, you can think about what the predictive model should do, based on the expectations of people like salespeople.
The change of strategy of jig and stain
As one of the world's largest retailers, Procter and Gamble has announced the introduction of a new low-priced detergent to attract mid-end customers. How should we evaluate this decision?
Shel Smith, the founder of Twenty-ten Inc., a market analyst, said: "If you launch a similar product, not just to get new customers, you're actually encouraging existing customers to replace the high-priced products." ”
Given the impact of the current economic situation, this fear is not unreasonable. However, Smith has confidence in the strategy of Procter and Gamble. He believes that the strategy of Procter and Gamble is based on predictive models, massive data and precision marketing, and can gain new customers without compromising the sales of existing brands.
"There must be a lot of things we don't know about, but there's nothing mysterious about getting new customers. "said Smith.
"For more information on business intelligence, business intelligence solutions and business intelligence software downloads, visit Finebi Business Intelligence official website www.finebi.com"
Big Data and predictive analytics: Is there more data, better?