A few weeks ago, when William Lansing, chief executive of FICO, attended Stanford University's Big Data conference, he found that the industry's heated discussion of big data had increased from three "V" to four "V", that is, volume (data volume), produced (data type) , velocity (processing speed), and value (data values). and the fourth "V" means that the industry is starting to focus on data insights, emphasizing how to get the value of large data. "Tapping the value of data is exactly what FICO has been focusing on for more than 50 years. Our core business is to analyze all kinds of data and make intelligent decisions. "The market capitalisation of Fico has climbed continuously in the past year and has now reached $1.5 billion trillion," Lansing said. ”
In 1956, a number of Stanford University math experts founded FICO, the company's vision is to use data analysis to predict risk variables, so as to help banks control the size of credit loans. Today, FICO's analysis technology is protecting 2/3 of the world's credit card business, only in the United States to help various institutions to achieve up to 10 billion U.S. dollars in approval loan decisions, visible data analysis and prediction technology. "For us, big data is a big opportunity," he said. Lansing recently came to China and shared his views on large data with our correspondent.
Value of Data tiering
True, big data is a hot topic today, but much of the discussion about Big data revolves around the infrastructure level, such as Hadoop. These discussions focus on data storage, processing and real-time management. Even in Silicon Valley, many start-ups are focused on big data infrastructure, such as Cloudera, which can provide business software level support for open source technology, in large data areas, like Red Hat support for Linux, Cloudera also supports Hadoop, Lansing revealed. Much of the issue revolves around how to make large data easier to read and store in developing technology. However, the topics surrounding big data analysis are far from enough. People are talking about how to store and capture big data, but they rarely mention what customers can do with these big data.
In fact, no matter what type of enterprise is most concerned about value, this means that enterprises need to find the most relevant variables in large data, and then based on these variables data modeling, and based on the model to make better decisions. This is FICO's specialty: Can we make better decisions based on all the data if we have an infinite data processing capacity? How much money and effort can we spend on this additional amount of data? Are the additional data important to the decision? What is the input/output ratio for processing all the data? Does this affect decision speed or even accuracy?
At the heart of this series of questions is that data is inherently not equivalent. This means at least three key points: one, there is always some data is more important data, we should first focus on these more important data and analyze and forecast it based on these important data. Second, the importance of the data is in a sequence, some data is our priority to use the data, some can be used as a basis for analysis and prediction; As with all data sources, large data can inevitably be doped with false clues, noise and interference, which is the problem of data cleaning. So we have to be very intelligent to use this data.
"Data processing is scoped, we focus on the range of data, make decisions based on the range of data, and find out more useful data," Lansing says. So today we need to focus more on the data that is useful, more structured data. ”
For example, banking institutions need to know their customers through all the data. These large data may include unstructured data, such as text data, image data, or even Facebook data. If a bank client is often drunk, can we predict the frequency of his drunkenness and determine his credit score? Obviously, these data may have some extra value, but not the most useful data, and we need to consider whether it is worthwhile to spend so much energy on these extra trivial data. Today, both the US and China's banking institutions are more concerned with practical data and using data that has been considered predictive by experience. These data may not necessarily be unstructured data in large data, but banks think they are well predictive and can be analyzed.
Changes in the way of analysis
Another noteworthy "V" in the big Data Age is velocity (processing speed), which means that in the processing of large data, it is not easy to apply traditional data mining techniques and the way data analysis is changing. Lansing also believes that the big data age is as important as speed and efficiency, and that customers are transitioning from "batch decisions" to "instant decisions".
In fact, there are generally two models of data analysis, one based on hypothetical models, such as focusing on high-value data, focusing on relevant areas of data, and focusing on data that can improve efficiency.
FICO is constantly promoting innovation in the field of data flow feature analysis, especially in the field of fraud prevention. Its anti-fraud solution model relies on trading characteristics to generalize the characteristics of the data in the transaction process, so as to compute the variables associated with the fraud characteristics, without relying on the generated data.
Another model is a model that is not based on assumptions. Because the changes brought about by large data are the need to reduce reliance on intrinsic data in analysis, the analysis model will be able to adjust itself to the dynamic data in the data stream. In order to deal with the dynamic data in the increasing data stream, we need to focus on developing self-learning techniques, including adaptive analysis and self correction analysis. Lansing believes that these key technologies will fill the gaps in traditional approaches and may even replace traditional models in some areas.
It is reported that there are already some emerging banks in the use of FICO this self-learning analysis of microfinance. These loans are not based on the traditional loan approval model, but on subprime mortgages or certain groups of people randomly issuing small loans. As a result of the system learning model, the information will be automatically entered into the system for system learning, then more loans are issued and the self-learning process is repeated.
This approach is only used in a very narrow crowd, and in fact many large banks will take quite a long time to accept this approach. But it is undeniable that this kind of microfinance can gradually supplement the original credit mode, and better provide services to customers.
Personality and Commonness
Is the technology of data analysis and prediction malleable, that is, can it be extended from banking to other areas? In fact, Fico's understanding of customer behavior is not limited to the banking sector, but also to insurance and retailing, because customer behavior is common in terms of data analysis.
"For example, fraud in the insurance industry is very similar to the credit card fraud in the bank," says Lansing. In the marketing solution, many retail user behavior and banking user behavior is very similar. Similarly, our experience in customer management in the financial industry can also be applied to the retail industry. ”
FICO is good at analyzing complex and difficult problems, and the reputation of the company is based on good performance in the financial industry. Therefore, in addition to conducting credit evaluation as the core business, FICO has also launched an application software business to help global financial institutions to provide products and services such as Basel compliance consulting, account management systems, anti-fraud systems, collection and asset preservation systems, credit scoring models and technology. Provides non-financial institutions with a marketing solution based on data analysis and extends to leveraging analytics tools to help customers resolve any analysis business.
For example, in large data, you can put forward marketing solutions based on retail analysis. Today, the retail market is subdivided into many regional markets, gradually refined, an enterprise may be dedicated to a specific market, for a specific group of people to send e-mail, text or even postcards, to carry out personalized marketing activities. Therefore, the main feature of the retail market is personalization and activity management.
For example, a customer will go to a supermarket every six weeks to buy a brand of detergent, so the next time the customer comes, the supermarket will provide him with detergent marketing and service the best time.
Lansing that the use of data analysis to achieve a single segment of the market should be the future development trend of marketing, and for the banking industry, one-on-one marketing ability is also valued. To this end, in May this year FICO also acquired the Entiera company, to better support FICO for banking and retail industry customers to provide one-to-one marketing solutions. It is reported that Entiera company's unique customer dialogue management solution can help enterprises to generate, monitor and analyze marketing solutions in the SaaS environment, and can store unstructured data, making marketing solutions and decisions easier to operate.
(Responsible editor: The good of the Legacy)