Percent is a referral service provider, but has been transformed into a provider of big data solutions.
First look at the relationship between big data and applied portraits, big data is hot now, big data of 4 V are more understanding, big data should be said to be the natural extension of information technology, means the ubiquitous data. We first look at the history of the transformation of data status, in the traditional it era, IT systems around this business services, in the process of the service to precipitate a lot of data, on the basis of data to do some analysis. But it's not the same in the DT era. Data is the reality of the virtual world, the data itself to build a virtual world, IT systems built in the virtual world, become more intelligent, DT strategy in many companies slowly began to apply, more and more management began to consider this aspect.
Big data everywhere now, the first of our social information construction more and more developed, the second is with the development of wearable devices, people have produced more and more data, access to the network, at the same time the way people communicate with people is not only face-to-head, so we need to re-stream to understand human, So building a user portrait is becoming more important and now the machine is becoming very intelligent, so we also have to teach machines to understand humans, so that we can build applications on the basis of portraits, such as personalized recommendations, precision advertising, financial and political letters.
User portraits, labels, 360-degree user views, and so on, these concepts may be plausible for most people to understand.
User Portrait Perceptual knowledge
In the real life of the user portrait, as described above, everyone will think of Zhuge Liang, the picture will think of Hitler. ID card, think of Obama. These are portraits of life, all to describe a person, but they are described in the same way and angle.
However, we can see some similarities in these descriptions, which are mainly embodied in these aspects: the first is the goal, the goal is to describe people, to understand people, this is the user portrait of the biggest goal. The second is the way we describe, can be divided into 2 kinds: the first non-formal means, voice, text, are non-formal, another formal, such as the identity card, you can put the ID card on the card reader, you can read the corresponding information. The third is the organization, is structured and unstructured, the players in front of us is the data of the structure of China. The fourth is the user portrait standard, this is very important, this is what we will say, why? For example, we describe the user in the process to have some consensus, for example, I said that someone special 2 times, the word may listen to different, because the two sides on 2 times this time did not reach a consensus, So there must be a knowledge system to reach consensus, otherwise the user portrait of this matter is no way to achieve. The last one is to verify that after we finish the portrait, we must verify it. For example, I said this person is particularly unreliable, equivalent to playing a label, you will be asked why I step, your basis is what, we give users to produce portraits, we must give the basis and reasoning process, tell you how the results are obtained, otherwise there is no credibility.
Said so many, in the end what is the user portrait, the user is the real world user's mathematical modelling, it includes two aspects: on the one hand is describes the user, did not say the person, is to explain it with the business relation close, he is abstracted from the business, therefore source reality, above reality, the second is the user portrait it is a model, is through the analysis of the user nearly possibly more data information obtained, it is from the data, but the data is abstracted, higher than the data, all the user image behind the content is based on this expansion. For example, the Moonlight clan, this must be mined and analyzed, not to say that the original data contains the Moonlight clan this label, so that this is its two-layer meaning.
And then just said, user portrait is a real-life data modeling, but how we describe such a model, the core is a set of standard knowledge system, describe the user portrait. Another aspect is to have a set of data, symbolic, formal way to describe the knowledge system. And the machine should be able to understand this system of knowledge, if only people understand that this thing can not be used.
69, the 70 's has existed, do semantic analysis of the friend may have heard this ontology, in the 90 's, the ontology and semantics is very popular, this thing is more complex, it helps group machine U understand knowledge system, because very complex, I will briefly say, similar to UML this language, including entity, contact, Reasoning and so on. It means that through this methodology, knowledge can be understood and even taught how to reason. This means that ontology is very complex.
I have a very simple method here, you can take a look at this picture, we are in the realization of the world, we in the real world how to express knowledge. We start to learn the language when the use of what is the Xinhua dictionary is not, the dictionary you see the organizational structure is what, the first is the word, the word as its performance symbol is also what, it followed by a longer interpretation of the text, for a concept, then that is, here the symbol and concept is relative. In our real life, for example, you say that is enough for this instance, the word dog is a symbol, but it corresponds to the concept of our brains, ' four legs ', ' a housekeeping, a barking animal '. For the entity is our real life of various varieties of enough, right. I hope you can remember this picture.
The above diagram of the model in front of the concept of the way, the label corresponds to the symbol on this map, and we emphasize that the two point is the label it more business closely related, 2nd, in this picture, the label is a symbol to go to express the model is right. To cite an example, such as my product, want to sell white-collar this group, white-collar This word is a symbol, can represent a user group, such as "High Income, do Office", so the label will make it a better definition.
Just said user portrait definition, label definition, then we usually say the label and portrait. What is the relationship between them? In fact, a whole and a local relationship, the user portrait is the whole, the label is local, the whole and the local relationship through the label system embodies. The whole and the part contains two aspects of the relationship: piecemeal, how the whole is reflected in the local; peers, the local how to make up the whole: For everyone should observe a pair of eyes and a nose, peers: only the right pair of eyes and nose, we can make him a human.
There is also a labeling system, to focus on, we are in the process of building user portraits for real enterprise customers, and their business units, product departments to build the labeling system, because the label is closely related to the business, their labeling system is to collect all the business needs, the development of the label system, The definition of each label standard is given, followed by the development of a line label.
The final point is the user portrait verification, that is, when we provide solutions to customers, they often ask a question, build the user portrait of how to verify the results? In our view, user portrait As a user in real life modeling, the model verification can be divided into two aspects: one is the accuracy of the verification, your label is not allowed to play, that is, we often say the accuracy rate, the second is the label hit the whole, but for these two aspects, you are not able to meet at the same time. Real business cannot pursue perfection because you may be able to make a 100%-play tag system.
So we verify that more of the talk is accurate, can be divided into two, one is the fact that the standard, such as physical sex, can be used to verify the accuracy of the model with standard data sets, and the other is no fact standard, such as user loyalty, we can only verify the process, the specific effect needs to be verified by the online business A/b
The former introduces the theory of user portrait: User portrait is a mathematical model of the real user, the label is a symbol, the label is connected with the business only makes sense, the user portrait and label are the whole and local relations. Next introduce the user portrait practice.
The above image is the user portrait production and application of the logical structure, including 5 layers: Data acquisition layer is to collect a variety of user data, take a company, it has data sources distributed around, there is a CRM system, scattered in various departments, the construction of DMP (data management platform) a difficult point is to collect all the data, Even ask the boss to push. The data management layer cleans, pull-through, integrates and analyzes modeling, builds user portraits, data interface layers and application layers based on user portraits, provides various types of analysis, service and marketing applications, service and finance, manufacturing, aviation and other industries.
It is important to build user portraits, faced with many technical challenges, followed by a focus on user multi-channel information, multi-channel product information and user data excavator modeling, 3 aspects to expand.
First of all the users of multi-channel information to get through, users and enterprises out of a lot of points, such as mobile phones, mailboxes, cookies and so on, we will be the same user of these multiple out points to get through, need to stand in God's perspective, we can view the user ID as the point in the figure, if the user's two contacts in the such as mailbox landing, then we will use the mailbox and cookie with an edge through the line connection, so as to build a picture.
The ID connected in the diagram can be regarded as the same user, thus realizing the user pull-through, and the credibility of the connection is determined by the density of the business, the higher the density, the higher the requirement of reliability, such as the recommendation is low-density business, even if the identification error, the impact is relatively small, but for e-commerce SMS Notification Service, if the
Just talked about is the user to get through, now to talk about is how to different channels of product pull-through, such as our e-commerce customers first-party labeling system are different, so the label system pull-through is to establish a standard classification label system, is generally a classification tree, any commodity can be divided into the leaf node of this classification tree. According to our time experience, the manual mapping cost is high, it is difficult to open the large-scale consultation, we actually use the machine learning model + a few artificial rules to achieve.
The concrete model realization see above This diagram, to realize the automatic classification, the difficulty step of which is the model, but is to obtain the training data, the featrure,engineering and the classification tree level node between the dependency question, here is the step concrete unfolded (need you unfold, I faint), At present, for our e-commerce channel of goods classification accuracy rate of more than 95%.
In the user portrait modeling aspect, we put the label modelling to divide into 4 layers: The first layer is the fact class label, for example the user buys the kind of product, the second layer is the machine learning model the prediction label, for example the present demand, the latent demand and so on, third is the marketing model class tag, for example user value, activity degree, loyalty The fourth layer is the label of the business class, such as a high-luxury group, there is a group of rooms and so he is the bottom of the label combination of production, usually has a business person definition, the previous introduction of the user portrait theory and time, the next introduction of user-based image application.
The construction and use of user portrait 1