Multi-View Learning ( Multi-View Learning )
Early bragging: Today this chapter we are to brag about, just started the boss and I said what is called multi-view learning, my mind is so understanding: we are in the picture of sister welfare, not only to see $ degree angle of the bar, or that would not be all beautiful, this also got. So we have to look at various angles of photos, anti-piracy beauty, to everyone to create a sincere little deception of the harmonious world. So, multi-view learning is the degree of full, all-round without dead-angle appreciation (learning) and then get the closest to the real value of the decision.
Saying that one day ah, a person and an ant in the dialogue, they looked at a rice grain, people said, this rice grain chubby must be very fragrant, Ant said: You nonsense, this rice is clearly rectangular, why do you say he chubby. Then they quarreled and asked God what the rice was like. God said: You are not wrong, people see the three-dimensional world, so they can see three-dimensional things, and ants can only see two-dimensional, so the ants can only see the plane.
From the little story above we can see that multi-view learning is to learn from multiple angles, and then the data to predict and improve accuracy.
Half supervised learning
the problem of semi-supervised learning exists in the real world , A few examples are listed below :
in the text category,For example,junk e-mail filtering issues,All messages can be used as unlabeled data,access to tagged data requires users to label what is junk e-mail,which is not,If you use a traditional supervised learning approach,users are required to mark thousands of messages as samples,In order to make the training of the learning device has better filtration performance,and few users are willing to spend so much time tagging messages, in cases where only a small number of users mark a message and a large number of unmarked messages,using a semi-supervised learning method to train spam filters can be a good choice. In the image processing,For example,The problem of computer-aided medical image analysis,a large number of medical images can be obtained from hospitals as unlabeled data,But if medical experts are required to identify the lesions in these images,,tend to be unrealistic,usually only a small number of medical images of the lesions in the identification,Therefore, it is necessary to use semi-supervised learning methods to reduce the demand for tagged data, in natural language processing,For example,Syntactic analysis problems,to train a good syntactic parser, you need to construct a sentence./Syntax Tree,It's a very time-consuming job.,constructing thousands of syntactic trees may cost a linguist a few years, and sentences that can be used as unlabeled data are ubiquitous.,semi-supervised learning that considers unlabeled data can solve the difficulties of linguists, and the above examples show,with the rapid development of information technology,The problem we face is,large number of data exists,but it takes a lot of manpower and resources to get the tag of the data, and the traditional supervised learning method is difficult to get good predictive performance when the tag data is low."semi-supervised learning is to solve this kind of problem and put forward, both in theory and in practice are of great significance.
1.1 Multi-view semi-supervised learning
1.1.1 Multi-View data
In some practical problems, the same thing can be described from many different ways or different angles, These various descriptions form the multiple Views (multi view) that make up things. In this paper, using the Subscript XI to denote the first data point, and using the X (t) with superscript to represent the first view of the data, the multi-view data can be represented as xi={x1x2, ,xn}, which represents the number of views, multi-view data in the real world is widely available, the following examples: in the Web page classification problem, can be based on the page itself contains information to classify the Web page, You can also use the link to the hyperlink contained in the page to classify the information, so that the Web page data can be represented in two views, depicting the Web page itself contains information of the feature set constitute the first view, depicting the feature set of the information contained in the hyperlink constitutes a second view, in the recognition of TV clips, Can be identified according to the information contained in the video, but also based on the information contained in the audio to identify, so the TV data can be used in both video and audio views "in natural language understanding problems, the same semantic object, can be expressed in different languages, These different language descriptions constitute a different view representation of this semantic object, in the above example, multiple views are used to represent different sets of characteristics of the data, and multiple views can also be used to represent different sources of data, for example, the same data source, with different acquisition devices for the collection, these results constitute a different view of the data; , multi-Views can also be used to represent different relationships between data; For example, in the classification of academic papers, the paper has both reference references and the author's cooperative relationship, which can be represented by different views. There are some references to multi-modal (multimodal) learning problems, But the meaning of modal is different in different literatures "the narrower mode is the different senses of man, such as vision, hearing, smell!" The image or text information corresponding to the visual, and the sound information corresponding to the hearing, constitute the multimode data; generalized modal data refers to the data collected by different methods for a thing. For example, in face recognition, 2D images and 3D shape models of human faces may be collected, which constitutes two modes of human face data, and in fingerprint identification, multiple patterns of fingerprint data are formed by using different fingerprints of a fingerprint collected by various sensors, and the concept of multi-modal data and multi-view data can be seen Multi-view contains multiple modes, and multiple views can represent a wider range of practical problems;
1.3. Representation of 2 multi-view data
the representation of data is one of the most important and difficult problems in machine learning, because the learning effect is often influenced by the method of data representation. Objects for the objective world,it often extracts its characteristics and then uses the eigenvector to represent the object, i.e.xi={x1, x2 ... xn}the,whichNrepresents the number of features. The features that people want to extract reflect the nature of this object, and thus can be used to learn the concept of goals. However, for a learning problem, the minimum set of features required is unknowable, in the absence of prior information,can only extract as many features as possible.,Available to Learners,expect the learner to get better predictive performance"In addition, the development of data collection technology enables people to describe things through more complex and diverse means, which also leads to more characteristics of data. Some features of these descriptive objects have different properties, so it is not appropriate to use the same learner for learning. For example, in the recognition problem of the TV segment, the video and audio features are two parts, which are more suitable for the learning of the image recognition method and the sound recognition method respectively. If you use a single view(that is, using all features to form a feature vector to represent a video segment, you will not be able to select a universal learning method that fits both the image and the sound, in which case the representation of the multi-view is more appropriate, that is, to represent the data as multiple feature sets,you can then learn from each feature set with different learning methods.
even if the characteristics of the data can be learned using the same learner,Multi-View learning may also have advantages over single view learning"For example, the above page classification problem,The information contained in the Web page itself and the hyperlinks to the Web page consist of words .,both the page view and the hyperlink view can be represented as text vectors,you can learn with the same learner on both views"however,if you want to synthesize these two views into one view,The resulting eigenvector loses its original meaning .,and may increase the dimension of the feature space,which brings unnecessary difficulties to learning."also,Multi-view representation of data also gives you the benefits of each view,using unlabeled data to achieve collaborative learning,to improve learning performance,This is explained in detail next.
1.3.3 multi-View semi-supervised learning
in the semi-supervised learning of multi-view,data has multiple views on the one hand,thatX=xt,On the other hand the data is marked by the data setLand unlabeled data setsUcomposition,Learning algorithms should consider how to take advantage of information contained in multiple views and information contained in unlabeled data,to assist the traditional supervised learning.The representative algorithms in this field areA.blumand theT.mitchellthe proposed cooperative training algorithm(CoaTraining). The algorithm assumes that there are two views of the data, first of all two views on the use of marker data to train a classifier, and then, in the course of co-training, each classifier in the never tag data to select a number of high predictive confidence data to mark,and add the tagged data to another classifier's tag dataset,so that the other person can update the data with these new tags, the process continues to iterate until a stop condition is reached.
The main idea of collaborative training algorithm is presented, and the diagram, Cland theC2represents two categories of data, respectively,expressed in two different colors, X (l)and theX (2)represents two different views of data in a viewX (1), these two categories of data can be distinguished by classifiers .,while in the viewX (2)in,these two categories of data mixed distribution,difficult to train to get a good classifier in this case, with the viewX (L)The trained classifier can compare its unlabeled data to the results of the classification of unlabeled data,together with its classification results for those unlabeled data,provided to the view togetherX (2)on the classifier, and then, with the viewx (2)the trained classifier can take advantage of the viewX (L)to exclude the uncertainty of their own,thereby improving the use of viewsX (2)training classifier performance, and vice versa, multi-view learning is the difference in how easily the data is learned in different views to play a role in the interaction between views,, complementary advantages, collaborative learning. Since the collaborative training algorithm was proposed,Multi-View semi-supervised learning has attracted the attention of researchers, a number of related work has emerged, and has made a lot of research to identify according to the information contained in the audio, so TV data can be used in both video and audio views, in the natural language understanding problem, the same semantic object can be expressed in different languages, These different language descriptions form a different view representation of this semantic object. In the example above, multiple views are used to represent different sets of characteristics of the data, and multiple views can also be used to represent different sources of data. For example, the same data source, with different acquisition devices for the acquisition, these results constitute a different view of the data. In addition, multiple views can be used to represent different relationships between data. For example, in the classification of academic papers, there are reference relations between the papers,There is also the author's partnership, which can be represented by different views of different relationships, some of which involve multiple modes(multimodal)learning problems, but the meanings of modal words differ in different literatures. In the narrow sense, multi-modal is the multi-modal data which refers to the different senses, such as visual, auditory, olfactory, tactile and other visual images or text information, and the sound information corresponding to hearing. Generalized multimode data refers to the data collected by different methods for a thing. For example, in face recognition, it is possible to collect human face3DImages and3Dshape model, which constitutes two modes of human face data. In fingerprint identification, multiple patterns of fingerprint data are formed by using a variety of different prints of a fingerprint collected by different sensors. Comparing the concepts of multi-modal data and multi-view data, it can be seen that multiple views contain multiple modes, and multiple views can represent a wider range of practical problems.
The second article summarizes:
This article is a popular science article, we have nothing to read, on the multi-view learning has a general understanding, my direction is probably this.
Multi-View Learning (MultiView learning)