Reference documents:
Excerpt from the literature:
The more general and powerful setting are theSelf-taught Learningsetting, which does not assume that your unlabeled dataxu Have to is drawn from the same distribution as your labeled dataxL .The more restrictive setting where the unlabeled data comes from exactly the same distribution as the labeled data is Sometimes called thesemi-supervised Learningsetting.
A cited example illustrates:
If you want to discriminate whether an image is a car or a bicycle. In practice, you may collect two kinds of data. (1) Download a bunch of pictures from the Internet, no matter if there is a car, bike, and then as a data set, do not make any label (label). (2) Carefully sift through the web a bunch of cars, or pictures of bicycles, as datasets, without any labels.
The former picture does not meet the distribution of our target projections . Called self-taught learning. (Note: Our goal forecast is: to give a car, or a bicycle two pictures of the distribution)
the latter picture is consistent with the distribution of our target predictions , called semi-supervised learning.
Self-taught learning setting && semi-supervised Learning