The DEEPFM proposed for cross (high-order) feature learning is a end-to-end model that does not require the artificial construction of features on the wide side like Wide&deep.
Network structure:
Structure of sparse features: Class-type feature one-hot, continuous-type feature numerical representation, or segmented discrete one-hot
After FM and nn output prediction y respectively, the two results are sigmoid
FM section:
Paper pointed out that in the case of sparse data, FM can still effectively learn the second-order characteristics, the final FM prediction is:
Deep part:
Papar points out that this network structure has two special points:
1) Although the field vector length of input is different, the length of the embedding is fixed.
2) The latent vector v Vector of FM, as the weight matrix of the original feature to the embedding vector, is placed in the network to learn, such as.
(Personal understanding, that is, in the network layer one or two, learned the weight is the FM latent vector, and then used in the FM component, calculated Y_FROMFM)
DEEPFM FM and deep share a copy of embedding data with two benefits:
1) Ability to learn low-order and higher-order features simultaneously from primitive features
2) do not need to do feature engineering like W&d
Several models, including the model and w&d, were compared after paper, as the original paper, which had not yet been read, was not remembered.
Paper notes-deepfm:a factorization-machine based neural Network for CTR prediction