DSSM & Multi-View DSSM TensorFlow implementation

Source: Internet
Author: User
Tags constant variable scope


Learning deep structured Semantic Models for Web Search using clickthrough data and its subsequent articles



A multi-view Deep Learning approach to cross Domain User Modeling in recommendation Systems implementation Demo.

1. Data



DSSM, for input data is the query pair, that is, the query and corresponding display, the display of the click and not click, respectively, for the positive and negative samples, but also for the order of clicks, there are different assignments, specific reference to the paper.



I do not have permission to open my query data, please find the data yourself. 

2. Word hashing



The original use of 3-grams, for Chinese, I used the uni-gram, because the Chinese language itself has a certain meaning (there are also paper strokes), for each gram use one-hot coding instead, the final can greatly reduce the dimension. 3. Structure



Structure diagram:



Map entries to low-dimensional vectors. Calculates the cosine similarity of queries and documents. 

3.1 Input



The Tensorboard visualization is used here, so the name_scope is defined:



With Tf.name_scope (' input '):
    Query_batch = Tf.sparse_placeholder (Tf.float32, Shape=[none, Trigram_d], name= ' Querybatch ')
    Doc_positive_batch = Tf.sparse_placeholder (Tf.float32, Shape=[none, Trigram_d], name= ' DocBatch ')
    Doc_negative_batch = Tf.sparse_placeholder (Tf.float32, Shape=[none, Trigram_d], name= ' Docbatch ')
    On_train = Tf.placeholder (Tf.bool)
3.2 Full Connection layer


I use the three layer of the fully connected layer, for each layer of the full connection layer, in addition to the neuron is not the same, the others are the same, so you can write a function reuse.
LN=WNX+B1 L_n = w_n x + b_1






def add_layer (inputs, in_size, Out_size, Activation_function=none):
    wlimit = np.sqrt (6.0/(In_size + out_size))
    Weights = tf. Variable (Tf.random_uniform ([In_size, Out_size],-wlimit, wlimit))
    biases = tf. Variable (Tf.random_uniform ([Out_size],-wlimit, wlimit))
    wx_plus_b = Tf.matmul (inputs, Weights) + biases
    if Activation_function is None:
        outputs = wx_plus_b
    else:
        outputs = activation_function (wx_plus_b)
    return outputs


Among them, for weights and bias, use the specific initialization method according to the paper:



    Wlimit = np.sqrt (6.0/(In_size + out_size))
    Weights = tf. Variable (Tf.random_uniform ([In_size, Out_size],-wlimit, wlimit))
    biases = tf. Variable (Tf.random_uniform ([Out_size],-wlimit, Wlimit))
Batch Normalization
def batch_normalization(x, phase_train, out_size):
    """
    Batch normalization on convolutional maps.
    Ref.: http://stackoverflow.com/questions/33949786/how-could-i-use-batch-normalization-in-tensorflow
    Args:
        x:           Tensor, 4D BHWD input maps
        out_size:       integer, depth of input maps
        phase_train: boolean tf.Varialbe, true indicates training phase
        scope:       string, variable scope
    Return:
        normed:      batch-normalized maps
    """
    with tf.variable_scope('bn'):
        beta = tf.Variable(tf.constant(0.0, shape=[out_size]),
                           name='beta', trainable=True)
        gamma = tf.Variable(tf.constant(1.0, shape=[out_size]),
                            name='gamma', trainable=True)
        batch_mean, batch_var = tf.nn.moments(x, [0], name='moments')
        ema = tf.train.ExponentialMovingAverage(decay=0.5)

        def mean_var_with_update():
            ema_apply_op = ema.apply([batch_mean, batch_var])
            with tf.control_dependencies([ema_apply_op]):
                return tf.identity(batch_mean), tf.identity(batch_var)

        mean, var = tf.cond(phase_train,
                            mean_var_with_update,
                            lambda: (ema.average(batch_mean), ema.average(batch_var)))
        normed = tf.nn.batch_normalization(x, mean, var, beta, gamma, 1e-3)
    return normed
Single Layer

With Tf.name_scope (' FC1 '):
    # Activates the function after bn, so here is none
    query_l1 = Add_layer (Query_batch, Trigram_d, L1_n, Activation_function=none)
    doc_positive_l1 = Add_layer (Doc_positive_batch, Trigram_d, L1_N, activation_function= None)
    doc_negative_l1 = Add_layer (Doc_negative_batch, Trigram_d, L1_n, Activation_function=none) with

Tf.name_scope (' BN1 '):
    query_l1 = batch_normalization (Query_l1, On_train, l1_n)
    doc_l1 = Batch_normalization ( Tf.concat ([Doc_positive_l1, Doc_negative_l1], axis=0), On_train, l1_n)
    doc_positive_l1 = Tf.slice (DOC_L1, [0, 0], [ Query_bs,-1])
    DOC_NEGATIVE_L1 = Tf.slice (DOC_L1, [Query_bs, 0], [-1,-1])
    query_l1_out = Tf.nn.relu (QUERY_L1)
    doc_positive_l1_out = Tf.nn.relu (doc_positive_l1)
    doc_negative_l1_out = Tf.nn.relu (DOC_NEGATIVE_L1)
······


Merge Negative samples



With Tf.name_scope (' Merge_negative_doc '):
    # Merge negative samples, tile can choose whether to expand negative samples.
    doc_y = Tf.tile (doc_positive_y, [1, 1]) for
    I in range (NEG):
        for J in Range (Query_bs):
            # Slice (input_, be gin, size) Slicing API
            doc_y = Tf.concat ([Doc_y, Tf.slice (doc_negative_y, [j * NEG + I, 0], [1,-1])], 0)
3.3 Calculation of the Cos similarity

With Tf.name_scope (' cosine_similarity '):
    # cosine Similarity
    # query_norm = sqrt (sum (each x^2))
    Query_ Norm = Tf.tile (Tf.sqrt (Tf.reduce_sum (Tf.square (query_y), 1, True)), [NEG + 1, 1])
    # doc_norm = sqrt (sum (each x^2))doc_norm = Tf.sqrt (Tf.reduce_sum (Tf.square (doc_y), 1, True)) prod = tf.reduce_sum (tf.multiply (Tf.tile (query_y),

    [ NEG + 1, 1]), doc_y), 1, True)
    Norm_prod = tf.multiply (Query_norm, Doc_norm)

    # cos_sim_raw = query * Doc/(| | query| | * || doc| |)
    Cos_sim_raw = Tf.truediv (prod, Norm_prod)
    # gamma = Cos_sim
    = Tf.transpose (Tf.reshape (Tf.transpose (Cos_sim) _raw), [NEG + 1, Query_bs])) * 20
3.4 Defining the loss function

With Tf.name_scope (' Loss '):
    # Train Loss
    # converted to Softmax probability matrix.
    prob = Tf.nn.softmax (Cos_sim)
    # takes only the first column, which is the probability of the positive sample column.
    Hit_prob = Tf.slice (prob, [0, 0], [-1, 1])
    loss =-tf.reduce_sum (Tf.log (hit_prob))
    tf.summary.scalar (' Loss ', loss)
3.5 Selecting an optimization method

With Tf.name_scope (' Training '):
    # Optimizer
    train_step = Tf.train.AdamOptimizer (flags.learning_rate). Minimize (loss)


# # 3.6 Start Training



# Create a Saver object, optionally saving variables or models.
saver = Tf.train.Saver ()
# with TF. Session (Config=config) as Sess: with
TF. Session () as Sess:
    Sess.run (Tf.global_variables_initializer ())
    Train_writer = Tf.summary.FileWriter ( Flags.summaries_dir + '/train ', sess.graph)
    start = Time.time () for
    step in range (flags.max_steps):
        Batch_ id = step% flags.epoch_steps
        sess.run (Train_step, Feed_dict=feed_dict (True, true, batch_id% flags.pack_size, 0.5))


GitHub full Code HTTPS://GITHUB.COM/INSANELIFE/DSSM



Multi-View DSSM implementation of the same, you can refer to Github:multi_view_dssm_v3



CSDN Original: http://blog.csdn.net/shine19930820/article/details/79042567


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.