International - English

Cart Console

Topic Center

Contact Sales

Home > Others

DSSM & Multi-View DSSM TensorFlow implementation

Last Update:2018-07-26 Source: Internet

Author: User

Tags constant variable scope

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Learning deep structured Semantic Models for Web Search using clickthrough data and its subsequent articles

A multi-view Deep Learning approach to cross Domain User Modeling in recommendation Systems implementation Demo.

1. Data

DSSM, for input data is the query pair, that is, the query and corresponding display, the display of the click and not click, respectively, for the positive and negative samples, but also for the order of clicks, there are different assignments, specific reference to the paper.

I do not have permission to open my query data, please find the data yourself.

2. Word hashing

The original use of 3-grams, for Chinese, I used the uni-gram, because the Chinese language itself has a certain meaning (there are also paper strokes), for each gram use one-hot coding instead, the final can greatly reduce the dimension. 3. Structure

Structure diagram:

Map entries to low-dimensional vectors. Calculates the cosine similarity of queries and documents.

3.1 Input

The Tensorboard visualization is used here, so the name_scope is defined:


With Tf.name_scope (' input '):
    Query_batch = Tf.sparse_placeholder (Tf.float32, Shape=[none, Trigram_d], name= ' Querybatch ')
    Doc_positive_batch = Tf.sparse_placeholder (Tf.float32, Shape=[none, Trigram_d], name= ' DocBatch ')
    Doc_negative_batch = Tf.sparse_placeholder (Tf.float32, Shape=[none, Trigram_d], name= ' Docbatch ')
    On_train = Tf.placeholder (Tf.bool)

3.2 Full Connection layer

I use the three layer of the fully connected layer, for each layer of the full connection layer, in addition to the neuron is not the same, the others are the same, so you can write a function reuse.
LN=WNX+B1 L_n = w_n x + b_1


def add_layer (inputs, in_size, Out_size, Activation_function=none):
    wlimit = np.sqrt (6.0/(In_size + out_size))
    Weights = tf. Variable (Tf.random_uniform ([In_size, Out_size],-wlimit, wlimit))
    biases = tf. Variable (Tf.random_uniform ([Out_size],-wlimit, wlimit))
    wx_plus_b = Tf.matmul (inputs, Weights) + biases
    if Activation_function is None:
        outputs = wx_plus_b
    else:
        outputs = activation_function (wx_plus_b)
    return outputs

Among them, for weights and bias, use the specific initialization method according to the paper:


    Wlimit = np.sqrt (6.0/(In_size + out_size))
    Weights = tf. Variable (Tf.random_uniform ([In_size, Out_size],-wlimit, wlimit))
    biases = tf. Variable (Tf.random_uniform ([Out_size],-wlimit, Wlimit))

Batch Normalization

def batch_normalization(x, phase_train, out_size):
    """
    Batch normalization on convolutional maps.
    Ref.: http://stackoverflow.com/questions/33949786/how-could-i-use-batch-normalization-in-tensorflow
    Args:
        x:           Tensor, 4D BHWD input maps
        out_size:       integer, depth of input maps
        phase_train: boolean tf.Varialbe, true indicates training phase
        scope:       string, variable scope
    Return:
        normed:      batch-normalized maps
    """
    with tf.variable_scope('bn'):
        beta = tf.Variable(tf.constant(0.0, shape=[out_size]),
                           name='beta', trainable=True)
        gamma = tf.Variable(tf.constant(1.0, shape=[out_size]),
                            name='gamma', trainable=True)
        batch_mean, batch_var = tf.nn.moments(x, [0], name='moments')
        ema = tf.train.ExponentialMovingAverage(decay=0.5)

        def mean_var_with_update():
            ema_apply_op = ema.apply([batch_mean, batch_var])
            with tf.control_dependencies([ema_apply_op]):
                return tf.identity(batch_mean), tf.identity(batch_var)

        mean, var = tf.cond(phase_train,
                            mean_var_with_update,
                            lambda: (ema.average(batch_mean), ema.average(batch_var)))
        normed = tf.nn.batch_normalization(x, mean, var, beta, gamma, 1e-3)
    return normed

Single Layer


With Tf.name_scope (' FC1 '):
    # Activates the function after bn, so here is none
    query_l1 = Add_layer (Query_batch, Trigram_d, L1_n, Activation_function=none)
    doc_positive_l1 = Add_layer (Doc_positive_batch, Trigram_d, L1_N, activation_function= None)
    doc_negative_l1 = Add_layer (Doc_negative_batch, Trigram_d, L1_n, Activation_function=none) with

Tf.name_scope (' BN1 '):
    query_l1 = batch_normalization (Query_l1, On_train, l1_n)
    doc_l1 = Batch_normalization ( Tf.concat ([Doc_positive_l1, Doc_negative_l1], axis=0), On_train, l1_n)
    doc_positive_l1 = Tf.slice (DOC_L1, [0, 0], [ Query_bs,-1])
    DOC_NEGATIVE_L1 = Tf.slice (DOC_L1, [Query_bs, 0], [-1,-1])
    query_l1_out = Tf.nn.relu (QUERY_L1)
    doc_positive_l1_out = Tf.nn.relu (doc_positive_l1)
    doc_negative_l1_out = Tf.nn.relu (DOC_NEGATIVE_L1)
······

Merge Negative samples


With Tf.name_scope (' Merge_negative_doc '):
    # Merge negative samples, tile can choose whether to expand negative samples.
    doc_y = Tf.tile (doc_positive_y, [1, 1]) for
    I in range (NEG):
        for J in Range (Query_bs):
            # Slice (input_, be gin, size) Slicing API
            doc_y = Tf.concat ([Doc_y, Tf.slice (doc_negative_y, [j * NEG + I, 0], [1,-1])], 0)

3.3 Calculation of the Cos similarity


With Tf.name_scope (' cosine_similarity '):
    # cosine Similarity
    # query_norm = sqrt (sum (each x^2))
    Query_ Norm = Tf.tile (Tf.sqrt (Tf.reduce_sum (Tf.square (query_y), 1, True)), [NEG + 1, 1])
    # doc_norm = sqrt (sum (each x^2))doc_norm = Tf.sqrt (Tf.reduce_sum (Tf.square (doc_y), 1, True)) prod = tf.reduce_sum (tf.multiply (Tf.tile (query_y),

    [ NEG + 1, 1]), doc_y), 1, True)
    Norm_prod = tf.multiply (Query_norm, Doc_norm)

    # cos_sim_raw = query * Doc/(| | query| | * || doc| |)
    Cos_sim_raw = Tf.truediv (prod, Norm_prod)
    # gamma = Cos_sim
    = Tf.transpose (Tf.reshape (Tf.transpose (Cos_sim) _raw), [NEG + 1, Query_bs])) * 20

3.4 Defining the loss function


With Tf.name_scope (' Loss '):
    # Train Loss
    # converted to Softmax probability matrix.
    prob = Tf.nn.softmax (Cos_sim)
    # takes only the first column, which is the probability of the positive sample column.
    Hit_prob = Tf.slice (prob, [0, 0], [-1, 1])
    loss =-tf.reduce_sum (Tf.log (hit_prob))
    tf.summary.scalar (' Loss ', loss)

3.5 Selecting an optimization method


With Tf.name_scope (' Training '):
    # Optimizer
    train_step = Tf.train.AdamOptimizer (flags.learning_rate). Minimize (loss)

# # 3.6 Start Training


# Create a Saver object, optionally saving variables or models.
saver = Tf.train.Saver ()
# with TF. Session (Config=config) as Sess: with
TF. Session () as Sess:
    Sess.run (Tf.global_variables_initializer ())
    Train_writer = Tf.summary.FileWriter ( Flags.summaries_dir + '/train ', sess.graph)
    start = Time.time () for
    step in range (flags.max_steps):
        Batch_ id = step% flags.epoch_steps
        sess.run (Train_step, Feed_dict=feed_dict (True, true, batch_id% flags.pack_size, 0.5))

GitHub full Code HTTPS://GITHUB.COM/INSANELIFE/DSSM

Multi-View DSSM implementation of the same, you can refer to Github:multi_view_dssm_v3

CSDN Original: http://blog.csdn.net/shine19930820/article/details/79042567

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

Related Keywords:

amp html amp laravel escape amp amp events xxx amp sitefinity amp thinkgeek amp

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

What's Trending

Top 10 Tags

datastax versions naming convention zookeeper client class definition md5 microsoft sql server 2005 data structures exception handling error handling

Top 10 Keywords

microsoft download center down wordpress address url site address url wordpress address url windows installer 4 0 download 302 not found web address url definition site address url wordpress db2 integer mac os installation step by step pdf abbreviation for return

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

DSSM & Multi-View DSSM TensorFlow implementation

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support