Learning notes TF009: logarithm probability regression, learning notes tf009

Source: Internet
Author: User

Learning notes TF009: logarithm probability regression, learning notes tf009

The logistic function, also known as the sigmoid function, is a probability distribution function. Given a specific input, calculate the probability of output "success" and the probability of "Yes" to the reply question. Accept a single input. Multi-dimensional data or training set sample features can be combined into a single value using a linear regression model expression.

The loss function can use square errors. The training set "Yes" indicates the probability of 100% or output value 1. Loss depicts the probability that a specific sample model is allocated less than 1. The probability value of "No" is 0. The loss is when the model allocates the sample probability value and takes the square. Square error penalty and loss are the same order of magnitude. The output differs too far from the expected one, and the cross entropy (cross entropy) Outputs a larger value (penalty ). When the prediction probability of the "Yes" sample output is expected to be close to 0, the penalty value increases to near infinity. After training, the model cannot make such an incorrect prediction. TensorFlow provides a single optimization step sigmoid for output Calculation of cross entropy.

In information theory, the probability of occurrence of each character in a symbolic string is known. The Shannon entropy is used to estimate the average minimum number of digits required for each symbolic character encoding. Symbol encoding. If other probabilities are false, the length of the symbol encoding is larger. The cross entropy is used to calculate the average and minimum digits of the same string Encoding Based on the sub-optimal encoding scheme. The loss function expects the output probability distribution. The actual values are 100% and 0, and the custom probability is used as the model computing output. The probability value output by the sigmoid function. When the actual probability is equal to the custom probability, the cross entropy value is the minimum. The closer the cross entropy is to the entropy, the higher the self-defined probability is the closer the actual probability to the entropy. The closer the model output is to the expected output, the smaller the cross entropy.

Reads data from a csv file, loads resolution, and creates a batch of tensor multi-row data to improve the inference computing efficiency. Tf. decode_csv () Op converts the string (Text row) to the specified default tensor column tuples, and sets the data type for each column. Read the file and load the tensor batch_size row. Attribute data (categorical data). The inference model must convert string features into numeric features. Each attribute feature is extended to an n-dimensional Boolean feature, and each possible value corresponds to one dimension. Set the value of the dimension corresponding to the attribute to 1. The model independently weighted each possible value. A single variable indicates that only two value attributes are allowed. All feature permutation matrices are transposed by matrices. Each row is the same, and each column has a feature. Input: Call read_csv to convert and read data. The tf. equal method checks whether the attribute value is equal to the constant value. The tf. to_float method converts the Boolean value to a value. The tf. stack method packages all boolean values into a single tensor.

Training, measurement accuracy, correct prediction of the total number of samples to all samples. Sample output greater than 0.5 is converted to positive answer. Tf. equal compare whether the prediction result is equal to the actual value. Tf. performance_mean calculates the number of all correct prediction samples, divided by the total number of samples in batches, and the correct prediction percentage is obtained.

 

Import tensorflow as tf import OS # Parameter Variable initialization W = tf. variable (tf. zeros ([5, 1]), name = "weights") # Variable Weight B = tf. variable (0 ., name = "bias") # linear function constant, model offset def combine_inputs (X): # input values merge print "function: combine_inputs" return tf. matmul (X, W) + B def inference (X): # Calculate and return the output of the inferred model (Data X) print "function: inference" return tf. sigmoid (combine_inputs (X) # Call the probability distribution function def loss (X, Y): # Calculate the loss (training data X and expected output Y) print "function: loss "return tf. performance_mean (tf. nn. sigmoid_cross_entropy_with_logits (logits = combine_inputs (X), labels = Y) # average def read_csv (batch_size, file_name, record_defaults): # read data from the csv file, load parsing, create a batch read tensor multi-row data filename_queue = tf. train. string_input_producer ([OS. path. join (OS. getcwd (), file_name)]) reader = tf. textLineReader (skip_header_lines = 1) key, value = reader. read (filename_queue) decoded = tf. decode_csv (value, record_defaults = record_defaults) # convert a string (text line) to a specified default tensor column tuples, and set the data type return tf for each column. train. shuffle_batch (decoded, batch_size = batch_size, capacity = batch_size * 50, min_after_dequeue = batch_size) # Read the file, load the tensor batch_size row def inputs (): # Read or generate training data X and expect output Y print "function: inputs" # Data source: https://www.kaggle.com/c/titanic/data # model based on age, gender, fare level of a passenger to determine whether the survival passenger_id, incluved, pclass, name, sex, age, sibsp, parch, ticket, fare, cabin, embarked = \ read_csv (100, "train.csv", [[0.0], [0.0], [0], [""], [""], [0.0], [0.0], [0.0], [""], [0.0], [""], [""]) # convert attribute data is_first_class = tf. to_float (tf. equal (pclass, [1]) # first-class ticket is_second_class = tf. to_float (tf. equal (pclass, [2]) # second-class ticket is_third_class = tf. to_float (tf. equal (pclass, [3]) # third-class ticket gender = tf. to_float (tf. equal (sex, ["female"]) # gender. Male is 0, female is 1 features = tf. transpose (tf. stack ([is_first_class, is_second_class, is_third_class, gender, age]) # All feature matrix, matrix transpose, each row is the same, and each column has a feature named ved = tf. reshape (lost ved, [100, 1]) return features, lost ved def train (total_loss): # train or adjust model parameters (total computing loss) print "function: train "learning_rate = 0.01 return tf. train. gradientDescentOptimizer (learning_rate ). minimize (total_loss) def evaluate (sess, X, Y): # evaluate the training model print "function: evaluate" predicted = tf. cast (inference (X)> 0.5, tf. float32) # convert the sample output to print sess if it is greater than 0.5. run (tf. performance_mean (tf. cast (tf. equal (predicted, Y), tf. float32) # calculate the number of all correct prediction samples, divided by the total number of samples in the batch to get the correct prediction percentage # Set up the flow chart of the session object startup with tf. session () as sess: print "Session: start" tf. global_variables_initializer (). run () X, Y = inputs () total_loss = loss (X, Y) train_op = train (total_loss) coord = tf. train. coordinator () threads = tf. train. start_queue_runners (sess = sess, coord = coord) training_steps = 1000 # actual training iterations for step in range (training_steps): # actual training closed-loop sess. run ([train_op]) if step % 10 = 0: # print str (step) + "loss:", sess. run ([total_loss]) print str (training_steps) + "final loss:", sess. run ([total_loss]) evaluate (sess, X, Y) # import time for model evaluation. sleep (5) coord. request_stop () coord. join (threads) sess. close ()

 


References:
TensorFlow practices for Machine Intelligence

Welcome to join me: qingxingfengzi
My public account: qingxingfengzigz
My wife Zhang Xingqing's Public Account: qingqingfeifangz

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.