Coursera deeplearning Sequence model Week1 Dinosaurus Character level language model

Source: Internet
Author: User
Character level language Model-dinosaurus Land

Welcome to Dinosaurus island! Million years ago, dinosaurs existed, and in this assignment they is back. You is in charge of a special task. Leading biology researchers was creating new breeds of dinosaurs and bringing them to life on earth, and your job was to GI ve names to these dinosaurs. If a dinosaur does not as it name, it might go beserk, so choose wisely!

Luckily you has learned some deep learning and you'll use it to save the day. Your Assistant have collected a list of all the dinosaur names they could find, and compiled them to this dataset. (Feel a look by clicking the previous link.) To the Create new dinosaur names, you'll build a character level language model to generate new names. Your algorithm would learn the different name patterns, and randomly generate new names. Hopefully this algorithm would keep you and your team safe from the dinosaurs ' wrath!

By completing this assignment you'll learn:how to store text data for processing using a RNN how to synthesize data, b Y sampling predictions at each time step and passing it to the next Rnn-cell unit how to build a character-level text gene Ration recurrent neural network why clipping the gradients is important

We'll begin by loading in some functions, we have the provided for your in rnn_utils. Specifically, you has access to functions such as Rnn_forward and Rnn_backward which is equivalent to those you ' ve imple Mented in the previous assignment.

Import NumPy as NP from
utils Import *
import random
1-problem Statement 1.1-dataset and preprocessing

Run the following cell to read the dataset of dinosaur names, create a list of unique characters (such as A-Z), and Comput E the dataset and vocabulary size.

data = open (' Dinos.txt ', ' R '). Read ()
data= data.lower ()
chars = List (set (data))
data_size, vocab_size = Len (data), Len (chars)
print (' There is%d total characters and%d unique characters in your data. '% (data_size, Vocab_si Ze))
There is 19909 total characters and a unique characters in your data.

The characters is a-Z (characters) plus the "\ n" (or newline character), which in this assignment plays a role similar To the <EOS> (or "End of sentence") tokens we had discussed in lecture, only here it indicates the End of the Dinosa ur name rather than the end of a sentence. In the cell below, we create a Python dictionary (i.e., a hash table) to map each character to an index from 0-26. We also create a second Python dictionary that maps each index back to the corresponding character character. This would help you figure out what index corresponds to what character in the probability distribution output of the SOFTM AX layer. Below, Char_to_ix and Ix_to_char are the Python dictionaries.

Char_to_ix = {ch:i for i,ch in enumerate (sorted (chars))}
Ix_to_char = {i:ch-i,ch in Enumerate (sorted (chars))}< C2/>print (Ix_to_char)
{0: ' \ n ', 1: ' A ', 2: ' B ', 3: ' C ', 4: ' d ', 5: ' E ', 6: ' F ', 7: ' G ',
8: ' H ', 9: ' I ', ' J ', one: ' K ', ': ' l ', +: ' m ', 14: ' N ', at: ' O ', 
+: ' P ', +: ' Q ', +: ' R ', +: ' s ', p: ' t ', 
+: ' u ', ' V ', p: ' W ',: ' x ', ' n ', ' Y ',: ' Z '}
1.2-overview of the model

Your model would have the following structure:initialize parameters Run the optimization loop
Forward propagation to compute the loss function backward propagation to compute the gradients with respect to the loss Fu Nction Clip The gradients to avoid exploding gradients Using the gradients, update your parameter with the gradient Descen T update rule. Return the learned parameters

Figure 1: Recurrent neural Network, similar-to-do had built in the previous notebook "Building a rnn-step by Step ".

At each time-step, the RNN tries to predict, the next character given the previous characters. The DataSet x= (X⟨1⟩,x⟨2⟩,..., x⟨tx⟩) x = (X⟨1⟩, X⟨2⟩, ..., x⟨t x⟩) x = (X^{\langle 1 \rangle}, X^{\langle 2 \rangle}, ..., X^{\langle t_x \rangle}) is a list of characters in the training set, while y= (Y⟨1⟩,y⟨2⟩,..., y⟨tx⟩) Y = ( Y⟨1⟩, Y⟨2⟩, ..., y⟨t x⟩) y = (y^{\langle 1 \rangle}, Y^{\langle 2 \rangle}, ..., Y^{\langle t_x \rangle}) I s such in every Time-step T T, we have y⟨t⟩=x⟨t+1⟩y⟨t⟩= x⟨t + 1⟩y^{\langle t \rangle} = X^{\langle t+1 \ra Ngle}. 

2-building blocks of the model

In this part, you'll build the important blocks of the overall model:

-Gradient clipping:to Avoid exploding gradients
-sampling:a technique used to generate characters

You'll then apply these, functions to build the model. 

2.1-clipping The gradients in the optimization loop

In this section you'll implement the clip function that's call inside of your optimization loop. Recall that your overall loops structure usually consists of a forward pass, a cost computation, a backward pass, and a par Ameter Update. Before updating the parameters, you'll perform gradient clipping when needed to make sure that your gradients is not "E Xploding, "meaning taking on overly large values.

In the exercise below, you'll implement a function clip that takes in a dictionary of gradients and returns a clipped VE Rsion of gradients if needed. There is different ways to clip gradients; We'll use a simple element-wise clipping procedure, in which every element of the gradient vector are clipped to lie betw Een some range [n, N]. More generally, you'll provide a maxValue (say 10). In this example, if any component of the gradient vector is greater than, it would are set to 10; And if any component of the gradient vector are less than-10, it would be set to-10. If It is between-10 and ten, it is the left alone.

Figure 2: Visualization of gradient descent with and without gradient clipping, in a case where the network is RU Nning into slight "exploding gradient" problems.

Exercise: Implement the function below to return the clipped gradients of your dictionary gradients. Your function takes in a maximum threshold and returns the clipped versions of Your gradients. You can check out the this hint for examples of what to clip in NumPy. You'll need to use the argument out = ....

# # # graded Function:clip

def clip (gradients, maxValue):
    Clips the gradients ' values between minimum and M Aximum.

    gradients--A dictionary containing the gradients "Dwaa", "Dwax", "Dwya", "db", "Dby"
    maxValue--ever Ything above this number is the set to this number, and everything less than-maxvalue is set To-maxvalue

    gr Adients--A dictionary with th
Related Article

Alibaba Cloud 10 Year Anniversary

With You, We are Shaping a Digital World, 2009-2019

Learn more >

Apsara Conference 2019

The Rise of Data Intelligence, September 25th - 27th, Hangzhou, China

Learn more >

Alibaba Cloud Free Trial

Learn and experience the power of Alibaba Cloud with a free trial worth $300-1200 USD

Learn more >

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.