Character level language Model-dinosaurus Land
Welcome to Dinosaurus island! Million years ago, dinosaurs existed, and in this assignment they is back. You is in charge of a special task. Leading biology researchers was creating new breeds of dinosaurs and bringing them to life on earth, and your job was to GI ve names to these dinosaurs. If a dinosaur does not as it name, it might go beserk, so choose wisely!
Luckily you has learned some deep learning and you'll use it to save the day. Your Assistant have collected a list of all the dinosaur names they could find, and compiled them to this dataset. (Feel a look by clicking the previous link.) To the Create new dinosaur names, you'll build a character level language model to generate new names. Your algorithm would learn the different name patterns, and randomly generate new names. Hopefully this algorithm would keep you and your team safe from the dinosaurs ' wrath!
By completing this assignment you'll learn:how to store text data for processing using a RNN how to synthesize data, b Y sampling predictions at each time step and passing it to the next Rnn-cell unit how to build a character-level text gene Ration recurrent neural network why clipping the gradients is important
We'll begin by loading in some functions, we have the provided for your in rnn_utils. Specifically, you has access to functions such as Rnn_forward and Rnn_backward which is equivalent to those you ' ve imple Mented in the previous assignment.
Import NumPy as NP from
utils Import *
import random
1-problem Statement
1.1-dataset and preprocessing
Run the following cell to read the dataset of dinosaur names, create a list of unique characters (such as A-Z), and Comput E the dataset and vocabulary size.
data = open (' Dinos.txt ', ' R '). Read ()
data= data.lower ()
chars = List (set (data))
data_size, vocab_size = Len (data), Len (chars)
print (' There is%d total characters and%d unique characters in your data. '% (data_size, Vocab_si Ze))There is 19909 total characters and a unique characters in your data.
The characters is a-Z (characters) plus the "\ n" (or newline character), which in this assignment plays a role similar To the <EOS> (or "End of sentence") tokens we had discussed in lecture, only here it indicates the End of the Dinosa ur name rather than the end of a sentence. In the cell below, we create a Python dictionary (i.e., a hash table) to map each character to an index from 0-26. We also create a second Python dictionary that maps each index back to the corresponding character character. This would help you figure out what index corresponds to what character in the probability distribution output of the SOFTM AX layer. Below, Char_to_ix and Ix_to_char are the Python dictionaries.
Char_to_ix = {ch:i for i,ch in enumerate (sorted (chars))}
Ix_to_char = {i:ch-i,ch in Enumerate (sorted (chars))}< C2/>print (Ix_to_char)
{0: ' \ n ', 1: ' A ', 2: ' B ', 3: ' C ', 4: ' d ', 5: ' E ', 6: ' F ', 7: ' G ',
8: ' H ', 9: ' I ', ' J ', one: ' K ', ': ' l ', +: ' m ', 14: ' N ', at: ' O ',
+: ' P ', +: ' Q ', +: ' R ', +: ' s ', p: ' t ',
+: ' u ', ' V ', p: ' W ',: ' x ', ' n ', ' Y ',: ' Z '}
1.2-overview of the model
Your model would have the following structure:initialize parameters Run the optimization loop
Forward propagation to compute the loss function backward propagation to compute the gradients with respect to the loss Fu Nction Clip The gradients to avoid exploding gradients Using the gradients, update your parameter with the gradient Descen T update rule. Return the learned parameters
At each time-step, the RNN tries to predict, the next character given the previous characters. The DataSet x= (X⟨1⟩,x⟨2⟩,..., x⟨tx⟩) x = (X⟨1⟩, X⟨2⟩, ..., x⟨t x⟩) x = (X^{\langle 1 \rangle}, X^{\langle 2 \rangle}, ..., X^{\langle t_x \rangle}) is a list of characters in the training set, while y= (Y⟨1⟩,y⟨2⟩,..., y⟨tx⟩) Y = ( Y⟨1⟩, Y⟨2⟩, ..., y⟨t x⟩) y = (y^{\langle 1 \rangle}, Y^{\langle 2 \rangle}, ..., Y^{\langle t_x \rangle}) I s such in every Time-step T T, we have y⟨t⟩=x⟨t+1⟩y⟨t⟩= x⟨t + 1⟩y^{\langle t \rangle} = X^{\langle t+1 \ra Ngle}.
2-building blocks of the model
In this part, you'll build the important blocks of the overall model:
-Gradient clipping:to Avoid exploding gradients
-sampling:a technique used to generate characters
You'll then apply these, functions to build the model.
2.1-clipping The gradients in the optimization loop
In this section you'll implement the clip function that's call inside of your optimization loop. Recall that your overall loops structure usually consists of a forward pass, a cost computation, a backward pass, and a par Ameter Update. Before updating the parameters, you'll perform gradient clipping when needed to make sure that your gradients is not "E Xploding, "meaning taking on overly large values.
In the exercise below, you'll implement a function clip that takes in a dictionary of gradients and returns a clipped VE Rsion of gradients if needed. There is different ways to clip gradients; We'll use a simple element-wise clipping procedure, in which every element of the gradient vector are clipped to lie betw Een some range [n, N]. More generally, you'll provide a maxValue (say 10). In this example, if any component of the gradient vector is greater than, it would are set to 10; And if any component of the gradient vector are less than-10, it would be set to-10. If It is between-10 and ten, it is the left alone.
Exercise: Implement the function below to return the clipped gradients of your dictionary gradients. Your function takes in a maximum threshold and returns the clipped versions of Your gradients. You can check out the this hint for examples of what to clip in NumPy. You'll need to use the argument out = ....
# # # graded Function:clip
def clip (gradients, maxValue):
"
Clips the gradients ' values between minimum and M Aximum.
Arguments:
gradients--A dictionary containing the gradients "Dwaa", "Dwax", "Dwya", "db", "Dby"
maxValue--ever Ything above this number is the set to this number, and everything less than-maxvalue is set To-maxvalue
Returns:
gr Adients--A dictionary with th