Typical to the need for RNN, we will look at a given sequence of 10 words to predict the next possible word. For this exercise, we will take the Alice dataset to generate words, as follows (the code file is available as RNN_text_generation.ipynb in GitHub):
- Import the relevant packages and dataset:
from keras.models import Sequentialfrom keras.layers import Dense,Activationfrom keras.layers.recurrent import SimpleRNNfrom keras.layers import LSTMimport numpy as npfin=open('alice.txt',encoding='utf-8-sig')lines=for line in fin: line = line.strip().lower() if(len(line)==0): continue lines.append(line)fin.close()text = " ".join(lines)
A sample of the input text looks as follows:
- Normalize the text to remove punctuations ...