- We start by loading all the necessary libraries, as follows:
import numpy as np import tensorflow as tf
- First, we load the text data:
article_filename = 'Data/summary/"Data/sumdata/train/train.article.txt'title_filename = 'Data/summary/"Data/sumdata/train/train.title.txt'with open(article_filename) as article_file: articles = article_file.readlines()with open(title_filename) as title_file: titles = title_file.readlines()
- To make our data readable for our model, we need to define a function that creates lookup tables for integers to vocabulary and vice versa:
def create_lookup_tables(text): vocab = set(text.split()) vocab_to_int = {'<S>': 0, '<E>': 1, '<UNK>': 2, '<PAD>': 3 } for i, v in enumerate(vocab, len(vocab_to_int)): ...