February 2020
Intermediate to advanced
328 pages
8h 19m
English
In this section, we will use the IMDb dataset, which contains movie reviews and sentiment associated with it. We can import dataset from the keras library. These reviews are preprocessed and encoded as a sequence of word indexes. These words are indexed by their overall frequency in the dataset; for example, the word index 8, refers to the 8th most frequent word in the data.
Now, let's import the keras library and the imdb dataset:
library(keras)imdb <- dataset_imdb(num_words = 1000)
Let's divide the data into training and testing sets:
train_x <- imdb$train$xtrain_y <- imdb$train$ytest_x <- imdb$test$xtest_y <- imdb$test$y
Now, we can have a look at the number of reviews in the train and test data:
# number of samples in train ...
Read now
Unlock full access