When developing the author classification model, the number of integers for each training and test text data need to be of equal length. We can achieve this by padding and truncating the sequence of integers, as follows:
# Padding and truncationtrainx <- pad_sequences(trainx, maxlen = 300) testx <- pad_sequences(testx, maxlen = 300)dim(trainx) [1] 2500 300
Here, we are specifying the maximum length of all the sequences, that is, maxlen, to be 300. This will truncate any sequences that are longer than 300 integers in an article and add zeroes to sequences that are shorter than 300 integers in an article. Note that for padding and truncation, a default setting of "pre" has been used and is not specifically ...