Let's take a look at the preceding process sequentially:
- The initial phase was doing the PubMed search through the pubmed.mineR library.
- The search keyword is converted into the PubMed search_query term using the EUtilsSummary() function.
- Once the search_query term has been created, it parses through EUtilsGet() to get the actual search result from PubMed. The search results are extracted into an object.
- Later on, the abstract text has been retrieved, and a vector has been created.
- Once the vector of the text data has been created, the pre-processing step began from here. Using the tm library, you have created the corpus of the abstract by giving the vector input by the following code line:
AbstractCorpus <- Corpus(VectorSource(abstracts)) ...