This section walks through the steps for downloading the chatbot data.
- Access the dataset from the following GitHub repository: https://github.com/asherif844/ApacheSparkDeepLearningCookbook/tree/master/CH07/data
- Once you arrive at the repository, right-click on the file seen in the following screenshot:
- Download TherapyBotSession.csv and save to the same local directory as the Jupyter notebook SparkSession.
- Access the dataset through the Jupyter notebook using the following script to build the SparkSession called spark, as well as to assign the dataset to a dataframe in Spark, called df:
spark = SparkSession.builder ...