This section explains how a Python function is converted into a user-defined function, udf, within Spark to apply a sentiment analysis score to each column in the dataframe.
- Textblob is a sentiment analysis library in Python. It can calculate the sentiment score from a method called sentiment.polarity that is scored from -1 (very negative) to +1 (very positive) with 0 being neutral. Additionally, Textblob can measure subjectivity from 0 (very objective) to 1 (very subjective); although, we will not be measuring subjectivity in this chapter.
- There are a couple of steps to applying a Python function to Spark dataframe:
- Textblob is imported and a function called sentiment_score is applied to the chat column to generate the ...