The most important step in sentiment analysis (as is the case with most machine learning problems) is the preprocessing of our data. The following table contains 10 tweets, randomly sampled from the dataset:
id |
text |
44 |
@JonathanRKnight Awww I soo wish I was there to see... |
143873 |
Shaking stomach flipping........god i hate thi... |
466449 |
why do they refuse to put nice things in our v... |
1035127 |
@KrisAllenmusic visit here |
680337 |
Rafa out of Wimbledon Love Drunk by BLG out S... |
31250 |
It's official, printers hate me Going to sul... |
1078430 |
@_Enigma__ Good to hear |
1436972 |
Dear Photoshop CS2. i love you. and i miss you! |
401990 |
my boyfriend got in a car accident today ! ... |