August 2018
Intermediate to advanced
378 pages
9h 9m
English
We have seen that the bag-of-words approach in traditional NLP approaches ignores sentence structure. Consider applying a sentiment analysis task on the four movie reviews in the following table:
|
Id |
sentence |
Rating (1=recommended, 0=not recommended) |
|
1 |
this movie is very good |
1 |
|
2 |
this movie is not good |
0 |
|
3 |
this movie is not very good |
0 |
|
4 |
this movie is not bad |
1 |
If we represent this as a bag of words with term frequency, we will get the following output:
|
Id |
bad |
good |
is |
movie |
not |
this |
very |
|
1 |
0 |
1 |
1 |
1 |
0 |
1 |
1 |
|
2 |
0 |
1 |
1 |
1 |
1 |
1 |
0 |
|
3 |
0 |
1 |
1 |
1 |
1 |
1 |
1 |
|
4 |
1 |
0 |
1 |
1 |
1 |
1 |
0 |
In this simple ...