REAL - T IME T E X T A NA LY T I C S
for each feature in the sentence is then
calculated by summing up the featureopinion scores for that sentence. (Each
feature-opinion score is obtained from
the sentiment polarity of the opinion
word and a multiplicative inverse of the
distance between the feature and opinion word. Opinion words at a distance
from the feature are assumed to be less
associated to the feature compared to
the nearer words.)
For example, the phone is useful and
a great work of art.
Let the feature here be phone and
opinion words be “useful,” “great.”
Semantic orientation of useful = 1
Semantic orientation of great = 1
Distance between the words useful
and phone = 2
Distance between the words great
and phone = 5
score(f)=1/2+1/5= 0.7
Aggregating opinions for tweets: The
sentiment score for a tweet is the summation of the scores for all opinion words
present in the tweet.
For example, “The phone is useful
and a great work of art.”
The opinion words in the sentence are
“useful,” “great”
Semantic orientation of useful = 1
Semantic orientation of great = 1
score(t) = 1 +1= 2
38
|
A N A LY T I C S - M A G A Z I N E . O R G
Negation-rule: This identifies the negation word (which can be 1 or 2 places
before the opinion word) and reverses
the opinion expressed in a sentence.
For example, “The phone is not good.”
Here phone gets negative orientation.
Context-dependent rules: The features
for which we find no opinion words, context
dependent constructs are used to identify
the orientation score.
For example, “The phone is good but
battery-life is short.”
The only opinion word in the sentence
is “good” (“short” is a context-dependent
word).
Phone gets positive orientation because of “good.”
Battery-life gets negative orientation
because of the word “but” being present
between good and battery-life.
Topic Evolution. The next step to
topic modeling is to understand how topics and trends develop, evolve and go viral
over time.
The algorithm maintains a fixed number of topic streams and their statistics.
Each tweet is processed as it comes in
and is assigned to the “closest” topic
stream (the topic stream most similar to
it). If no topic stream is close enough,
then a new stream is created and a stale
stream is killed to maintain a fixed number
W W W. I N F O R M S . O R G