White Papers Deriving Business Value from Big Data using Sentim | Page 4
Deriving Business Value from Big Data using Sentiment analysis
A ComTechAdvisory Whitepaper
DATAGENIC’S NEWS ANALYTICS SYSTEM
DataGenic has been deeply engaged in the data management and aggregation business in commodity markets
for many years and has a broad array of blue chip clients in the industry. Recently, it has been developing a news
aggregation service to be a part of its GenicIQ product and designed to provide sentiment analysis that can be
used as an input in to trading decision-making and risk management. The objective was to,
/ Automatically process unstructured textual data in near
real-time to deliver both insight and value,
/U
tilise Twitter and a multitude of news resources available online as inputs,
/P
rovide sentiment Analysis that had value to traders and
risk managers in the commodity space.
The problem for DataGenic was very much as described
above. It involved designing the delivery of a message/information system in such a way that it facilitated a greater understanding of the market. By utilizing Twitter as one source of
raw information, DataGenic data scientists had to figure out
how to process over 4.5 million tweets in a 34 period (equating to some 150,000 – 400,000 tweets per day) and over 1.7
million news articles over 73 days of activity. The final product
Figure 1 – DataGenic’s Oil Price Sensitivity Analysis Results
needed the scalability to handle much more data than that in
real live use and at a greater velocity in order to produce sentiment scores, volume and indicators that could be exposed in
an readily consumable form both via DataIQ and through an
API.
By defining a process that involved stripping unnecessary and
superfluous data, utilising NLP and machine learning and the
development of scoring mechanisms, DataGenic has been
able to produce quite remarkable results for crude oil price
sentiment in its test case (Figure 1). The product is now live
and its output available to subscribers. The remarkable aspect
of DatGenic’s efforts at mining big data for its intrinsic business value in the form of sentiment analysis is in how closely
the Twitter sentiment score appears to predict crude oil prices.