White Papers Deriving Business Value from Big Data using Sentim | Page 4

Deriving Business Value from Big Data using Sentiment analysis A ComTechAdvisory Whitepaper DATAGENIC’S NEWS ANALYTICS SYSTEM DataGenic has been deeply engaged in the data management and aggregation business in commodity markets for many years and has a broad array of blue chip clients in the industry. Recently, it has been developing a news aggregation service to be a part of its GenicIQ product and designed to provide sentiment analysis that can be used as an input in to trading decision-making and risk management. The objective was to, / Automatically process unstructured textual data in near real-time to deliver both insight and value, /U  tilise Twitter and a multitude of news resources available online as inputs, /P  rovide sentiment Analysis that had value to traders and risk managers in the commodity space. The problem for DataGenic was very much as described above. It involved designing the delivery of a message/information system in such a way that it facilitated a greater understanding of the market. By utilizing Twitter as one source of raw information, DataGenic data scientists had to figure out how to process over 4.5 million tweets in a 34 period (equating to some 150,000 – 400,000 tweets per day) and over 1.7 million news articles over 73 days of activity. The final product Figure 1 – DataGenic’s Oil Price Sensitivity Analysis Results needed the scalability to handle much more data than that in real live use and at a greater velocity in order to produce sentiment scores, volume and indicators that could be exposed in an readily consumable form both via DataIQ and through an API. By defining a process that involved stripping unnecessary and superfluous data, utilising NLP and machine learning and the development of scoring mechanisms, DataGenic has been able to produce quite remarkable results for crude oil price sentiment in its test case (Figure 1). The product is now live and its output available to subscribers. The remarkable aspect of DatGenic’s efforts at mining big data for its intrinsic business value in the form of sentiment analysis is in how closely the Twitter sentiment score appears to predict crude oil prices.