Analytics Magazine Analytics Magazine, March/April 2014 | Page 23

Anyone who has ever worked with data understands that no data set is ever “clean.” most successful businesses do both, just like the Oakland A’s. Whether knee-deep in a big data implementation or just starting to explore the options, companies should consider some tips, pitfalls and best practices for getting the maximum value from their data. A good way to start is to make data analytics decisions with eyes wide open about what is truly required for setup, which tools are most effective for the organization, and how to maximize alwayslimited resources. NOTE TO SELF: DATA IS NOT PERFECT Anyone who has ever worked with data understands that no data set is ever “clean.” The situation becomes even more complicated when organizations are pulling data from multiple production applications. A NA L Y T I C S A few examples highlight the enormous, unavoidable challenges associated with data inconsistencies. Consider an international company looking to identify fraud in offices worldwide. The company may start with a database of countries with the highest risk of corruption, and then evaluate transactions for those countries. In different production applications, countries may be noted in multiple different ways depending on the system, the purpose for which the information was captured, and the individual who entered the data. For example, South Korea may be entered as a standard two-letter abbreviation such as “KR” in one system, and specified in various other standard text formats such as “South Korea,” “Korea, South” or “Republic of Korea.” M A R C H / A P R I L 2 014 | 23