The Technology Headlines DEMAND FORCASTING & AI | Page 33

THE TECHNOLOGY HEADLINES EXPERT ANALYSIS “ Data lakes are a good low-cost solution for storing data in a variety of formats “ transformation to fit it into the data warehouse(which can cause a long wait in some organizations). If data scientists areanalyzing a new business question, they may not want the structure and transformation imposed by pulling data from the data warehouse. If the exploration proves useful and there are requests for similar analyses, that data can then be standardized and added to a curated zone in the data lake or to the data warehouse. But it would be a waste of effort to immediately integrate any new data to the data warehouse before repeated value was found. Not every analyst has or needs the tools and skills to extract data from the data lake. You don’t want every analyst to perform their own transformation and cleansing. Automating data transformation and load to a data warehouse reduces duplicatetransformation efforts and undesirablevariations in calculations. An analytical databasesummarizes data and delivers it to end users with predefined relationships and calculations, making it easy to see and compare metrics through a drag- and-drop reporting interface. It is used to provide quick, consistent answers to common or important questions. The predefined relationships and calculations may meet 70% of the organization’s analytical needs. Some analysts or report developers may need direct access to the data warehouse AUGUST 2019 to write queriesthatuse different relationships and calculations. If they often write the same queries, that data should be considered for addition to the analyticaldatabase. Catalog and Limit Access to Sensitive Data Special consideration should be given to sensitive data, such as PII (personally identifiable information). It may not be advisable to keep a copy of sensitive data in every layer of your data platform. One strategy is to load data only to an area of the data warehouse with restricted access, deleting any staged copies from the data lake and declining to add itto analytical databases.The data warehouse may offer features to encrypt sensitive columns so only users with special permissions can view that column while other users can see other less sensitive columns in the table.Sensitive data should be catalogued and tagged for easy identification, audit, and deletion as needed.In this era of data leaks and GDPR, keeping an inventory of sensitive data is important for compliance and risk management. Modernizing your analytics architecture is more than just adding technologies to your stack. It gives you the opportunity to reassess how your organization uses data and to meet business needswith timely data in usable formats. 33