The Technology Headlines DEMAND FORCASTING & AI | Page 33
THE TECHNOLOGY HEADLINES
EXPERT ANALYSIS
“ Data lakes are a good low-cost solution for storing data in a variety of formats
“
transformation to fit it into the data warehouse(which can
cause a long wait in some organizations). If data scientists
areanalyzing a new business question, they may not want
the structure and transformation imposed by pulling data
from the data warehouse. If the exploration proves useful
and there are requests for similar analyses, that data can
then be standardized and added to a curated zone in the
data lake or to the data warehouse. But it would be a waste
of effort to immediately integrate any new data to the data
warehouse before repeated value was found.
Not every analyst has or needs the tools and skills to
extract data from the data lake. You don’t want every
analyst to perform their own transformation and cleansing.
Automating data transformation and load to a data
warehouse reduces duplicatetransformation efforts and
undesirablevariations in calculations.
An analytical databasesummarizes data and delivers it to
end users with predefined relationships and calculations,
making it easy to see and compare metrics through a drag-
and-drop reporting interface. It is used to provide quick,
consistent answers to common or important questions. The
predefined relationships and calculations may meet 70% of
the organization’s analytical needs. Some analysts or report
developers may need direct access to the data warehouse
AUGUST 2019
to write queriesthatuse different relationships and
calculations. If they often write the same queries, that data
should be considered for addition to the analyticaldatabase.
Catalog and Limit Access to Sensitive Data
Special consideration should be given to sensitive data, such
as PII (personally identifiable information). It may not be
advisable to keep a copy of sensitive data in every layer of
your data platform. One strategy is to load data only to an
area of the data warehouse with restricted access, deleting
any staged copies from the data lake and declining to add
itto analytical databases.The data warehouse may offer
features to encrypt sensitive columns so only users with
special permissions can view that column while other users
can see other less sensitive columns in the table.Sensitive
data should be catalogued and tagged for easy identification,
audit, and deletion as needed.In this era of data leaks and
GDPR, keeping an inventory of sensitive data is important
for compliance and risk management.
Modernizing your analytics architecture is more than
just adding technologies to your stack. It gives you the
opportunity to reassess how your organization uses data
and to meet business needswith timely data in usable
formats.
33