The current state-of-the-art in DL is AlphaGo Zero from Google’s DeepMind.
The combination of DeepMind’s algorithms and Google’s vast amounts of raw
data have been making great progress toward solving difficult problems, such
as image and speech recognition. We also cannot forget the fact that AlphaGo
Zero has now handily beaten the Go world champion!
Elements of AI
Now that we’ve provided a breakdown of the larger topic, let’s take a look at
some of the parts that need to be considered in tackling AI.
Data Ingestion
The large volumes of data we referred to earlier first need to be captured
before we can do anything with them. This is where data ingestion comes into
play. Think of data sources like social media streams, corporate transaction
systems and sensor data (aka the Internet of Things). This data, whether in the
form of files, transactions or streams, is often pulled and stored into a reposi-
tory. With virtually unlimited storage capacity and relatively low costs, public
cloud provides an attractive destination.
Data Munging
We use the term data munging to encompass a few concepts that generally
comprise 80-90% of the overall effort involved in AI. These include:
• ETL (extract, transform, load), to get the data into a common format
• Cleansing or removing incomplete or corrupt data
• Deduplication, to remove duplicate data that might be pulled in from dif-
ferent sources
• Enrichment, to add in third-party data that may provide a more com-
plete data set to analyze
Much of this process can be automated, but there is no magic way to avoid the
still laborious job of getting all your data ready for the data scientists to start
analyzing.
Data Analytics
Once your data has been ingested and munged into a usable state, you can
begin to apply the computational techniques of Machine Learning and Deep
Learning. This is not an exact science and generally involves quite a bit of trial
and error. It is also important to factor in a healthy dose of applicable domain
knowledge. For example, if you’re looking at marketing data, you’d better be
working with someone who understands the type of marketing you’re doing.
Or if you’re looking to improve predictive maintenance for industrial machin-
ery, you’d better include someone who knows the ins and outs of how those
machines tick.
WINTER 2018 | THE DOPPLER | 57