In our internal data science meetings,
we love to think about, tinker with and
invent our next generation algorithms
for when a customer is “at scale.” When
at scale, a customer has run enough ad
campaigns that have created enough
data that we can finally apply some of
our cutting-edge predictive analytics,
machine learning and optimization algorithms. That is, we actually have some
big data to work with.
EARLY ON: SMALL DATA
Long before a customer is at scale,
they are essentially in a start-up phase. In
this phase, terabytes and petabytes are replaced by mere megabytes. A/B/n testing
is replaced by just … A. Predictive analytics is replaced by anecdotal evidence. And
sample sizes are so small that the concept
of statistical significance is, well, insignificant. From a data science perspective, we
refer to this as small data.
But despite the lack of data during this
start-up phase, customers still expect our
platform to optimize their ad campaigns.
So how do we approach this situation?
We will address this and other similar
situations in the “Big Data Dreams, Small
Data Reality” column.
A few other obvious examples include planning for new businesses, new
products or services and new business
processes.
A NA L Y T I C S
A start-up almost certainly lacks
the historical data that an established
company has collected about its operations, finances or sales and marketing
strategies. Yet a new business still needs
to plan its future: which products or
services to launch, which customers to
target, how to set pricing policies, how
to promote the brand, how to layout
the website, how much to staff up, and
so on. All of these decisions could be
aided by data, if only you had some. In
the absence of data, one of the most
important parts of planning to make
data-driven decisions is how you structure
your decision model. Did you include the
right objectives, constraints and other
assumptions?
Even though you have no data, you
still have to populate your model with
something, for example industry benchmark data, data from public company
SEC filings, probability distributions
(if you want to use something more
sophisticated like Monte Carlo simulation), and yes, even gut-feel values.
As you start gathering data, you can
transition from those external data
sources to your own internal data. But
when do you make this transition? How
much data is enough data?
In contrast to new businesses,
well-established companies, such as
the Fortune 500, have databases upon
J A N U A R Y / F E B R U A R Y 2 014
|
23