EDITOR’S
30Q&A
HOW DO YOU HELP
ORGANISATIONS
MAKE SENSE OF
DATA INSIGHTS IN A
SUSTAINABLE AND
REPEATABLE WAY?
Businesses are facing an influx of
different data types which means
companies need to make important
decisions on how to use these insights in a
sustainable and repeatable way.
Data is really only valuable if you can
translate it into actionable insights and with
more data available, companies have a lot
to manage.
Yossi Naar, Chief Visionary Officer and
Co-founder, Cybereason, said: “In security
data analysis, hunting and AI-driven
automated detection, the quality of your
results depends heavily on the quality of
your data. However, we often find ourselves
having to process more data than we or
our systems can handle. Depending on the
organisation, there are essentially three
INTELLIGENT
INTELLIGENT TECH CHANNELS TECH CHANNELS Issue
approaches to managing and processing the
data: sampling – analysing a statistically
significant subset of the data; filtering –
removing data that we deem unimportant or
repetitive and scaling up – finding tools and
technologies that will allow us to process all
data in an effective way.
“Sampling can be an effective way to
learn about the statistical nature of the data,
but it’s not very useful if you’re looking for a
needle in a haystack or if you require access
to specific data. Filtering is a good strategy
when you have high certainty that your
filtering methods are reliable, do not change
the statistics of the data collected and can
be guaranteed to retain all important data.”
He said that ‘smart’ filtering sounds
better than it really is.
Naar added: “To help illustrate my point,
let’s delve into the filtering approach to data
collection – a favourite approach for several
organisations and security vendors. Say that
we limit the collection of network data to
100 connections for every process. On the
surface, this sounds reasonable. The average
number of connections per process is much
lower on most endpoints, so you can expect
to filter very little of the data.
“However, in reality, data patterns
in computerised environments follow
aggressive power law distribution and not
linear or even natural distribution.
“As a consequence of this behaviour, any
type of cap-based filtering will remove the
vast majority of the data. Even if you try
and factor in malicious behaviour into the
filtering algorithm – what some vendors call
‘smart filtering’, there are still several issues.”
He added: “As such, there is nothing
about filtering that is smart. It’s not designed
to reduce ‘noise’. It’s merely a strategy to
overcome technological limitations of the
server-side systems and save on the cost
of the solution. However, this comes at a
significant cost to the integrity of the data,
the quality of detection and the security
Yossi Naar, Chief Visionary Officer and Co-founder, Cybereason
value provided by the system. When you
apply arbitrary/smart /statistical filtering,
you will inevitably introduce blindness to
your system. And hackers will exploit it –
either deliberately by understanding how
you made your decision or by accident –
because you can never have 100% certainty
on what particular piece of data can be
completely ignored.”
Sampling can be an
effective way to learn
about the statistical
nature of the data, but
it’s not very useful if
you’re looking for a
needle in a haystack or
if you require access to
specific data.
49