D ATA C E N T R E S
Reliability Testing
Testing, Testing…
By: Giacomo Losio, Head of Technology, ProLabs
Introduction
Giacomo Losio
discusses the
conviction of a
successful optical
infrastructure
Reliability stems from a company’s
culture, from its values, the way
products are conceived and designed
– testing is the responsibility of
everyone. Many companies can offer
a working transceiver, but only a few
offer reliable ones. In the world of
optical infrastructure equipment, the
importance of reliability testing is
sometimes understated. As a technology
developer, few things cause as much
delight as taking a new product all the
way through a laborious design process
and to celebrate when it works exactly
as you’d hoped. However, the inevitable
and all-important question soon comes
to mind - ‘but how long for?’
It goes without saying that in the
optical transceiver space reliability
and expected lifespan are absolute
necessities. Data centre failures can
be expensive in a number of different
ways: the recurring damage to adjoining
infrastructure components, the cost of
delays caused by system downtime and
the long-term reputational damage as
to the trustworthiness of the provider in
question. For these reasons, the quality
and reliability of data centre products
has always been at the forefront of
consumers’ minds.
Although for a long time OEMs
have demanded large sums of money
for their products, end-users have seen
this expense as a way to insure against
a catastrophic data centre failure,
choosing to side with the big brands
regardless of cost. Historically, the
lower-end data centre infrastructure
market has been so saturated with
underperforming providers that
procurers have felt as though they were
taking huge risks by purchasing parts
from lesser known brands. In effect,
customers have tended to weigh up the
costs of a data centre failure and decide
that although OEM’s are much more
expensive, they are not as expensive as a
product malfunction.
How Are Products Tested?
Reliability is the probability that a
product will perform its intended
function in a satisfactory manner, for a
specified period of time, when operating
under specified conditions. To be
reliable, a product need not last forever;
more rather, it must be ‘predictable’ - if
it purports to last 20 years, it needs to
work faultlessly for at least 20 years.
Reliability testing is based on
benchmarks set by standards agencies
such as Telcordia, IEC and even
military standards. Every device/subassembly used in the transceiver has to
be qualified independently and internal
interconnects have to be verified with
particular attention since it is in this
area where mechanical stress can often
occur and the device can fail. Tests are
clearly defined and readily repeatable
with some tests running for as long
as 2000-5000 hours (3-7 months!) or
so). Products must be tested in specific
environments (QT tests), and in tests
called ALT (accelerated life tests) and
HALT (highly accelerated life tests). The
provider is gauging whether or not the
product can be released in the first place,
and then pre-determining percentage
failures in order to focus on continuous
design improvement.
The quality and reliability of data centre products has always been at the forefront of consumers’ minds.
16 NETCOMMS europe Volume V Issue 6 2015
www.netcommseurope.com