FEATURE
and near limitless computational
capacity. And, having organised nearly
all the world’s information, Google
attracts even more talent, and thus the
cycle continues.
If you, as an affiliate, regardless of
size, trigger any of its anomaly detection
algorithms, it will hurt your business.
Hold on, back up —
what do you mean it’s untrue?
According to archive.org and a snapshot
from 1 April, 2012, this page used to exist:
https://web.archive.org/
web/20120401005940/http://www.
google.com/about/company/history
Nowadays, however, that page redirects
to: google.com/about (Google’s blog).
However, the April 2012 page details a
history that’s rather different from what
you read earlier. To quote:
Our history in depth
1995-1997
● ● 1995: Larry Page and Sergey Brin meet at
Stanford. (Larry, 22, a U Michigan grad,
is considering the school; Sergey, 21, is
assigned to show him around.) According
to some accounts, they disagree about
almost everything during this first meeting.
● ● 1996: Larry and Sergey, now Stanford
computer science grad students,
begin collaborating on a search engine
called BackRub.
● ● BackRub operates on Stanford servers for
more than a year — eventually taking up
too much bandwidth to suit the university.
● ● 1997: Larry and Sergey decide that the
BackRub search engine needs a new name.
After some brainstorming, they go with
Google — a play on the word “googol,”
a mathematical term for the number
represented by the numeral 1 followed by
100 zeros. The use of the term reflects their
mission to organize a seemingly infinite
amount of information on the web.
So when, according to the 2017
statement, did Page and Brin meet?
Check that first quotation again. In truth,
Larry Page and Sergey Brin met at Stanford
in 1995, and their collaboration began
in 1996.
More specifically, their crawler began
exploring the web in March 1996.
BackRub was the crawler, as you
can see from this archive.org page:
bit.ly/backrub1997
It’s a snapshot of 29 August, 1996.
By then, BackRub had managed:
● ● Total indexable HTML URLs:
75.2306 million
● ● Total content downloaded: 207.022 gigabytes
● ● Total indexable HTML pages downloaded:
30.6255 million
● ● Total indexable HTML pages which have
not been attempted yet: 30.6822 million
● ● Total robots.txt excluded: 0.224249 million
● ● Total socket or connection errors:
1.31841 million
Note the message: “Sergey Brin has also
been very involved and deserves many
thanks.”And look again at the date. The
last time this summary was updated was 29
August, 1996. The snapshot was recorded
by Archive.org on 10 December, 1997.
However slight that difference may be,
any discrepancy between the two is equally
as vital in the practice of SEO as it is in
any court of law. Although the difference
in this trivial example might appear to be
simple nitpicking, a similar misjudgment
arising from an error of even such a
narrow margin could cost a large corporate
igaming operator literally billions.
Case closed: fake news
By no means am I implying malicious
intent. It’s only a tiny white lie. However,
it is consistent with the public relations
mastery through which we learned to
trust Google, even with our most private
of information. Almost childlike is our
trust in it, whether it’s with our personal
communications, our real-time location
data, its news aggregation or even its entire
legitimacy. We trust its claims to have
fixed the AdWords click fraud. We used
to trust its adherence to monopoly laws
with respect to shopping results and now
allow it into our homes, as Dave Snyder
quite accurately predicted in 2011 on the
iGaming Affiliate Demon SEO panel
in Dublin (bit.ly/DavePredicts2011).
Can you imagine what would happen
if your private search history were to
become public information?
Because I suspect no malice on
Google’s part, I think instead that it is
likely to have heeded the message of Seth
Godin’s talk to Googlers from July 2007
(youtube.com/watch?v=AZnYRaQfjK4
or tinyurl.com/godin2007). He clarified
how ideas spread, and it makes sense to
simplify the message regarding Google’s
birthday. (At the time of his 2007 talk his
most recent book was All Marketers Are
Liars: The Power of Telling Authentic
Stories in a Low-Trust World.) My point
was simply to illustrate the approach
required where search engines are
concerned. So we start with the same
methodology, first used to power Google’s
inverted index, aka the magic that enables
its distributed performance. Let me explain
the importance of Google innovations
before we address the implication.
Deep learning the Google parts
To illustrate the trajectory we need to
observe Google’s history leading up to
2011 and the work of senior fellows Jeff
Dean and Geoffrey Hinton. Hinton is
currently emeritus professor of computer
science at University of Toronto and an
engineering fellow at Google. He co-
invented the Boltzmann machine in 1985
and is recognised as one of the pioneers of
neural networks. Dean, having worked
with Google since mid 1999, designed
and implemented many of the
innovations there, including: MapReduce,
a methodology whereby distributed
computation may be carried out over
multiple machines in parallel; and Bigtable,
a distributed storage system for structured
data designed to run on cheap commodity
hardware connected via a network. In
2011, he and a small team of engineers
invented DistBelief. The following quote
is taken from the original paper, Large
Scale Distributed Deep Networks, that
he co-authored with his colleagues, using
tens of thousands of CPU cores to develop
a parallelised methodology to an object
recognition task with 16 million images
and 21k categories: “In this paper, we
consider the problem of training a deep
network with billions of parameters using
tens of thousands of CPU cores. We have
developed a software framework called
DistBelief that can utilize computing
clusters with thousands of machines to
train large models.”
He then led a team that generalised
DistBelief into a library built on a
Python interface. Despite Python being
a relatively slow language, it is popular
iGB Affiliate Issue 65 OCT/NOV 2017
31