Global Security and Intelligence Studies Volume 5, Number 1, Spring / Summer 2020 | Page 62
Global Security and Intelligence Studies
presence than non-political channels,
as political channels generate more divisive
discussions. Political channels
included liberal, conservative, and
neutral channels (e.g., Bernie Sanders’,
President Trump’s, and the Washington
Post’s Twitch channels, respectively).
To search for bots, we assumed that bot
users post more comments and post at
a higher rate than average users. We
deemed streams that returned data indicating
bot or bot-like user posting
as “anomalous.” The code used for this
project is located at https://github.com/
SferrellaA/twitch-analysis.
To prepare the dataset for analysis,
the commentScraper.py script
used the Twitch-Chat Downloader library
(https://pypi.org/project/tcd/) to
download the comments from the last
10 streams of Twitch channels listed in
config.ini. The comments were downloaded
in .srt (SubRip subTitle) files,
which were then refactored into .csv
(Comma-Separated Value) files with
the commentRefactor.py script.
To analyze the downloaded chat
logs, we ran the videoStats.py script.
While analyzing an individual stream,
the script did the following:
1. A data structure was created that
associates a commenter’s username
with the number of comments they
wrote. That is, by providing a given
number, such as three, a list of
all users that wrote three comments
would be generated.
2. A data structure was generated
that associates a commenter’s username
to their average and range of
comment speed (in milliseconds).
That is, by providing a username,
that user’s average and range of comment
speed would be generated.
a. Average comment speed was
defined as the average number
of milliseconds of all of the users’
comments.
b. Range of comment speed was
defined as the difference between
the longest and shortest
time between the user’s comments.
To calculate this, only
users with at least three comments
were considered. Comment
speed range was not used
in this study, but could be used
in future iterations.
3. The mean and median comment
count of each stream was then
calculated. Due to the nature of
Twitch’s platform, most streams
have right-skewed count distributions.
That is, most users write very
few comments, and a few users
write so many comments that they
bring the stream’s comment count
mean above the median.
a. Mean comment count was defined
as the average number of
comments posted by users. Users
that only watched a stream
but did not comment were not
considered in the results.
b. Median comment count was defined
as the middlemost count
of comments posted by users.
4. The mean and median comment
48