Intelligent Issue 23 | Page 34


Web data collection :


When analysing data collection , it ’ s important to understand what is being collected and how it is being processed . Ron Kol , CTO at Bright Data , speaks to Intelligent SME . tech about public web scraping and debunks five myths surrounding it .

AITING FOR DATA is boring .


Businesses of all kinds , especially consumer-facing brands that have to move quickly within their market and react to customer changes , are tired of waiting around for it . This is why public web data collection – perhaps better known as public web scraping – has become a ‘ go-to ’ strategic move . It helps organisations remain competitive in a market that ’ s predominantly volatile , where one market action triggers another and another and so on . Real-time data can answer many of the questions that toplevel executives have about their future profits . However , this vast and expansive field carries more than a few myths around how it works , why it ’ s important and whom it benefits .
Before we continue , it is important to note and keep in mind that web scraping is an essential real-time resource that contributes to the success of an organisation . Some people still think the industry has ambiguous borders , but given how rapidly it is developing and expanding , it is important to dispel some of the myths that have been connected to it recently .
Myth # 1 : Harvesting , collecting , scraping , it ’ s all illegal . . . wrong !
Let ’ s put this one to bed ; public web scraping is not illegal , full stop . The website is within the bounds established by the law as long as it is freely available and not behind a paywall or log-in type portal . In fact , a recent judgement from a US Federal Court in the hiQ / LinkedIn case likens instances of public online scraping to window shopping .
Additionally , start-ups , SMEs and large corporations all take part in public online data collection to monitor the strategic choices and market trends of their rivals , as well as to conduct fresh market research on their own data . The main goal is to find new avenues for innovation and growth while also making sure that an organisation doesn ' t pass up any opportunities that will help guide more success .
As with all processes , it is vital that businesses follow compliance regulations and if their public web scraping is outsourced , they must always work with their data collection provider to ensure that all operations are legal and ethical . To avoid any doubt , businesses should work with providers to understand what can and can ’ t be collected , both from a legal and ethical standpoint .
One has a moral obligation to ensure that the data they collect is ethical and promotes the greater good because there are no regulations in this area . If not , they need to re-evaluate their plans – it would be immoral and also illegal to do otherwise .
Myth # 2 : Web scraping hurts businesses and makes it more difficult for them to compete .
Another myth busted ! Totally not true , in fact , quite the opposite . Public web data collection , or web scraping , provides anyone with the transparency needed when accessing the Internet . It allows all players in the market to compete openly by providing accurate market research information . For example , if Company A wishes to set its own pricing strategy in
34 intelligent
. tech
Intelligent SME . tech