Over the years web scraping has rapidly increased, and in 2013, about 23% of all internet traffic was scraping-related. It is extremely easy to perform as just simple online search will provide step-by-step guides of how to do it. This could also have something to do with it gradually increasing over time as people are connected to the internet more now than before.
Should we combat web scraping?
There are some cases in which web scraping can be very disruptive to the general public. Due to this, the Computer Fraud and Abuse Act (CFAA) was created to combat web scraping. The CFAA imposes liability on “whoever…intentionally accesses a computer without authorization or exceeds authorized access, and thereby obtains…information from any protected computer”. This has helped to solve many cases, but there is one main underlying question, does web scraping actually violate the CFAA?
The CFAA was originally intended to be an anti-hacking statute, not used to combat web scraping. Many think that web scraping doesn’t violate the CFAA because it only accesses information on websites that is already publicly available, which means that it doesn’t violate the CFAA. However, things are never as simple as they seem, just as there are always two sides to an argument. There are many cases where the CFAA have intervened in web scraping.
On one side of the web scraping debate are the people who are simply accessing public information and are not actually harming anyone through web scraping. For example, companies that have limited resources make use of web scraping to access large amounts of data quickly, so that they can identify what is needed on the market according to consumer demand. This, technically, helps the general public if they then generate whatever is in demand, so here there would probably not be many complaints about violating the CFAA.
However, on the flipside of this argument are the people who use web scraping to con people out of money, or just to generally disrupt day-to-day life. These websites will take data and use it against the individual person. An example of this is a website called Jerk.com which supposedly scraped personal information from Facebook, changing people’s profiles to identify them as a “Jerk” or “not a Jerk”. The people who were affected were then allegedly told that if they wanted to get rid of the labels, they could remove them by paying $30 to the website. This affected around 73 million people. Fair enough to say, all these victims (including children) were not amused by this kind of web scraping, so it is easy to understand why many argue about web scraping violating the CFAA.
Has the CFAA been successful combating web scraping?
Yes and no. The CFAA was not originally intended to combat web scraping; it was designed to prevent hacking as mentioned earlier. This means that in some scraping cases that are brought forward to the courts, claiming CFAA violation has been a success, whereas in others, the courts have been more unwilling to accept the CFAA violation case.
Again, there are usually two types of cases brought forward. One of these is by websites that have been scraped, contacted the other website which is scraping them, and clearly revoking authorization for them to use any information gathered. If the scraping website continues to gather information despite being politely asked (or more likely warned) to stop, they can be taken to court under the violation of the CFAA. These types of cases have had the most success in court. This is most likely because the ‘victim’ website contacts the other website and tries to solve the problem, which means that the CFAA will definitely be violated if that website then continues to scrape information without authorization.
So, from this, how do we know when publicly available data actually is publicly available? Should we really upload personal information on websites such as Facebook? This is still an ongoing debate to which there are many sides. However, it is good to be aware of everything that can happen when using the internet. Being socially aware is the key to protecting yourself against online fraud, and if something does eventually affect you, remember that when properly used, the CFAA remains a feasible tool to combat web scraping.