This article was first published on LexisNexis®PSL, August 2017. 

TMT analysis: LinkedIn has recently attempted to prevent third parties from ‘scraping’ its publicly available member profile data. Toby Headdon, senior associate at Bristows, explains the case, which was heard in the US district court in San Francisco, and examines how a UK court would potentially deal with a similar scenario.

What was this case about?

LinkedIn sought to prevent hiQ—a provider of information to businesses about their workforces based upon statistical analysis of publicly available data—from accessing, scraping and using information on LinkedIn users’ profiles. As its business is wholly dependent on this information, hiQ sought a preliminary injunction to stop LinkedIn from doing so. hiQ also sought a declaration that it would not, by accessing this information, violate the federal Computer Fraud and Abuse Act (CFAA) as alleged by LinkedIn.

The US district Court had to consider whether hiQ had established that:

  • it would be likely to succeed on the merits
  • without the preliminary injunction it would be likely to suffer irreparable harm
  • the balance of hardships tipped in its favour and
  • the injunction was in the public interest

Ultimately the district court sided with hiQ and granted the preliminary injunction it had sought.

Notably, the district court expressed serious doubt that hiQ would violate CFAA, which was intended to address hacking rather than policing traffic to publicly available websites on the internet. The court accepted that hiQ had raised serious questions that LinkedIn was unfairly leveraging its power in the networking market to secure an advantage in the data analytics market (where there was evidence that LinkedIn was seeking to compete with hiQ), and that, given its dependence on LinkedIn’s user profiles, hiQ would be likely to suffer irreparable harm.

Ultimately, the court was of the view that the public interest favoured hiQ’s position. hiQ had argued that, if LinkedIn was allowed to block scraping it would be tantamount to allowing a private party to decide ‘who gets to participate in the marketplace of ideas located in the “modern public square” of the internet’. Conversely, LinkedIn claimed to be safeguarding its users’ privacy, although the court felt that their users’ privacy interest in their public data was at best uncertain.

How would this case likely be treated in the UK courts?

The requirements for the award of a preliminary injunction in the UK are quite similar to those in the US. The court essentially asks whether there is a serious question to be tried (ie is there an arguable case), whether damages would be an adequate remedy (for either of hiQ or LinkedIn) and where the balance of convenience lies.

Factors likely to weigh in a UK court’s analysis would include:

  • the significant possibility that, absent a preliminary injunction being granted, hiQ would be driven out of business
  • the fact that LinkedIn could ultimately be compensated for any losses by an award of damages
  • LinkedIn’s power in the professional networking market and possible breaches of competition law (eg a refusal to supply)
  • uncertainty surrounding the enforceability of LinkedIn’s user agreement (which includes specific restrictions on scraping-related activities) against a web scraping program or the persons deploying that program
  • the fact LinkedIn’s user profiles are already publicly available
  • the fact that under the terms of its user agreement, LinkedIn does not own intellectual property rights in those profiles
  • hiQ’s access of those profiles would be very unlikely to amount to hacking under the Computer Misuse Act 1990 (the UK’s equivalent of CFAA)

Provided that hiQ could satisfy the court that it has an arguable case, the UK court would probably also grant a preliminary injunction.

What steps can a website take technically and legally, in the UK, to prevent a third party from scraping data?

It is ultimately very difficult to stop a determined scraper. From a legal perspective, a website owner could:

  • if it owns any intellectual property rights in the website (eg copyright or database rights), seek to rely on those rights to obtain an injunction against the scraper, and/or
  • include a suitable restriction on scraping in its terms of use (preferably with those terms being ‘click-wrap’ to ensure they are contractually enforceable against those who deploy scraping programs)

On a more technical level, there are a whole range of steps that a website owner could take depending upon the type of scraping in issue. Some of these measures, however, may have other unwelcome collateral effects such as conflicting or interfering with the use and functionality of the website in other ways. Some of the main types of action available to a website owner are to:

  • deploy the ‘robot exclusion protocol’ to instruct visiting webcrawler programs not to scrape the relevant website (although a rogue program may simply choose to ignore those instructions)
  • deploy challenge responses such as ‘captcha’ requiring the user to enter a code or select a series of nominated images to filter out non-human users, and/or
  • require registration and login to access the website