Text and data mining
Computer-generated works


The UK Intellectual Property Office (UKIPO) has published the outcome of its consultation on the relationship between artificial intelligence (AI) and IP rights. On the copyright aspects of the consultation, the headline is that the government intends to extend the current text and data mining exception, giving AI developers greater access to data to train AI tools than ever before. For further details on the questions asked in the consultation, see "UK consultation on copyright and its role in development and deployment of AI technology".

Text and data mining

An AI tool must be trained, and this process requires vast amounts of data to be copied and analysed so that patterns, trends and other useful information can be identified and used to support the tool's decision making. This copying and analysis of data is known as "text and data mining" (TDM). Currently, access to much of that data is prevented by the copyright or database right subsisting in it, and an AI developer will need to obtain a licence (at cost) to use the data for TDM, otherwise the copyright and/or database right will be infringed.

If a work is not available for TDM on generic open licence conditions, it may be necessary to negotiate an individual licence with the rights holder. This is not always straightforward. It can be difficult, for example, to identify who the rights holder is, and they may be reluctant to negotiate or grant a licence at all. This has raised concerns that copyright and database right could hinder AI development projects by severely restricting the data available.

There is an existing TDM exception in section 29A of the Copyright Designs and Patents Act 1988, but this is unhelpful to AI developers in most cases. It permits TDM of copyright works for non-commercial purposes, but not TDM of works protected by database right. Also, it cannot be relied upon where the AI tool will be commercialised (which is most cases).

One of the UK government's stated aims is to make the United Kingdom a global leader in AI. Recognising the issues outlined above, the UKIPO's consultation considered whether the TDM licensing environment should be improved, or whether the TDM exception in section 29A should be extended, in both cases to make it easier to data mine materials. The UKIPO proposed five possible options, ranging from maintaining the status quo through to a very broad extension of the TDM exception to permit TDM in most cases.

There were 60 respondents to this aspect of the consultation. Rights holders favoured no change to the existing position, or alternatively improved licensing solutions. In their view, no problems existed around licensing for TDM, some saying that they had never refused a request for a licence. They expressed the preference to continue to control the licensing of their works themselves without government intervention. Generally, rights holders were against an extension to the TDM exception, arguing that this interfered with their rights to exploit their works and could lead to a decline in the creation and curation of datasets for mining, as they could not be licensed.

Users of copyright and database material favoured a wider exception rather than improvements to the licensing regime. They highlighted the costs of licensing and difficulties in obtaining licences, especially when many rights holders were involved. Users felt that a broader TDM exception would be the best way to reduce the cost of TDM in the United Kingdom and highlighted more attractive TDM conditions in other countries where broad exceptions already exist.

In its published consultation outcome, the government has announced that it has decided to extend the TDM exception. It has chosen the most radical of the five options on the table, which means that the TDM exception will be extended to permit TDM of works protected by both copyright and database right for both commercial and non-commercial purposes without the need for a licence. Rights holders will not be able to opt their works out of this regime, and it will not be possible to contract out of it either. However, the extended exception will, as now, require AI developers to have lawful access to the works they want to mine. Rights holders will be able to choose the platform on which they make their works available, including charging for access via subscription or single charge.

In its publication, the government says that it chose this option as being the one "most supportive of AI and wider innovation". It highlights its ambition to make the United Kingdom a global centre for AI innovation and says that the extended exception "will ensure the UK's copyright laws are among the most innovation-friendly in the world . . . with rights holders having safeguards to protect their content".

In deciding to extend the TDM exception, it seems that the government was driven by its desire to "make the most of the greater flexibilities following Brexit" and establish the United Kingdom as the jurisdiction of choice for AI developers and investors in AI alike. The extended exception will create a highly favourable AI environment in the United Kingdom. There will be no need to obtain permission for TDM from potentially multiple rights holders and no licence fee to pay, other than where required to access the works. This will likely speed up the TDM process and development of AI. The new exception will also bring benefits to those engaged in TDM in other industries, including research, journalism, marketing, business analytics and cultural heritage.

The risk is that rights holders will simply refuse to publish their data. This would be detrimental to future progress on AI development and undermine the very reason for extending the TDM exception in the first place. The extended exception may also adversely affect those who have built business models around data licensing.

The government intends to introduce legislation to extend the TDM exception "in due course".

Computer-generated works

A second issue addressed in the consultation was whether creative works generated solely by AI without any human intervention should be protected by copyright. Currently, an AI-generated work which is an original literary, dramatic, musical or artistic work is given a special form of copyright protection as a "computer-generated work" (CGW), lasting 50 years from the date on which it was made. The United Kingdom is one of only a handful of countries that gives copyright protection to creative works generated solely by AI.

The arguments for giving copyright protection to CGW include incentivising businesses to develop and deploy AI technology, which is more likely where the "person by whom the arrangements necessary for the creation of the work are undertaken" is given exclusive rights to exploit the creative output and recoup their investment costs. However, others argue that copyright is intended only to reward human creative endeavour, the originality requirement being evidence of that, and that the current protection for CGW risks devaluing human creativity, and human creators being "crowded out".

As part of the consultation, the UKIPO asked whether CGW should continue to be protected by copyright and, if so, how. Options included maintaining the status quo, removing copyright protection for CGW altogether or adopting a middle ground of a reduced term of copyright protection of CGW (five years was suggested).

There were 61 responses to this aspect of the consultation and the majority favoured no change to the law. They argued that there is no evidence that existing copyright protection for CGW causes harm, but the longer-term implications of removing protection are unclear. Changing the law, they said, would remove legal certainty, while leaving the law unchanged offered stability. Some conceded that as AI and its uses develop, the issue might need to be looked at again, and that alternative approaches might become suitable in future.

The UKIPO agreed with the majority and decided to make no change to the law on copyright protection of CGW. The UKIPO said there was:

no evidence at present that protection for CGWs is harmful, and the use of AI [to generate creative content] is still in its early stages. As such, a proper evaluation of the options is not possible . . . . [and] the future impacts of this provision are uncertain. It is unclear whether removing [copyright protection for CGW] would either promote or discourage innovation and the use of AI for the public good . . . . we will keep the law under review and could amend, replace or remove protection in future if the evidence supports it.

The UKIPO dismissed concerns that maintaining copyright protection for CGW could encourage false attribution of works to human authors in order to obtain the longer term of copyright protection available, or for reputational reasons. The UKIPO said that the existing law on passing off, fraud and privacy, and the need to establish authorship in cases of alleged copyright infringement, was adequate to deal with this and that it saw no need for intervention on the issue.


The UKIPO's decision on both copyright protection for CGW and the TDM exception makes the United Kingdom an attractive location for AI businesses and investors. In terms of next steps, AI developers need make no immediate changes to their existing processes and practices. They should, however, keep a watching eye on the implementation of the extended TDM exception, following which they will have greater freedom around data use.

Rights holders should similarly monitor the implementation of the new TDM exception. In the meantime, they may want to review which, if any, of their works they want to withdraw from public view, and work with the platforms on which their other works appear in readiness for introducing a subscription wall and payment terms.

For further information on this topic please contact Cerys Wyn Davies or Gill Dennis at Pinsent Masons by telephone (+44 20 7418 8250) or email ([email protected] or [email protected]). The Pinsent Masons website can be accessed at www.pinsentmasons.com.