In the intense debate over the growth of artificial intelligence (AI), the system most talked about recently is ChatGPT. Until recently, the technology used in ChatGPT was considered an interesting fact, while now it has suddenly and spectacularly become part of daily life – and commerce as well. The debate has changed from being one in the realm of science or even science fiction to a debate on specific matters with major implications for the IT and technology sector, and not only. In this article, we examine the main issues that need to be considered when making practical use of ChatGPT and similar systems in business activity, and how to account for the risk posed by use of ChatGPT by employees and contractors.
ChatGPT is probably the most recognizable AI tool in recent times, apart, for example, from art generators such as DALL·E 2 or Midjourney. It is a chatbot, which is software used for conversation that simulates human interaction. The application uses a Generative Pre-trained Transformer (GPT-3), which in turn is a language model that uses deep learning to produce human-like text. Thus equally it is primarily intended to produce responses to the user’s questions.
To achieve this functionality, the entire model is trained, using that deep learning process, by analysis of huge amounts of data, including from databases in the public domain. Meanwhile, this data is combined with information obtained through interaction with human users when the GPT model was being developed in the past – specifically, this was done by the previous versions, GPT-1 and GPT-2.
Millions of people around the world, including the authors of this article, have found that ChatGPT is capable of responding to questions posed by users, and performing specific tasks requested, from smoothly written and high-quality translations through providing solutions to various problems, to generation of complex text or writing computer programs.
Due to the quality of the generated responses, which in fact is surprisingly high, one of the ways in which the current, publicly available version of ChatGPT can be used successfully has been found to be creating computer programs, producing marketing texts and social media posts, automated text editing, or for example creating a presentation or drawing up documentation. The system can then modify the generated content according to parameters selected at will (such as rewriting using a different style of expression). Importantly, the content is produced instantly.
ChatGPT and the technology it uses has huge potential to become a commonplace tool used in commerce to perform (or at least be used as a major aid in performing) tasks of which only humans were capable in the past (and considered an area of human creativity). Meanwhile, the next version has now been announced, which uses an enhanced version of the model, GPT-4, trained using a much larger amount of data and many more initial parameters.
This will mean that there are many more practical uses for ChatGPT, and an increasing number of people will use it at work using the free version, or indirectly through GPT-3-based commercial tools (using the API supplied).
For this reason, it is natural to ask whether ChatGPT and ChatGPT-based tools can be used, in addition to personal purposes, in business, whether this is safe and lawful, and what kinds of risk this might involve.
The essential legal issues central to addressing the concerns mentioned above are examined below.
Firstly – data
For ChatGPT to generate content of any kind, first it has to be asked the appropriate question. From this perspective, there are two situations in which the tool is used, depending on the type of input data:
- in the first situation, specific commands intended to generate entirely new content do not include data that is protected or subject to special laws, for example a command telling the system to produce a brief article on a general subject.
- in the second, specific data, content, or text input is required, to modify, expand, or correct the data and content, or to produce responses, while the input data might include data that is protected or subject to special laws, for example personal data.
The two ways in which ChatGPT can be used, according to the range of provided information, involve different risks for users.
When a user inputs data of any kind into ChatGPT, they provide that data to be used by the tool provider – the US company OpenAI L.L.C.
It is true that the OpenAI service terms state that data are not processed entirely without restriction, and the provider enables a user to withdraw consent to processing (at least certain users can do so). However, any data input into ChatGPT are provided for use and processing by the provider, which has its seat in the USA, and the extent of the processing is not stated specifically. In addition, under the service terms, the ChatGPT provider does not make any specific undertakings towards the user of non-disclosure of the input data.
Due to the above, when inputting data constituting a business secret or which is confidential into ChatGPT, a user discloses that data to a third party for use for an indeterminate purpose, explicitly giving them permission to use it in that way. Moreover, it is clear from information released publicly that the provider of ChatGPT does have the technical capability of accessing processed information for the purpose of machine learning, whether by automated means or manually, i.e. this is done by its employees or contractors directly.
Thus it follows that if a user inputs into ChatGPT data that includes details of business strategy, there is also a risk at the same time that the document will be obtained by the provider’s personnel, but also provided to other users, or used to produce a comparable strategy for the competition. Similarly, if a business undertaking inputs confidential information relating to a client or business counterparty, it could be viewed by an unlimited number of persons on the side of the provider and affiliates of the provider.
This means that using ChatGPT for business purposes to analyze input confidential information could constitute breach of data protected by law (business secret or information that is privileged in other ways under specific laws, such as medical or banking secrets) thereby breaching contractual obligations or the law.
This means that proper oversight is needed of use of ChatGPT in business, and this includes drafting the relevant legislation and guidelines on whether and to what extent the tool can be used by staff or contractors.
Laws on personal data processing are a different issue.
Notwithstanding the above, data input into ChatGPT and processed as described above could include personal data.
Clearly, where it involves personal data processing, use of this system by Polish and European users is directly governed by the GDPR, and therefore this must comply with the rules laid down in law. In addition, where ChatGPT is used in this way, this could result in personal data being transferred to third countries (as at the day on which this article was written, according to the authors’ information, the ChatGPT infrastructure within which data processing takes place is located in the US).
If doubts arise, to eliminate the legal risk connected with using ChatGPT under data protection laws, there is an alternative, which is deleting data of this kind from any content input into the tool.
Secondly – intellectual property issues
One of the principal topics in the intense debate over AI and the increasing range of AI applications, is of course the question of intellectual property rights, and this applies equally to ChatGPT. Three main legal issues arise in this regard:
- possible infringement of third-party intellectual property rights when the system is “learning” and when it is launched;
- the status of the generated content and how to protect it;
- who holds the right to use the generated content.
Machine learning and copyright
The issue of machine learning is a highly emotive one, and has already led to specific legal measures being taken, in particular lawsuits against AI system providers. Training the system concerned requires systemic analysis and processing of huge amounts of data, including data that constitutes works protected under copyright law in various jurisdictions. The debate centers around the right of operators of systems that “learn” to make use of databases that are in the public domain. The main objection regarding operators of systems of this kind is web scraping, i.e. unlawfully obtaining and processing huge amounts of available data in an automated manner, which is then used to produce content. The parties that raise claims say that this conduct infringes the rights of the original authors or other rightholders. Importantly, holders of rights to databases on which AI systems “learn” are able to file similar claims.
Legislators in various countries around the world and in the EU have discerned this problem, and in the EU, elements of a regulatory framework addressing the issue were implemented in the DSM Directive. Under the DSM Directive, national legislatures are required to pass laws enabling third parties to reproduce databases or works in the meaning of copyright law for the purpose of machine learning. This applies to both academic and commercial use, while a rightholder can refuse to give consent with respect to commercial use. The respective laws have been passed in some EU countries. In the case of Poland, the legislative process and work on amendments is ongoing (as of the moment this article went to press).
Clearly, this issue is very interesting from a legal point of view, while in the view of the authors, it is crucial primarily for authors and providers of systems of this kind, taking into account the practical aspects of using ChatGPT and similar tools. It is these parties that will firstly face a risk due to the claims being pursued.
Whether or not it is a copyrighted work
The other issue with many more practical implications for users of ChatGPT and similar systems is the status of the generated content. The starting point for considering this issue is the question of whether this content constitutes copyrighted works. The debate among experts has become very important, and resolving the issue has now become a matter of urgency. The concerns stem from the fact that in most legal systems, one of the essential criteria for an element to be considered a copyrighted work in the meaning of copyright law is that it has to be the product of human creation (the Polish legal system is one of such systems). It is a fundamental premise of copyright law that it protects the product of human creativity.
This leads to the view that even if particular content generated by ChatGPT has features that are identical to content created by a human author, it does not constitute a work in the meaning of copyright law, as the legal requirement of human creative output is not met. This approach means that copyright protection does not apply, and therefore it is permitted for example to freely copy, adapt, and make commercial use of content of this kind.
The opposite viewpoint is also taken, that content generated by ChatGPT and similar systems can be considered copyrighted works under current copyright law, because ultimately the creator is human. There are various proposals in this regard, while in this approach the author is identified for instance as the operator of a particular system (or the party that constructed and “trained” it), or also the end-user, as the end-user defines the criteria for the generated content, and thus plays the fundamental creative role in creation of the content.
This issue is currently unresolved. Intervention of the legislature in specific jurisdictions is needed to address these concerns. Of course, the issue could be settled in court rulings on the issues described above. The first cases have now been filed in courts in various countries around the world. The first cases have now been filed with courts in various countries around the world.
In the authors’ view, in the context described, the institution of computer-generated works is noteworthy, which has existed for years in certain jurisdictions such as the United Kingdom. Under the UK’s 1988 Copyright, Designs and Patents Act, the work is generated by computer in circumstances such that there is no human author of the work. These works are protected by copyright, while the copyright to the work is held by the person who performed the essential actions to create it.
The numerous legal concerns regarding the status of content generated by ChatGPT and similar systems also include the issue that if this content is considered to constitute a work, the next debate arises as to the relationship between this content and works used in the “training” of the system in question (such as ChatGPT). The extreme view is that content generated using AI must be considered a related work or the equivalent of this institution in other jurisdictions in relation to the works on the basis of which the system did the “learning”. According to this concept, use of content generated by the system in question would require, for example, permission of the original authors. In practice this would mean fulfilling the necessary formalities and making the appropriate payments to them. This would also have serious consequences for ChatGPT end-users, because using content generated by AI in their business activity could infringe the rights of authors of the original works, and trigger direct liability, notwithstanding the liability of the system provider.
In our view, this is a flawed standpoint, reached due to not being aware of the “technical” nature of operations of ChatGPT and similar tools Meanwhile, it is an excellent illustration that fundamental issues regarding use of this revolutionary technology remain unresolved, and of the high level of legal uncertainty surrounding its use. More importantly, the current copyright framework is based on rules formulated at a time when artificial intelligence did not exist, and was talked about in the realm of science fiction. Clearly, the main copyright institutions are not suited to this new reality. At the moment, it is essential to reflect on elements of law as fundamental as the characteristics of a work. For this reason, ChatGPT users need to follow the ongoing debate closely, as use of generated content could prove to be an infringement of third-party rights at a certain point.
If it is a copyrighted work, then whose?
To pursue this further, if content generated by ChatGPT may be a copyrighted work in the meaning of copyright law (provided that it meets other criteria specified in law) the question is who holds the rights to the work.
The conditions for using ChatGPT do not provide a definitive answer. To the extent permitted by law, under the OpenAI service terms, OpenAI transfers all rights to content generated by the provided tools, including by ChatGPT, to the user: OpenAI hereby assigns to you all its right, title and interest in and to Output. At the same time, the provider stipulates that OpenAI may use Content as necessary to provide and maintain the Services, comply with applicable law, and enforce our policies. Furthermore, the user is responsible for ensuring that the generated content does not breach the law or the OpenAI service terms. Importantly, those rules also state explicitly that generated content is not necessarily unique in nature, and multiple users may obtain the same or very similar content. In addition, the service terms do not explicitly place any restrictions on the purposes for which the generated content is used (such as commercial use).
Interestingly, when ChatGPT is asked about this issue, it responds that all rights to generated content are vested in the supplier (OpenAI), and that this content may not be used for commercial purposes. However, as this article goes to press, the data used in ChatGPT is current only up until 2021 (and in certain exceptional cases 2022), and therefore the starting point has to be the currently applicable conditions of use.
This does not alter the fact that the standpoint of the provider of ChatGPT is that first and foremost it holds the rights to generated content, which it then transfers as far as possible to the end-user. Of course, this issue continues to raise serious concerns under generally applicable law, as mentioned above, and this means that the provider’s standpoint should be approached with great care.
In our view, however, the service terms formulated in this way for OpenAI services significantly reduce the risk arising from an end-user using content for commercial purposes, because the provider gives the user a broad license to use the content, and this is an entirely understandable approach from a business point of view. The provider has stated that its prime goal is to commercialize the developed technology, and not to benefit financially from use of the generated works. Clearly, to attain this goal, end-users have to be allowed to benefit from using content generated by ChatGPT and other tools as far as possible.
Even favorable rules laid down in an agreement with the ChatGPT provider will not provide sufficient protection if the generated works are found to breach third-party rights (for example authors of works used to “train” that system). For this reason, caution is always advised when using that content, above all if it is published unamended, especially as tools for detecting content generated by ChatGPT are becoming increasingly common.
Thirdly – AI can make mistakes too
In closing, ChatGPT-generated content will not always be correct and true, and OpenAI states this specifically in its communications and service terms, and thereby also states that its liability is excluded as far as possible.
Put briefly, content generated by ChatGPT can contain flaws and harm the person asking the question. An end-user uses the system at their own risk and are themselves accountable, and it will not be possible to raise any claims against the provider.
How ChatGPT can be used safely at the moment
Evidently, there are more questions than answers as concerns the current debate over legal aspects of AI systems. Moreover, the most fundamental issues need to be addressed, above all whether content generated by tools of this kind is protected by law, and on what grounds. On the other hand, the benefits of technological systems of this kind are so great that many persons and institutions have begun using them in their business activity or will consider doing so, especially as providers allow them to be commercialized as part of dedicated tools that use the API supplied.
Therefore, despite the described legal concerns, it is now possible to determine some general rules on using ChatGPT and similar tools:
- content that may include personal data, data constituting a business secret or which is confidential, privileged communication, and any data that may not be disclosed to third parties, should not be input into tools of this type.
It needs to be borne in mind that any content input into ChatGPT could be disclosed or used by the provider and affiliates;
- when using ChatGPT, the user transfers data to a third country, as they transfer data to the provider, which has its seat in the United States;
- major legal concerns as to the nature of content generated by ChatGPT need to be taken into account. This content could be found to infringe rights, such as copyright or third-party rights, in the future;
- ChatGPT-generated content is not necessarily always correct, and there may be no grounds for a claim against the provider in this respect.
Also, even if we or our organization decide not to use tools similar to ChatGPT, they are used today by many firms on the market. As this technology develops, it will be increasingly common. For this reason, solutions should be implemented now to protect the interests of individuals or firms.
- Above all, it is recommended that the appropriate clauses be inserted into contracts with employees, contractors (in particular those working on a B2B basis) or external vendors. Those clauses should govern the issue of when ChatGPT and similar tools may be used, and the consequences of using the generated content in performance of an agreement (for example if use of this content is found to be an infringement of third-party rights). As a minimum, the appropriate notification obligation regarding use of tools of this kind should be introduced. Introducing such clauses as soon as possible (for example by revising agreement templates accordingly) will help to reduce costs and mitigate the risk of grave problems occurring in the future.
- Secondly, introduction of general policies on using AI tools is recommended – especially in organizations in which work of a creative nature is a core element of the activities, for example firms in the creative or new technologies sector, or software manufacturers. These policies can be used to address the entire range of issues relating to use of AI by personnel or suppliers. This could prove to be vital to ensure security and confidentiality of data processed within the organization. This is currently one of the essential factors determining a firm’s success. Introducing policies of this kind will certainly also raise personnel and contractor awareness and stress the risk connected with using ChatGPT and similar tools, and the respective limitations.
Clearly, the measures described above will render the organization more secure for the future, when it is likely that AI will be as commonplace as the Internet today.