Background
Data ownership considerations
PI considerations
Data quality considerations
Transparency considerations
Comment
"Artificial intelligence" (AI) is a broad term that encompasses a variety of computing techniques with many different views and approaches. AI developments may be of particular interest in the healthcare industry (for further details, see "Thinking of using AI in a medical context? Key issues to consider"). This article outlines some of the data-related considerations that Canadian companies should be aware of before embarking on such projects.(1)
When using an AI system to solve a business problem, it will likely involve machine learning and require a training data set. The data set would be used to "train" a machine learning model that the AI system would then employ to generate results or solutions.
Machine learning models are created by inputting numerous instances of input data, each paired with a known corresponding output. A machine learning algorithm uses those pairs of inputs and outputs to infer relationships between input data and the desired type of output data. There are various types of algorithms, including:
- deep learning;
- neural network; and
- classifiers.
In industrial settings, AI tools may be built on top of data sets that are licensed in, or existing systems that generate or collect data for other purposes. Often, data may be sourced from, collected or generated by some kind of specialised system. For example, data may be generated by process control or automation software or hardware, or by sensors installed for such controls. Sometimes, third-party technology may be involved in this data generation or collection, which raises the possibility that third parties may have rights in the data.
Contractual issues
In the example of an AI tool designed to detect lung cancer, a supplier of imaging devices may use its own image-processing software to clean up images acquired using its machines and may be concerned about how that processed data is used. Therefore, it is not uncommon for vendors of specialised equipment to assert some measure of control over data collected using the equipment.
Another possibility is that the imaging devices are operated using specialised software that is itself subject to restrictions – if data is generated by the software, it raises a possibility that "use" of the data is akin to "use" of the software and therefore subject to the same restrictions.
On the other hand, for something simpler – such as an off-the-shelf digital thermometer that simply outputs a stream of temperature values – the vendor may not care what is done with those temperature values.
In contracts – particularly software contracts – it is important to understand:
- what they say about who owns the data; and
- what restrictions they place on the ways in which the data can be used.
This can vary depending on the parties involved. It is important for parties to consider whether the provisions suit their needs.
Common terms are:
- limitations on external use;
- limitations on use for specific purposes;
- prohibitions on commercial use; and
- prohibitions on creating competing or derivative products.
It is therefore important to consider where and how the AI tool will be offered and used. Also, parties should be aware that the use and purposes of the AI tool may be different from the uses originally envisioned for the underlying data, which may be a source of problems.
In relation to the above example of an AI tool designed to detect lung cancer, a prohibition on creating derivative products of images or a limitation on internal use may not pose an issue for clinical use of the images by physicians and may not trouble a hospital contracting with an imaging supplier. However, those limitations could be much more problematic if the data is to be used to create a competing AI product.
Rights in products derived from data
There are also rights that apply not to the data itself, but to the products derived from the data or associated IP rights.
Often, contracts relating to data may refer to "derivative works" or "improvements" and either prohibit the development of such works or define the ownership thereof. In the context of the lung cancer detection tool example, a contract may provide that clinical imaging systems or software that provides derivative products, improvements or intellectual property produced using the data is to be owned by the vendor.
When the contract is negotiated, derivative products may be envisioned. However, if the images are used to build a software tool that uses AI, the AI model itself could be considered a derivative work and may be subject to limitations or ownership provisions. In the case of AI, the algorithm and the tool that are ultimately produced are derived from the data, and in a sense may be seen as more tightly coupled to the data than in more traditional examples of software tools.
Organisations should also consider whether the data they are using constitutes personal information (PI). Under current Canadian privacy laws, organisations can typically only use PI for the purposes for which the individual consented, unless otherwise permitted. However, an organisation may want to use PI collected for one purpose for a secondary purpose.
In healthcare, key questions to consider include:
- Can PI be used for a new purpose when consent was originally obtained for a different purpose?
- Can the information simply be de-identified and used without express consent?
The answers to these questions depend on various factors, including:
- which law applies – many provincial privacy laws govern personal health information held by a health information custodian or PI held by public bodies (eg, hospitals);
- what the privacy consent originally agreed to – did it consider deidentification or anonymisation of PI (ie, is the information already de-identified or is the PI in the form originally collected)? If information has been de-identified for one purpose (eg, for a safety review purpose), it may be reasonable to use it for AI purposes even if no specific consent for such purposes was given, provided that the consent did not affirmatively say that de-identified information would not be used for other purposes not already specified;
- how the de-identified information will be used – for example, whether it will be used to improve the quality of care for a hospital's patients or to provide to law enforcement; and
- whether the organisation is a service provider or an organisation with control of the PI – service providers are typically only allowed to collect, use and disclose PI as set out in the contract between it and the organisation. Even if the organisation is permitted from a privacy perspective to de-identify PI and use it for AI purposes, whether the service provider also has this right must be considered from a contractual perspective.
Current legal landscape
In Summer 2022, the Digital Charter Implementation Act 2022(2) was tabled, re-introducing the Consumer Privacy Protection Act along with the new AI and Data Act.
The proposed Consumer Privacy Protection Act specifically states an organisation is allowed to de-identify PI without individual consent.(3) It says: "An organization may use an individual's personal information without their knowledge or consent to de-identify the information." (Emphasis added.)
Policy reasons to permit de-identification without consent include the following:
- AI requires large volumes of data. This means that there is a huge administrative burden to seek consent, particularly where updated contact information must be obtained or where the patient is inaccessible (eg, the patient is elderly or has died).
- If consent is obtained, the data may be skewed because individuals who consent tend to have different characteristics than those who do not consent (with regard to, for example, their disease severity, age, education or race).
Overall, the government is leaning towards making it easier to de-identify PI to be used for AI purposes.
One of the goals of using AI in healthcare is to ensure that medical treatment is as objective and accurate as possible. This is difficult to achieve if the data used to train the AI model is biased.
The proposed AI and Data Act states that organisations that are responsible for AI systems are required to determine whether it is a "high-impact system" and if so, must establish measures to identify, assess and mitigate the risk of harm and "biased output" (ie, situations when individuals could be adversely affected, without justification, from a human rights perspective).
Organisations can minimise bias by:
- being aware of the potential for bias;
- ensuring that datasets used are as complete as possible and that assumptions are as accurate as possible;
- increasing transparency; and
- increasing diversity among those who help develop AI tools.
The consequences of relying on a biased model will depend on how much risk is associated with the model reproducing the bias or having other errors. Organisations should consider what the AI model is being used for and the potential impact if it makes an error.
Opacity with respect to AI can be considered in two ways:
- lack of awareness that AI is being used to make a decision; and
- lack of awareness about how the decision arose (ie, what data and algorithms were used to make the decision).
Currently, there is no law in Canada requiring transparency when an AI system is being used, but there are proposed laws to change that: both the proposed Consumer Privacy Protection Act and the AI and Data Act contain transparency provisions.
As part of the principle of openness and transparency, the proposed Consumer Privacy Protection Act imposes disclosure obligations regarding an organisation's use of any "automated decision system" to make predictions, recommendations or decisions about an individual that could have a "significant impact" on them, similar to how privacy legislation already requires organisations to disclose information about how they handle an individual's personal information (eg, privacy policies). The proposed Act also allows individuals to make requests to organisations for an explanation of predictions, recommendations or decisions made about them using an "automated decision system" that could have a significant impact on them, similar to how current privacy law allows individuals to make requests to organisations regarding how their personal information has been handled.
Under the AI and Data Act, persons who manage the operation of (as well as make available for use) a "high-impact system" (this term is not defined) must publish on a publicly available website certain information, including:
- how the system is used;
- the types of content the system generates; and
- the decisions, recommendations or predictions it makes.
Under the proposed AI and Data Act, liability can be quite high – penalties of C$10 million or 3% of a company's global revenue may be imposed. Three types of provisions in the proposed AI and Data Act may result in an even more severe penalty (ie, C$25 million or 5% of global revenue).
Canadian organisations should:
- take a broad view of how data might ultimately be used. It is important to recognise data itself as an asset. Securing an organisation's freedom to use it as it suits the organisation in the future is worthwhile;
- aim to secure broad rights to data and avoid significant restrictions. This tends to be more easily done at the outset of a relationship as opposed to later, when new applications have been created and value crystallised;
- keep track of the data they use and be aware of which is subject to the rights of others and which is not. Organisations should consider how to maintain a separation between such data. If data is restricted, maintaining a separation and not mixing it with less encumbered data will be helpful to prevent data issues in the future;
- ensure there is a broad right to de-identify information (eg, that contracts do not specify that the data will be de-identified "for safety purposes only"); and
- ensure that machines are trained explicitly to do their job. In most cases, AI algorithms are a black box – the basis for the resulting model may not be easily understood by a human operator, which can raise issues such as generalisation, overfitting or bias.
For further information on this topic please contact Patrick M Roszell, Alice Tseng or Graham Hood at Smart & Biggar by telephone (+1 416 593 5514) or email ([email protected], [email protected] or [email protected]). The Smart & Biggar website can be accessed at www.smartbiggar.ca.
Neil Padgett, lawyer and patent agent, participated in the roundtable discussion on which this article is based.
Endnotes
(1) This article is part of a series based on the roundtable discussion "In-house counsel primer: Managing IP and compliance risks in artificial intelligence and a digital world". For the first article in the series, see "Thinking of using AI in a medical context? Key issues to consider".
(3) This Act does not apply in respect of personal information that has been anonymised.