Designing AI systems
Technical standards
First steps
Comment
This article is the second in a series on developing medical artificial intelligence (AI) systems and focuses on designing AI system and the technical standards involved. For part one of the series, see "Developing medical AI systems – overview of MDR and AI legislation implications".
AI systems should be designed to remove or reduce as far as possible the risks associated with possible negative interactions between the AI software and the IT environment within which it operates and interacts.(1)
The chapter entitled "Requirements regarding Design and Manufacture" in the EU Medical Devices Regulation (MDR) states that software should be designed to ensure repeatability, reliability and performance in line with its intended use and in accordance with the state of the art. If the software is intended to be used in combination with mobile computing platforms, it should take that into account (eg, the size and contrast ratio of the screen), along with the external factors relating to its use (eg, varying levels of light or noise).(2)
Human oversight
The AI Act specifically requires proper human oversight.(3) One of the main prerequisites is the possibility to stop high-risk AI systems or safely abort an operation where necessary. Specific oversight and control measures should also reflect the self-learning or autonomous nature of AI systems.(4)
Transparency
A key concern is that people may not be aware of the fact that they are interacting with an AI system.(5) Therefore, the AI Act requires this information to be disclosed. In the medical environment, physicians and/or other medical staff typically use AI systems for diagnosis, such as tomography. However, patients are often actively involved with AI, which may require proper disclosure.
Additionally, each decision or prediction that the AI system produces, helps to produce or enhances should be understood by the user. Otherwise, the user may not be able to use the outcomes or to identify the source of AI errors to further enhance the system.
Accuracy
Some say that AI systems are only as good as the data provided to feed them. Therefore, one crucial question to consider in order to verify the accuracy levels of AI systems is whether the developer has put in place accurate measures to ensure that the data, especially training data, used to develop the AI system is up to date, of a high quality and representative of the medical environment when the system is complete and in use. For example, in a study on smokers and lung cancer, only 4% of those included in the database were smokers of an ethnic minority.(6) Therefore, the results produced by the system may be detrimental to the diagnostic process of melanoma lesions in people of an ethnic minority, which reflects a common concern relating to AI systems: bias.
The situation may be reversed when the diagnostic process performed by physicians is challenging. In such cases, the use of AI systems may be more justified and may impose fewer risks. For example, physicians have had problems with diagnosing acute kidney injuries (AKI), which can cause patients to deteriorate very quickly and can be fatal. An estimated 11% of deaths associated with AKI in hospitals was due to the failure of proper diagnosis. Thus, a machine learning tool (ie, an AI system) that could predict AKI was developed.(7) The system can identify more than 90% of acute AKI cases and does so 48 hours earlier than traditional methods would do.(8)
Human oversight, transparency and accuracy are not the only things that should be taken into account. Developers need also to ensure that the AI system is technically safe, non-discriminatory and accountable as well as ensure privacy, cybersecurity and other parameters are factored in. Such parameters will be reflected in the technical documentation(9) and will be part of the testing protocol.(10)
Ensuring that all standards are in place is challenging for developers of AI systems, especially in healthcare. As well as checklists, technical standards are important and practical tools that may help with this process. There is a number of standards relating to AI systems, both general and more specific, which describe best practice for comparing the performance of machine learning models and specify classification, performance metrics and reporting requirements.(11)
Along with the AI Act proposal, the European Commission requested the European Committee for Standardisation (CEN) and the European Committee for Electrotechnical Standardisation (CENELEC) to create technical standards for AI.(12) The standards may be available by the end of 2024 or beginning of 2025.
In principle, a step-by-step upgrade from a wellness or lifestyle application via a medical device to an app that may be covered by health insurance(13) is possible. However, this should be compatible from the start with the requirements of the MDR (in particular with regard to technical documentation and the quality management system) and with possible additional requirements relating to health insurance coverage in mind. Additionally, the developer should set up a quality management system certified by the International Organization for Standardization (ISO) standard ISO 13485 early on and run an assessment on whether the AI Act may be applicable once it becomes enforceable. Further requirements are set by data protection law.
Experience shows that it is likely to cause massive difficulties and additional work if requirements must be met retrospectively and were not proactively met from the project start on. In particular, the technical documentation is likely to be difficult to track in retrospect. Another point that speaks against a tiered model is the long time it takes to involve a notified body. As a classification in class IIa or higher is likely in the case of an AI medical device, which inevitably requires the involvement of a notified body, contact should be made early on. Even a classification in class I, however, would require the involvement of a notified body to a certain extent.
Against this background, it is most appropriate to design an AI or an application directly as a medical device and to observe the MDR requirements from the outset. If a project was previously designed as, for example, a wellness or lifestyle application, it would firstly be necessary to ensure that the intended purpose does not fall into the medical device category. Secondly, the requirements of the MDR and the AI Act – if its applicability is possible or even likely – are observed throughout. Finally, a notified body is to be involved in due time in order to avoid delays.
For further information on this topic please contact Jowita Prokop or Malte Scheel at Eversheds Sutherland LLP by telephone (+49 89 54565 0) or email ([email protected] or [email protected]). The Eversheds Sutherland LLP website can be accessed at www.eversheds-sutherland.com.
Endnotes
(1) Annex I No. 14.2 of the MDR.
(2) Annex I No. 17 of the MDR.
(6) National Lung Screening Trial 2002.
(7) By the US Department of Veterans Affairs and DeepMind Health.
(8) US Department of Veterans Affairs, press release: VA, DeepMind develop machine learning system to predict life-threatening disease before it appears, 31 July 2019.
(9) Annex II No. 6.1 of the MDR – pre-clinical and clinical data.
- ISO/IEC DIS 22989 (Artificial intelligence concepts and terminology);
- ISO/IEC CD 42001 (Management System); and
- ISO/IEC DTS4213 (Assessment of Machine Learning Performance).
(12) CENELEC, ETUC's position on the draft standardisation request in support of safe and trustworthy AI, 1 June 2022.
(13) For example, the DiGA (Digitale Gesundheitsanwendungen) apps in Germany.