Supervisory Authorities investigate OpenAI’s ChatGPT

May 28th, 2024 Posted in Data Protection, News & Resources

Following OpenAI recently releasing its most recent version of ChatGPT, ChatGPT 4.0, the European Data Protection Board (“EDPB”) ChatGPT Taskforce (the “taskforce”), published an interim report (the “report”) on May 23 2024, setting out its efforts and preliminary findings in respect of the interplay between ChatGPT and its compliance with the General Data Protection Regulation (“GDPR”). Key points to note from the report are:

Background

We have seen an emergence of large language models (“LLMs”) in the GPT category which use vast amounts of personal data. This development prompted several Supervisory Authorities (“SAs”) to initiate data protection investigations into OpenAI’s ChatGPT. Against this backdrop, a taskforce was established to coordinate these various investigations due to the absence of OpenAI’s establishment in the EU until February 2024.

Ongoing Investigations

The investigations examining OpenAI’s compliance with the GDPR in respect of different versions of ChatGPT are ongoing. To support the coordinated approach outlined above, a common set of questions was developed to assist with initial exchanges with OpenAI; these questions are included as an annex to the report.

Preliminary Views

Lawfulness

The usual GDPR rules regarding lawfulness of processing apply to LLMs, meaning that personal data should be processed lawfully, fairly and in a transparent manner. As such, a lawful basis for processing any personal data under Article 6 (1) of the GDPR and Article 9 (2) (where applicable) must be present.

Data Processing Stages: The report highlights, when assessing lawfulness, that it is useful to categorise the different stages of processing personal data into data collection (including web scraping or reuse of data sets), pre-processing, training, and use of prompts and outputs.

Web Scraping: OpenAI has assigned legitimate interests as its legal basis for web scraping. The report emphasises the importance of carrying out a legitimate interest assessment in order to balance the rights of individuals against Open AI’s interests in order to ensure individuals’ fundamental rights are fully considered and protected.

Special Categories of Data: Any processing of special category data (e.g. health or race information) must be in line with an exception under Article 9(2) of the GDPR.

To mitigate any undue impact on individuals and to safeguard their personal data, the The EDPB ChatGPT Taskforce recommends measures are implemented to filter and anonymise personal. In particular, such measures should be taken before the training stage, where data has been collected by web scraping.

Fairness

According to the GDPR, personal data should not be processed in a way that is unjustifiably detrimental to the data subject – this is the GDPR’s fairness principle. The report outlines that OpenAI should not circumvent its compliance responsibilities by passing them on to data subjects, for instance, by unfair provisions in T & Cs. As a result, OpenAI should remain responsible for ensuring its compliance with the GDPR and implementing safeguards to handle personal data inputs in a responsible manner.

Transparency

Web Scraping: Article 14 of the GDPR will apply to any web scraping which means data subjects should be informed when their data is collected indirectly. Given the impracticality of informing each individual, due to the nature of web scraping, exceptions under Article 14(5)(b) of the GDPR may apply, provided all the requirements of this section of the GDPR are adhered to.

Direct collection: In accordance with Article 13 of the GDPR, OpenAI must ensure users are fully informed about the use of their personal data when personal itis collected directly via interaction with ChatGPT.

Data Accuracy

Although it is likely that biases and hallucinations will arise through the use of ChatGPT, users are still likely to deem outputs as accurate. The taskforce stresses the importance of maintaining data accuracy in the report and highlights controls should be implemented to promptly correct inaccuracies to ensure the reliability of the data used by ChatGPT.

Data Subject Rights

The report sets out the challenges for exercising data subject rights such as access, rectification, and erasure due to the nature of the processing activity. OpenAI does have certain measures in place to facilitate data subject rights but the report points to the need for OpenAI to continue improving these measures. It will be interesting to see how the taskforce concludes on this point once it finalises its ongoing investigations as those in the AI industry know it is currently extremely difficult, due to the way such models are trained, to fully reconcile LLMs with the GDPR’s requirements in relation to data subject rights.

Next Steps

As alluded to above, the taskforce aims to foster cooperation among SAs, coordinate external communication, and identify common issues that require a unified approach.

This initial report highlights the importance of keeping abreast of regulatory and legislative developments in relation to AI systems. It is likely the above preliminary views will evolve as more information becomes available and as OpenAI responds to any findings. We will keep you updated with any findings relating to this interim report.

In addition, whilst the findings outlined in this report relate to investigations into ChatGPT, providers of similar tools (and AI systems generally) should take note and ensure they adopt measures to comply with the GDPR’s principles, especially as we will no doubt see similar reports to this one in the future, in relation to other existing AI systems and as SAs become more cognizant of the workings of AI systems.

We can help your organisation with data protection

Need help understanding the new EU AI Act? If you would like any help in understanding how the AI Act will impact your business or you would like to discuss how we can support your organisation from an AI governance perspective, please get in touch using the form below.

  • This field is for validation purposes and should be left unchanged.

Raymond Orife Evalian 250x250

Written by Ray Orife

Ray specialises in data protection and information rights law. He is a qualified solicitor and worked in private practice and in-house in commercial law roles before focusing on data protection. Before joining Evalian™ he was in-house counsel and Data Protection Officer for a high street financial services organisation and their associated businesses. His qualifications include a First Class Honours Degree in Law, LPC (Distinction), Practitioner Certificate in Data Protection (PC.dp) and IAPP CIPP/E.