Study on copyright and new technologies – copyright data management and artificial intelligence
Date de publication : 4 avril 2022 | langue de rapport : EN
In October 2020, DG CNECT commissioned a consortium of Technopolis Group, Philippe Rixhon Associates, UCLouvain, Crowell&Moring and IMC University of Applied Sciences Krems with the execution of a 9-month “Study on copyright and new technologies: copyright data management and artificial intelligence”.
The study addresses two topics related to the copyright system and new technologies, dealt with in two dedicated parts of the study.
- The first part of the study systematically takes stock of the current situation with respect to rights metadata in different creative industries. It attempts to identify and describe the economic impact of the current situation related to rights metadata. It also compiles information on the most important ongoing initiatives to address some of the identified problems. Finally, the study indicates broad avenues which could contribute to improving functioning of the copyright data ecosystem.
- The second part of the study focuses on a) uses of copyright-protected content as input to feed AI technologies and b) the copyright implications of the production of cultural outputs by or with the assistance of AI. Furthermore, the study discusses possible policy scenarios which might be needed to react to these developments.
All information and views set out in this publication are those of the authors and do not necessarily reflect the official opinion of the Commission.
Summary Part One
The first part of this project is focusing on the topic rights metadata or “rights management information” as defined in the European Directive 2001/29/EC on the harmonisation of copyright in the information society:
“Rights Management Information means any information provided by rightsholders which identifies the work or other subject matter [ ], the author or any other rightsholder, or information about the terms and conditions of use of the work or other subject-matter, and any numbers or codes that represent such information. [This] shall apply when any of these items of information is associated with a copy of, or appears in connection with the communication to the public of, a work or other subject matter [ ]”.
It analyses the extent to which different challenges related to rights metadata can be empirically substantiated, including the availability of rights metadata attached to content, the interoperability between different systems for exchanging metadata, or the authority (i.e., trustworthiness) of sources. In doing so it provides supporting information in the context of the stocktaking document on developing improving the Copyright Infrastructure issued by the Council of the European Union under the Finnish Presidency in December 2019 and the Action Plan on Intellectual Property adopted by the European Commission in November 2020.
The empirical basis for this part of the study consists of interviews and surveys among industry stakeholders as well as analysis of existing studies and further secondary research by the study team. We conducted more than 80 interviews covering three creative industries (film & television, music, and publishing, i.e., books, press, journals, and images) in different Member States of the European Union, Canada, the United Kingdom, and the United States. The interviewees were stakeholders of the digital value network in these creative sectors. The surveys were rolled out in spring 2021 and targeted the main trade associations of four creative industries (music, publishing, film, and TV broadcasting) as well as their members. The surveys consisted of two modules: one general part for industry experts without detailed insights into metadata challenges (mostly open questions on the respondents’ perspective on the topic) and one detailed part targeted at metadata experts. Due to the high complexity of the topic, but also to the varying relevance of the topic in different industries, the number of respondents responding to the industry surveys differed widely (between 7 responses for broadcasting and 124 for publishing, with further item non-response for specific questions). Due to the low number of responses, quantitative estimations on the study topics were difficult to obtain. This led the study team to mainly concentrate on reporting qualitative indications.
For the literature review, we thoroughly analysed more than 20 core documents, including recent research and working papers, studies, position articles, and communications from the European Commission. In the selection process, we chose recent documents that cover different industries, technologies, and perspectives.
The study team additionally compiled an impressive list of EU and industry initiatives that currently play a role, or propose to play a role, in the content rights infrastructure, be it on issues of concrete data access and exchange (interoperability), global standards, and identifiers of parties or content or the overall governance of the copyright infrastructure. Any serious future action in the area of the copyright infrastructure must take these into account, either because (like the identifier and metadata standards and many of data access/exchange systems) they already play an essential and established role which must be integrated with others where necessary; or because (like frameworks, working groups and reports) they provide guidance, tools or potential tools for solving aspects of the interoperability challenge. The majority of these initiatives come from specific content or cultural sectors and are thus not cross-sectoral by design.
Key findings of the report suggest that rights data management is – for many reasons – challenging in all creative industries. Based on interviews, surveys and literature review, it can be summarised that the analysed creative sectors are facing data-related challenges in the following four areas:
- Costs, in the area of rights management
- Efficiency issues, in the domain of licensing
- Challenges concerning payments processes, in the field of rights remuneration
- Risks of misappropriation and other rights infringements, in the sphere of rights enforcement
To be able to describe the challenges in a nuanced way, the study team took a sectoral approach in analysing metadata challenges, holding a differentiated perspective on the film, music, and publishing industry. The reason for this is that there are large differences between industries, but also within industries, with respect to the identified importance of metadata challenges and how high they are currently ranked on the industry or policy agenda. The same applies to the current status quo of metadata initiatives in the different industries.
As an example, in the music industry findings from existing evidence (information from e.g. CMOs) and expert interviews for this study among various stakeholder groups suggest that – while the work of many initiatives (for example CISAC and DDEX) has improved data exchange processes – challenges of imperfect or disputed rights metadata information are still a challenge. This implies, for example, that at least in some cases the music industry “spends an inordinate amount of time correcting errors and resolving disputes which hold up payments to music creators”. This is less the case for newer recordings since awareness of the importance of rights metadata has improved, but in any case still seems to be an issue for older works. However, stakeholders from the recorded music sector also signalled that rights metadata issues are for them nowadays less problematic than, for example, the inaccuracy or lack of usage metadata provided by online platform services.
In the publishing sector, the results of interviews and surveys suggest that in areas such as the digital news publishing sector, challenges are more prevalent than in book publishing. In the former, issues such as a lack of granular attribution of copyright ownership for photographs used on news websites can lead to challenges regarding the remuneration of rightholders. In the latter, rightholder identification (e.g., the author of a book) is more straightforward and rights management processes therefore easier. The granularity and the degree to which copyright-protected content is embedded in creative works as well as rightholder structures of a creative work (through iterative contributions or co-authorships) make a substantial difference.
In the audiovisual sector, problems of compatibility of descriptive as well as rights management data were identified – despite recent developments to increase compatibility through a harmonisation of the registration process between the standard content identifiers developed within the industry, EIDR and ISAN. Also, a transparent exchange of usage metadata seems currently not always to be a given. Niche players in the film and television industry (independent film producers), but not the major studios we spoke to, raised concerns in this regard. In film production, stakeholders mention interoperability issues in rights management systems. Moreover, in most cases, there is no obligation to use the EIDR or ISAN standard identifiers and not all players are using them to identify their works. Such standard identifiers do not cover the rights, which does not facilitate licensing.
The study concludes that different avenues for future action could help improve the current situation: raising copyright awareness in general (e.g., on the side of creators and rightholders, and on the side of users and consumers) would help clarify the importance the copyright system in general for creatives in all industries. More specifically, initiatives to help raising awareness and skills specifically related to rights metadata seem to be important. The study results suggest that individuals in the creative content industries (especially creators themselves) have a relatively weak understanding of what metadata are and how to handle them. This lack of expertise and attention ends up in non-exhaustive technological developments such as Artificial Intelligence and Distributed Ledger technologies (for example via the European Blockchain Services Infrastructure). Finally, the study authors are of the opinion that, in the long run, a cross-sector rights data network could bridge gaps between standard content identifiers such as ISRC, ISWC, ISBN or ISAN and digital manifestations of the content they denote. This would increase interoperability also between different media or content sectors. The ultimate objective of this endeavour would be to break the silos between different creative industries and improve the efficient rights data management and licensing across sectors. It could help to release even more of the digital potential of Europe’s creative sectors.
Summary Part Two
The second part of the study analyses the implications for copyright and the related rights of the increasing use of AI technologies within the cultural and creative industries. Building on a review of several experiences with AI tools and concrete use cases, on a traditional study of the legal sources, on interviews with legal experts and stakeholders and lastly on a Delphi survey among legal experts and industry stakeholders, this part of the study focuses on the challenges raised by those tools and use cases for the EU copyright and related rights framework.
Over the last few years, AI solutions have been deployed across different industries and in a wide range of applications. The cultural and creative sector is no exception: some AI tools assist or complete the highly human process of creation; more often, they are used for improving the production of successful cultural artefacts or the consumer experience, e.g. through well-targeted recommendations. The reliance on AI technologies for or during the creative process might yet challenge copyright and/or related rights. The study distinguishes upstream or input issues, i.e. those related to the use of protected content as inputs for an AI application, from the downstream or output issues, i.e. those related to the musical, graphical, audiovisual or other cultural content that results from the use of the AI application. On the input side, AI applications may be trained with large datasets of creative content enjoying protection under copyright and related right, which prompts the question whether the rightholders’ authorisation is needed for such use. On the output side, the AI applications can generate cultural content without any significant human contribution, which raises the question whether such outcome is protected under copyright or a related right. Other issues come to mind: Is there a need for additional incentives (in the form of copyright-like rights) to use AI tools for generating cultural outputs? Should the related investments in AI solutions be protected by an exclusive right or just promoted through funding? Are there authorship or ownership issues?
The study is structured as follows. It first shows how some AI applications are used in practice (part 3.2). The illustrations help to understand how this developing practice may impact the various stakeholders (creators, artists, producers, distributors, etc.) in the cultural and creative industry. This assessment is done in four creative sectors, namely visual arts, music, audiovisual & film, and video games. This allows then to identify possible issues with and challenges to copyright and related rights (part 3.3). Finally, some policy options are examined (part 3.4) to address these challenges.
Concerning the deployment of AI solutions in the cultural sector, the research demonstrates on the basis of concrete examples, that the overall reliance on these tools is increasing even if the degree of adoption of AI solutions varies significantly from sector to sector, for instance with a clear use of AI tools for upgrading video games, as well as for generating photos and faces for advertisements or “elevator music”. Furthermore, AI solutions are not only deployed for repetitive or mechanical duties but also for tasks leading 21 to outputs which appear original and imaginative – and therefore, traditionally considered as within the sole realm of humans. The study focuses on the creations that are produced autonomously by the AI solution, with no or little human intervention.
Most of the AI applications appear to be marketed online “as a service” (one might even refer to the nascent market for “Creation as a Service” or “CaaS”). This business model confers factual control of the use of the AI solution and the production of AI output to the AI developers, who can consequently protect their revenues (and terminate the service in case of non-payment). The features of this business model should be considered when assessing the need for protecting the AI output under copyright or related rights.
On the input side, several challenges are examined, including the limits of the exclusivity conferred by copyright and the related rights.
Firstly, the scope of the reproduction right is still in the process of being defined by the European courts, especially when purely technical or intermediate copies are made such as within the process for training an AI algorithm through the analysis of protected elements. The recent teleological interpretation of the reproduction right and of the extraction right under the database right by the Court of Justice of the EU (CJEU) opens new avenues for the interpretation and application of those exclusive rights. It remains to be seen how the case law will interpret the notion of reproduction and whether it applies to intermediate and technical copies made during the process of feeding an AI tool. The scope of the reproduction right might indeed be resized so as to permit some uses that do not lead to an output in which the protected elements contained in the input are visible or audible.
Secondly, it appears from the consultation of the stakeholders that the expected application of the TDM exceptions, in particular of the TDM exception for other purposes than scientific research, raises some interrogations, especially concerning the way opt-out decisions should be communicated. In any case, the transposition of the TDM exceptions within the national laws should be carefully monitored to avoid diverging interpretations. Clarification as to the means and modes for expressing the opt-out under Article 4 DSM Directive might result from the development of good practices when the TDM exceptions will come into force.
Thirdly, the moral rights attached to copyright (and to the performers’ right) have not been harmonised at EU level. This could lead to diverging situations where some member states allow authors and performers to exercise their moral rights (in particular the right of integrity) to oppose the use of their works or performances as AI inputs. One way to address this is to clarify that the moral rights cannot block the application of harmonised exceptions (such as for TDM). A more ambitious approach would be to (partly) harmonise the moral rights. `
On the output side, the AI-generated output is not protected under copyright in the absence of human creative choices. The research, interviews, and surveys conducted within the study indicate firstly that no incentive for the use of AI tools in the creative process in the form of additional exclusive rights appears necessary. The already broad deployment of AI tools in the creative context confirms this. Also, the feedback received seems to indicate that an additional right in favour of machine-generated outputs might have negative impacts on the traditional creative sectors. The study concludes that a new related right for AI-generated outputs is not desirable.
Secondly, even if advanced AI applications are increasingly capable of approximating the style of human-made works or performances, the scope of copyright should not be extended to offer protection to the style of an author or of a performer, unless some significant and recognizable features of a protected work or performance are reproduced in the AI output. The protection of a creator’s style would indeed amount to a significant extension of copyright scope and would limit in a disproportionate way the freedom of expression and freedom of art downstream in the artistic process. Under national law, other remedies (e.g. image rights, personality rights or unfair practices) may be available and, some harmonisation at EU level of the claim of parasitism as an unfair commercial practice could be considered.
Thirdly, the absence of copyright protection for an AI autonomously generated output could leave artistic performances without protection under the law of certain member states, which may require that a ‘work’ protected under copyright be performed for the performance to be protected under the related rights. The human performance of an AI creation would not meet that condition and consequently the human performer would then be left without protection. To avoid this, the study suggests to consider a harmonised definition of ‘performance’ as the subject matter of the performers’ right. This definition would not require the performed subject matter to be protected under copyright (while making sure that the scope of the performers’ protection is not stretched to cover activities with little cultural interest).
Fourthly, even if autonomously generated outputs are excluded from copyright protection, they might, however, enjoy protection under the related rights of phonogram producers, film producers, or press publishers (if the outputs respectively qualify as sound or audiovisual fixation or as press publication), even if the production of an AI output has little to do with the traditional activities of producers or publishers and does not necessarily require a similar investment. To avoid that some of those related rights are used to circumvent the copyright policy trade-off, the study proposes to impose a condition of investment so that only the fixations for which a certain investment (possibly with a de minimis threshold) was made could trigger the application of the related rights of phonogram and film producers or publishers.
Fifthly, in the (still rare) cases of AI autonomously generated outputs, the false attribution of authorship to a human might, in practice, allow to circumvent the absence of copyright protection for this type of output. It might indeed suffice to claim authorship (by mentioning a person as the author) for the person to enjoy (unlawfully) the presumption of authorship and consequently, in fact, copyright protection over an AIgenerated creation, knowing that this presumption is difficult or even impossible to rebut. However, the status quo concerning the authorship presumption should reasonably be maintained. A restriction or abolishment of this presumption would indeed be excessive and could have negative effects for the human creators on whom the burden of establishing authorship would lie.
Lastly, the study reflects on the evidence that the fact of knowing that an art piece is created by a human or by an AI system might affect the perception and the experience of the public. The study nevertheless points to the conclusion that no information obligation concerning the use of an AI solution for the development of the work should be added within the copyright framework. Such a legal obligation would indeed raise issues regarding its scope and its impact on the creators’ artistic freedom and their personality rights. The study does not enquire about the adequacy of imposing an information obligation in other bodies of law, such as consumer law.