Today we celebrate World Intellectual Property Day. This is the opportunity to recap what has kept the news media sector busy in the field of intellectual property.
A new fascinating topic landed on our desk. Over the winter, we got very excited about the potential of ChatGPT to answer our most pressing queries. We played with the tool, we phrased and rephrased questions, we asked him to deliver poetry for our members. We were amazed.
And slightly troubled.
We understood that generative AI is a predictive and statistical system that does not have inherent knowledge but anticipates the most probable answer to questions based on a huge pool of data. Put in other words, the output is a mathematical result. And the result, based on text and image analysis, can be mind-blowing.
But generative AI is only as good as the quality of the content that feeds into it. It did not come as a surprise that journalistic content was being used to train the system and generate answers to the public. Further investigation showed that not only open content but behind-paywall articles were used to feed the system.
This practice raises a series of challenges, just to name a few:
From a commercial perspective
Tech companies and media houses are currently negotiating licenses and commercial partnerships for the reuse of journalistic content and its online distribution. Negotiations follow the implementation of the European Copyright Directive (Article 15) which is meant to compensate for the shift of advertising revenues, recognise press publishers’ investments in journalism and make sure that citizens continue to access quality online news.
Now, the integration of generative AI into search engines and web browsers means that user experiences will radically change, from clicking on links and snippets, to reading already-made summaries. In other words, AI is bringing a zero-click search experience, which reduces traffic to news publishers’ websites.
Whatever the way forward, we must not lose sight of the objective of EU copyright law: what matters for the financial viability of news production is the ability to retain control over the content and monetise it.
From a legal and policy perspective
In order to sustain the viable production of journalistic content in Europe, we need to think really hard on how we can collectively achieve a suitable regulatory framework that works in the interest of citizens.
One would argue that the text and data mining (TDM) provisions of the EU Copyright Directive (Articles 3 and 4) offer a suitable framework. Yet the TDM exception is limited both in terms of scope and application. Generative AI goes arguably much further than text and data mining as it produces a completely new and creative output. Also, an instrument like ChatGPT was not envisaged at the time of negotiations. Finally, the exception for scientific purposes would not apply to private entities that use AI for commercial purposes.
The Copyright Directive however is relevant insofar as it provides press publishers with the ability to opt out from TDM usage. So far press publishers have been reluctant to use this opt-out function, as experience shows a loss of visibility on search engines, also called “search engine optimisation”. Maybe this opt-out function should be made available by AI providers on a more granular level, to allow press publishers to protect their data from crawling, while remaining visible and accessible to readers online.
Google’s CEO Sundar Pichai said in a New York Times interview that “AI is too important an area not to regulate. It’s also too important an area not to regulate well.” We fully agree.
The Artificial Intelligence Act, currently under discussions in Brussels, offers a window of opportunity.
The AI Act proposes different degrees of obligations on AI systems, depending on their purposes. Members of the European Parliament introduced an amendment to oblige providers of foundation models to comply with certain transparency requirements and disclose the use of copyrighted content, such as news content. The file is currently under discussions in the European Parliament.
The Council of Member States already proposed to apply certain requirements of “high-risk AI” to general purpose AI systems like ChatGPT via a future implementing act and after proper consultation.
From an ethical perspective
The dilemma is not simple. AI usage is not just a commercial or legal issue, but also an ethical one for press publishers. If the vast majority of citizens will soon use generative AI, we hope the tool functions on the basis of authoritative and reliable sources. If press publishers were to opt out, does it mean that the algorithm would learn through inaccurate facts, uncertain datasets and potentially disinformation? The repercussions on the social and political debate would be serious, to say the least.
To address this problem, it is first crucial that tech companies disclose the different types of data that they are using, and explain what is a criteria for a credible and relevant source. Transparency is what will help commercial and non-commercial players discriminate between reliable and unreliable AI, and choose how to best use the tool in the interest of social cohesion and democracy.
Conclusions
There are more questions than answers at this stage. News Media Europe proactively reached out to the tech community, with the firm belief that our two worlds need to communicate on this issue. AI is not just innovation, it is a civilisational change. General purpose AI will impact all aspects of our lives. It already impacts the way readers consume online news and ultimately, it will impact political and democratic debate. Such development cannot happen behind closed doors. We need to put the debate out there and we call on policy-makers in Brussels to take the lead on this matter.