Background
The Higher Regional Court of Hamburg had to decide on the copyright permissibility of using a photograph to create a dataset for training generative AI. The dispute did not concern the later AI training itself, but the download of a watermarked preview image made available on a stock agency website. The defendant association used the image file to automatically check whether the image content matched an existing image description. According to the court’s findings, only metadata such as the URL and the description were later included in the published dataset, not the image file itself.
The court’s decision
The Higher Regional Court of Hamburg dismissed the photographer’s appeal. In the court’s view, the download constituted a reproduction, but one that could be justified by copyright exceptions. The automated comparison between image and description was considered an analysis aimed at obtaining information about the relationship between image and text and therefore fell within the TDM concept under Section 44b(1) German Copyright Act.
The court also affirmed the requirements of Section 60d German Copyright Act. The image-text comparison served to obtain information about correlations between image content and image description. The later use of the dataset for training AI could also qualify as scientific research where the dataset was created in a methodical, systematic and verifiable way and made available transparently and free of charge.
Rights reservation and machine readability
The court’s reasoning on the TDM opt-out is particularly relevant in practice. The court made clear that reservations declared by holders of non-exclusive rights may, in principle, also have to be observed where the work is accessed through their source. If a photographer uses stock photo providers to market their images, a reservation declared there can generally be attributed to the photographer.
In the specific case, however, that reservation did not help. According to the court, it had not been sufficiently shown that the rights reservation existed in machine-readable form at the relevant time of use. The court stressed that the law does not prescribe one rigid technical format, but it does require that the reservation be machine-interpretable in automated processes.
Why this matters
The decision is relevant for rights holders, platforms, AI projects and companies using data-driven development processes. It shows that preparatory steps before the actual AI training must be assessed independently under copyright law. At the same time, the judgment underlines how important an effective and technically robust TDM reservation is where text and data mining is to be excluded.
For companies and organisations creating datasets or using them for AI development, the ruling is another indication that the legal assessment depends heavily on the purpose, the dataset design, the technical implementation and the documentation. For rights holders, the pressure increases to implement opt-out mechanisms not only in text, but also in a technically reliable way.
To the point
- Downloading a photo for automated image-text matching may be permissible as text and data mining under Section 44b German Copyright Act.
- A purely textual reservation is not automatically sufficient; machine readability at the time of use is decisive.
- A reservation declared by a stock agency may be attributed to the rights holder.
- Creating and later using a dataset for AI may qualify as scientific research under Section 60d German Copyright Act.
- For companies and rights holders, effective opt-out processes, technical implementation and documented workflows are especially important.
Source: Landesrecht Hamburg