Historic paintings are resourceful cultural heritage artifacts. Their meaning can only be unveiled when describing the context of their creation, deciphering their symbolic visual content, contextualizing events and characters depicted and analysing the pigments used. An enormous endeavor given the millions of cultural objects created throughout history. The Saint George on a Bike project has zoned in on allowing unprecedented insight into our vast cultural heritage by building AI systems to describe and classify art pieces automatically.
Training AI caption generation models to automatically unlock European cultural heritage collections
Automatic image captioning is a process that allows already trained models running on commodity computers to generate textual descriptions from an image. To date, no Artificial Intelligence system has been built and trained to help in the description of cultural heritage images, while factoring in the time period and scene composition rules for sacred iconography from the 12th to the 18th centuries.
The Saint George on a Bike project, led by the Barcelona Supercomputing Center in collaboration with Europeana Foundation, aims to provide metadata enrichment capabilities for the cultural heritage domain by leveraging High-Performance Computing (HPC) resources to train AI models. The basic idea is to use natural language processing and deep learning algorithms to train a caption generation model based on tens of thousands of images and textual descriptions; as a result, descriptions for hundreds of thousands of images from various European cultural heritage repositories can be generated automatically, which will reflect an understanding of culture, symbols, and historical context.
The results of the project will be useful both to cultural heritage professionals and the public at large. Rich annotations will enable good indexation, which translates into better access to collections and a better experience navigating through collection catalogs. These results can be leveraged for educational, creative, or tourism projects.
Saint George on a Bike in 2:41 minutes
To show how AI can help the cultural heritage sector, Saint George on a Bike launched an engaging and inspiring video that explains the potential of AI to recognize the context of artwork and generate accurate annotations automatically. The video is not only directed towards professionals working in the GLAM sector, but even more so offers an easy-to-follow explanation of the project´s mission to the general public.
Among other advantages of applying AI in the cultural heritage sector, the video highlights the possibility to improve accessibility for the visually impaired people to image content on websites, the possibility to extract relations between thousands of concepts to connect them into large knowledge graphs for further analysis and inference, and the opportunity to curate virtual exhibitions with related paintings from around the globe.
Challenges along the track
The project met several challenges along the way, most of which had to do with the quality and quantity of the data and metadata available. To train a model for caption generation, one needs to have access to an aligned image/description dataset. Such descriptions don´t exist in general and, for those paintings that do have descriptions, these are normally not about the visual content. There is an assumption that one can see the content and thus there´s no need to describe it – this assumption fails in the case of minorities such as the visually impaired or when consumption by machines is sought. A further difficulty has been to automatically describe paintings that depict scenes in artworks that are not present in everyday images, as well as entities that belong to the supernatural or symbolic. The differences between photographs and artworks in terms of the presence of different objects and actions, or even the different disposition of objects in present-day images compared to past images, have been challenges that the project has faced because models (and databases) that are already trained on photographs cannot be used as such.
Examples of metadata enrichment
The image gallery below demonstrates one of the approaches that was implemented in the Saint George on a Bike project. The researchers collected and annotated data and then trained an object detector with a strong focus on the Cultural Heritage domain. Object detection model can easily detect 69 categories of objects that are most common for paintings of the 12th-18th centuries. The list of categories includes objects such as angels, crucifixion, specific attributes of popes and monks, the armament of knights, attributes of Saints, fantastic beats, etc. Detected objects can be used in several ways for the enrichment of metadata:
- As a tag for image collections of GLAM institutions
- Searching by keywords
- Positional information about objects for researches
- Initial step to define the relationships between the objects
- Initial step to generate natural language captions
Cardinal Juan José Bonel y Orbe
In a nutshell
“The Saint George on a Bike project will allow quick access to enriched cultural information, which can serve equally well for cultural and social ends, education, tourism, and possibly for historians or anthropologists. Indirectly the citizens can benefit from better public services, when these are based on the insight that the richer metadata we produce offers – such as web accessibility for the visually impaired or narratives that can expose social injustice or integration and gender issues through cultural heritage corpora and help create a more tolerant European identity”.
Maria-Cristina Marinescu (Project coordinator)
For more information on the project, visit our Local Time Machine project site by clicking the link below.