For the best search for online collections, museums and cultural heritage institutions need to reflect the voices of the people they serve and collaborate in how they describe collections, says Jessica BrodeFrank, a doctoral student at the School of Advanced Study and digital collections access manager at Chicago’s Adler Planetarium.

Have you ever attempted to search the internet for a specific thing, but found that no matter what you typed in the search bar you couldn’t find what you were looking for? The same problem can happen when searching within digital collections of museums and can happen for both the public and the internal staff.

This happened to me in the early days of my career as a digital museum specialist. I was tasked with working with social media teams, exhibition teams, and public program staff to scour millions of collections images to find applicable images for specific uses. Whether it was a query for Pride Month or Black History Month, or a query for an archival image of the museum covered in snow for a blizzard closing update; I often went to our collections search bar and was confronted with zero results. This meant I had to visually scan through hundreds or thousands of images to find one that worked.

This was the problem I could see. Museums have a wealth of objects, truly a plethora of data, but an extremely limited vocabulary in describing these collections. This type of disconnect between search terms and searchable data happens most frequently when the language being used by the searcher differs from the language used by the institution in the database – the metadata. Even as museums begin to incorporate artificial intelligence (AI) and machine learning into the process of creating descriptive language and keywords for collections materials, much of the vocabulary and word choices differences are being trained right into these programs – still affecting the ability to discover.

Traditionally, museums have focused on cataloguing collections in order to record what an object is. This means, for the most part, that collections records focus on information such as who made an object, when it was made, where it was made, and what it was made of. This is all important information to record for future generations, but it misses what the object is about. Many users of online collections are searching for objects based on visual characteristics, such as what is depicted on an object, what does the object mean or do, etc. If this information is missing from the records, it can be difficult or even impossible to find what you’re looking for; as I encountered often in my work.

This led me to become a research student working on my PhD at the University of London’s School of Advanced Study, specialising in how to increase ease of search and make databases more accessible and inclusive through the use of crowdsourcing. As I continued to come up against access issues as part of my job, it became clear if I, as a person with knowledge of cataloguing practices and museum databases, could not find what I knew was there, then the public would also be coming up against issues of discovering collections. Only the public would not have the same assurance that what they were looking for was in fact there, just obscured by language choice.

As social movements called for museums to become more representational and diverse places, I saw this extend to the need for museums to become more transparent in how they describe collections and how these descriptions privilege specific narratives and ways of thinking. This led me to the engaging process of crowdsourcing; a collaborative process that brings the public into the cataloguing work of the museum’s collections. In March of 2021 I launched the project Tag Along with Adler (TAwA) as part of my doctoral research. This project is hosted on the platform and allows volunteers to engage with AI generated tags, while also creating their own.

Based on previous crowdsourcing projects such as in the 2000s, Tag Along with Adler looks to see how inviting volunteers into the traditionally professionally curated process of describing collections can help to not only create engaging experiences for guests, but also create a more representative and diverse set of search terms for collections. To date more than 4,000 volunteers have participated in the project, generating over 200,000 individual metadata tags for 900 collections images.

In November 2021, I also launched a video game prototype of the experience, which allowed volunteers to sort tags created by Adler cataloguers, AI tagging models, and other Zooniverse participants, before their own language as well. These workflows have already helped confirm that inviting the public into the description process yields language different from that of professional cataloguers; but additionally show ways in which the voices being included are more diverse, and how the process itself helps expose volunteers to ideas of cultural heritage and algorithmic bias.

It is imperative to the relevance of museums and cultural heritage institutions that they reflect the voices of varied publics they represent, and that the collections and narratives they hold are discoverable. Crowdsourcing of metadata allows for the public, and the internal staff, to discover and utilise collections better, while also building a relationship with the institution itself. It is this trust and this access that make museums relevant in the modern age, and institutions should prioritise this in their work.

Jessica BrodeFrank is the digital collections access manager at Chicago’s Adler Planetarium and a PhD student on the School of Advanced Study’s Digital Humanities Programme. Her research focuses on crowdsourcing as a means of digital engagement, and the enrichment of metadata with diverse voices as a way towards more inclusive search and representation.

Cover image: Zooniverse platform for Tag Along