Blog

Thinking About [Art] Collections As Data

By Chloe Downe Wells on October 29, 2019

If you could Google search the collective holdings of the world’s art museums, or a course materials on the entire history of art, what could you learn about objects and how they relate to one another? How would this create new connections or elide certain facts? Would it change our current narratives in the history of art? As a Praxis Fellow, I’m interested in how the digital humanities can help create new insights about art objects and their history. Before pursuing my Ph.D. in Art and Architectural History, I worked with art object metadata in different formats and for varying purposes. All of this cataloguing, sorting, and researching brings up larger questions about how to store and utilize archival and object data: what elements are important to record, how should they be categorized, and how can this information be searched and/or pulled to show relationships? Is there a way it can all be “linked” together? Fortunately, I am not the only one with such Utopian [art historical] data dreams.

Currently in Praxis, we are designing workshops based on our research interests to consider and implement our own DH pedagogy. In order to begin to think about the problems and possibilities for managing metadata about art objects – including how this speaks to other forms of data – I am designing a workshop about the concept of “Collections as Data” and an introduction to Linked Open Data (LOD). An initiative led by the Institute of Museum and Library Services and Always Already Computational: Collections as Data project team, “Collections as Data” aims to make digital collections of cultural heritage institutions (more) computational. In other words, to develop a way to make the collections available as data that is accessible to a wider audience with the ability to analyze and engage with it in different ways. Because the development of this concept and the frameworks with which one may utilize this data is still quite nebulous, we will discuss some of the elements at play in creating and working with object metadata and think about possible solutions, including the use of LOD.

To initiate some ideas about the need for and possibilities of collections as data, or what it might look like, we’ll start by thinking about the way art collections are catalogued and the way that information is stored and accessed. What types of information are important to record? How should one name those categories? How does one enter the descriptions and encode them? In an activity, the group will work in pairs to describe objects by recording details and categorizing them in entry fields. Comparing results, we’ll discuss how object data is classified and recorded.

cat/log

Certain issues will arise when descriptions inevitably don’t all match up, especially the difference between what we as humans may think is important to mark as historical record and what a computer can read and allow you to search for. For example, how do we search for works made during a certain year when the dates of creation can be listed as circa dates or by a range of years? How does one differentiate between a date given by the artist and a more historically accurate description of the date?

Moving on to relationships between the objects, we’ll think about how these fields and entries can be searched and associated with one another. Curators have a privileged knowledge of the collections they maintain and must rely on this familiarity (in addition to existing metadata) in order to associate certain types of objects and themes and create exhibitions. If we can search more easily or more specifically within a collection, what new associations can be made, especially across the boundaries set between collection categories, such as photography and painting? What if we could add a subject field in addition to a title? Or a theme or artistic movement field where we could add different tags? In a metadata world that is often dictated by the creator/maker/artist, could this influence a move away from privileging monographic exhibitions? Finally, how can we make this type of object association more functional within a museum as well as more accessible to scholars and the public?

Different institutions use various types of databases and data entry standards to house object data, so we’ll look at some existing schemas for cataloguing object data, like VRA Core and Dublin Core, which aim at inter-operational datasets. Comparing these schemas to the information gleaned from the first activity, we’ll talk about the pros and cons of these more standardized forms of cataloguing, especially regarding the ways objects can relate to one another. Going back to the initial activity, the group will try to streamline their data across the different entries and think about how they might meaningfully connect different descriptions. This will get us thinking about better ways to categorize and enter metadata, but also point towards the possibility of associating a whole collection database with other/outside forms of data. Historical archives and art collections are not often digitally linked to refer to the same entities. For example, accessing a letter or article written by an artist about a particular subject in relation to his/her work of art.

One way some organizations have begun to associate different datasets is through Linked Open Data, a way of storing bits of information as coded descriptions that can be easily accessed and linked across different resources. Using something called a URI (uniform resource identifier), which consists of a number, they can be – though are not always – stored in a URL (uniform resource locator, for those of us who didn’t know what that stood for before Praxis) on the web. By using standard ways of referring to unique entities, such as a person, LOD can also describe relationships between entities in the forms of “triples.” After a further introduction to LOD, we’ll look at an example or two of current linked data projects. The SNAC project uses LOD to link several descriptors from different resources to one individual, including images of that individual. But what about adding the object information for these images? Surely this would enhance the available knowledge!

Taking the local Holsinger Portrait Project as an example, we’ll discuss what sort of resource one could create by linking different types of data. As of now, there are several web pages that refer to these photographs and their subjects. Linked Open Data is one method by which we could think about linking the biographical information with the objects entries from the Small Special Collections Library, where they are housed. Furthermore, we could imagine linking historical or city data records on the sitters and places identified in these photographs. This could potentially allow us to map different types of data together in a way that could inform the history of Charlottesville as well as the records of the objects themselves, which too come in multiple layers (not least of which is the glass negatives which are not digitally linked to their corresponding prints). This is just one example of how linking a historical archive with an object archive could be extremely useful, and could expand to and illuminate other types of public information. Looking at specific photographs from the Holsinger collection and the available information, we can discuss both the possibilities and limitations of Linked Open Data for art and archival collections.

I hope some of these questions can prove useful for ways to think about how to implement Linked Open Data and the potential for Collections as Data, both for the field of art history as well as for its more public digital presentations and their use.