In June, Dr Lise Jaillant organised the workshop ‘Privacy, open data and the humanities’ at the School of Advanced Study, thanks to funding from DH@Lboro – the digital humanities research group she initiated at Loughborough University. Below, she provides a short summary of the workshop, and explains the next steps to achieve the right balance between open access and privacy.

Earlier this year, the Cambridge Analytica scandal revealed that data from millions of Facebook profiles had been used to target American voters in the wake of the 2016 presidential election. The scandal led to a privacy backlash and to calls for tighter regulation of internet giants.

Facebook launched a redemption campaign, with posters on the streets of London and Melbourne. ‘Data misuse is not our friend’ the posters declared, along with a list of actions to preserve privacy and give users more control over their data. Social and computer scientists fear that legitimate researchers could be collateral damage. It might be ‘more difficult to get Facebook – and its users – to agree to hand over the data for research alone,’ writes Annabel Latham, a senior lecturer in computer science, in an article in The Conversation.

Humanities researchers have been largely left out of the conversation about data and privacy. This marginalisation is surprising. After all, the transition from print to digital has led to profound changes in the way we encounter archival documents. For example, the archive of the poetry publisher Carcanet in Manchester contains hundreds of thousands of emails, but it is currently closed due to data protection and technical issues. The introduction of the General Data Protection Regulation (GDPR) in May 2018 has led to increased uncertainties and difficulties to access born-digital documents.

The workshop ‘Privacy, open data and the humanities’ brought together archivists and scholars to discuss the right balance between privacy and openness. The first session focused on archival repositories and private data. Cathy Williams of The National Archives (TNA) talked about discoverability and the need to unlock archives. TNA’s approach to collecting, enhancing, preserving and making collections information available is outlined in the 2017 brochure ‘Archives Unlocked’ (available as a PDF). The second speaker, Emma Canny presented her work at the Parliamentary Archives. Accountability and openness are central to information sharing within Parliament, Canny said. Adrienne Muir and Charles Oppenheim of Robert Gordon University concluded the session with an overview of archival research and data protection. Pushing for more transparency, they argued that archival repositories should explicitly state the reasons why certain archives are closed.

The issue of access cannot be separated from discoverability, which was the focus of the second session. Gareth Cole of Loughborough University explained that we increasingly expect data used in research to be FAIR (findable, accessible, interoperable, reuseable). However, achieving ‘fairness’ is more complicated that it seems. The next speaker, Thais Sardá, also based at Loughborough, presented her work on the representation of Deep Web technologies and users in British newspapers. The workshop concluded with a keynote speech by Olivier Thereaux, the head of technology at the Open Data Institute. We need to avoid a ‘data wasteland’, where people withdraw permissions for data to be used, even for the public good, Thereaux said.

What are the next steps to achieve the right balance between privacy and openness?

  • First, humanities researchers need to find allies in the data science community. The creation of a Data science and digital humanities group at the Alan Turing Institute is encouraging.
  • Second, academic researchers need to push policymakers to facilitate access to data – such as data locked in dark archives – without infringing on privacy. There is no reason why researchers (including humanities researchers) could not access anonymised data necessary for large-scale analysis.
  • Finally, we should communicate our research to a larger audience. Too often, the humanities are associated with ‘traditional’ research methods such as close reading or work in paper-based archives. It is our job to explain that humanities researchers and archivists actively engage with the post-digital-revolution world.

To summarise, we need to reach out and create alliances with data scientists, policymakers and the general public. As Bill Gates, the Microsoft co-founder, recently wrote on his blog, data can be used to take a humanist approach, to humanise the work that is ahead of us.

Dr Lise Jaillant has a background in literary studies and digital humanities. Her expertise is on issues of open access and privacy with a focus on archives of digital information. In 2017–18, she was awarded a British Academy Rising Star Engagement Award for her project: ‘After the digital revolution: bringing together archivists and scholars to preserve born-digital records and produce new knowledge’. She has recently started a major Arts and Humanities Research Council Leadership Fellowship. This two-year project focuses on the poetry publisher Carcanet and its born-digital archive, which is currently closed to researchers.