Category Archives: Humans of Data

Humans of Data is an art intervention into the international research data community, by artist and data researcher Laura Molloy. All material including photographs, text quotes and the name ‘Humans of Data’ are made available under Creative Commons Licence CC-BY-NC. Supporters of the Humans of Data project include CODATA, the Digital Curation Centre and the Research Data Alliance.

Humans of Data 26

What motivates me to keep going is teaching people who are going to keep this going after us. Managing data will find its place in the world.  I don’t mean analytics, I mean taking care of the data so that people can run legitimate analytics.

It annoys me just now – a lot of these data science and data analytics programs, they’re all about statistics, visualisation, analysis, but very little about actually curating the data underneath. Not to say that data curators don’t need to know a little bit about analysis but people who do data science in the business environment, they often don’t know much about curation.  People working for businesses, they complain that they spend 80% of their time cleaning data and without that, the data wasn’t usable. But I feel like saying, ‘If you hired data curators you wouldn’t have to deal with that problem!’

Humans of Data 25

“I’m an archivist who does digital preservation in a library and I’m very aware of the opportunities and challenges that happen in that context.  When we talk about inclusion, we need to remember professional and technical inclusion, too.  We don’t leverage our cumulative power enough.  Archives, libraries, digital preservation, digital curation, data science: we need to think what we all bring to the table and how we can put the pieces together.  If we don’t do that, we end up bumping into each other and missing opportunities.

I recently marked 30 years of working with data.  I’ve been a curator, preserver, creator and user.  I believe strongly in the continuum of data to information to knowledge to wisdom – we often stop at data and that’s short-sighted.  Data is the raw material that fuels we what understand and share, and we don’t make nearly enough of its potential.

I really like the kinds of stories that people are able to tell with various types of data.  When people think about what data can be, they often stop at structured, quantitative data, but there is a a broad mix of the various content that we can consider to be data.  We have an opportunity to innovate if we come together to develop a shared understanding of data services and practice, and collaborate with shared objectives.”

 

Humans of Data 24

“My story with data is funny. A year and half ago I didn’t know the term ‘big data’ exists. I couldn’t sleep one night in Cairo and I was reading online, and I found an article about big data. I had no idea what it was. So it was like, ‘This is interesting. I should be learning about this.’

So I was self-learning from scratch, so I think the passion started at the first sight. I’m so glad I didn’t sleep this night – because here I am studying data because of not sleeping!

I’m passionate about what we can do with data. It’s something very precious. It’s there and no one is using it so let’s use it. Because I have data, I can do things other people can’t. I’m still learning because data is complicated. But when you have them, data gives you power that other people don’t have.”

Humans of Data 23

“I’m excited that people are now starting to think about data sharing. For the last few years it’s been me, as the institutional data manager, going to people and saying, ‘You should make your data available!’  Now people are getting in touch and saying they want to do it, because they’re recognising they can get more stuff published that they can get recognition for.

It’s also good that we’re getting more than just the raw or aggregated data – we’re also getting the survey tools, the Stata code and the files for the processing scripts for how the data is analysed.  It’s exploding out into all the different stages of research.  If you’re thinking about reproducibility of research, you still only see tiny snapshots of that.  I’d like to do more about that: my frustration is that we don’t have software to document all stages of the research process.

A lot of those research outputs are useful but also ephemeral.  If you wanted to reapply a questionnaire, you’d have to do an update of it 2 or 3 years down the line.  Research approaches change, the language changes and so on.  But you could actually go back and do a comparison about how interviewing has changed over a specific time period – as long as we start managing those research outputs too, alongside the data and publications.”

Humans of Data 22

“In my previous life as an academic, I always liked interdisciplinary work: to come at things from a slightly sideways perspective. But in this area, I get to encounter more than most people do – collections, ideas, researchers, people, stories … I get to discover everything from every different area of knowledge, from lots of different perspectives.  The data itself is obviously really interesting but it’s what goes into the creation of that data, and what people then do with that data – that’s what’s really fascinating to me.

When people ask me, ‘What do you do?’, I’m still not sure how best to describe it.  Whenever someone asks, I give a different answer, but it doesn’t actually capture what the day-to-day work is about, which is the exchange of social and cultural knowledge.  I think that’s the most appealing thing to me.  There’s always something new to find out about, and this central thing that we call ‘data’ is a conduit into discovery of all kinds of stories and narratives.  It’s a window into lots of different worlds.”

Humans of Data 21

I’m not a data scientist but I know how to read and fiddle with code. This is what drives me – I want to understand and know something practically, not just by reading about it but by getting first-hand experience in collecting data, doing things with it, manipulation. I enjoy this and find it valuable. I do theory about data practice, so I’m interested in asking what data does to knowledge practices, but I’m looking at it as a philosopher rather than anything else. I’m interested in how data can be used to tell stories, but want to take this one step further. How do we use data to make arguments? I’m interested in how we can move to a critical way of looking at argumentation – how we can use data as evidence, to convince, to tell stories. I’m asking what is ‘good enough’ knowledge, what is ‘responsible’ knowledge, what is ‘valuable’ knowledge? What are the ethical considerations about data when we use it to make decisions?

Humans of Data 20

“Still, I’m inspired by the fact that the field is cross-disciplinary.  To be able to talk about digital preservation in a holistic way you need data producers and data consumers including people from information sciences, library scientists and researchers.  With every domain we need to understand a whole new idea of how data is produced and consumed and the use cases for the value of data.  It never gets boring.  There will always be work.  And if I have a question about a file format or metadata problem I can ask colleagues in New Zealand or the States or Scotland or the Netherlands and they know what I’m talking about.  I love that.  To me it’s like a cool kids’ domain!”

Humans of Data 19

“Digital preservation is a perfect field because it unites two things I’m passionate about: humanities and IT.  I can work on a framework to keep the data for future generations.  It’s always been important to do that whether the data is analogue or not.  Data presents evidence, evidence that’s subject to story telling and interpretation.  It opens up unlimited possibilities.  If you want to understand how a community ticked at a certain time, literature gives you a representation of the time, of what moved people.  Data that we create today can do the same thing.

Data can be literature, poetry, art or factual experimentation.  It’s not just an output of research; it’s an output of creativity and of our life today.  Sometimes we forget that.  
 
But we should spend more time talking about what works and what doesn’t work.  We need to not always invent new models, but apply a model and see what happens – to use models and tools to curate and treat our data, and then it’s very important to look at these tools critically.  And to improve them. There’s a lot of great output that has come out of projects but does anyone use it?  There’s a gap in implementation.  And funding’s becoming scarcer, so we need to find more effective ways to make tools sustainable and useable for the user communities.  It’s frustrating.”

 

Humans of Data 18

“I work in a university library but was trained as an engineer.  When I was doing my PhD, my advisor claimed engineering was a liberal art, which I didn’t understand then but I get it now: statistics and computation are all methods.  You need to think about people, products and processes, and the workflows that connect them.  So I brought that to library world and the research data management world, and it’s definitely an interesting space for people, products, processes and workflows.

I’ve always felt very welcome in this community. When I came I didn’t have the Library and Information Sciences degree or the background training but even in the early stages of my interaction, the community was very open, welcoming and accepting.  I try to return that to anyone who is new.

I hope we continue those positive trends in diversity and inclusion. There seems to be more awareness now about that but I think we’ve all been to that panel where you think, ‘Hmm, this isn’t right – everyone there looks the same.’  It’s frustrating when those more formal channels of conferences, things like panels, sometimes aren’t reflective of who’s in the audience.  So here, in research data, it’s a healthy community in many ways but we can always look at what can be done better.”

 

Humans of Data 17

“Brené Brown, the social scientist, said that stories are data with a soul.  I think about that a lot in the work I do.  I’m passionate about it.  When I meet the most engaging researchers, they’re good storytellers. Data are ways to connect with stories – data are the underlying content that researchers are sharing through their stories. I’m keen on preserving those stories, sharing those stories, now and in the future.

Particularly now, we’re in an unfortunate situation in the United States where things we had taken for granted – trust and integrity of information – are being questioned.  And we’re seeing such an emerging problem with tribalism, where people in their bubbles only talk to each other.

Data are a way we can span between different communities, different tribes, different people. We do that already in the research space, I think, but I hope that by continuing our work in data, we can help to deal with this tribalism issue.”