What is Humanities Data?
In this section, we'll lay the groundwork for what we mean when we talk about "humanities data."
What is data?#
What do you see in you mind's eye when someone says "data?"
Data is one of those ubiquitous words that we use and see everywhere, but when it comes time to define it, we hesitate. It is used by so many people in so many contexts, it can be difficult to narrow down. If asked to share the first thing that comes to mind, you might say "information," "numbers," or "facts."
Often, data is defined by the verbs associated with it, such as this definition from Wikpedia:
"Data is measured, collected and reported, and analyzed, whereupon it can be visualized using graphs, images or other analysis tools. Data as a general concept refers to the fact that some existing information or knowledge is represented or coded in some form suitable for better usage or processing." - Wikipedia
Data is understood as something we do something with. It is an object in motion, under manipulation. It is used to say something, to prove something, or to disprove something else. Data is supposed to hold the truth. "What does the data say?" is a common refrain, as if data on its own can hold the answers. There is a lot more to say about the role and rhetoric of data in our society, but that does not bring us closer to a definition. If we're looking for a simple way to understand data, try this:
"Data is a value assigned to a thing." - School of Data.
A value assigned to a thing. Value is a word we encounter a lot when working with technology and math. It is often a number, but it does not have to be. If you're filling out a form that asks for your name, email address, and phone number, the values are pieces of information that you contribute. The things are the categories, the labels that remain constant no matter how many people fill out the form.
Data can be expressed in many different ways, something we'll talk about in another section. But for now, if you need a visualization, think of a simple table.
Thing1 | Thing2 |
---|---|
Value 1 | Value 2 |
What are the humanities?#
The humanities might be a little easier to define than data. Wikipedia says that the humanities "academic disciplines that study human culture."
How does your own university define the humanities? At my school, the humanities course designation is defined this way:
"Courses in a variety of disciplines focus on aspects of human experience and on methods of addressing the basic questions of meaning in humanistic study. Courses in history, philosophy, religion, or other departments or interdepartmental programs may fulfill this requirement. W&L Registrar
Now might be a good time to do some research about the history of universities and how we ended up with the academic disciplines we have. Take a look at the first few pages of Chapter 1 in Digital_Humanities. Did anything surprise you? What questions might you have about your own discipline, if you have one?
But the registrar's definition offers something else: humanities courses focus on "methods of addressing the basic questions of meaning in humanistic study." You might also see this phrased as "humanistic inquiry." Does "humanistic" just mean that it is coming from a humanities discipline? Or is there more to it? If we were to borrow a computing term, using the term in its own definition is a bit recursive.
Let's turn to some other potential definitions:
"The spectrum of humanistic thought, like that of scientific investigation, encompasses the gamut of beliefs regarding the nature of knowledge, the world, and the human ability to establish understanding with various degrees of certainty. D_H
"Humanistic inquiry acknowledges the situated, partial, and constitutive character of knowledge production, the recognition that knowledge is constructed, taken, not simply given as a natural representation of pre-existing fact." Joanna Drucker
Both of these definitions give us an idea of what the humanities might be after: knowledge. The nature, meaning, and construction of knowledge. And right away, we should notice that knowledge is not a certain, natural thing. We might say that each of the humanities disciplines takes its own approach to finding meaning and constructing knowledge about the human experience. How does Philosophy do this? English? History?
What is humanities data?#
Miriam Poser, well-known DH scholar at UCLA and someone you'll see repeatedly referenced in this coursebook, calls humanities data a "necessary contradiction." She describes the humanities scholar's resistance to seeing their sources/material/texts as data. You might have a professor who thinks this way. Many humanities scholars engage with the objects of their study in a way that does require spreadsheets, databases, or powerful computers. They might read their print books, visit archives to read printed documents, or view artwork in person. That being said, many humanities scholars do, and have for a long time, used technology to help them in their work. Scholars were using computers in the 1950s to create concordances and indices. Today, scholars are reading e-books, annotating on their iPads, organizing and tagging digital images, or searching scholarly databases. In doing so, they're relying on tools and processes built by librarians, who have been organizing information for a long time.
As technology expands into every corner of our lives, humanities scholars find themselves wanting and needing to address their questions, aka their lines of humanistic inquiry, in new ways. In some cases, humanities scholars have led the way in building new tools and methods for analyzing data. Examples.
But what humanities scholars have found is that their "data" does not always make good data. We'll learn later on about tidy, well-structured data sets, but humanities sources do not fit the bill. Humanities-based research objects could be: a single book, a set of unique, hand-written manuscripts stored in four different libraries, the art on the walls of a whole city, ancient graffiti etched into crumbling plaster, or audiovisual material so fragile that it degrades with every viewing. Humanities scholars might have questions that start with, "how many" or "what percentage," but they see years of work ahead in order to answer those questions.
It's true that there are some humanities data projects that could go on for years and years. But before we get going, we have to figure out where we're going and how we're going to get there. This is a crucial piece called data modeling and it actually takes up a large portion of the work of a humanities data project. Before we start typing into our Google doc, we need to create a model of exactly what we're collecting, how it is going to be formatted, and the most difficult: what information we don't care about. For some, this is an excruciating process. Every detail is interesting, worth a whole day of rabbit trails and research. For others, it brings immense satisfaction to organize their material into neat rows and columns. Regardless it's necessary to bring your goals, data, and analysis methods in sync with one another.
Fortunately, you don't have to do it alone. Humanities scholars have a reputation for working in isolation. Thinking, reading, and writing are solo activities. But data-driven humanities projects often require a team of people with a range of skillsets. Humanities scholars partner with librarians, technologists, amongst others in order to build giant databases or interactive applications. And importantly, they collaborate to extend and share their data. As just one example, the Pelagios Network lists dozens of partners in their expansive goal to "link and explore the history of places."
Why humanities data?#
Why is this coursebook about data in the humanities and not just data in general? What about the social sciences? Or journalism? Aren't some of these methods used in the sciences as well? What if I'm not going to be a professor, why should I care?
Good questions! First, there are a lot of existing resources for folks in other fields looking to learn about working with data. Examples. There are not as many resources available for folks, especially students, looking to learn about working with humanities data. This coursebook aims to fill this gap, and to do so from an multi-disciplinary perspective.
But the better answer is that there are valuable lessons and transferable skills to be learned from working with humanities data. The skills, things like data modeling, cleaning, visualization, and analysis can be used in all sort of other ways. Data and databases are present in virtually every industry.
Beyond technical skills, working with humanities data show you how to apply that humanistic inquiry to technology. It helps you see how the complexities of our world may have been sliced or squished to fit into a database. Understanding the principles of design will help recognize a misleading data visualization.
What about digital humanities (DH)?#
As you work through this coursebook, you will find references to "Digital Humanities" or DH. A lot of energy has been put into coming up with definitions for the Digital Humanities, but the short version is: it's the intersection of the humanities and technology. It's an umbrella term created to help humanists understand what their disciplines look like in a technological world. It's a community of practice that values process, openness, and experimentation. Humanities data projects certainly fall within Digital Humanities, but not every DH project is a humanities project. At its worst, it's a gatekeeping term, used to weaken the confidence of those who aren't sure their work is DH enough. Some people believe that in the future, DH will just be the humanities, but until then, we have this label to organize around.
Activities#
Activity 1.1#
Let's take a look at two humanities data projects to get a sense what we're talking about. Spend a few minutes exploring each project, then come back and see if you can answer the following questions. It's okay if you're not quite sure what each question means, but give it your best shot.
Projects:
- Project 1: Photogrammar
- Project 2: Robots Reading Vogue
Questions:
- What is the goal of this project? Are there guiding research questions?
- Who are authors? What are their affiliations and roles? Are students involved?
- How was this project funded?
- What is the source of the data?
- How has the data been processed or modified for this project?
- What do the visualizations show? Are they interactive?
- What tools or technologies were used to build this project?
- What was interesting about this project? What was confusing?
Resources#
- Data Carpentry
- The Digital Humanities Literacy Guidebook
- Digital Humanities Now
- The Historian's Macroscope: Big Digital History
- The Programming Historian
- Propublica Data Institute
- School of Data Online Courses
Readings#
- A Companion to Digital Humanities
- Defining Data for Humanists: Text, Artifact, Information or Evidence?
- Digital_Humanities, chapter 1
- Humanities Aproaches to Graphical Display
- Humanities Data: A Necessary Contradiction
- Technology Is Taking Over English Departments: The false promise of the digital humanities
- What is data?