on small data

Maybe it will help to start small.

In the entirety of his poem, De Rerum NaturaOn the Nature of Things, as Lucretius advocates for Epicurus’ atomistic view of the world (and thus, now famously through Stephen Greenblatt’s recent book, helps to precipitate the Renaissance), Lucretius only uses the Latin word, datum, a single time–and then only as extrapolated by later readers from a section of missing text.

It strikes me as a bit odd that the term–from dare, to give, and refers to that which is given, present, extant–occurs only once in a treatise devoted largely to ideas about relationships among elements at the atomic scale. Throughout the text, Lucretius seeks exactly this–the unseen datum–something elemental or essential about which one could make assumptions and which could explain the underpinnings of the universe. No small task.

Today, however, our post-theory scholarly landscape reminds us of the continual subversion of any such underlying or essential truth, which leads almost inevitably to some significantly larger questions:

What is given?

What are the essential constituent parts of our inquiry process?

At this weekend’s THATCamp, a group of us interrogated the very idea of data in a session aptly titled “What is the Opposite of Big Data?” With guidance from Suzanne Fischer and Sarah Werner, we made a few tentative steps into the shallow lake of big data looking for some substantive ideational rafts among the flotsam of small data sets.

Two questions that came from our session that might foster some useful research or inquiry:

1) Do humanistic projects need to meet expectations about accuracy and statistical significance? And, how can scholarship that engages a sample size of n≈1  demonstrate its validity and relevance in an increasingly scientific discourse?

 2) Are big data approaches that advocate a distance-reading to large swaths of text or cultural artifacts in some way a response to continual challenges to the relevance of humanities in society broadly? And if so, how can small-data approaches be leveraged to demonstrate the relevance of humanities-based scholarship and liberal arts education generally?
One model around which we coalesced proposes to turn the traditional big data approach (one scholar, lots of texts) on its head (one text, lots of scholars) to underscore the utility of what are, frankly, some really cool DH tools to explore the materiality of texts and artifacts. By having a range of scholars converge on a small data set (be it a single book, poem, musical refrain, painting, or material object), we can turn attention to the affective response inherent in scholarship and create a network of nuanced meanings in the context of a very narrow slice of data. In such a model, the experience of reading itself becomes the data.

Suzanne succinctly summarized the discussion with what began to approach a mission statement–that as a group, “we do value the small and limited, and we do value modest claims. We believe that big data should try to make more modest claims and to think how humanistic inquiry can be ported over to those claims.”

I was left with a something not quite a manifesto and call for a renewed attention to the experience of reading, to the materiality of texts in an effort to reground and make relevant our teaching and scholarship.

Although I think we’d all admit to only being ankle-deep in this discussion, I think there’s also a lot of much more profound terrain to explore. In particular, trending toward small data can continue to decentralize, destabilize, and complicate humanistic study in ways that can (perhaps even more so when paired with contextualizing swaths of big data) open doors to innovative methods of inquiry that foreground the significance of humanities scholarship.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: