How do you know your work is any good? And I mean, really know? (This is the second post in my Heuristics and Empirics series. You can find the first post here.)
Let’s start as simply as possible, with a list. Why a list? Well, because it's the basic building block of just about anything metadata-ish. You can put them together to define a metadata model, embed them to make a hierarchy, relate them using taxonomy and ontology, or just create something browseable so people can choose what they want or where they want to go next. Lists abound on websites, intranets, tagging interfaces, menus, folders, and models. And they're basic, which means we all get what I mean when I say "list."
So, this list. How can we determine – empirically, quantitatively, and with sufficient statistical significance – if this list is actually any good?
Before we can measure goodness, we are first challenged to decide what “good” means to us. Are we satisfied if our list is free of typographical errors, or do we need better? We need to do some design thinking (not empirical!) and ask why: Why does the quality of this list matter to my business? Asking why is a tenet of effective problem solving and can lead to creative solutions, but for now we need something way simpler. We need a quantitative measurement of goodness (quality), which means we need some kind of goodness scale. So the better we understand our motivation for quality, the more we understand what qualities we want in our list, and the better our test protocol will be.
“Why?” almost always boils down to trying to increase value: money (making or saving), time (saving), or risk (reducing). Therefore, our list must be good if it leads to greater money, more efficient use of time, or less risk. Experience tells us that lists capable of achieving these goals have many of the CRANIUM characteristics:
Luckily for us, CRANIUM features are testable characteristics. What’s more, if we’re smart, we can tie these characteristics to a dollar amount, time savings, or risk avoidance ratio, providing ROI and a decent justification for performing the test.
Here are my recommendations for CRANIUM-testing a list of values. None of these approaches requires the tester to speak with participants, or for the participants to answer subjective questions.
These tests can be applied to hierarchical and polyhierarchical lists as well (e.g., taxonomies), although their complexity makes interpretation of results more challenging. This is because problems at the higher (broader) levels of the hierarchy will affect test results regarding the lower (narrower) levels. For example, participants looking for microphones might get confused by an ambiguous top-level choice between “Electronics” and “Computers” and so fail to find “Microphones” under Electronics. Recognizing that the problem lies at the top level (with Electronics and Computers) and not elsewhere is not intuitive.
Lists and hierarchies, as you might imagine, appear everywhere in information management environments. These structures are used for product categories and product specifications at e-commerce websites, inside SharePoint document management for tagging, as options for search queries and refiners for search results, and, of course, for navigation everywhere.
Given how often they appear, a certain amount of empirical reassurance is always good for the CRANIUM.