Data, data, everywhere...

why we should reject extreme views about data source validity in constructing knowledge

Posted Apr 16, 01:03 pm in epistemology, research


Some academics are not secure about knowledge without having reams of “data” about it. But what exactly is data? To many people, it just means numbers that somehow tells us something about something else. Often, data is a proxy for information we can’t really get directly. If you ask someone how much they would pay for something, you’re trying to gain access to a hidden piece of information that you can’t access directly. You would like to be able to reach into someone’s head and grab the number, but you can’t. Instead, you have to ask. The problem with asking, however, is that the person may not really know how much they would pay. What they tell you and what they might actually do could be two very different things. Many researchers have issues with this, but I want to explain here why I think it’s misguided to categorically dismiss certain forms data.

Okay, truthfully, I wouldn’t place a tremendous amount of stock in survey data. Often if you ask people about things that intimately relate to them, they have little insight or even self-awareness. This creates a problem in data accuracy. I once read a paper in which respondents were asked, among other things, how neurotic they considered themselves. What person is going to be able to answer this accurately? People have a vested interest in believing that they are not neurotic and are likely motivated to have an overly positive view of themselves; moreover, they probably don’t have any awareness of how neurotic they are. You could get this information much more reliably from asking the people who have to spend extended amounts of time with them. But beyond that, it gets even more complicated once you realize that people have different scales. A 7 to one person is a 9 to another. Over a group of people, these numbers end up muddled because of intersubjectivity. But often, it’s the best we have, and we have to make do with it. And generally, if you take the time to control for variables, you can minimize uncertainty.

Nevertheless, there are those who argue that “factual” data is the only way to go. This is stuff like scanner data (quantitative information about sales, for example— with no room for “interpretation”). This information, it is argued, consist of “facts.” They are not debatable, touchy-feely constructs dealing with emotions and nebulous latent thought patterns. Yet, in my opinion, numbers only tell part of the story. You can get a what from that sort of data, but it’s much harder to get a why. You can speculate, but speculation only gets you so far. Some might argue that getting a why from interviewing doesn’t work either. I can see why they might say that (perceived lack of generalizability), but nevertheless, one can still gain some valuable insights, especially over large sample sizes.

I know of a few researchers in marketing who do, almost exclusively, ethnographic studies. This means that they interview a usually small number of people and perform something akin to psychoanalysis to make sense of their comments, from which they extract broader level understandings about people and culture. This is a method that many people apparently view with skepticism and scorn. “How do we know that this stuff is true, and that it’s generalizable?” a colleague once asked me. This, to me, invites the question of how we “know” anything.

Knowledge is a strange thing. What does it mean to “know” something, anyway? The best explanation of knowledge that I’ve ever heard was featured in Michael Shermer’s excellent book Why People Believe Weird Things. In it, he offers a thought experiment.

There are a number of people who are vehement Holocaust deniers. They claim that this genocide never happened, and people who believe it did are ignoring mountains of evidence showing that it didn’t happen. These people then present certain pieces of evidence that they have collected that indicate that they are correct. A piece of evidence Shermer received from one denier was a photograph of a gas chamber that showed that the gas lines weren’t hooked up properly. Therefore they could not have been used. This was at one of the concentration camps with the most supposed casualties. The denier had numerous other similar forms of ‘proof’ as well.

So what are we to make of this?

Shermer’s response is rather remarkable for its simple but profound intuition: he states that knowledge is not produced from isolated pieces of evidence. It is based on a convergence of evidence from may different areas. It’s not just the millions of bodies found, it’s not just the thousands of witnesses, it’s not just the trove of photographs, it’s not just the millions of pages of documentation. It’s all of these things together that all point towards the same thing, which is that the Holocaust happened.

Isolated photographs don’t prove anything; human interpretation of knowledge is based on our assessment of what is most likely true based on all information available. One photograph of a disconnected gas chamber does not disconfirm all the aforementioned evidence, and to think that it does suggests the presence of motivated reasoning more than anything else. Even a hundred such photographs don’t show much— because the relative amount of evidence between the it happened group and the denier group is at such a disparity that it’s virtually impossible for an unbiased party looking at all the evidence to arrive at the conclusion that it didn’t happen.

Likewise, I think it’s very important for researchers to have a somewhat contemplative view of data. Despite the presence of “definitive” journal articles about certain subjects, one sample of data and its interpretation is not definitive; more important is the convergence of many different types of data onto one most likely truth. More data from more diverse sources ensures that any one piece is not biased, and isn’t misleading you in a certain direction. However, as anyone who has been collecting data for any amount of time will tell you, it isn’t easy to get data, much less many different forms of data that relate to the same constructs. But if you ask me, from an epistemological point of view, it’s the only way to establish knowledge; the more data you have that all point at the same thing, the more you can be certain about something. It’s a rather simple concept that forms the basis of the scientific method.

Even if some people find certain forms of data less reliable than others, all data collected in a responsible fashion can be useful in constructing truth— especially when combined with other relevant data. Every piece can play a role, just as each brick plays a role in building a house. True, not all data is perfect; in fact, very little of it is. But it is not productive to dismiss entire categories of data simply because it came from a survey or it is based on ethnographic information. It is a grievous research error, in my opinion, to make the perfect the enemy of the good; it undermines the role of information diversity in arriving at truth.




Comment

 
Textile Help

Categories

External Links

Search