Wednesday, October 14, 2009

Which of these things is most like the others? The Reference Class Problem

The central requirement of the rule of law is equality before the law, which means that like cases ought to be treated alike.  One of the fundamental problems to valuing cases is to determine what cases are alike under what measures such that its fair to say they are worth the same amount (or fair to say that in fact they are different).

The question of whether a group of cases are sufficiently alike that they ought to be similarly valued (or have similar outcomes along some other measure, such as causation) is referred to as the "reference class problem."  I'm just going to focus on the question of value in this post.  How do we decide what measurable criteria we are going to use to determine that some cases fall into the same categories? Should we worry that there are unmeasurable or subjective characteristics of particular cases that juries would consider but that cannot be accounted for in a statistical or qualitative social science model of valuation? 

Edward Cheng (Brooklyn) purports to solve this problem in a recent essay in the Columbia Law Review.  Click the link here for a summary of his ideas. I linked to his piece on SSRN previously (A Practical Solution to the Reference Class Problem).  Unfortunately, even if we can make some progress under Cheng's theory, we must still balance the fit of the model to the data, and that means deciding which variables are "noise" and which are relevant, and accounting for what Donald Rumsfeld would call "unknown unknowns" - variables that we are not able to ascertain but that do end up being relevant. Cheng himself admits in the paper that it doesn't solve the extrapolation problem, which is our main concern in mass torts.  On the other hand, I think Cheng has it right that we need to consider how rigorous social science methodology can help us solve these types of problems or at least move us a step closer to a solution that will satisfy the equality principle.  He should be commended for moving us a step forward in that direction.  I will have more on this in a forthcoming article.


(hat tip: Emily Wall, Columbia Law Review)

Many thanks to Alexi for raising a number of good questions about the implications of my article. I’ll try to address two of them here.

i) The question of value.
One undoubted limitation of the use of model selection methods (at least in the regression context) as a means for resolving reference class type problems is that you need to have a measure of outcome. Thus, the ideas in the paper work well when what we want to predict is the market value of a house or the pre-exposure risk of cancer. Where they do not work straightforwardly are areas determining commonality in class action cases, because there, you really don’t have an obvious target for prediction
One possibility for analyzing commonality through this lens is to use cluster analysis and the “cluster selection” tools that accompany them. (Thanks go to Richard Nagareda for spurring this idea.) Cluster analysis is about figuring out how to sensibly construct groups, and I think may be a fruitful avenue. More details to come as my work progresses.

ii) Relevancy.
The other big issue that Alexi raises is about the “relevance” of the predictors. How do we know that we’ve gotten all of the important predictors, or put differently, how do we know when our model is “right”?
As a response, I have to admit that I am in many ways advocating for a far more practical and data-driven perspective than what we conventionally see in social science studies of law. I think we need to view model selection methods as an attempt to make the best predictions given the available data. Take property valuation for example – I’d argue that we’re not really interested in the true model of property valuation; all we want is a reasonably accurate prediction of what the house would have sold for on the market. Might we get greater accuracy ultimately if we understood the underlying phenomenon better? Possibly. But until we do, I think the model selection methods are powerful ways of making do with what we have. And arguably, that’s what the legal system does anyway. We aren’t in the business of ultimate truths. We’re in the business of resolving cases based on the evidence at hand.

Posted by: Ed Cheng | Oct 15, 2009 12:06:24 PM

