April 18, 2009
Taking Liberties with the Numbers
This month's issue of the California Lawyer perpetuates the confusion in the media about DNA database trawls. In an article entitled "Guilt by the Numbers: How Fuzzy is the Math that Makes DNA Evidence Look So Compelling to Jurors?," award-winning journalist Edward Humes discusses the unusual case of People v. Puckett, No. A121368, Cal. Ct. App., 1st Dist., May 1, 2008). John Puckett, now an elderly man, is appealing his recent conviction for the 1972 murder of Diane Sylvester, a San Francisco nurse. The conviction rests on a cold hit in California’s convicted-offender database at a small number of STR loci (genetic locations). Hume writes that in Puckett, "the prosecution's expert estimated that the chances of a coincidental match between the defendant's DNA and the biological evidence found at the crime scene were 1 in 1.1 million." Id. at 22. Then he adds "there's another way to run the numbers" which shows that "the odds of a coincidental match in Puckett's case are a whoppiong 1 in 3." Id. "Both calculations," he maintains, "are accurate. The problem is that they answer different questions." Id. The explanation, he believes, lies in "a classic statistical puzzle known as the 'birthday problem.'" Id.
Surely the probability of "a coincidental match" cannot have such fantastically different "accurate" values. Moreover, the birthday problem has almost nothing to do with these numbers. The fuzziness is in the words of the article, not in the math. Only if we define "a coincidental match" can we begin to see what its probability would be and how unlike the birthday problem it is.
Definition 1. The probability of coincidental match is the chance that Mr. Puckett is innocent and the match to him is just a coincidence
The average reader might think that a coincidental match means that Mr. Puckett is innocent and the match to him is just a coincidence. If this is what it means, however, its probability is neither 1 in 1.1 million nor 1 in 3. The former figure is the probability that Puckett's DNA would match if he were the only one whose DNA had been checked and if he were unrelated to the killer. The latter figure is the probability that at least one profile in the California database -- not necessarily Puckett's -- would match if no one in the database were the killer. Notice that both probabilities are conditional -- they depend on assumptions about who the real killer is or is not. They cannot readily be inverted or transposed into the probability of who the real killer is. Under Definition 1, therefore, neither number is an "accurate" statement of the probability of a coincidental match. Neither one expresses the chance that the match to Mr. Puckett is just a coincidence.
A technical note: This description of the probabilities of 1 in 1.1 million and 1 in 3 assumes, for simplicity, that it was the killer's DNA that was found near the victim and later typed and that there was no possibility of error in the DNA typing, no ambiguity in the test results, and no selectivity in presenting them. Statisticians will immediately recognize that Bayes' rule could be used to arrive at the posterior probability of Puckett's innocence.
Definition 2. The probability of a coincidental match means the chance that Mr. Puckett's DNA would match (and no other DNA in the database would) if he were not the killer and if he were unrelated to the killer.
This definition refers to the probability of the DNA evidence given the hypothesis of coincidence. Again, neither 1 in 1.1 million nor 1 in 3 expresses this value, but 1 in 1.1 million is a far closer estimate than is 1 in 3. The reason is that the DNA evidence includes not merely the datum that Puckett's DNA matches, but the additional information that no one else's does. If Puckett were the only one tested (a database of size 1) and if he were innocent, then the chance that he would match would be 1 in 1.1 million. Now we test an unrelated second person. The chance that this individual would match if he were innocent also is 1 in 1.1 million, and the chance that he would match if he were the killer is 1. The chance that Puckett matches and the other man does not is therefore either (1/1,100,000) x (1/1,100,000) (if both men are innocent) or 1/1,100,000 x 1 (if Puckett is innocent and the other man is the killer). In other words, the probability that Puckett matches just by coincidence (he matches if he is innocent) in a search of a database of size 2 is, at most, 1 in 1.1 million. Searching the database and finding that only Puckett matches is better evidence than testing only Puckett. (This reasoning is developed more fully, for a database of any size, in. e.g., David H. Kaye, Rounding Up the Usual Suspects: A Legal and Logical Analysis of DNA Database Trawls, 87 N. Car. L. Rev. 425 (2009).)
Definition 3. The probability of a coincidental match means the chance that one or more DNA profiles in the database would match if no one in the database is the killer.
This definition refers to the probability of one or more hits in the database given that the database is innocent. This probability is approximately 1 in 3. What it has to do with the probability that the DNA in the bedroom was Mr. Puckett's is obscure. It is not even the expected rate at which searches of innocent databases would lead to prosecutions. After all, the 1 in 3 figure includes people who were not even born in 1972, when Puckett allegedly killed Diane Sylvester. If the probability that applies under Definition 3 were to be admitted, it should be adjusted so that it it is not so misleadingly large. See id.; David H. Kaye, People v. Nelson: A Tale of Two Statistics, 7 L., Probability, & Risk 247 (2008).
The Birthday Problem
Also contrary to the claim in the California Lawyer, the birthday problem is not involved in Puckett. The birthday problem, in its simplest form, asks for is the smallest number of people in a room such that the probability that at least two of them will have birthdays on the same day of the same month exceeds one-half. The answer (23) is surprisingly small because no particular birthday is specified. In the Puckett search, however, a particular DNA profile -- the one from the crime-scene -- is specified. Finding that this particular profile matches at least one in the database is much less likely than finding at least one match between all pairs of profiles in the database. The latter event is the kind that is at issue in the birthday problem. See David H. Kaye, DNA Database Woes: What Is the FBI Afraid Of? (under review). It is not involved in a cold hit to a crime-scene profile.
There are other errors in the California Lawyer article, but I hope I have said enough to caution readers to be wary. The media portrait of the database-trawl issue bears but a faint resemblance to the peer-reviewed statistical literature on the subject.
Guilt by the Numbers: How fuzzy is the math that makes DNA evidence look so compelling to jurors?, California Lawyer, Apr. 2009, at 21-24.
This blog --
The Birthday Problem in Las Vegas, Aug. 11, 2008
DNA Database Woes and the Birthday Problem, July 20, 2008
Rounding Up the Usual Suspects III: People v. Nelson, June 22, 2008
The Transposition Fallacy in the Los Angeles Times, June 8, 2008
The Transposition Fallacy in Brown v. Farwell, May 3, 2008
Rounding Up the Usual Suspects II, May 5, 2008
Rounding Up the Usual Suspects, April 5, 2008
Recent law review articles
David H. Kaye, People v. Nelson: A Tale of Two Statistics, 7 L., Probability, & Risk 247 (2008)
David H. Kaye, Rounding Up the Usual Suspects: A Legal and Logical Analysis of DNA Database Trawls, 87 N. Car. L. Rev. 425 (2009)
April 16, 2009
Two Cases on Multiple Chemical Sensitivity
A diagnosis that is presented in courts with some regularity is "multiple chemical sensitivity." Wikipedia provides the following links and remarks about its dubious scientific status:
"Because of the lack of scientific evidence based on well-controlled clinical trials that supports a cause-and-effect relationship between exposure to very low levels of chemicals and the myriad symptoms reported by clinical ecologists, MCS is not recognized as an established organic disease by the American Academy of Allergy, Asthma, and Immunology, the American Medical Association (AMA), the California Medical Association, the American College of Physicians, and the International Society of Regulatory Toxicology and Pharmacology."
Case law therefore generally rejects expert medical testimony of MCS. A recent Kansas court of appeals case, Kuxhausen v. Tillman Partners, 197 P.3d 859 (Kan. Ct. App. 1998), is illustrative. "When Stacy Kuxhausen reported for work at an accounting firm on a Monday morning in Manhattan, Kansas, she smelled paint and began to feel ill within minutes of entering the building. She said that her eyes burned, that she started to get a sore throat, and that she had to take deep breaths to get enough air. She later learned that epoxy-based paints had been applied in the basement of the building on the preceding Friday and Saturday. Kuxhausen came back to the building twice more over the next few days but stayed for only a few hours each time. . . ." She sued the building owners for about $2.5 million.
She found a member of the American Academy of Allergy, Asthma, and Immunology (not board certified), Dr. Henry Kanarek, who has diagnosed more than 100 patients with MCS and who concluded that Ms. Kuxhasen was suffering from this condition. The trial judge barred the diagnosis on the ground that MCS is not an ailment that is generally recognized in the medical community.
The court of appeals affirmed. There is nothing odd about that, but the court had to distinguish Kuhn v. Sandoz Pharmaceuticals Corp., 14 P.3d 1170 (Kan. 2000). Although Kansas follows Frye v. United States, 293 F. 1013 (D.C. Cir. 1923), in requiring general acceptance of scientific propositions that are the basis of expert testimony, Kuhn tosses this test out the window when the expert gives "pure opinion" based on personal experience. Thus, had Dr. Kanarek simply testified that in his 13 years of practice, he had encountered more than a 100 cases of MCS, the trial judge might have had to admit the diagnosis. But "[w]hen asked his basis for multiple-chemical sensitivity
as a valid diagnosis," Dr. Kanarek cited "information that has appeared
in various articles written in the publications that I've read as well
as lectures or discussions." Because he "has relied upon articles and lectures by others as support for the validity of the diagnosis," the court of appeals concluded that his testimony was inadmissible, Kuhn notwithstanding.
There is something ironic about a legal doctrine that excludes scientific evidence when the expert cites the scientific literature yet admits it when the expert relies on much more limited (and much less reliable) personal experience of a single physician. Kuhn makes little sense.
The Kansas Court of Appeals had to get around bad law to reach a reasonable result. Moving from Kansas to Oregon, the Oregon Court of Appeals misapplied good law (in the form of a state test for scientific evidence that anticipated the approach in Daubert v. Merrell Dow Pharmaceuticals) to deem it an abuse of discretion for a trial court to exclude such theories as dental fillings cause chemical sensitivity. In Kennedy v. Eden Advanced Pest Technologies, 193 P.3d 1030 (Or. Ct. App. 2008), the court of appeals thought that theories and publications within the subculture of clinical ecology were just as valid and established as mainstream medicine. It was unfazed by a toxicologist's testimony within "the recognized medical community" there was zero acceptance of the field "because it hasn't been substantiated [as] a scientific method." Id. at 1037.
Plaintiff relied on a diagnosis of "Dr. William Rea . . . who founded the Environmental Health Center in Dallas," id. at 1035, and whose methods, the court conceded, had been rejected by "virtually all courts that have considered the issue." Id. at 1041. But a physician who testified for the defendant dismissed the diagnosis as resting on "novel tests * * * published in obscure journals for which we don't know anything about peer review or other aspects of the testing procedure." Id. at 1037. He explained that Dr. Rea was "the mouthpiece, so to speak, for the clinical ecology movement. But the—the difficulty with—with this concept is that it's never had any scientific underpinnings. [T]he condition [cannot] be defined in such a way that anybody can properly diagnose it. [W]e continue to see a number of physicians who . . . use diagnostic tests that are not validated. They continue to make the diagnosis of multiple chemical sensitiv[ity], or MCS, or chemical sensitivity or sometimes it's been renamed to idiopathic environmental intolerance. None of these are legitimate diagnosable medical conditions for which criteria exist." He insisted that Dr. Rea is "practicing something that is not mainstream medicine, for sure. That, I can tell you."
In essence, the Oregon court substituted a simple credentials test for the requirement of a scientific foundation for scientific testimony, observing that "Rea is a medical doctor who has practiced for a long period of time, belongs to relevant professional organizations, and has examined over 30,000 patients." Id. at 1039. Apparently thinking that the possession of an M.D. makes a physician a scientist, the court stated that "there exists a legitimate debate within the scientific community between two groups of scientists." Id. at 1040. It concluded that "the most that can be said is that there is a controversy in the medical community about whether chemical sensitivity or MCS is a valid diagnosis." Id. at 1039.
The question, of course, is not just whether there is a controversy among individuals with advanced degrees. It is the nature, quality, and extent of the data that might confirm or refute the beliefs of these individuals. The learned professions are not immune to quackery. Some physicians entertain unvalidated -- and sometimes implausible -- theories. These believers may organize themselves into professional societies, issues certificates to their members, and publish their own peer-reviewed journals (that are ignored by the larger medical community). Courts dealing with medical testimony therefore may have to probe more deeply than the Oregon court did into the substance of the dispute if they are to reach sound decisions about the admissibility of scientific evidence.
Thanks to David Bernstein for calling the Oregon Court of Appeals opinion to my attention. Further discussion of the admissibility of medical testimony in light of modern tests for scientific evidence can be found in The New Wigmore, A Treatise on Evidence: Expert Evidence (2004).