« Ghostwriting in Medical Journals | Main | My brother's DNA: Near-miss DNA searching »

May 5, 2008

Rounding Up the Usual Suspects II

Not long ago, I mentioned the DNA-database-trawl issue that has led to several confused court opinions. The evidentiary issue is whether a complete search through a database of DNA profiles that produces one and only one match is less probative than a simple match to a known suspect. Some researchers in the U.K. tendentiously call the former use of the database “speculative searching.” (Kaye 2006, 18).

Now, an article in the May 3 Los Angeles Times claims to have uncovered a national scandal of sorts. The reporters describe a recent “cold hit” case that they say

is emblematic of a national problem, The Times has found. [¶] Prosecutors and crime labs across the country routinely use numbers that exaggerate the significance of DNA matches in "cold hit" cases, in which a suspect is identified through a database search. [¶] Jurors are often told that the odds of a coincidental match are hundreds of thousands of times more remote than they actually are, according to a review of scientific literature and interviews with leading authorities in the field.

The article maintains that

[I]n cold hit cases, the investigation starts with a DNA match found by searching thousands, or even millions, of genetic profiles in an offender database. Each individual comparison increases the chance of a match to an innocent person. [¶] Nevertheless, police labs and prosecutors almost always calculate the odds as if the suspect had been selected randomly from the general population in a single try. [¶] The problem will only grow as the nation's criminal DNA databases expand. They already contain 6 million profiles.

This description portrays one approach to the issue as if it is the consensus in the scientific literature. It is not. There is disagreement about the need to adjust a random-match probability. Furthermore, if one counts the number of peer-reviewed articles on the subject, the dominant view is that adjustment is not necessary.

I won't present a full blown analysis here, but I will offer a thought on the statement that “[e]ach individual comparison increases the chance of a match to an innocent person.” It is true that if one searches a database of a million innocent people, all of whom are unrelated to the source of the crime-scene DNA, there are more opportunities for a match to an innocent person than if one searches a database of half a million innocent people, or than if one searches a database of one only one innocent person (i.e., the suspect). So sooner or later, searches of innocent databases will produce a false positive. Indeed, they already have.

But is the probability that an innocent database will contain a matching type the right question to ask? The probative value of a match depends on how much it shifts the odds in favor of the prosecution's claim that the matcher is the source of the crime-scene DNA. The enhancement in the odds grows progressively larger as the size of the database increases. The reason is simple. More and more people are definitively excluded as possible sources of the crime-scene DNA. This raises the probability that someone else in the population — including the matcher — is the source. In the limiting case of a database that includes every person on earth, the evidence of a single match in the database becomes conclusive (ignoring scenarios involving fraud or laboratory error).

It can be shown (and has been) that, due to this “exclusion effect,” the single match in the database raises the odds even more (at least slightly) than does testing a single person at random and finding that he matches. (E.g., Donnelly and Friedman 1999; Kaye 2008). Therefore, if there is any prejudice in the existing practice of reporting the random-match probability in the “cold hit” case, it is not because a cold hit in a large database is less probative than a cold hit in a small one!

In sum, searching large databases gives more information than searching small ones, and searching small ones is better than limiting a search to a single individual. The DNA evidence has more, not less, probative value in a database-search case than in a single-suspect case.


Donnelly, Peter, and Richard D. Friedman. 1999. “DNA Database Searches and the Legal Consumption of Scientific Evidence.” Michigan Law Review 97: 931–984.

Kaye, D.H. 2008. “Rounding Up the Usual Suspects: A Legal and Logical Analysis of DNA Trawling Cases.” (submitted for publication).

Kaye, Jane. 2006. "Police Collection and Access to DNA Samples." Genomics, Society and Policy. 2: 16–27.


May 5, 2008 | Permalink


TrackBack URL for this entry:

Listed below are links to weblogs that reference Rounding Up the Usual Suspects II:


Post a comment