Saturday, November 20, 2010
Federal Rule of Evidence 901(b)(5) provides that the identification of a voice, "whether heard firsthand or through mechanical or electronic transmissions or recording," may be established by opinion testimony that is "based upon hearing the voice at any time under circumstances connecting it with the alleged speaker." So, can an officer authenticate a defendant's voice on wiretapped telephone conversations based upon the officer listening to an approximately fifteen second voice exemplar at least fifty to sixty times? According to the recent opinion of the Seventh Circuit in United States v. Cruz-Rea, 2010 WL 4628670 (7th Cir. 2010), the answer is "yes," despite the absence of any empirical evidence on the reliability of voice identifications.
In Cruz-Rea, Rosalio Cruz-Rea appealed his convictions for conspiracy to possess with the intent to distribute more than five kilograms of cocaine and possession with intent to distribute 500 grams or more of cocaine in violation. In large part, these convictions were based upon
twenty-four wiretapped telephone conversations, including a conversation in which Cruz-Rea offered to sell cocaine that was "good for the frying pan" and a conversation in which Cruz-Rea discussed his plan to ship cocaine to Indianapolis via a car hauler carrying a Ford Explorer.
Officer Marytza Toy was the only witness who actually testified that she recognized Cruz-Rea as the speaker in each of the twenty-four recorded conversations. As noted, Officer Toy's familiarity with Cruz-Rea's voice came from listening to an approximately fifteen second voice exemplar at least fifty to sixty times.
In finding that Officer Toy provided proper authentication under Federal Rule of Evidence 901(b)(5), the Seventh Circuit noted that
We have consistently interpreted this rule to require that the witness have only "minimal familiarity" with the voice....Once the court admits voice identity testimony, opposing counsel may cast doubt upon the witness' opinion through cross-examination, additional testimony, or other evidence....It is ultimately the trier of fact's responsibility to determine the accuracy and reliability of the identification testimony, and when reaching its determination, the trier of fact may consider circumstantial evidence that tends to corroborate or contradict the identification.
The court thus could not
say as a matter of law that the low bar of minimal familiarity was not met in this case. Officer Toy testified that she became familiar with Cruz-Rea's voice by listening to an approximately fifteen second voice exemplar at least fifty to sixty times. Officer Toy then identified Cruz-Rea's voice on twenty-four wiretapped telephone conversations....Two different witnesses testified to having these exact conversations with Cruz-Rea on the telephone. Although neither of the two witnesses offered any voice identification testimony in court, their corroborating testimony tends to establish the accuracy of Officer Toy's voice identification. Given the length of the voice exemplar and the number of times that Officer Toy listened to the exemplar, the district court did not abuse its discretion in determining that the government had laid sufficient foundation for Officer Toy's voice identification testimony under Rule 901(b)(5)....The accuracy and reliability of the testimony was a question for the jury to weigh, and the court properly admitted the corroborating testimony to aid the jury in this role....We stress, however, that we arrive at this conclusion without the benefit of empirical evidence on the reliability of voice identifications, and as previously cautioned by this court in Jones, we can imagine a case in which the foundation for the voice identification testimony was so flimsy as to be deemed insufficient....
So my question is: Where is the empirical evidence? There is tons of research out there about the inaccuracy of eyewitness visual identifications, but I'm not aware of any studies about the effectiveness (or ineffectiveness) of voice identifications. My inclination is to believe that such studies would show that a standard of minimal familiarity is insufficient, but I would love to see the actual results.