December 8, 2009
Google SLOJ Details Emerge on Law Librarian Blog Talk Radio
Rich Leiter and company did an interview with Google engineer Anurag Acharya on the Law Librarian Blog Talk Radio last Friday. The hour and a half broadcast was a fascinating look into Google Scholar Legal Opinions and Journals. Acharya is the driving force behind Scholar which he described as a sabbatical project rather than something Google had expressly planned out. His major interest, as any technical employee at Google, is search. He wanted to see how search could be tweaked to raise the level of scholarly documents in search results. That basic idea spawned Google Scholar.
We did learn some new information about legal opinions in Scholar. One is that the case law database is licensed from a major legal information vendor, who Acharya could not name. The suspicion is West, if for no other reason there have been reports of West headnote numbers appearing in some of the texts. I’ve seen that myself in some of the Illinois cases I’ve found, though that is not the norm. As we say in court, suspicion isn’t proof, so the text could potentially come from Lexis or any of the other players. That was an interesting fact given the discussion of authentication.
Source of information is always important to librarians and legal practitioners. Google is an untested player for providing case information. The courts prefer books for citation and generally court rules enforce that preference. Citation to electronic sources seems more a matter of convenience when books do not contain the cited material. So how does Google SLOJ fit into this? With text coming from a major vendor, one would expect that the same editorial standards that apply to Lexis and Westlaw would take care of the reliability issue for Google SLOJ. That issue was not explored in the context of editorial standards, or at least as to who is editing the text of a case. If it’s Google, I’m not so sure about reliability. Anyone who has used uncorrected text in Google Books should have an idea of how bad this potentially can be. If it’s the unseen hand of someone in Eagan or Dayton, well, I’d have the same accepting view of the courts as to the accuracy of the text. Acharya did note that the contract called for updates and additions from the unnamed vendor, so the database is dynamic. There is an assumption, then, from these circumstances that the text is likely reliable, though I would prefer a more definitive statement on it. My experience with the cases I've read on Google SLOJ reinforce this assumption as I've found no errors to which I can point to as an example of a shoddy product.
The rest of the conversation contrasted the respective points of view of legal professional and engineer on just what this product is supposed to do. We in the library business are interested in slicing and dicing this information. Google seems to be interested in merely presenting it. Acharya said that he and two other people were responsible for Google SLOJ. He said it was included in Scholar because case law is no different that scholarly articles, albeit with a bit more significance to society. We as librarians may disagree, but that view of the presentation is telling. The ranking algorithm had to be tweaked to account for court hierarchy, and, implicitly, for precedent. Other than that, the cases are information as defined in its broadest terms. The audience is everyone. We can make what we will of it depending on our needs.
That is one reason why Google limits the analytical tools we use compared to Lexis and Westlaw. Everyone has different levels of experience with electronic texts. Ask any librarian who deals with a range of public patrons. Google designed this for people who know how to use Google at the very least, and to be successful with mining cases with that level of experience. No one will see something like a citator beyond the “How cited” tab. The panel clearly hungered for a more definitive free tool that matches Shepards or KeyCite. Don’t expect anything like that soon, if at all. Acharya pointed out some technical difficulties in doing some of the things Lexis and Westlaw does for a case opinion database. Google is not going there. He also alluded at one point to agreements he has in place that prevent him from doing certain things, such as creating an API to embed case information in third party sites. This suggests that whoever is vending the text see the raw case law as a commodity. The real value to a vendor is the analytical tools they provide. The contract essentially seems to be that someone provides the text at a reasonable price provided Google does not compete on features. I wonder if the DOJ has an opinion on that.
The other major question the panel explored is the addition of other legal materials such as statutes, or regulations. The problem here is that these materials are updated regularly. Acharya focused the the differences between a static and dynamic document, especially for hyperlinking purposes. Which version of the statute should be online to compliment a case published 20 years ago? Google is probably not going to even try to address this, unless there is an easy answer to the problem. Keep in mind a lot of this is automated. If someone has to do significant editing in house, it's not likely to be cost effective for Google to provide these features. That should be a clue as to what Google will and will not provide as future enhancements. He suggested the alternative, that the information is likely going to be in the standard Google index if it exists. Use the search engine to find it. If there is enough accurate information about the other material, it should be locatable. Google has no plans to compete with Lexis or Westlaw, whether by design or contract. It is what it is.
The entire podcast is available for download here. It's worth a listen. I've hardly touched on issues noted, such as the relationship between Google and publishers. As you can imagine, it's a good one. That's one reason Hein is prominently placed in the side results. One other note, the URL to a case is not permanent, at least not yet. Two other places where there is more information is Rick Klau's weblog. Rick is a product Manager at Google and has other detail about Google SLOJ. Another is Greg Lambert's post on 3 Geeks and a Law Blog. Greg was one of the participants who asked questions on the podcast. He gives his own views of Google versus the commercial legal databases. [MG]
As a part-time practicing attorney, I have had a lot of experience with Google scholar. It's essentially a great, free tool for looking up cases if you have the citations. I don't trust the search feature that much. It's very random, and I'm not a fan of the parameters used. But, if I have a treatise I'm going through to research a specific issue, and I find cases cited in the treatise, I'll pull up Google Scholar, type in the citation, and pull the case and read it on my computer screen. For what it is (basically a free look-up tool), I think it's a huge benefit for obtaining free, limited research.
Posted by: Online LSAT Course Instructor | Apr 18, 2011 7:02:27 AM
I noticed links to Lexis' journal articles are now showing up in the search results. Could they be the mystery case law providers? Also, I wonder how this will impact subscriber costs?
Posted by: C. Loveday | Dec 11, 2009 11:56:40 PM