« Meet the New Chief Marketing Officer for Wolters Kluwer Law & Business: Might Need a Master Class Taught by Dick Spinelli | Main | WestlawNext: Pros and Cons and General Comments from Law Librarians »

June 16, 2011

Converting and Correcting Bulk-Distributed State Code Text into Well-Formed HTML: Hershowitz on His California State Code Project

Ari Herdhwitz discusses the process he went through to turn California's statutory code text into structured HTML with hyperlinked internal references after downloading the code titles the State makes available via FTP. He notes that some of the text errors he found where probably produced by the State's own conversion of text from print to electronic format. "With almost all legal research now being done electronically, I think it's reasonable to expect official government electronic sources that can be relied upon." Quoting from Cleaning Up California Law: Errors in online sections.

That, however, would require that "primary legal materials, and the methods used to access them, should be authenticated so people can trust in the integrity of these materials." Plus, the task Hershowitz went through to convert the distributed text into well-formed HTML would not have been such a strenuous effort had technical standards for document structure, identifiers, and metadata had been implemented. See LAW.GOV's Principles and Declarations.

Hershowitz has written about the conversion process he used in the following very interesting Tabulaw blog posts:

Hat tip to Free Government Information blog. [JH]

June 16, 2011 in Electronic Resource, Gov Docs, Information Technology | Permalink

Comments

Post a comment