Tuesday, January 29, 2013
Different Computer Systems Give Different Word Counts
If your court or your Legal Writing class places caps on the number of words that you can place in a document, this study will be of interest. Don Cruse’s “The Supreme Court of Texas Blog” has noted that different word processing systems count words differently. The various systems have to make choices about how to count words and phrases. For example:
- Phrasal adjectives: Is “summary-judgment motion” two words or three?
- Legal citations: Is “S.W.3d” one word or two?
- Numerals: Does a pinpoint cite to a span of pages (e.g., “123-25″) count as one word or two?
- Record citations: Is a record citation like “4.RR.124-25″ one word or two or three or four?
- Statutory citations: How many words is a cite to “§123.23(A)(1)(i)(a)”? Is it just one long word, or is it five very short words?
To illustrate, Cruse conducted an experiment:
I lifted roughly a page and a half from a recent appellate brief. I put this text into its own clean word-processing file and made a few tweaks to the typography.
Survey says…
Here are the word counts from four word processors I had at my fingertips:
Word processor |
OS |
Word Count |
Microsoft Word 2011 |
Mac OS X (10.8) |
363 |
LibreOffice 3 |
Linux (Ubuntu) |
364 |
Wordperfect X5 |
Windows (XP) |
380 |
Pages |
Mac OS X (10.8) |
405 |
What led to the huge gap between the lowest count (Word) and the highest count (Pages)? It turns out that Pages uses an algorithm that treats an abbreviation like “4.RR.125-26″ as being four words. Yes, four. Pages sees imaginary word breaks in places that I do not.
(ljs)
https://lawprofessors.typepad.com/legal_skills/2013/01/different-computer-systems-give-different-word-counts.html