Class of 2007: A More Extreme Bi-Modal Distribution

NALP just published its 2007 edition of Jobs & JD's. One topic of interest to students, lawyers, law firms, and legal educators is the change in salary distribution from 2006 to 2007.  The now famous 2006 bi-modal distribution was vivid evidence that the U.S. legal profession is undergoing significant structural change.  As shown in the graph below (from this NALP webpage entitled "Another Picture Worth a 1,000 Words"), the underlying stressors are even more pronounced for the class of 2007. 


The sample is based on 23,337 law school graduates from the class of 2007 who reported salary information.  Note, however, that 197 ABA-Accredited law schools graduated 43,518 students in 2007.  Although we know the types of jobs taken by 40,416 grads, only 57.7% of this group provided salary information.  If I had to wager on the direction of underreporting, I would predict it was under-inclusive of graduates with lower salaries and those who did not pass the bar.  Why?  Aside from the human psychology that it is easier to share flattering rather than embarrassing information, the roughly 7,500 jobs under the second mode are fairly close to figures I have seen from ALM and NALP data, which are provided by large law firms rather than individual students. See, e.g., charts in this NLJ article.

This bias, however, is not necessarily good news.  In the above graph, 32.5% of the law graduates took jobs with starting salaries in the $100K+ range; but the true percentage for the class of 2007 is probably lower.   Some facts and then one normative observation.  The facts first:

  • 91.9% of 2007 graduates were employed 9 months after graduation, which compares favorably to 2006 (90.7%), 2005 (89.6%), 2004 (88.9%), and 2003 (89.0%).  I would like to believe these numbers are trustworthy. 
  • 76.9% were in jobs that required bar passage.  [It would be useful to disaggregate the jobs in the remaining 23.1% of law school graduates.  Who are these students?  How many entered law school with no intention of practicing law? ]
  • The median salary in the above distribution is $65,750; the mean is $86,396.  But these measures of central tendency are not reliable guides of future earning power.
  • 38% of all starting full-time salaries were less than $55,000 per year, including 18% of all jobs in private practice, 27.5% in business, and 70.0% in government (excluding judicial clerkships).
  • 79.6% of law firm jobs in NYC, 80.3% in Washington DC, and 74.9% in Boston were in firms with 100+ lawyers.  Even in Indianapolis, 50.4% were in 100+ lawyer firms.  Wow! those are big numbers.

See also NALP Press Release, July 24, 2008. On the normative front, I have a simple thesis:  the bi-modal distribution is bad for students, bad for law firms, bad for clients, and bad for law schools.  [When I showed the 2007 distribution to one law school dean, she shielded her eyes!]:

  • Students.  It is bad for students because at $160,000 per year, many corporate clients will ask that you not be assigned to their matters.  And if your initial work experience is document review, a $160K job can quickly become a dead-end because your skill set is not growing with your billing rate (avg. 1st yr billing rate in a $160K+ firm is $225 to $255/hr).   So the atmosphere among associates at $160K+ firms is probably becoming more competitive.  It would be better in the long run to start at $95K, learn your craft, and become a great lawyer who commands top dollar.  And young lawyers should think long run.
  • Clients.  This is bad for clients because the short term solution of requesting only midlevels and partners will eventually constrict the supply of incoming legal talent.  When clients and law firms try to externalize the cost of mentoring and training--here I mean observation, contact, and feedback from partners and clients--associates are more likely to leave.
  • Law Firms.  Actually the bi-modal distribution is only bad for firms trying to keep pace with the Am Law 200 salary pay scale.  In contrast, boutiques and organizations like Axiom will find general counsel more interested in their value proposition. For Am Law 200 firms, the difficulty is getting partners to commit themselves to the future of the firm by spending more time and money investing in associates.  This will reduce attrition and protect the brand.   But the $160K+ cost structure provides partners with strong incentives to bill hours rather than investing in the long term future of the firm.
  • Law Schools.  The economics of the bi-modal distribution take the pressure off elite law schools--indeed, they can raise tuition! Thus, for many law professors, the best outcome is lateraling into a Top 15 law school.  But more/better law review articles--a precondition of a lateral offer--is not going to solve the difficult institutional problems of lower ranked schools.  Now more than ever, all law faculty members need to understand the structural shifts taking place in our profession.  When faculty at Harvard and Yale ignore these changes, it does not mean that these changes are not important.  It just means that Harvard, Yale, et al. are not affected.

I don't have any solutions to these issues, though I did write up some useful insights in my prior post, "Part II: How law firms misapply the 'Cravath System.'" Our situation reflects difficult collective action and coordination/signaling problems.  For example, how a firm gracefully bows out of the salary wars is an immensely difficult problem.  I do think, however, that permitting nonlawyer investment would provide law firms with the financial wherewithal (and psychological courage) to experiment with more innovation.  And that would be good.  Larry Ribstein's scholarship is now more timely than ever.  See, e.g., here and here.

When I was an interim associate at Sidley & Austin the summer after the 2000 salary wars, a partner told us that "we are all going to hell" based of the jump in salaries from $95K to $125K.  I now worry that he may have been right.


As with your previous post, it's misleading to present this data with a continuous line. For one thing, the data is inherently discontinuous. For another, were one to ignore the discontinuity, one would need to explain the underlying economics of why there is, e.g., a continuous range of salaries on either side of the $165K mark instead of a singularity. This data would be more accurately presented with discontinuous points, or as a bar graph.

Posted by: A.J. Sutter | Aug 2, 2008 9:21:09 AM

The objection to a continuous curve seems to me to be nitpicking (and it's NALP's graph). The point of the graph is the bi-modality, which we'd see even if this were a bar chart. So as to the point Bill is making, it's not misleading. It might be misleading if it were being used to prove something about a cluster of salaries around the spikes, but it seems beyond question there are two spikes.

For example, I'd guess that salaries are largely set in $5,000 increments. It looks like there'd be a bar at $145K and one at $155K and then one at $160K, but it would tell us the same thing about the some 17% of law grads who start at $160,000.

Posted by: Jeff Lipshaw | Aug 2, 2008 12:33:47 PM

One other thought. This reminds me of the creative use of scale in graphic presentation, and that doesn't matter as between histographs, line graphs, or anything else. For example, when you wanted to demonstrate to the board how your performance over the last eight quarters versus the competition wasn't really that bad, you used scale on the Y axis in astronomical proportions, so as to minimize the appearance of any difference. On the other hand, when you wanted to laud your .1 percent improvement in safety performance, the Y-axis was denominated in tiny fractions, so as to make it appear there had been a quantum jump..

So the real misleading presentation here would be to make the Y axis measurement go up to 100% which would flatten out the two modes compared to the rest of the graph.

Posted by: Jeff Lipshaw | Aug 2, 2008 12:42:50 PM

I disagree that it's nitpicking, especially for an audience enamored of law and economics.

First of all, it messes up the normalization of the graph. Consider the interval between [$160K, $170K], where the value of ordinate is 10% or more throughout. Now divide that space on the X-axis into 10 equal subintervals. According to this graph, you would get in excess of 100% of salaries in that interval alone. Using a larger number of subintervals just makes things worse. So the graph is nonsense if you interpret it at face value. Second, suppose we ignore the normalization problem and consider the ordinate as representing an absolute number rather than a percentage. Then the graph presents some interesting economic phenomena to be explained. For example, why are so many people earning just a tad less than $165K, and why are even more earning just a tadlet less, etc., when there is a peak at $165K? And what about the people earning more, and then just a little bit more than that? What sort of market forces are at work here that support such a continuous range of salaries around a peak? You might protest: oh come on, don't take it so literally. But then how am I supposed to know which values of the abscissa represent actual values of salaries, and which are just artwork? Continuous lines can be legit as guides for the eye, but in that case, the actual data points should be shown. Bottom line, this graph is impossible to interpret in any consistent way.

The root of these problems is pretending discontinuous data are continuous. The same pretense and similar (or even worse) problems routinely arise and are routinely ignored in economics. Often they're hand-waved away with notions like "aggregation of individual demand curves" etc. But the hand-waving doesn't actually solve the problems, as was acknowledged even by Gerard Debreu. See, e.g., Nicholas Georgescu-Roegen, _The Entropy Law and the Economic Process_ (Harvard UP 1971); Alan Kirman, "Demand Theory and General Equilibrium: From Explanation to Introspection, a Journey down the Wrong Road," in P. Mirowski & W. Hands, eds., _Agreement on Demand: Consumer Theory in the Twentieth Century_ (Duke UP 2006); and M.F.M. Osborne, _The Stock Market and Finance From a Physicist's Viewpoint_ (Crossgar Press 1996). (See also Edward Tufte's books on data display.)

If you're going to invoke economic explanations and throw around fancy terms like "bimodal distribution," then you should have the requisite rigor. Apropos of board meetings, I'd argue that relying on such a graph wouldn't be discharging the duty of care (though they might be saved by the fact that "ordinarily prudent people" are so innumerate). But law profs are supposed to be more critical about such things than execs. The fact that NALP is promulgating these graphs is no excuse; no one's forcing anyone to use this graph. I do agree, though, that Bill and NALP have lots of company in giving these matters short shrift.

Posted by: A.J. Sutter | Aug 3, 2008 8:10:44 AM

There should be another response of mine to Jeff preceding this one -- I hope it will show up eventually. But just to be more explicit about Jeff's point that "[t]he point of the graph is the bi-modality, which we'd see even if this were a bar chart[; so] as to the point Bill is making, it's not misleading": Since the graph isn't normalized, it can't possibly represent what it purports to represent. Sure it shows a line with two humps. But of what is this a "bi-modality" if the graph is meaningless? You can claim that it shows that a bunch of people got paid about so much, and a bunch got paid another figure, but to do that you have to ignore at least the caption on at least one of the axes. And if you actually tried to reconstruct the data from the graph, you'd find it hard to do. So one needs to willfully (or, I suppose, negligently) misread this graph to give it the interpretation that's claimed for it.

Posted by: A.J. Sutter | Aug 3, 2008 9:48:21 AM

Correction to Aug 3 11:10:44 post - "Now divide that space on the X-axis into 10 equal subintervals" confuses points and (sub)intervals. I meant to say pick 10 points within the [$160K, $170K] interval. In fact, these don't have to be equally spaced. If you add up the ordinates for these 10 points, you'll get > 100%. Similarly, if you use a larger number of points within the interval (rather than a larger number of subintervals), the problem gets worse.

Posted by: A.J. Sutter | Aug 3, 2008 6:00:18 PM


Get a life Poindexter. Anyone with half a brain can see the purpose of the graph.

Posted by: S.O. Swift | Sep 5, 2008 2:28:22 PM

Post a comment