Friday, July 10, 2015
With Watered Down Analysis, Eleventh Circuit Holds Florida Can Evaluate Teachers Based on Their Students' Scores In Someone Else's Course
Earlier this week, in Cook v. Bennett, 2015 WL 4086148 (11th Cir. 2015), the Eleventh Circuit held that Florida's teacher evaluation system was constitutional. Florida statute required schools to evaluate teachers based on their students' standardized test scores, relying on what is called a value added method of statistical analysis. As demonstrated here, any number of problems render this method of analysis problematic. But putting aside the general problems, plaintiffs point outed that not all subjects are tested every year and, thus, it is impossible to reliably evaluate those teachers with this system. Districts nonetheless evaluated these teachers on the test scores that students received in other subjects--math and English-- or school wide average scores. Either way, they were evaluated based on scores in subjects they did not teach.
The Eleventh Circuit held that this was permissible under the due process clause. Applying rational basis review, the court found that although the state's value added method "was not designed to evaluate" the teachers in question,
the policies are rationally related to the purpose behind the Student Success Act itself, which is to “increas[e] student academic performance by improving the quality of instructional, administrative, and supervisory services in the public schools of the state.” Fla. Stat. § 1012.34(1)(a). The plaintiffs have failed to carry their burden to refute this justification for the law. While the FCAT VAM may not be the best method—or may even be a poor one—for achieving this goal, it is still rational to think that the challenged evaluation procedures would advance the government's stated purpose.
As the plaintiffs conceded at oral argument, Florida officials could have reasonably believed that (1) a teacher can improve student performance through his or her presence in a school and (2) the FCAT VAM can measure those school-wide performance improvements, even if the model was not designed to do so. For example, Type B teachers may have a positive impact on their students that bleeds over into the students' work in other classes, including those measured by the FCAT. Type C teachers may have a positive impact on the learning environment of the school overall. The FCAT VAM can capture such impacts either by measuring the growth of a Type B teacher's students or by measuring the growth of a school overall. It is also reasonable to think that tying teacher evaluation scores and teacher compensation to FCAT VAM scores can incentivize teachers to pursue more school-wide improvements, which would in turn improve student academic performance. Thus, we agree with the district court that the policies pass rational basis review. Without a doubt, the evaluation scheme has led to some unfair results for Type B and C teachers, but “[t]he Constitution presumes that, absent some reason to infer antipathy, even improvident decisions will eventually be rectified by the democratic process and that judicial intervention is generally unwarranted no matter how unwisely we may think a political branch has acted.” Vance v. Bradley, 440 U.S. 93, 97, 99 S.Ct. 939, 59 L.Ed.2d 171 (1979).
While rational basis review is supposed to be deferential, courts have adopted various subanalyses that can be more demanding, particularly in regard to test based evaluations. For instance, some courts have required that tests be valid. The above test, at best, could be valid in regard to teachers whose students took the tests. The notion that a test is valid in regard to teachers who did not teach the material on the test is mind boggling.
But even on basic rational basis review, the court's analysis is problematic. It sidesteps the foregoing obvious point about validity by looking solely to the question of whether relying on this test is a rational means to achieve the state's goals. The court asserts that these evaluations are a rational way to improve student achievement because the evaluations will motivate teachers to improve their teaching to help either their students or the school's program overall. But again, how can a test that is not valid help teachers improve? That is hard to imagine.
The most an invalid test can do is scare teachers. Scaring them can theoretically make them improve, but it can also cause them to make mistakes. What if a teacher is actually improving, but the schools' overall scores are going down? How is that teacher to know whether he or she needs to stay the course, try harder at doing the same thing, or do something entirely different? The lower scores, taken on their face, would suggest the teacher is doing something wrong. In fact, those lower scores would be held against the teacher and become a basis for taking negative action against the teacher. In other words, the effect of the test is arbitrary. This arbitrariness is not a factor a court can legitimately ignore. Arbitrariness is a key factor in assessing the rationality of policy. In fact, early foundational due process precedent states that avoiding arbitrary results is the very purpose of due process.