Register |  Lost Password?
eSchool News

Should student test scores be used to evaluate teachers?

Debate rages over using the 'value-added' model in measuring teacher effectiveness

Teachers who lead students to achievement gains in one year or in one class tend to do so in other years and other classes, the report said.

The so-called value-added model is an “imperfect, but still informative” measure of teacher effectiveness, especially when it is combined with other measures, according to the preliminary results of a large-scale study funded by the Bill and Melinda Gates Foundation. The study’s early findings have ratcheted up the debate over whether student test scores should be used in evaluating teachers—and if so, how.

The report, entitled “Learning About Teaching: Initial Findings from the Measures of Effective Teaching Project,” reportedly gives the strongest evidence to date of the validity of the value-added model as a tool to measure teacher effectiveness.

The $45-million Measures of Effective Teaching (MET) Project  began in the fall of 2009 with the goal of building “fair and reliable systems for teacher observation and feedback.”

Teacher quality is important, but “teacher evaluation is a perfunctory exercise,” the report said. Principals tend to go through the “motions” of evaluation, and “all teachers receive the same ‘satisfactory’ rating.” The study aims to fix this “neglect” and help devise a system that gives teachers the feedback they need to grow, the report said.

What is the value-added model?

Value-added is a controversial statistical method that relies on test-score data to determine a teacher’s effectiveness. Each student’s performance on past standardized tests is used to predict how he or she will perform in the future. Any difference between the student’s projected result and how the student actually scores is the estimated “value” that the teacher has added or subtracted during the year.

The value-added model is thought to bring objectivity to teacher evaluations, because it compares students to themselves over time and largely controls for influences outside teachers’ control, such as poverty and parental involvement.

For more on teacher evaluation, read:

Teacher quality under the microscope

Are qualified teachers always effective teachers?

Newspaper’s teacher ratings stir up controversy

Video to be a key part of student teacher evaluation

Teacher’s death exposes tensions in Los Angeles

Value-added has been a buzz word among educators since the Obama administration’s “Race to the Top” grant program began promising money to school systems that adopt to certain requirements, such as evaluating teachers’ performance by using factors like student achievement.

Critics of the value-added model fear school leaders might make serious decisions about individual teachers based on these projections alone.

“This is a problem with value-added,” said Raegen T. Miller, associate director for education research at the Center for American Progress. “So far, value-added has been on its own. People are very scared that administrators would start making serious decisions about individual teachers just based on that information—and nobody thinks that should be done. It doesn’t take away people’s fear of it. We can write all we want that we should use multiple measures; now, we actually [should] start having multiple measures.”

That’s the good news that comes from the preliminary findings of the MET Project. Based on these findings, researchers recommend that school leaders use multiple measures, in addition to value-added, to evaluate teachers effectiveness.

  • Facebook
  • Twitter
  • Delicious
  • LinkedIn
  • StumbleUpon
  • Add to favorites
  • Email
  • RSS
1  2  3  Next >  

You must be logged in to post a comment Login

Comments:

  1. the_hill1962

    January 12, 2011 at 12:51 pm

    The statement, “Indeed, we do find that a teacher’s value-added [result] fluctuates from year to year and from class to class, as succeeding cohorts of students move through his or her classrooms” hopefully will come under fire.
    In the article “Bad teachers”, I pointed out the fact that some good teachers may just be given a bad schedule.
    Some teachers only have one subject to prepare for. Some teachers have to prepare for 8 subjects. Certainly, a much better lesson can be developed for the students when only one lesson has to be written. Further, many schools now give teachers two preparation periods while some only have one preparation period.
    This article listed “Reasons for instability from year to year could include factors size as significant differences in class size from year to year, an influenza outbreak, a group of disruptive students, construction noise during testing, and so on”. The “and so on” is what concerns me. There are so many reasons.
    Most often, a teacher has no control over what his/her schedule is going to be. Sure, this is true for most professions. Probably the worst case is that of an Emergency Room doctor. I wonder if they evaluate E.R. doctors with a similar value-added model concept?

  2. the_hill1962

    January 12, 2011 at 12:51 pm

    The statement, “Indeed, we do find that a teacher’s value-added [result] fluctuates from year to year and from class to class, as succeeding cohorts of students move through his or her classrooms” hopefully will come under fire.
    In the article “Bad teachers”, I pointed out the fact that some good teachers may just be given a bad schedule.
    Some teachers only have one subject to prepare for. Some teachers have to prepare for 8 subjects. Certainly, a much better lesson can be developed for the students when only one lesson has to be written. Further, many schools now give teachers two preparation periods while some only have one preparation period.
    This article listed “Reasons for instability from year to year could include factors size as significant differences in class size from year to year, an influenza outbreak, a group of disruptive students, construction noise during testing, and so on”. The “and so on” is what concerns me. There are so many reasons.
    Most often, a teacher has no control over what his/her schedule is going to be. Sure, this is true for most professions. Probably the worst case is that of an Emergency Room doctor. I wonder if they evaluate E.R. doctors with a similar value-added model concept?

  3. Judith Naylor

    January 12, 2011 at 3:46 pm

    It really depends on what you mean by evaluate. If you mean punish or penalize, the tool is not effective. If you mean teach or support, the use of data can be an effective tool. The assumption seems to be that student success or failure is based on given set of tests and that the teacher just needs to teach the ‘stuff’. This assumption is flawed because student success is based on many facets both in and out of school. The teacher alone can not change the home that the students are coming from. The teacher alone can not change administrative decisions which negatively impact what goes on in the classroom. Using statistics is essential to improving educations. But the statement that value-added testing, “compares students to themselves over time and largely controls for influences outside teachers’ control, such as poverty and parental involvement” is ludicrous. A student who does poorly in a first test and comes from a non-supportive home and school is not going to show improvement on a subsequent test no matter how much work a teacher puts in with that child. Don’t use the data to punish. Use the data to examine all circumstances and work as a community to focus on the problems the student is facing.

  4. Judith Naylor

    January 12, 2011 at 3:46 pm

    It really depends on what you mean by evaluate. If you mean punish or penalize, the tool is not effective. If you mean teach or support, the use of data can be an effective tool. The assumption seems to be that student success or failure is based on given set of tests and that the teacher just needs to teach the ‘stuff’. This assumption is flawed because student success is based on many facets both in and out of school. The teacher alone can not change the home that the students are coming from. The teacher alone can not change administrative decisions which negatively impact what goes on in the classroom. Using statistics is essential to improving educations. But the statement that value-added testing, “compares students to themselves over time and largely controls for influences outside teachers’ control, such as poverty and parental involvement” is ludicrous. A student who does poorly in a first test and comes from a non-supportive home and school is not going to show improvement on a subsequent test no matter how much work a teacher puts in with that child. Don’t use the data to punish. Use the data to examine all circumstances and work as a community to focus on the problems the student is facing.

  5. Jutti

    January 17, 2011 at 5:01 pm

    One major problem with value-added is that it doesn’t take into account that some years students have outside factors such has illness, family problems, etc. that affect their preformance on the state tests that are given over a week’s time. What about the child who has had health problems and messed 40 days of school that year? What about the child whose parents are undergoing a messy divorce? What about the child whose older brother was murdered and the whole family is in a tailspin? What about the child who comes to school on one of the testing days totally exhausted because her apartment building was evacuated by police due to a hostage situation?

    Value-added does not take into consideration all the things that are beyond a teacher’s (and often a family’s) control. This can change from year to year. A child who had a stable home life last year may not have one this year. My last year’s class had some of the worst attendance I’ve ever seen. There were only 27 school days in the ENTIRE YEAR that I had 100% of my students present. Kind of hard to teach children who aren’t present. My district has no program to deal with attendance problems.

    How about spending the money the Los Angeles Times spent on value-added to get students to school and to hold parents accountable for their children. Bill and Melinda Gates, why don’t you put your money where it will do some good? Family support services, health care, etc.

  6. Jutti

    January 17, 2011 at 5:01 pm

    One major problem with value-added is that it doesn’t take into account that some years students have outside factors such has illness, family problems, etc. that affect their preformance on the state tests that are given over a week’s time. What about the child who has had health problems and messed 40 days of school that year? What about the child whose parents are undergoing a messy divorce? What about the child whose older brother was murdered and the whole family is in a tailspin? What about the child who comes to school on one of the testing days totally exhausted because her apartment building was evacuated by police due to a hostage situation?

    Value-added does not take into consideration all the things that are beyond a teacher’s (and often a family’s) control. This can change from year to year. A child who had a stable home life last year may not have one this year. My last year’s class had some of the worst attendance I’ve ever seen. There were only 27 school days in the ENTIRE YEAR that I had 100% of my students present. Kind of hard to teach children who aren’t present. My district has no program to deal with attendance problems.

    How about spending the money the Los Angeles Times spent on value-added to get students to school and to hold parents accountable for their children. Bill and Melinda Gates, why don’t you put your money where it will do some good? Family support services, health care, etc.

  7. gmonohon

    January 17, 2011 at 5:47 pm

    Previous comments address many variables within a classroom that affect test scores. There is another aspect that must be considered. With schools now focusing on pupil progress as Professional Learning Communities (PLC), students are no longer the “property” of one classroom teacher, but are taught by a number of different individuals in various capacities. Their progress is discussed collaboratively by teams of teachers at grade level or departmental meetings, by auxiliary personnel on student study teams, with parents at parent-teacher conferences; instructional decisions made at these sessions affect the student’s learning. So does the additional instruction given in RTI by other personnel, extra help from a Special Ed teacher, after-school tutoring, and SES by independent private company instructors. Then there is the help that a concerned parent may give the child at home.

    After considering all of the persons who are involved in “teaching” the child, to whom can we attribute success? The classroom teacher may be ineffective with this child, and test score progress being made should be attributed to other “teachers.” Who can say for sure? Test score data will only tell us how the child did on the test, but not necessarily what the child knows, or which teacher is responsible for the knowledge having been absorbed.

    There are many other measures of a teacher’s effectiveness, obtained through regular observations by trained evaluators, that are much more valid in measuring teacher quality. Test scores should be used only as one imperfect indicator of a teacher’s effectiveness.

  8. gmonohon

    January 17, 2011 at 5:47 pm

    Previous comments address many variables within a classroom that affect test scores. There is another aspect that must be considered. With schools now focusing on pupil progress as Professional Learning Communities (PLC), students are no longer the “property” of one classroom teacher, but are taught by a number of different individuals in various capacities. Their progress is discussed collaboratively by teams of teachers at grade level or departmental meetings, by auxiliary personnel on student study teams, with parents at parent-teacher conferences; instructional decisions made at these sessions affect the student’s learning. So does the additional instruction given in RTI by other personnel, extra help from a Special Ed teacher, after-school tutoring, and SES by independent private company instructors. Then there is the help that a concerned parent may give the child at home.

    After considering all of the persons who are involved in “teaching” the child, to whom can we attribute success? The classroom teacher may be ineffective with this child, and test score progress being made should be attributed to other “teachers.” Who can say for sure? Test score data will only tell us how the child did on the test, but not necessarily what the child knows, or which teacher is responsible for the knowledge having been absorbed.

    There are many other measures of a teacher’s effectiveness, obtained through regular observations by trained evaluators, that are much more valid in measuring teacher quality. Test scores should be used only as one imperfect indicator of a teacher’s effectiveness.

  9. oekosjoe

    January 17, 2011 at 9:57 pm

    For years the Gates Foundation financed the Standard & Poors ranking of school districts by gain scores. They figured, logically, that districts that had the highest gains between grades 7 and 10 might actually be doing the best.

    Like other S&P metrics for private sector clients, this one didn’t work. In one particular district, which, for nearly a decade, had the highest gains in the state, the high school routinely held back its lowest 25% in achievement, supposedly because “they weren’t prepared.” In fact, the retention was to make the Principal and Superintendent the highest gain score earners in the state. When, after ten years of torturing 100 kids a year with an extra year of school, for no clear educational purpose, I confronted them, calling them cheats, liars and frauds, they coincidentally retired – the Principal, the Superintendent, and the Director of Guidance.

    Two years later I joined the school’s Council, the official review committee responsible for the school’s School Improvement Plan. My first meeting the new Guidance Director presented her portion of the plan, and I asked about that retention policy. “I’m so glad you asked,” she answered brightly, and outlined a tutoring and counseling alternative that had reduced that 25% – or 100 kids – to fewer than 50 in a single year. When we went to the School Committee with their plan, the Chair, remembering my outre scene two years earlier, asked if I had a comment: “This administration has reduced retention by 12% and may well reduce it further. They’ve saved you – and the students – 50 extra years of schooling for 50 kids who will now graduate in four, not five years. That represents $500,000 or more in actual cash savings, as well as many fewer dropouts. How much will you give them for further innovations and more savings?”

    Gain scores without clear correlations to age, to continuous progress, and to timely interventions are a fraud. Gates no longer funds S&P. Perhaps they will finally realize the futility of their metrics.

  10. oekosjoe

    January 17, 2011 at 9:57 pm

    For years the Gates Foundation financed the Standard & Poors ranking of school districts by gain scores. They figured, logically, that districts that had the highest gains between grades 7 and 10 might actually be doing the best.

    Like other S&P metrics for private sector clients, this one didn’t work. In one particular district, which, for nearly a decade, had the highest gains in the state, the high school routinely held back its lowest 25% in achievement, supposedly because “they weren’t prepared.” In fact, the retention was to make the Principal and Superintendent the highest gain score earners in the state. When, after ten years of torturing 100 kids a year with an extra year of school, for no clear educational purpose, I confronted them, calling them cheats, liars and frauds, they coincidentally retired – the Principal, the Superintendent, and the Director of Guidance.

    Two years later I joined the school’s Council, the official review committee responsible for the school’s School Improvement Plan. My first meeting the new Guidance Director presented her portion of the plan, and I asked about that retention policy. “I’m so glad you asked,” she answered brightly, and outlined a tutoring and counseling alternative that had reduced that 25% – or 100 kids – to fewer than 50 in a single year. When we went to the School Committee with their plan, the Chair, remembering my outre scene two years earlier, asked if I had a comment: “This administration has reduced retention by 12% and may well reduce it further. They’ve saved you – and the students – 50 extra years of schooling for 50 kids who will now graduate in four, not five years. That represents $500,000 or more in actual cash savings, as well as many fewer dropouts. How much will you give them for further innovations and more savings?”

    Gain scores without clear correlations to age, to continuous progress, and to timely interventions are a fraud. Gates no longer funds S&P. Perhaps they will finally realize the futility of their metrics.

  11. cskrzypchak

    January 18, 2011 at 9:08 am

    The question I always raise (being an exploratory teacher), is how do you use test scores to evaluate PE, Music, Computer, Art, Foods, Industrial Tech, etc. teachers? While you can create “standardized tests” for these subjects, the reality is that what you want students to learn is to create; to do, not recall facts.

  12. cskrzypchak

    January 18, 2011 at 9:08 am

    The question I always raise (being an exploratory teacher), is how do you use test scores to evaluate PE, Music, Computer, Art, Foods, Industrial Tech, etc. teachers? While you can create “standardized tests” for these subjects, the reality is that what you want students to learn is to create; to do, not recall facts.

  13. msrobins

    January 24, 2011 at 7:35 pm

    The real questions are, “Should test scores be used to evaluate parents? What sanctions should be taken against parents, when their children under perform on tests?”

  14. msrobins

    January 24, 2011 at 7:35 pm

    The real questions are, “Should test scores be used to evaluate parents? What sanctions should be taken against parents, when their children under perform on tests?”

  15. monty

    January 25, 2011 at 8:49 am

    If anyone new still comes to read this, you should read Jesse Rothstein’s critique of the Gates study. The great body of evidence makes clear that value-added is far closer to being valueless addition. In addition to numerous technical flaws (including that it actually cannot adequately adjust for variations in students) that make it unreliable, it is based on the same old tests and its use in judging teachers will only intensify narrowing curriculum and teaching to the test. Rothstein’s review of Gates is at Rothstein, Jesse. 2011, January. Review of “Learning About Teaching.” National Education Policy Center. http://nepc.colorado.edu/thinktank/review-learning-about-teaching

  16. monty

    January 25, 2011 at 8:49 am

    If anyone new still comes to read this, you should read Jesse Rothstein’s critique of the Gates study. The great body of evidence makes clear that value-added is far closer to being valueless addition. In addition to numerous technical flaws (including that it actually cannot adequately adjust for variations in students) that make it unreliable, it is based on the same old tests and its use in judging teachers will only intensify narrowing curriculum and teaching to the test. Rothstein’s review of Gates is at Rothstein, Jesse. 2011, January. Review of “Learning About Teaching.” National Education Policy Center. http://nepc.colorado.edu/thinktank/review-learning-about-teaching

  17. Dpierce

    January 25, 2011 at 5:18 pm

    Thanks, monty. You’re right on about the Jesse Rothstein analysis. This story was published before Rothstein published his analysis, but we’re working on a follow-up story that includes his criticisms.

  18. Dpierce

    January 25, 2011 at 5:18 pm

    Thanks, monty. You’re right on about the Jesse Rothstein analysis. This story was published before Rothstein published his analysis, but we’re working on a follow-up story that includes his criticisms.

  19. msbsinvegas

    January 28, 2011 at 4:54 pm

    As an educator, if I am to be judged on test scores, then I should be able to select my students. There are so many factors that affect “student success” that so-called “value-added testing” cannot begin to take into account what does and does not affect student success. In the classroom of today, weeks are spent covering the same basic topic in an effort to ensure that every student “gets it” – well enough for the teacher to pass the test! What of those students who “get it” the first day or two – then spend weeks suffering through repeated lessons on the same material?

  20. msbsinvegas

    January 28, 2011 at 4:54 pm

    As an educator, if I am to be judged on test scores, then I should be able to select my students. There are so many factors that affect “student success” that so-called “value-added testing” cannot begin to take into account what does and does not affect student success. In the classroom of today, weeks are spent covering the same basic topic in an effort to ensure that every student “gets it” – well enough for the teacher to pass the test! What of those students who “get it” the first day or two – then spend weeks suffering through repeated lessons on the same material?