Getting to this time in the semester, we’re all stressing out about these huge exams that are worth 20, 25, or even 30% of our entire grade. The more percentage the exam counts for our grade, the more we stress out. That’s a natural response over something that everyone in college and people who will hire us afterward is focused on, high GPA and test scores. The question I constantly ask myself is this high-state testing helping push us to our very best potential, or is it causing more stress and having us focus on testing more than if we actually learn and apply the subject to the rest of our lives? This something education is supposed to be all about, but has it lost its way?
Looking at this from the perspective of science, we have to have a hypothesis that we can test. The null hypothesis is that testing does not affect us, something we know not to be true. For better or worse, testing will always affect us in one way or another. So, we’re left with two alternate hypotheses:
Either testing makes us worse off than before, which we can test with things like stress levels, performance on the tests, and differences in how countries handle testing. We’ll call this alternate hypothesis #1.
or
Testing makes us better off than before, which we can test with our nationwide performances on test scores, achievement gains since implementation, or how it compares to old test scores before we changed our testing policies. We’ll call this alternate hypothesis #2.
Background: Before we get into the study, which was done by Brian Jacobs in 2002, I first want to go through the previous studies that Jacobs went through in order to find correlation with his studies. The first thing that should be mentioned is the majority of exams that are measured in studies like this focus on high school graduation exams. This makes sense, considering that these are both high-stakes for the students (getting a good grade on this might help them get into college) and also will have a higher pool of students over the college level. Interestingly, Jacobs found studies that have a positive association (Bishop 1998, Frederisksen 1994, Neill 1998, Winfield, 1990), but also found a study that had no increase in achievement (Jacob 2001). So, basically, he found conflicting studies that can mean either of our alternate hypotheses is indeed correct. There were about 8-12 other studies that Jacobs looked at before conducting his own research, but this too proved to be a mixed bag. Just like Andrew often has said to us in class, science and studies just don’t seem to agree with each other and make things easy. The question is, which studies are wrong and which are right?
Before we get into the study, I wanted to share this video about the testing standards Jacobs studied, and why many view it as a failure.
The Procedure: In 1996, high-stakes assessment was implemented in Chicago. This focused on having students be accountable for learning, mostly by stopping the “social promotion” practice, which basically means that students would advance to the next grade, no matter what. If the students fail to perform to Chicago’s new standards in math and reading, they have kept back another grade until the standards are met and were put into 6 weeks of mandatory summer school. Looking for first-time test takers, Jacobs looked to see if their scores increased over time in this low-income, mostly African American environment to see if it helped in any way. The independent variable in this scenario is the new test policies, while the dependent variable is the reading, science, math, and social studies scores that the kids receive in these new tests.
The Results: This was quickly met with problems, however. Third graders who took the test failed to meet the criteria almost half of the time. On top of that, six and eighth graders failed 33% of the time. Obviously, Jacobs has focused on an observational study on this point. Unless you could convince a school board that you want to have some kids tested more high-stakes than some others, it would be incredibly difficult to have an experimental study. So, Jacobs is doing the best he can; looking at a low-income, low-grade area and seeing how high-stake assessment is doing.
Obviously, looking at the data above, it’s not doing well. When 33% of your upper-level students and 50% of you lower level students fail to meet the standards, there are two possibilities. Either the standards are too high and cannot be met, or you haven’t been teaching these kids right for years prior. Either way, it shows our educational system is incredibly flawed. However, something drastic happened in regards to math and reading scores. Beginning in 1993, test scores in math began to steadily increase each year. However, this may not be due strictly to the implementation of the new policy, but rather the changing of teaching to prepare students for the testing. Either way, it does seem to have increased both math and reading scores steadily. By the year 2000, math scores increased by 0.3 deviations higher than it’s 1993 counterpart.
More interestingly, low-achieving students (the ones used for the data above) may have done better in math and reading, but fell farther behind in science and social studies. This raises the question, did they really increase with high-stake assessment, or did they shift the focus to math and reading because it looked better for the school district (and thus would get them more money)?
So, in conclusion, the answer isn’t as cut-and-dry and we may have hoped. We ‘ve seen this a lot in class, with studies like the prayer hypothesis that didn’t affect mortality rate but did help them get out sooner (although that was later debunked). Although test scores did increase greatly in math and reading, it went down in science and social studies. Likewise, students had to take summer school between 33-50% of the time in order to advance to the next great because standards have not been met. The question is, did the scores increase because students worked harder with tougher standards, or did the administration just shift focus towards math and reading to get better test scores, and more money ,for their school?
If we had to choose between alternate hypothesis #1 or alternate hypothesis #2 and didn’t choose to put it in a file drawer, we’d have to go with #2 due to it increasing math and reading scores significantly. You can argue that the decrease in scores in social studies and science is temporary, and will increase over time as more focus is put back into those subjects after math and reading increases. Likewise, students not meeting standards increasing may be due to the change in standards taking an adjustment. Therefore, I would take this conclusion with a grain of salt. However, the increase in test scores show that these standards may be due to a bunch of different reasons, but it did increase test scores, which is exactly what it was meant for.