Nov 24 2006
The “Good v. Bad Test Taker” Effect
Back in September, this diatribe was burning up the edusphere:
I Know The Material - I Just Don’t Do Well on Exams
Leprechauns, unicorns, Bigfoot, the Loch Ness Monster, hobbits, orcs - and students who know the material but don’t do well on exams. Mythical creatures.
I’ve met students who claim to know the material but not do well on exams, but when you press them, it turns out they don’t know the material after all. If you can’t answer questions about the material or apply the knowledge in an unfamiliar context, you don’t know it. You might have vague impressions of specific ideas, but if you can’t describe them in detail and relate them to other ideas, you don’t know the material.
And there is yet more oblivious hand wringing in the Washington Post about the discrepancy between student grades and test scores (hat tip to Ken for the link):
Brittanie’s mother, Kay Morton, was befuddled when she opened the mail and saw the results of her daughter’s standardized math exam.
“It’s hard to understand a situation where you can have an Honor Roll student who doesn’t pass the test. She’s been an Honor Roll student since the sixth grade,” she said. “I can’t say I really hold her teacher accountable. . . . I just accepted the fact that Brittanie may not be a child that tests well.”
Again, we have the "bad test taker" rearing its head. The question: Is it a myth, as our author above suggests, or not?
The “good” v. “bad” test-taker is one of those untested, unresearched bits of education “knowledge” that has been around, well, since I was in high school (and folks, that’s been a while now — I took two years of Latin in high school and we did “duck and cover” exercises in grade school complete with a fallout shelter, if that tells you anything). But it it a myth?
With no research of any kind, in all these years, to say one way or another, I’d say no, it’s unlikely to be a myth. Think about it: Are some people better at certain types of tasks — like taking standardized tests — than others? Of course. However, I think the actual "good v. bad test taker" effect is exaggerated, and I have speculative, anecdotal, and empirical reasons to support my stance.
The exaggerated effect: test questions
From the examinee’s perspective, there are three types of questions on any objective exam:
- Questions the examinee knows immediately
- Questions the examinee is not quite sure about, but believes he could answer if he spent more time on them
- Questions the examinee has no clue about whatsoever
Provided the examinee does not violate the Cardinal Rule of test-taking (always go with your first impression, or never second-guess yourself) — and we will come back to that — the first type of questions present no problem whatsoever. The difference, then, between the “good” and “bad” test-taker would entail how he answers the second two types of questions.
So place yourself in the chair taking the test. How do you answer that second type of question, those questions you think you know, though you really need to think about them some more? You use the process of elimination to eliminate distractors that cannot be correct. You then analyze the remaining distractors, looking for cues that will tell you which is correct — that is, if you are a “good” test-taker.
In other words, “good” test-takers approach these questions with a more intelligent and logical thought process and better orgainzed minds than do “bad” test-takers.
Now, assuming that this “good” v. “bad” distinction exists, “bad” test-takers will answer fewer of the second type of question correctly, and as a result will have somewhat lower scores. But the question — assuming, again, that this distinction exists, which we do not know — is how much of a difference in scores does this create. And I would argue that it would create a far smaller difference in scores than most suggest.
For one thing, we have that third type of question, the “I have no clue at all” question. Statistically, a certain number of these are going to be correct, because the examinee will fill in a bubble on the scantron and move on. This alone inflates scores, for both “good” and “bad” test-takers, and if bad test takers have twice as many questions of type three to answer, their scores will be inflated more than those of the good test takers. The third type of question, assuming (as we must for the sake of argumentation) that both groups of students have the same knowledge of the material, will not introduce any difference between the groups’ scores.
However, good test-takers violate that Cardinal Rule I mentioned above more often than bad test takers, and as a result, lose points. This could be because they manage their time better than bad test-takers and have more time to look over their exams, or it could be because they (possibly) tend to be more anal. This is going to counteract part of that advantage they have.
I am skeptical of the claim that this distinction introduces any statistically significant difference in scores, because I don’t see any evidence of it in the data, as I will discuss below.
But since it would be so difficult to do any research on this — since you’d have to have two groups with pretty close to exactly the same knowledge of the material to test for the “good v. bad test-taker” effect — this is pure speculation. I do, however, have other arguments.
The exaggerated effect: self-identification and reality
Admittedly, these are anecdotal evidence — and since “anecdotal evidence” is an oxymoron, I offer these only as anecdotes. I often have students come to my office after exams and say that they are bad test takers, yet when I look, they have done perfectly well on the exam. I also get the reverse, students who did poorly yet claim they are good test takers. When I probe the students, I usually find that they did not study or do much work in class because they believed that being good test takers would carry their grade (test taking strategies are important, but they’re no substitute for knowing the material).
In both cases, if the “good v. bad test taker” effect were significant, students who said they were bad test takers would have done less well, and vice versa.
And examples like these are frequent. More often than not, when students come to my office to see how they did on the test, they volunteer their "test taker" status as soon as they walk in the room. Also more often than not, the student’s self-perception as a test taker contradicts reality, that is, how they did on the test. This raises an interesting question: How are students so out of touch with their test taking abilities? But since that is another topic, I’ll let it drop.
The mismatch between self-perception and reality, however, is so common that if it doesn’t call into question the existence of the "good v. bad test taker," it certainly questions the strength of the "good v. bad test taker" effect. After all, if the effect were significant, I would expect a larger number of students who identified themselves as bad test takers to have done poorly, and vice versa. Yet, that is the exception, and not the rule.
So far, I’ve given speculative and anecdotal arguments. At the moment, I can only say I suspect that the effect is exaggerated. However, I also have hard data to support my position.
The exaggerated effect: the data
Even though I suspect (suspect and not know because this is one of those educrat chestnuts that has not a gram of research to support it) that the “good v. bad test taker” exists, I not only suspect but know that the difference between the two is exaggerated. And I know this from analyzing grades (as opposed to the “good v. bad test taker” chestnut, which is supported by no evidence at all — though I suspect it does exist).
Semester after semester, I statistically analyzed the grades in our two semester course sequence. All exams and assignments are graded by program so all assignments and exams are graded in exactly the same way for each student. The grades are wholly based on objective criteria; there is not even one point that can be adjusted with any subjective criterion, because there is no subjective element in the grading system: No participation, no attendance, no improvement. All tests are created with a great deal of care, and before scores are recorded, the tests are statistically analyzed question by question, to catch not only poorly written questions, but also questions that do not discriminate. Grades are based solely on assignments and exams. The grading system is points-based (no curve), there are 1000 possible points, and we use a strict 90-80-70-60 grading scale. So if Jane gets 899 points, she gets a B+, even if one more point would have gotten her an A-, and there is nothing her instructor can do to affect her grade. Although there is some variation from semester to semester, we usually have a total of 2700 students in the two courses each semester.
We’ll say our sample size each semester is 2700 (n=2700). One of the many analyses I did semester after semester was run correlations and regressions between assignment scores and exam scores. Because assignments are not done in a testing environment — that is, students typically have a couple of weeks between the date the assignment is posted and the date it is due — the “good v. bad test taker” effect would show up in the correlation. Yet, it did not (the lowest r in all those semesters between assignment scores and exam scores was 0.8-something, and it was usually 0.9 or higher).
This is the strongest evidence that the "good v. bad test taker" effect is exaggerated. However, it isn’t the only evidence.
I also analyzed the test scores for both classes every semester. With a sample of 2700 each semester, every test distributed normally, with a mean in the mid-70s, exactly where it should have been. Looking at the test scores, then, if there was a "good v. bad test taker" effect there, it wasn’t significant enough to affect the distibution.
If, as I suspect, some students are better test takers than others, then why does the effect not show up in the distribution? There are three things that could be going on.
- There are so few of both types that neither significantly affects the data.
- There is a significant number of both types, but the "good v. bad test taker" effect is insignificant.
- There is a significant number of both types and the effect is significant, but they occur in roughly equal distribution and with roughly equal effect, so they balance each other out.
Which would be impossible to say, except for the evidence I presented first, namely, that the correlations between assignment scores and test scores are very high. This supports either of the first two hypotheses, and contradicts the third (if there were a significant number of both types and the effect were significant, the correlation between assignment and test scores would be significantly less than 0.8-something or higher). Of the first two, I favor the second, that there are significant numbers of good and bad test takers, but that the effect is insignificant.
In other words, I suspect that if I had a magic wand and could wave it over my bad test taker student’s head and make him a good test taker before he took the test, his score would not significantly improve. Certainly, this would not include what I will call stupid test takers, students who dawdle over questions and pay no attention to the time and end up only answering half the questions. That’s not bad test taking strategies: That’s sheer idiocy. And I’m sorry if I sound cold or unsympathetic, but I don’t believe we should be giving students handicaps because they’re idiots, or otherwise rewarding stupidity.
The reason I favor the second hypothesis over the first is pure observation. I actively proctor exams. I don’t sit and read or play on the computer. I walk up and down the room, looking closely at what students are doing. And I always see some students with good test taking strategies, some students with not very good test taking strategies, and yes, the idiots. But the data do not support the hypothesis that the "good v. bad test taker" effect is significant.
I suspect that more than anything, the "good v. bad test taker" effect is an excuse, for students, parents, and bad teachers. But since I have collected and analyzed the data, from significant sample sizes semester after semester after semester, the burden of proof rests on the shoulders of anyone who claims that the “good v. bad test taker” effect is significant — particularly since no research exists to support it, one way or another.
Exaggerated or not: the effect is valid
Finally, and this will get me jeers from the educrats, even if the "good v. bad test taker" effect can be shown to be significant, I strongly believe that it should not in any way be controlled, and bad test takers should receive no handicap or advantage. The two most common reasons given for the "good v. bad test taker" effect are stress and test taking strategies. Given that stress is a part of daily life, and yes, that a good part of the educational mission is to prepare students for the real world, any difference in test scores resulting from a student’s being under stress is a valid one, and should stand on that student’s record. It runs counter to our mission as educators to coddle or protect students from reality.
The other reason, test taking strategies, is likewise valid. Developing good test taking strategies is a matter of basic intelligence or common sense. Recall that I said above:
So place yourself in the chair taking the test. How do you answer that second type of question, those questions you think you know, though you really need to think about them some more? You use the process of elimination to eliminate distractors that cannot be correct. You then analyze the remaining distractors, looking for cues that will tell you which is correct — that is, if you are a “good” test-taker.
In other words, “good” test-takers approach these questions with a more intelligent and logical thought process and better organized minds than do “bad” test-takers.
Good test takers are more organized, more logical thinkers. They are (here come the PC police) more intelligent, and students who have poor test taking strategies are less intelligent because their thinking is comparatively disorganized and illogical. What difference in test scores results, results from intelligence — and how well students have learned to think clearly and logically. Again, it contradicts the educational mission to handicap the less intelligent, and doing so does not create a level playing field. It penalizes the intelligent in order to coddle the less intelligent. It creates an artificially steep playing field, and only to make the educrats feel better about themselves.
Even if the "good v. bad test taker" effect is significant, it should not be controlled in any way. Doing so subverts the mission of education. Certainly, the educrats, who are always going on about “critical thinking” and “higher level thinking skills” should agree that more organized, more logical thinkers should not be hobbled to favor less organized, less logical thinkers.
And if you’re a teacher and you object, then let me ask you this: If you have bad test takers, and if you believe that the effect is significant, and if you believe those bad test takers should be handicapped, then do you cover test taking strategies in class? No? So why not put your money where your mouth is?
If it significantly improves your students’ scores, good for you. I suspect it won’t make much difference, but by all means, try it.
Other education articles:
- Not Enough Homework
- Dancing Queen!
- Projects and Activities
- Academic Groupthink
- Words Matter
- But It’s Peer Reviewed!
- THIS Is What’s Wrong With Education
- High Tech High, (co-written with Ken DeRosa, on Edspresso)
- And In The Spirit Of Postmodernism And Idiocy
- Math Aptitude And Sex, Er Gender
- Navigating The Group Work Maze
- Diversity Destroys Education
- Reality Check For Fuzzy Math Fans
- “Qualitative Research” Is Neither
- Test Tips.
Or click here for a complete list of all my education articles (click here for math articles).
13 responses so far
13 Responses to “The “Good v. Bad Test Taker” Effect”

Or just a matter of training. I used to work for one of the major test-preparation companies, and was able to help kids greatly improve their SAT and GRE scores with just a few techniques. Especially in the mathematical questions, if they approached the solutions the way they’d been taught in school, they might get the right answer, but it would take too long.
Also, on the hardest questions, the thing to do was to ignore the Cardinal Rule, and actually remove the first impulse from the possible answers, and randomly choose from the remaining answers. This was especially true in verbal sections, where false cognates tended to mislead even students with good vocabularies.
Teachers, parents, and educrats alike also need to realize that good test-taking skills can and should be learned. One of the goals of the educational system ought to be to produce students who can employ the kinds of effective logical reasoning skills that we associate with good testers. The way some people talk, being a bad test-taker is a terminal handicap that forces other people to adjust, rather than something that can be overcome with training and practice. We need to start telling the students that if you’re a bad test-taker, the onus is on YOU to develop the skills you need to get past that. (These are logical/critical thinking skills that go way beyond mere testing.)
I haven’t thought this out quite in the detail you have, but I’m reminded of some golfer who said, “Golf is a game of luck. The more I practice, the luckier I get.”
I also think that organized thinking is a product of discipline and practice, and that while intelligence helps, and extreme intellignece helps even more, for most of us there’s no substitute.
Tor,
Could you share some of those techniques you mention?
You know I linked.
I think you’re defining intelligence more narrowly than I was. Perhaps knowledge would have worked better.
I’ve done the equivalent, teaching TOEFL prep classes, but the problem with the claim that it improves scores is that there’s no way to know. You can, of course, compare the scores after to the last scores, but that tells you nothing, since in the intervening time, the students (we hope) have learned more. Once again, you’re up against the difficulty of testing for the “test taker” effect, which would be very hard to do, because you have to have two groups with nearly the same knowledge (how do you do that?), and do test taking skills with the experimental group but not the control and see how the scores compare. If they’re in school — and who else but students would you test — then you’d have to find some way to control for any learning over the intervening time, and that’s damn near impossible.
There’s a reason no research has been done on this, you know.
We did those practice tests, but the first problem is you don’t have the same amount of time in the class that you do on the exam so that makes comparing the practice exams to the TOEFL problematic. Second, the quality of the practice exams is quite low compared to the real thing (lots of poorly written questions). Third, there’s a certain amount of variance with the same students over different tests. So there’s really no way to know if test prep is accomplishing anything.
Interestingly, the form the tests takes has an effect as well. Multiple choice allow people with only a fuzzy idea of teh coceepts to be tipped off by familiar words and concepts in the choices presented as answers. More traditional tests where questions are asked and no possible answers are given require the student to know the material without any need for it to be right in front of thier faces.
Even so, I find that the middle ground is rare anyway in my own educational endeavors. When I see a question I ether know the answer or I don’t, although I can sometimes reason out an answer to a question I am utterly clueless about. Sometimes throwing out a little BS works for some reason.
www.ravingconservative.com
[…] Other education articles here. […]
I was a Chem major in school. Chemistry is a subject where arithmetic and mathematics (calculus and linear algebra) are joined with concepts (molecular kinetics, bonding behaviour, conservation of matter/energy…). So, I had to learn to deal with many types of tests; essay questions, explicit mathematical questions, multible choice and diagrams. I found that having good; class room attendance, class room participation, homework methods and study methods = good test taking. You see if you are honesty making a real effort to learn you will. If you’re smoking dope and drinking beer in your dorm you won’t. However, “bad test takers” don’t despair… The world needs ditch diggers too.
[…] In an educational world that is becoming more and more dominated by tests, what do we do with the student who is simply not a good test taker? Who needs to take that responsibility? Right Wing Prof looks at some of the answers. […]
In my undergrad physics courses, there were precious few problems that could be solved in a small amount of time, as required for an in-class test. most physics profs felt that you should never see a problem on a test that you’d seen in homework. Generally speaking, each question was structure the same way: parts a and b: something that makes sense if you did your homework, as it’s analogous to a worked homework problem. Parts c and d: something easy to solve if you make the correct leap to understand how this problem is analogous to a worked problem, but you never did this on any homework. A good test taker could make the leap. A bad test taker couldn’t; they didn’t make the leap, and they could never understand what part d meant, because they had to already be solving the right problem and then part d just fell out for them. For example. you solved the hydrogen atom in homework. The test asked for the helium atom.
in my undergrad math courses, most profs had a different notion. “math proofs are about correctness, not the length of time to find the correct answer. you should never see a problem on a test that you’ve not seen before.”
therefore, a good test taker in math simply did all the proofs, carefully, and until each step was really truly understood without glossing over anything. a bad test taker was actually bad at the homework.
there are bad test takers out there. they can be taught to be good test takers, if anyone would care to mentor them. they don’t know what they don’t know. but too often, profs think “not my problem; it’s theirs.” yeah, well, that’s great, terrific, and your prerogative. TEACHING is different. Better to light a candle than curse the darkness.
—Good test takers are more organized, more logical thinkers. They are (here come the PC police) more intelligent, and students who have poor test taking strategies are less intelligent because their thinking is comparatively disorganized and illogical.
this is why that magic wand of teaching can actually affect the outcomes. your comment implies there’s no value to teaching anyone who isn’t the better student. why have the poorer performer in your class at all? why admit them? i mean, why TEACH TO PEOPLE who aren’t already getting everything you say?
Look, I went to MIT. I was dumber than lots of students. I was smarter than some, too. I was terribly disorganized during certain terms–living was hard, and therefore, school was hard. But a decent teacher taught me how to BECOME more logical.
The bad physics test taker can be taught how to anticipate the test questions. “well, we DID the hydrogen atom; so that won’t be on the test. What’s the next simplest system we’ll have to write down Hermite Polynomials for?” “helium” “okay, so to prepare for the midterm, let’s work out the helium atom, rather than re-writing our answer for the hydrogen atom.”
Similarly, the bad test taker can be taught how to tell when they are glossing over their answers. This “I should only teach the already organized and intelligent” is really defaulting on your part of the contract.
[…] I believe that good test-taking strategies can affect scores, but I don’t believe that the effect is significant (as I said, rather extensively, here). However, I do not know that this is the case. […]