or I should say what passes for a critique, given that I’m busy with this hard disk problem, and the article doesn’t give a lot of information. As I said before, the results aren’t surprising, but the study seems to have a few problems.

The research team, led by psychology graduate student Jason Chan, designed three experiments to determine whether testing can enhance the long-term recall of studied material. In the first, 84 students, split into three groups, read an essay on the toucan bird. The first group took a short 22-question test on the information they’d just read but received no feedback on their performance. The second one reviewed 22 statements culled from the longer passage. The third group went home without being tested. The next day, all the students took a 44-question test–with the same 22 questions on the first day’s test and 22 additional ones. The researchers found that the first group did significantly better on the questions that only appeared on day two (performing at least 9 percent better than the other groups).

The sample size (n=28) is pretty small — certainly too small to be anything but preliminary. This needs to be replicated on a larger scale. What mystifies me is why three groups here? And how did the scores of the second two groups compare with each other?

This article doesn’t say.

In a second experiment, 72 students studied two articles on different topics. Immediately after the students read the articles, they took a 12-question test on one of the pieces. The next day, the students took a 48-question test with the same 12 questions they saw the day before, 12 more from the same article and 24 from the other article. The students did significantly better on the second set of 12 questions of related material than they did on the 24 questions about the second article. In the third study, Chan manipulated the recall methods of students–asking them, during the first round of testing, either to think of all the information related to test questions or to home in solely on the answer. On the second test, the students who thought more broadly the day before performed much better on related questions.

I can’t say much about the third study, since the article doesn’t give many details. A sample size of 72 is certainly better than 28, though again, this needs to be replicated with a much larger sample. The “control” here is the 24 questions from the other article, which is a bit unconventional, but I see nothing objectionable about it.

Michael Anderson, a psychologist at the University of Oregon, begs to differ. He notes that over 80 published articles in the field claim that testing actually harms retention, a phenomenon called “retrieval-induced forgetting.”

Well not really. Retrieval-induced forgetting, IIRC, doesn’t necessarily have anything to do with testing, but memory recall. The researcher has a good response to this:

Chan acknowledges that the literature is against him, but he argues that his study approximated a college course because it used “textlike narratives” rather than word lists for subjects to memorize, which he says most of the research arguing against him employed. “One thing we know about retrieval-induced forgetting is that it is a short-lived phenomenon that typically does not last for a day,” Chan points out that his study mimics students cramming for examinations by allowing participants a 24-hour delay between tests. Previous studies only allowed 20 minutes.

“One last implication for education is that it is a good idea to give short-answer exams as opposed to multiple-choice exams,” Chan says. “It is unlikely for students to try to recall related information during a multiple-choice exam because they tend to answer a question by first looking at the choices, instead of trying to recall information from what they know.” Sorry, Scantron fans, looks like blue books make for better learning.

Of course, there are two responses to this. In a class of 250, who is going to grade all those exams in a timely fashion? I’m partial to short answer and essay exams myself, but they aren’t always feasible, and life is just like that. Also, while it is certainly true that students answer multiple-choice questions by reading first the question then the distractors, the problem is that this is basic test-taking skills, and it gets in the way of studying memory.

An interesting study, though as I said, it needs to be replicated on a much larger scale in order to be taken seriously.