TOEIC® washback effects on teachers:
by Tim Newfields
This paper explores how the use of the TOEIC® as a streaming tool and curricular component at one faculty in a Japanese university has impacted EFL teachers there.
Both quantitative and qualitative data reveal mixed reactions to the use of this test as a placement tool and part of the curriculum. After comparing the results of this study with related test washback
research, suggestions for additional TOEIC research are offered.
|"Since a lot of money and prestige is at stake in hi-stakes tests such as the TOEIC, objective measures such as recorded behaviors or test scores should supplement self-reports."|
[The TOEIC] is still based on the structuralist, behaviorist model of language learning and testing that informed discrete-point testing. If ETS has accepted this model is no longer suitable as a basis for the TOEFL, why has TOEIC not been treated similarly? (p. 3)Cunningham (2002) has also correlated TOEIC scores of fifty Japanese university freshman with an in-house direct test of listening, reading, and writing and found that TOEIC reading scores correlate negatively (-0.3908, p=0.609) with the direct test she employed. On the other hand, TOEIC listening scores did yield a +0.8193 (p=0.181) correlation. Cunningham adds:
It would appear that students are much closer in ability when it comes to language competence than the TOEIC test scores would demonstrate. It also suggests that the TOEIC was not an accurate method for determining group levels for these learners. (p. 46)These cautionary remarks are worth reflecting on. Over forty universities and junior colleges in Japan currently use the TOEIC as a placement tool (Tonegawa, 2005). Nall (2004) laments that many of the claims being made by ETS about the TOEIC are being taken at face value. There is not enough critical examination of this test. This paper examines how the TOEIC may be impacting teachers in one micro-environment: the faculty of one university in Tokyo.
|"78% of the respondents (N=14) . . . [indicated] general support for the TOEIC as a streaming tool."|
|"Anyone who understands basic statistics should question whether the TOEIC is an appropriate tool for screening incoming Japanese university freshmen."|
|"students with relatively high TOEIC scores tend to be pro-active in attempting to raise their scores further, yet those with low scores tend to perceive themselves as 'bad English learners' and easily get stuck in a rut of ennui."|
The TOEIC programme generally recommends that learners whose native language is that of Western European origin do not take the TOEIC test until they have received at least 60 hours of English training and/or practice. Native speakers of languages from other origins should probably wait at least 100 hours. (p. 12)In light of these comments, it is worth reflecting on why TOEIC re-tests occur in much shorter time frames. Templer (2004) cautions that market-driven drives to produce "quick results" may downgrade the effectiveness of some programs and place substantial burdens on both students and teachers.
|"market-driven drives to produce 'quick results' may downgrade the effectiveness of some programs and place substantial burdens on both students and teachers."|
The TOEIC was not designed to be a summative test measuring learned content. As Childs pointed out a decade ago, it is not an appropriate way to gauge what individuals learn over a period of time. To evaluate what students may have learned, a more appropriate method would be to adopt a criterion-referenced test.Unfortunately, many teachers are not very clear about how normative and criterion-reference tests differ. Issues of practicality and face validity are likely to weigh more heavily in the minds of non-experts than concerns about construct validity when making test planning decisions.
|"Issues of practicality and face validity are likely to weigh more heavily in the minds of non-experts than concerns about construct validity when making test planning decisions."|
What impact is the TOEIC having on teachers at other institutions? It would be especially interesting to expand this pilot study and compare schools with hi-stakes, hi-intensity TOEIC programs (such as Yamaguchi and Hiroshima universities) with those which have relatively low-stakes, low-intensity programs (such as Toyo and Tokai Universities). In hi-stakes settings students are required to obtain a specific TOEIC score to graduate and/or teacher evaluation is measured at least in part on the basis of score gains.
What impact is the TOEIC having on students at other institutions? More information about student backwash effects would be worth investigating. A hypothesis to explore is that the TOEIC provides a positive incentive for students with higher scores, but may lead weaker students to develop negative attitudes.
How do students who perform well on the TOEIC differ from those who don't? What meta-learning strategies do more competent students use that differs from less successful ones?4. Correlation and Validation Studies
How do TOEIC scores correlate with other English proficiency test scores? It may be useful to replicate some previous correlation studies in a Toyo University context to make sure that tests results reported for different populations also apply here. Since many TOEIC research studies were conducted with small populations and/or have design errors it would also be worth validating some previous.
|"Used in conjunction with other measures, the TOEIC may give us valuable insights into the language proficiency of an examinee. However, as a sole yardstick of language proficiency, it is subject to marked distortions."|
Many thanks to Kondoh Hiroko and Katou Osamu for help in constructing the Phase One Survey.