CALPER Language Assessment

Center for Advanced Language Proficiency Education and Research at The Pennsylvania State University

Corpus

Overview

Composed of a large group of written and/or spoken texts gathered from specified sources, a corpus enables linguistics researchers to move past intuition about language form, function, and complexity and to understand how people actually use language in specific contexts and for specific purposes. Similarly, language educators can use corpus linguistic methodologies to broaden their presentation of the language to students, focusing not on constructed textbook rules and examples, but actually engaging learners directly with real language use.

Corpus-informed or corpus-based language pedagogies are currently benefiting from more user-friendly, advanced analysis software that make corpus methodologies accessible for non-specialists. In addition, learner corpora can also be developed by assembling learners’ written and/or spoken language production over time. Such corpora are searchable by lexical, morphological, or even discourse features and allow the precise location and quantification of particular items in learner language production. Corpus-based assessments employ corpus methodologies to trace the language development of groups of learners as well as individuals. In this way, language teachers can use corpora of expert-speaker data as a point of comparison as well as studying changes in learner performance over time.

Background

Based on the very principals of their construction and use, corpora should be viewed as assessment tools. Because they typically are composed of relatively lengthy texts, language learner corpora may also best be suited to evaluating advanced learners. To date, much of the corpus-based assessment occurring in language education has taken place in European Union countries where advanced proficiency levels are understandably at a premium. More studies have recently begun appearing in the North American context, and the increasing emphasis on advanced language proficiency in the United States means that greater attention will likely be paid to the development corpus-based language pedagogies in coming years.

In addition to allowing teachers and researchers to track learner development over time, corpus-based assessment can also be used to create more general comparisons between learner and expert-speaker production. Corpus studies of advanced learners may be particularly useful in that they may be able to highlight subtle ways in which even highly proficient students fail to match their expert-speaking counterparts. Such findings can be used to design teaching and consciousness raising activities on especially challenging language points.

Application

CALPER created a corpus tool for helping to assess learner written language development. Called the Graphic Online Language Diagnostic (GOLD) teacher can create an account and use the tool to analyze the language of their students. CALPER is currently preparing a new User Manual and will make that available shortly. We also created several slideshows introducing language instructors to this approach. A full PPT presentation can be viewed here. A second one on “Tracking Learning” is also available.

In an article, Belz (2004, 2006) reports how, over the course of a semester, language instructors can compile individual learner corpora based on traditional writing and speaking assessments as well as telecollaborative interactions with expert-speaking counterparts. By comparing students’ production of specific language features to that of expert speakers, teachers can then gauge the general level of learner language use. The short-term longitudinal nature of such projects also provides a view of individual language development in response to pedagogical and interactional input. Belz provides examples with advanced learners of German.

Personal concordancing and academic writing: Conian (2004) explains how advanced learners can develop their academic writing in the target language by creating a personal concordance to analyze and improve their compositions. The use of discourse features such as personal pronouns, hedging, and passive constructions differ among languages, academic disciplines, and even writers within disciplines. Not only does a personal corpus of articles allow writers to compare their work to the discourse features commonly found within their chosen field, it also allows them to identify the features that typify their own personal writing styles.

 


Suggested Readings:

Belz, J. A. (2004). Learner corpus analysis and the development of foreign language proficiency. System, 32, 577-591.

Belz, J. A. (2006). At the intersection of telecollaboration, learner corpus analysis, and L2 pragmatics: Considerations for language program direction. In J. Belz & S. Thorne (eds.), Internet-mediated intercultural foreign language education (p. 207-246). Boston, MA.: Thomson Heinle.

Cobb, T. (2003). Analyzing late interlanguage with learner corpora: Québec replications of three European studies. The Canadian Modern Language Review, 59, 393-422.

Granger, S., Hung, J., & Petch-Tyson, S. (Eds.) (2002). Computer-learner corpora, second language acquisition, and foreign language teaching. Philadelphia, PA: John Benjamins.

Sinclair, J. (Ed.) (2004). How to use corpora in language teaching. Philadelphia, PA.: John Benjamins.


Next section: Dynamic Assessment

Skip to toolbar