- Mengxi Lin and Alexander L. Francis
Purdue University 2014
The Relationship between Fluency, Intelligibility, and Acceptability of Non-native Spoken English
- Veronika Maliborska, Xun Yan, Matthew Allen, Aylin Baris Atilgan, and Lee Jung Huang
Pronunciation and Fluency Practice for ITAs Using Free Online Resources
- April Burke
The Impact of the ISTEP+ on a School Corporation with a Large ELL Population
- Haiying Cao
Language Profiles of the Cutoff Borderline Cases for OEPT Rating
- Lixia Cheng
Examining Task Variability on a Computer-based Oral English Proficiency Test
- Nancy Kauper
Development of an Assessment of Interactive Conversation Skills
- Soohwan Park
Temporal Measures of Fluency: Automatic and Manual Extraction of Temporal Variables
- Sunny Park
Toward a Syntactic Analysis of Oral English Proficiency
- Rui Yang
An Investigation of the Function of a Rating Scale
- Lixia Cheng
Self-assessment for Teaching and Learning Classroom Presentation Skills
- Nancy Kauper
Social Language Skills: Assessing Face-to-Face Conversational Interaction
- Lixia Cheng
MwALT, October 2007
Self-assessment of Speaking Skills by Purdue OEPT Test Takers
- April Ginther, Jeanne Lee, Rui Yang
MwALT, September 2006
Integration of Test Scores and Instruction in a Local Testing System
- Jeanne (Yu-Chen) Lee
MwALT, September 2006
English Intonation as a Descriptor for Evaluating Oral English Proficiency
- Rui Yang
MwALT, September 2006
Factor Structure of the Oral English Proficiency Test
- Christopher Grant Blake, PhD
Purdue University, August, 2006
The Potential of Text-Based Internet Chats for Improving ESL Oral Fluency
- Christopher Blake
AILA, July 2005
Revision of an ITA Curriculum: A Case Study
- Edie Cassell
AILA, July, 2005
Toward an Ecological Approach to International Teaching Assistant Preparation
- Jennifer Haan
AILA, July, 2005
History, Policies, and Community Values in ITA Training
- Nancy Kauper
AILA, July, 2005
A Lexical-based Instructional Component for Developing Oral Fluency
- Rui Yang
AILA, July, 2005
Conjunctions in International Teaching Assistants' Oral Discourse
- Slobodanka Dimova
MwALT, October, 2004
Test Prep, Washback, Washing Forward, What Else?
- Shigetake Ushigusa
MwALT, October, 2004
The relationship between Silent Pauses and Idiom Use
- Slobodanka Dimova
ECOLT, March, 2003
POET Rater-Training Program
Non-native accented speech is typically less intelligible and less fluent than native speech, but it is unclear how these factors interact to influence perceived speech quality. To investigate this question, the speech of 20 non-native speakers of English varying in proficiency and native language was evaluated. Subjective measures of speech quality (listening effort, acceptability and intelligibility) were compared to objective measures of word recognition by native listeners, and to acoustic measures of fluency and of segmental and suprasegmental properties related to intelligibility. Results showed that subjective quality measures were highly related to one another and to word recognition, and were most strongly predicted by measures of fluency. Segmental and suprasegmental measures did not predict word recognition or subjective speech quality. There was also an interaction between the effects of proficiency and speaker's native language on word recognition, but this did not extend to subjective measures. Finally, listeners who first heard high-proficiency speakers gave overall lower subjective quality ratings but there was no interaction between proficiency and presentation order. Multivariate analyses suggest that factors related to speaking rate, including pause duration, have the greatest effect on measures of acceptability, intelligibility and listening effort.
Helping ESL international teaching assistants (ITAs) develop their speaking skills is often a challenge both for the students and for ITA programs. Despite various kinds of support (e.g. classes, individual instruction, tutoring), improvement in pronunciation and fluency occurs at a slow rate and often delays ESL students’ certification for becoming ITAs. A possible solution is developing input-focused programs that have demonstrated to have positive effects on students’ fluency improvement rate (Gorsuch, 2011). This presentation describes an ITA course where students are assessed through presentation and teaching simulations (currently taught in an ITA program for graduate students) and a set of input-focused activities for both in-class and out-of-class practices using free video materials available on the Internet. The presenters will demonstrate activities based on videos from TED talks, Khan Academy, and YouTube, showing how the same sources could be used for different purposes, such as, for example, developing fluency through shadowing techniques and developing pronunciation through listening, transcribing, and reading aloud. These online-video activities have many advantages: free access, authentic language use and communicative contexts, extensive range of subjects, and suitability for students of various language proficiency and educational levels (i.e., undergraduate or graduate). Hand-outs of sample lesson plans detailing objectives, materials, activities, and assessments will also be provided. These activities, while shown to be beneficial in structured ESL courses, can also be tailored to students’ individual needs. The on-line videos and activities we present can be flexibly incorporated into instructor-mediated or self-guided learning for prospective ITAs or ESL students who may not have access to structured ITA courses or would like to develop their oral English profiency.
April Burke (with Luciana C. de Oliveira, Purdue University, W. Lafayette, IN)
The Impact of the ISTEP+ on a School Corporation with a Large ELL Population
English language learners (ELLs) consistently receive lower scores than non-ELLs on the Indiana Statewide Testing for Educational Progress Plus (ISTEP+). For example, only 52% of ELLs passed the Language/arts section of the 2007-2008 ISTEP+ compared with 78% of the non-ELLs who passed the exam. The performance of a school's ELL population can determine whether or not the school is deemed "failing" and placed on "improvement status." Despite the fact that many schools face these possible consequences, few studies have focused on the ramifications of using the ISTEP+ in schools with large ELL populations. In response to this deficit, this study investigates the impact of the ISTEP+ on a school corporation in which ELLs make up over 26% of the student population. Through interviews with three administrators, who were formerly teachers in the corporation, this study provides their perspective on the impact of the ISTEP+ on their corporation's programming, funding, classroom instruction, staff and ELL students. The study also includes a quantitative component, which begins with a discussion of the corporation's overall subgroup performance on the ISTEP+, followed by an analysis of individual free and reduced lunch ELL and non-ELL ISTEP+ test scores.
OEPT is a test at Purdue University for certifying international graduate teaching assistants in terms of English Proficiency. An examinee receiving a score 5 is certified of oral English proficiency while a 4 means requiring at least one semester of class on oral English proficiency. The disagreement at the cutoff (the borderline cases between Score 4 and 5) has remained consistently a challenge to the rater training and the raters' decision seems quite random.
To address this issue, besides bettering the level descriptors, individual language profiles of thick description were created for five examinees that raters disagreed on in terms of cutoff in the rescaling project. It was an attempt to better understand the nature of cutoff borderline cases. Two items selected from each of the five tests were transcribed, coded and analyzed in detail. A list of common features derived from the analysis put four examinees into the lower borderline cases and one into Level 4, one category below the borderline cases. These examinees showed mastery of pronunciation, idiomaticity and grammar whereas their performance regarding syntax, content and organization was insufficient to tackle the task on the OEPT test.
The current popularity of performance testing and task-based language assessment has triggered a great interest in researching task difficulty and variability in language performance assessment. The present study will determine whether the same examinees' responses would have significantly different fluency measures across two tasks (Newspaper Headline vs. Compare and Contrast) on a computer-based semi-direct oral English proficiency test. Transcriptions of task responses by two low-intermediate proficiency groups of 25 Chinese ESL learners each are analyzed in terms of temporal measures and lexical variables. Results of statistical tests suggest that task does not have main effects on temporal measures of fluency such as speech rate and mean syllable per run, and the interaction of task with examinee proficiency is not significant in determining these temporal variables either. T tests to compare the means of lexical variables (number of tokens, number of types of words, type token ratio, and lexical density) across the two task types indicate that there is no significant difference between the two tasks in type token ratio and lexical density either. However, the total number of tokens and total number of types do display significant differences between the two groups. But an in-depth discourse analysis is needed before any claims can be made about the possible transfer of lexes in the text prompt of one of the tasks to examinees' responses.
Purdue University's Oral English Proficiency Program (OEPP) trains and certifies international graduate students for English proficiency. We have traditionally assessed oral presentation skills in formal classroom contexts, but found that assessment of face-to-face interaction and conversation skills was lacking. Developing an assessment of conversation skills has allowed us to focus on students' abilities to interact and communicate informally one-on-one.
In 2007-2008, OEPP students were assessed using a working model of the Interactive Conversation (IC) assessment, which included evaluation criteria in three general areas: interactive understanding, active listening and conversation management. Prior to assessment, students were provided with evaluation criteria, guidelines and general strategies for successful conversations, and opportunities to practice conversing with classmates and instructors.
Digital video recordings of those IC assessments were made. A set of recordings showing a wide range of variability was selected and the conversations transcribed. Transcriptions were analyzed to see whether and how students met the evaluation criteria, what specific strategies and language they used to meet the criteria, and whether there were gaps or superfluities in the criteria.
Information gained from this analysis was then used to revise the evaluation criteria, to provide students and instructors with examples of successful conversation strategies and language, and to create new scales and rubrics in order to better reflect the range of variability in student assessment performances.
Excerpts from transcripts will be shown and discussed, along with comparisons of the working and present models of the IC assessment criteria, scale and rubrics, and evaluation guidelines for raters.
Among several components of oral proficiency of foreign language learners, fluency can be defined as "speed and smoothness of oral delivery" (Lennon 1990). We can measure fluency by calculating temporal variables such as speech rate, articulation rate, mean silent pause time and mean syllables per run (Riggenbach 1991, Kormos and Denes 2004). Studying the relationship between temporal measures of fluency and holistic scores on the Oral English Proficiency Test (OEPT) showed the potential of using the fluency measures (Ginther, Dimova and Yang 2009). The software PRAAT can extract basic information like total response time, number of pauses and number of syllables using scripts (de Jong and Wempe 2007). This study investigates the possibility of automatic measurement of temporal variables in speech samples by using acoustic analysis functions of PRAAT. The data sets are 300 speech samples from OEPT results, composed of a total of 150 speakers on two items across three language groups with different proficiency levels (Chinese, Hindi and English native). Two methods are used in analyzing the speech samples: extracting temporal variables manually and by PRAAT. This study extracts basic temporal information such as speech time, pausing time, number of pauses and number of syllables using these two methods, and calculates temporal measures of fluency to compare results from the two methods. The result shows that automatic measurement of temporal variables using PRAAT shows good performance in measuring fluency, compared to extracting temporal variables manually, especially in temporal measures using the information of syllables.
Test developers struggle in their attempts to design items that are able to elicit level differences in a meaningful way. Particularly for oral proficiency, it is challenging to identify syntactic qualities that characterize the levels. This study investigated the differences in syntactic qualities between levels of oral English proficiency on a semi-direct test. Formal syntactic theory reveals structural differences between types of embedded clauses, such that those functioning as complements (e.g., object clauses) are structurally more complex and restricted than those acting as adjuncts (e.g., adverbial clauses) within the main clause, predicting that embedded complement clauses would cause more difficulty for learners than embedded adjunct clauses. This study aimed to test the following research questions: (a) Are the variables of clause type and oral proficiency related in spoken L2 English? (b) Does the proportion of complement clauses to all well-formed clauses predict and correlate with oral proficiency level? Ninety-six transcripts of oral test data from Chinese learners of English were coded for the following clause types: main clause, embedded adjunct clause, embedded finite complement clause and embedded nonfinite complement clause. Four levels of oral proficiency, including a native speaker control group, were examined. Results showed that there is indeed a significant effect of clause type on proficiency score (X2 =19.66, p=0.02). Findings also reveal that the number of complement clauses used in a test response is a significant factor between learners and native speakers. Implications for speaking test development, particularly through incorporating formal syntactic theory, are discussed.
Scaling is one of the critical issues in test development. The purpose of this study was to investigate the function of a four-point ordinal scale used in an oral English proficiency test administered to certify the oral English proficiency of prospective international teaching assistants at a large North American university. Data in this study included ratings of 434 examinees by 10 raters from two operational administrations one year apart. A Many-Facet Rasch analysis was conducted to examine how the raters interacted with the rating scale upon the scores examinees received on the test, and how well the scale distinguished the oral proficiency of the examinees. The result shows slightly larger than the recommended maximum gap (5 logits) between adjacent rating categories. The findings invite the opportunity for scale revision that would allow raters to better distinguish the oral proficiency levels of the examinees.
It is essential to clarify to ITA trainees what makes good presentations, since they are likely from a different classroom culture than their American undergraduate students. This paper describes how a self-assessment instrument incorporating both a rubric and transcription tasks was developed, and has helped clarify what the instructor values.
Conversation skills and knowledge of local norms of social interaction are essential for ESL speakers who wish to communicate successfully in a variety of informal contexts. The presenter will show methods and materials for assessing conversation skills with the goal of enhancing ESL students' competence in informal, face-to-face social interaction.
Self-assessments of foreign language skills have been found to have high level agreement with evaluations based on a variety of external criteria, such as teacher rating or test scores (LeBlanc & Painchaud 1981, 1985; Rea 1981). Other studies, however, suggested that gender difference, cultural backgrounds, self-esteem, proficiency levels and even wording of assessment items would all complicate the research into the relationship between self-assessments and actual test performance. The present study examined the accuracy of prospective International Teaching Assistants' (ITAs) self-evaluations of test performance and English proficiency levels. 200 questionnaires completed by potential ITAs from 34 countries were collected. A four-point Likert scale was employed to code the subjects' self-ratings of proficiency levels (poor, fair, good or excellent), and a five-point scale used for their self-estimation of whether or not they will pass the test (strongly disagree, disagree, no opinion, agree and strongly agree). Spearman Correlation tests were performed to examine the relationship between the questionnaire and the Oral English Proficiency Test (OEPT), and chi-square tests were performed to investigate the effects of gender and native country on self-assessments. The research found out that the test takers' OEPT scores had a moderate correlation with their self-estimation of test performance. The subjects tended to associate the ability to speak fluently with the ability to speak without a lot of grammatical errors. There is no clear evidence that gender is related to the subjects' self-estimation of test performance although the subjects' native country might have an effect on their self-estimation.
The Oral English Proficiency Program at Purdue University developed a local, computer-based, semi-direct test of oral English proficiency that became operational in 2001. Last year, the test was placed into a Web-based network system that links registration, administration, and rating. The information gathered from all input into the system is saved as a database that can be used for score reporting, research and instructional purposes.
A critical gap for score users — especially for test takers and researchers — is the one that exists between a test score and its meaning. Local testing systems in which a test is linked to instruction and test data are accessible for research can be used to bridge this gap by allowing data to be leveraged for uses beyond the provision of a score.
In the rating component of the system, raters have access to the prompt, the scale and sample benchmark performances. In addition, raters not only assign scores but also provide a justification for the scores they assign. These justifications serve as extra interpretative information to the score users (examinees, departments and researchers).
The raters are also instructors in the program. For those examinees who are placed into an OEPP class, instruction begins with a review of each item response by the student with the instructor/rater. Access to the raw performance data, the review of the item responses, scores and rater comments allows the students to understand the reasons/meaning for their scores. These negotiated interpretations of actual performance serve as the foundation for each student's instructional plan, midterm evaluation and final evaluation. For many of the students, it's the first time that they have had the opportunity to examine their actual performance in relation to their assigned score.
All data generated by the test and subsequent evaluations are accessible to researchers. Studies generated by these data will be reviewed.
This presentation will demonstrate the components of the system and how they are accessed by its users. The local system has many advantages that are not possible when large-scale assessments and tests focus primarily on placement and not on the extended and system-integrated opportunities for test score use.
Intonation is often listed as a descriptor in oral English proficiency scales, which demonstrates that it is a part of a performance that influences people's perception of proficiency. However, there is no explanation or guideline of how to evaluate intonation in oral English performance. This study proposes an acoustic approach to reliably measure English intonation for the analysis of pitch patters by native speakers (NS) of English and English as a second language (ESL) speakers who speak Mandarin as their first language (L1). Speech samples of the five NS of English from the Midwest, USA, and five Chinese speakers on their performance in reading aloud and leaving a telephone message are analyzed. Intonation is measured instrumentally with PRAAT, a speech analysis computer program, on sentence nonfinal and final positions, where sentence units are reliably determined by syntax.
The preliminary findings indicate that native English speakers may use different intonation patterns for different discourse situations: the five English NS of this study prefer to use level or rising contours in sentence final positions for leaving voicemails, whereas only falling contours are used to mark sentence endings for reading aloud. The Chinese ESL speakers, on the other hand, do not make use of intonation for different discourse functions: there is a prominent use of the level and falling contours for both reading aloud and leaving a telephone message. It is argued that Chinese learners of English tend to impose the prosodic patters of their L1 on their second language (L2) that results in the typical Chinese accent that can be characterized with a monosyllabic rhythm with constant high level and falling tones. The results of this study provide potential guidelines on how intonation may be systematically used as a descriptor for oral English proficiency.
The purpose of this preliminary study is to examine the factor structure of the Oral English Proficiency Test (OEPT). The OEPT is used at Purdue University to assess the oral English proficiency of prospective international teaching assistants (ITAs). Only ITAs that pass the test are allowed to conduct direct classroom teaching at the University. A Confirmatory Factor Analysis (CFA) was used to examine the structure of the test since a one-factor model is hypothesized underlying the test structure. The sample for the factor structure study includes 641 subjects from 40 different countries and regions. For the analysis of factor invariance across test-takers from different native language backgrounds, 180 Chinese subjects and 110 Korean subjects were chosen from the overall sample of 641 subjects.
The result of the CFA confirms the hypothesized one-factor model (chi-square=87.45, GFI=0.973, CFI=0.996, RESEA=0.0523). However, errors of three items were found to covary. Further analysis using the Modification Indices provided in the LISREL program suggest an effect of test prompts. Compared to the other item prompts, the item prompts of the three items with correlated errors are less text-based. Thus, the prompts provide minimal information that examinees could take advantage of in forming their responses. The one-factor model with covariance among three items was invariant across the Chinese and Korean samples chosen for the study. The covariance among the three items in the one-factor test structure implies that prompt revision is necessary.
Text-based Internet chats have become a popular component of second language classrooms, making it possible for students to communicate with native speakers and second language learners across the globe. While a number of studies have reported on the positive affects that chat discourse can have on the learning environment, few studies have examined whether participation in chat discourse can help learners improve their proficiency in a second language. To the best of knowledge, no studies to date have examined whether second language learners can improve their oral fluency through participating in a text-based chat learning environment.
This dissertation addresses the above question by examining the oral fluency development of 34 ESL learners who participated in the same six-week course but in separate instructional environments: a text-based Internet chat environment, a traditional face-to-face environment and a control environment that involved independent learning with no student interaction. A fluency pretest was administered prior to the study and a posttest was administered at the end. Speech samples collected from these tests were analyzed for fluency at five temporal variable levels: speaking rate (SR), phonation time ratio (PTR), articulation rate (AR), mean length of run (MLR), and average length of pauses (ALP). Improvement in fluency was measured in terms of the pretest to posttest gain scores on each of these measures.
The study found that the gain scores of participants in the text-based Internet chat environment were significantly higher on the PTR and MLR measures than the gain scores of participants in the face-to-face and control environments. Gain scores on the three other measures were not significant. The author discusses these findings in relationship to Levelt's (1989) model of language production and argues that text-based Internet chat environments can be a useful way of building oral fluency by facilitating the automatization of lexical and grammatical knowledge at the formulator level.
(Back to Top)
AILA, July 2005
Revision of an ITA Curriculum: A Case Study
By definition, ITA preparation courses have traditionally focused on preparing ITAs for the tasks of teaching in an undergraduate classroom setting. Skills such as using visual aids, answering questions, organizing lessons and understanding American classroom culture are areas typically covered in an ITA classroom. Because it is generally assumed that prospective ITAs can make only limited progress in language proficiency over the course of a semester, compensatory strategies such as paraphrasing student questions and using discourse markers are typically emphasized over language specific features such as pronunciation and grammar.
An analysis of the Purdue ITA curriculum revealed that only 8 percent of the activities focused on language proficiency areas while the majority related to teaching and compensatory strategies. This paper discusses the curriculum revision process and explains how the changes reflect the current language needs of the ITA population that it serves. The author also describes how moving the curriculum onto a Web-based platform with links to online resources has created an environment that facilitates student autonomy and lifelong learning after the course has been completed.
Most large research universities in the United States rely in great part on international graduate students, for whom English is not their native language, as teaching and research assistants.
As universities must ensure that International Teaching Assistants' (ITA's) oral English skills are sufficient to teach in classrooms and otherwise interact with Americans, many require ITA's to pass a spoken English test and if they do not pass the test, to undergo an ITA training program.
In order to have realistic expectations for ITA's and to better inform the development of ITA screening and training programs, it is necessary to attempt to identify ITA's needs and this requires collection of information about their actual patterns of language use, both within and outside their working environment. For this study, the perspective of an ecological approach was adopted to gain insight into ITA's participation in a larger social sphere. This approach was selected as a guide in conducting this research because it considers: (1) all domains of language use; (2) the potential of linguistic diversity; (3) the limitations of natural and human resources; (4) the importance of support within the community; (4) the need for a long term view — into the past and the future.
With the considerations of this approach as guide, a questionnaire was developed and administered to 200 ITA's. ITA's responded about their demographic make-up, their participation in extra-curricular and community activities, the amount of time they spend engaged in activities for which they use their first language vs. English, and the use of their first language vs. English to communicate (oral and written) in their work and social environments. Analysis of responses indicate their use and need of first language maintenance as well as English and participation in activities spanning from home country traditions to mainstream American events.
Within any language ecosystem, the historical values of institutions within the community influence language instruction and use. In ITA training programs, the values set forth by the policies of the institution and the attitudes of those involved in the community often times affect the type of program that is put into place, the curriculum that is offered, the language that is used, and the type of learning that takes place. By analyzing the history and policies of the institution, one can better understand the values that the community brings to the instruction and integration of ITAs, and can compare these values and policies with the needs and motivations of the students themselves.
Using archival research from 15 years of university senate documents, community and university publications, interviews, and other university and departmental public documents, this study examines the history of policies and attitudes toward ITA training at a large Midwestern university. The researcher investigates 1) what policies are in place for the instruction of ITAs, 2) how these policies have evolved over time, 3) what factors have influenced the implementation of these policies, 4) and how these policies reflect the values and attitudes of the community. The analysis reports a shift from punitive policies that view ITAs simply in terms of their language deficiency to more positive policies, which focus on the globalization and internationalization of the campus.
Studies on lexical coverage of spoken language (Adolphs & Schmitt, 2003) and on the role of short-term memory in lexical acquisition (Ellis & Sinclair, 1996) have shown that a focus on vocabulary development and lexical sequence acquisition for L2 learners is essential for developing fluency, especially in oral-based language programs.
According to Ellis and Sinclair, a focus on lexical development should include familiarizing the learner with word families and lexical sequences, followed by repetitive oral practice to establish the material in short-term memory, use of the lexis in meaningful contexts, and frequent periodic recycling of content and practices to promote long-term learning.
This paper will describe initial attempts to apply information from these and other studies on vocabulary development (Schmitt & Meara, 1997), lexical sequencing, and corpus linguistics (Simpson & Mendis, 2003) in the design of a lexically-focused instructional component for developing the oral fluency of intermediate to advanced ESL learners who are graduate student teaching assistants-in-training. The curriculum includes the use of lexical notebooks (Schmitt & Schmitt, 1995), and subsequent recycling activities designed to give students multiple opportunities in the instructional setting to take in and use targeted chunks of language.
A key question for any lexically-based program is how to choose content. For the instructional component discussed here, some content, including words, idioms and collocations, were chosen from word frequency lists and the MICASE corpus. Because learners have differing academic, professional, and personal situations and needs, in addition to instructor-chosen material, each student added individualized material to their lexical notebooks, including items that are difficult for the learner to pronounce due to unfamiliar phonological structure, items that represent unfamiliar concepts, critical vocabulary associated with academic, professional, and personal domains of the individual student, and unfamiliar colloquial expressions.
Despite the high scores international students often receive on the TOEFL or GRE, these same students can be challenged by the oral communication requirements associated with employment as teaching assistants (TAs) in North American universities. Previous analyses of international teaching assistant (ITA) discourse have shown that listeners' interpretation of discourse is not only influenced by pronunciation but also by discourse-level patterns of language use, e.g., the use of discourse markers. Researchers agree that discourse markers play an active role in discourse coherence; however, they argue that the use and misuse of different types of discourse markers have differential impact upon the perceived comprehensibility of the discourse. This paper examines four types of conjunctions in order to investigate whether and to what extent the use of conjunctions can be argued to influence scores on a semi-direct test used for screening the oral proficiency of prospective ITAs. Item responses by 40 Chinese examinees across 4 levels of proficiency and 10 native English speakers to the "Compare and Contrast" item on the test (OEPT) were examined and coded for the use and "misuse" of conjunctions. The total number of words produced by individual speakers ranged from 89-306 words. The Chinese examinees were found to use conjunctions in some unexpected ways: the overextended use of the simple coordinating AND to signal relations that should be cued by SO or BUT, the inconsistent use of temporal conjunctions, and the unexpected collocation of conjunctions are characteristic of nonnative response. Native speakers' use of AND was often accompanied by parallel grammatical structures — a combination of coordination and syntactic features absent in the nonnative speakers' responses. The characteristics of the ITAs' use of conjunctions can be argued to contribute to the distribution of scores.
Traditionally, the term test preparation triggers the idea of drilling, teaching testing strategies and tricks, and focusing on test items. In the area of large standardized testing, test preparation has been investigated only in terms of its effects on test scores and test validity. In education, test preparation is viewed as washback, or the influence of tests on teaching. Even though test preparation and washback are believed to have negative influences on teaching and testing, the term washing forward (Pearson 1988) was created to account for their benefits. In this paper, test preparation is not viewed as a score-boosting method but rather as an instructional tool that enhances learning.
In August 2002, Purdue University developed a set of electronic preparatory materials for the Purdue's Oral English Test (POET), known as the POET Tutorial. The purpose of the POET Tutorial is to direct prospective examinees' attention to the components of the construct being measured, rather than to the particular items used in the test. Even though POET's test format and method are designed to elicit adequate criterion performances, the Tutorial expands, justifies and completes the connection between the test methods and the construct.
The present study was designed to analyze (1) the relationship between the use of the POET Tutorial and the examinees' knowledge of the testing context, the testing method and the consequences of the test, and (2) examinee's opinions about the usefulness of the set of preparatory materials included in the POET Tutorial. Two questionnaires were administered to 600 examinees during the academic years 2002-04. Most examinees (92%) found the tutorial helpful. Examinees who use the tutorial are significantly more likely to report they are familiar with the testing context, consequences and methods.
Idioms are considered to be stored and retrieved from memory rather than preferentially composed at the time of use, on the basis of the idea that they are "lexicalized," "institutionalized" and "non-compositional." The literature pertaining to phraseological units, formulaic language and fixed expressions including idioms (FEIs) indicates that idioms are regarded as a subset of prefabs.
This study examines the hypothesis that there is a correlation between the amount of silent pauses and the amount of idioms used in spontaneous speech by ESL speakers by observing the presence of pauses and use of idioms in examinee responses to Purdue's Oral English Test (POET). The idioms examined are the 97 idioms identified as the most frequently used idioms on the basis of corpora research of spoken American English. The silent pauses are silences that are longer than .2 seconds. In this pilot, 24 examinees' responses to the test item, Compare and Contrast of POET, were analyzed to address the following research question: What is the correlation between the total silent pause time as percent of total time of delivery and the total number of the idioms as percent of total number of words in the delivery?
Performance-based tests for oral language proficiency are recognized as important for measuring certain aspects of language ability. However, the establishment of valid and reliable rating procedures poses challenges for practitioners and researchers. The purpose of this paper is to analyze the rater-training procedure for the Purdue's Oral English Test (POET), an oral English screening method for international teaching assistants at Purdue University. Seven raters-in-training used the POET 10-hour self rater-training program. During the program, the raters-in-training were introduced to the five-level holistic scale. Besides the traditional band descriptors, the scale contained performance descriptors based on experienced raters' perceptions. The score bands were introduced one at a time (starting from the highest), and each band was closely compared to the characteristics of the higher or lower band. The raters-in-training were encouraged to focus on different aspects of the speaking performances: organization and coherence, articulation, pronunciation, grammar and syntax, delivery, and vocabulary range. During practice rating, the raters-in-training were requested to assign and justify their scores using perceptive description of examinees' performances. Percentages of agreement of the raters'-in-training scores were calculated against the original scores. The raters'-in-training perceptive descriptors were analyzed in terms of frequency of variable perception, frequency of using performance characteristics presented in the scale descriptors, and frequency and type of perceptive descriptions not given in the scale. Results suggest that the raters-in-training had high percentages of score agreement even though they used different characteristics in their performance descriptions. The rater-training procedure analyzed in this paper not only informs raters-in-training about the rating scale, it also gives an opportunity for trainers to follow raters' conceptualization of the construct, their scale interpretation and their performance variable perception from the very beginning. Well-trained raters are an essential component of performance-based testing because they are one of the main factors that influence examinees' scores. If raters are trained to use the scale effectively, score variance due to rating error may be minimized.