Criterion-Referenced Language Testing

Front Cover
Cambridge University Press, May 20, 2002 - Education - 320 pages
Over the past decade criterion-referenced testing (CRT) has become an emerging issue in language assessment. Most language testing books have hitherto focused almost exclusively on norm-referenced testing, whereby test takers' scores are interpreted with reference to the performance of other test takers, and have ignored CRT, an approach that examines the level of knowledge of a specific domain of target behaviours. It is designed to comprehensively address the wide variety of CRT and decision-making needs that more and more language-teaching professionals must address in their daily work. Criterion-referenced Language Testing is the first volume to create a nexus between the theoretical constructs and practical applications of this new area of language testing.
 

What people are saying - Write a review

We haven't found any reviews in the usual places.

Contents

Criterionreferenced tests are different
9
The place of CRTs in language testing theory and research
14
What is language prociency?
16
What problems do CRT developers face?
25
2
28
A closer look at objectives and criterionreferenced testing
36
Performance objectives
39
Experiential objectives
46

Format confoundings
62
Selfassessment
64
reading listening grammar knowledge and phonemic
69
Constructedresponse items
71
Personalresponse items
78
Criterionreferenced item format analysis
86
Improving the specications
95
Item quality and content analyses
98
4
101
Description of CRTscore distributions
102
Table 41 Calculating the mean for a set of CRT
107
Understanding numerical descriptions
111
Difference index
120
all of those who failed the test answered item 3
126
Criterionreferenced item selection
127
Item response theory and CRT
128
The oneparameter model
131
The twoparameter model
132
The threeparameter model
133
2
138
George Helga Dan Pat
140
+
147
5
149
consistency reliability and
150
NRTreliability
151
A note on correlation
152
Reliability
153
Figure 51 Plot of listening by reading
156
Table 53 Correlation matrix for the
159
Equivalent forms reliability
163
Thresholdloss agreement methods
169
Masters
171
Table 510 Calculating estimated variance components for persons
179
z2
181
Thus phi is the ratio of the persons variance o
185
Useful relationships among reliability and dependability
198
Local independence
206
Model to data t
207
6
212
Content validity
213
Expert judgments approach to content validity
220
Construct validity
225
PreB
226
Differentialgroups construct validity studies
230
Expanded views of validity
240
Cronbachs perspectives on questions about validity
246
Making decisions with criterionreferenced tests
248
Are traditional cutpoints and grading on a curve justied?
249
Are cutpoints necessarily arbitrary?
251
What is standards setting?
253
acceptable
261
What is the relationship between standards and
264
How are validity and criterionreferenced decision making
265
7
269
Team development of CRTs
270
Marshaling adequate resources
275
Counterbalancing criterionreferenced forms
277
Who should get feedback?
279
Interpreting gain scores
288
Difculties in reporting criterionreferenced results
291
References
292
Index
310
1
1
Some useful denitions
2
Criterionreferenced tests
3
Differences and similarities between NRTs and CRTs in
6
Fitting assessment types to curriculum
47
The role of feedback
49
3
56
Some useful denitions
57
Table 31 Linguistic and format confoundings
61
Format confoundings
62
Selfassessment
64
reading listening grammar knowledge and phonemic
69
Constructedresponse items
71
Personalresponse items
78
Criterionreferenced item format analysis
86
Improving the specications
95
Item quality and content analyses
98
4
101
Description of CRTscore distributions
102
Table 41 Calculating the mean for a set of CRT
107
Understanding numerical descriptions
111
Difference index
120
all of those who failed the test answered item 3
126
Criterionreferenced item selection
127
Item response theory and CRT
128
The oneparameter model
131
The twoparameter model
132
The threeparameter model
133
2
138
George Helga Dan Pat
140
5
149
consistency reliability and
150
NRTreliability
151
A note on correlation
152
Reliability
153
Figure 51 Plot of listening by reading
156
Table 53 Correlation matrix for the
159
Equivalent forms reliability
163
Thresholdloss agreement methods
169
Masters
171
Table 510 Calculating estimated variance components for persons
179
z2
181
Thus phi is the ratio of the persons variance o
185
Useful relationships among reliability and dependability
198
Local independence
206
Model to data t
207
6
212
Content validity
213
Expert judgments approach to content validity
220
Construct validity
225
PreB
226
Differentialgroups construct validity studies
230
Expanded views of validity
240
Cronbachs perspectives on questions about validity
246
Making decisions with criterionreferenced tests
248
Are traditional cutpoints and grading on a curve justied?
249
Are cutpoints necessarily arbitrary?
251
What is standards setting?
253
acceptable
261
What is the relationship between standards and
264
How are validity and criterionreferenced decision making
265
7
269
Team development of CRTs
270
Marshaling adequate resources
275
Counterbalancing criterionreferenced forms
277
Who should get feedback?
279
Interpreting gain scores
288
Difculties in reporting criterionreferenced results
291
References
292
Index
310

Other editions - View all

Common terms and phrases

Popular passages

Page 33 - And the Gileadites took the passages of Jordan before the Ephraimites : and it was so, that when those Ephraimites which were escaped said, Let me go over ; that the men of Gilead said unto him, Art thou an Ephraimite? If he said, Nay ; Then said they unto him, Say now Shibboleth : * and he said Sibboleth : for he could not frame to pronounce it right.
Page 35 - A criterion-referenced test is one that is deliberately constructed to yield measurements that are directly interpretable in terms of specified performance standards. Performance standards are generally specified by defining a class or domain of tasks that should be performed by the individual.
Page 272 - Messick (1989a:13) sums it up this way: "validity is an integrated evaluative judgment of the degree to which empirical evidence and theoretical rationales support the adequacy and appropriateness of inferences and actions based on test scores or other modes of assessment".
Page 123 - They are signs, and they do no more than denote the objects to which they are attached. What we call a symbol is a term, a name, or even a picture that may be familiar in daily life, yet that possesses specific connotations in addition to its conventional and obvious meaning. It implies something vague, unknown, or hidden from us. Many Cretan monuments, for instance, are marked with the design of the double adze. This is an object that we know, but we do not know its symbolic implications. For another...
Page 59 - ... information lies in the standard used as a reference. The standard against which a student's performance is compared in order to obtain the first kind of information is the criterion behavior which defines subject matter competence.
Page 53 - ... an approach requiring an integrated, facile performance on the part of the examinee. It is conceivable that knowledge could exist without facility. If we limit ourselves to testing only one point at a time, more time is ordinarily allowed for reflection than would occur in a normal communication situation, no matter how rapidly the discrete items are presented. For this reason I recommend tests in which there is less attention paid to specific structure points or lexicon than to the total communicative...
Page 27 - Statistical Tables for Biological, Agricultural and Medical Research, by RA Fisher and F. Yates (Oliver and Boyd, London, 1938).
Page 35 - Criterion-referenced measures indicate the content of the behavioral repertory, and the correspondence between what an individual does and the underlying continuum of achievement. Measures which assess student achievement in terms of a criterion standard thus provide information as to the degree of competence attained by a particular student which is independent of reference to the performance of others.
Page 35 - Along such a continuum of attainment, a student's score on a criterion-referenced measure provides explicit information as to what the individual can or cannot do. Criterion-referenced measures indicate the content of the behavioral repertory, and the correspondence between what an individual does and the underlying continuum of achievement.
Page 13 - Hudson, TD (1989b). Measurement approaches in the development of functional ability level language tests: norm-referenced, criterionreferenced, and item response theory decisions. Unpublished PhD dissertation. University of California at Los Angeles. Hudson, T, Detmer, E., & Brown, JD (1992).

Bibliographic information