International Education Journal

contentsBack

download View Complete Article

Acrobat Reader Install Acrobat Reader

Examining the Validity of Different Assessment Modes in Measuring Competence in Performing Human Services


Hungi Njora
School of Education, Flinders University of South Australia

I Gusti Ngurah Darmawan
School of Education, Flinders University of South Australia

John P. Keeves
School of Education, Flinders University of South Australia
john.keeves@flinders.edu.au

 

 

download Read complete article

Abstract

This article addresses an important problem that faces educators in assessing students' competence levels in learned tasks.

Data from 165 students from Massachusetts and Minnesota in the United States are used to examine the validity of five assessment modes (multiple choice test, scenario, portfolio, self-assessment and supervisor rating) in measuring competence in performance of 12 human service skills. The data are examined using two analytical theories, item response theory (IRT) and generalizability theory (GT), in addition a prior, but largely unprofitable examination using classical test theory (CTT) was undertaken.

Under the IRT approach with Rasch scaling procedures, the results show that the scores obtained using the five assessment modes can be measured on a single underlying scale, but there is better fit of the model to the data if five scales (corresponding to the five assessment modes) are employed. In addition, under Rasch scaling procedures, the results show that, in general, the correlations between the scores of the assessment modes vary from small to very strong (0.11 to 0.80). However, based on the GT approach and hierarchical linear modelling (HLM) analytical procedures, the results show that the correlations between scores from the five assessment modes are consistently strong to very strong (0.53 to 0.95). It is argued that the correlations obtained with the GT approach provide a better picture of the relationships between the assessment modes when compared to the correlations obtained under the IRT approach because the former are computed taking into consideration the operational design of the study.

Results from both the IRT and GT approaches show that the mean values of scores from supervisors are considerably higher than the mean values of scores from the other four assessments, which indicate that supervisors tend to be more generous in rating the skills of their students.

item response theory, generalizability theory, classical test theory, self assessment, portfolio assessment, supervisor scaling, scenario assessment, competences, measurement

top

  Hungi N., Darmawan, I.G.N. and Keeves, J.P. (2004) Examining the Validity of Different Assessment Modes in Measuring Competence in Performing Human Services. International Education Journal, 5 (2), 154-175.
http://iej.cjb.net

All text and graphics © 1999-2004 Shannon Research Press