The Effect of the Number of Answer Choices on the Psychometric Properties of Stress Measurement in an Instrument Applied to Children
DOI:
https://doi.org/10.35670/1667-4545.v12.n1.4694Keywords:
formato de respuesta, TRI, propiedades psicométricas, percepción de estrés en niñosAbstract
The main objective of this study was to use Item Response Theory (IRT) models to measure the effect exerted by the number of response options on the psychometric properties of a test measuring stress in children. In this study, we applied the 30-item Child Stress Perception Inventory (CSPI) scale to 583 children; the items have different response alternatives (3, 5, or 7). We studied whether the scales measure the same trait and whether the alternatives that the same items possess are equivalent. As evidenceof validity, we present measurements that examine the internal structure of the instrument and its relationship with other variables. The result indicates that the three forms measure the same trait, but that there is no equivalency among the categories.The scale adjustment of 7 response alternatives is best; however, validity in relation to other variables is optimal for 5 response alternatives, which in addition, performs best in terms of reliability and information.
Downloads
References
Ackerman, T. M. (1991). Manual for the child behavior checklist/4-18 and 1991 profile. Burlington: University of Vermont.
Aiken, L. R. (1983). Number of response categories and statistics on a teacher rating scale. Educational and Psychological Measurement, 43, 397-401.
Alwin, D. (1992). Information transmission in the survey interview: number of response categories and the reliability of attitude measurement. Sociological Methodology, 22, 83-118.
Andrich, D., & Masters, G. (1988). Rating scales analysis. In J.P. Keeves (Eds.), Educational research, methodology and measurement: an international handbook. Elmsford, N.Y.; Pergamon Press.
Bandalos, D. L., & Enders, C. K. (1996). The effects of no normality and number of response categories on reliability. Applied Measurement in Education, 9,151-160.
Boote, A. (1981). Reliability Testing of psychographic scales. Journal of Advertising Research, 21, 53-60.
Cicchetti, D. V., Showalter, D., & Tyrer, P. J. (1985). The effect of number of rating scale categories on levels of inter-rater reliability: A Monte-Carlo investigation. Applied psychological Measurement, 9,31-36.
Chang, L. (1994). A psychometric evaluation of four-point and six-point likert type scales in relation to reliability and validity. Applied Psychological Measurement, 18,205-215.
Cox, E. P. (1980) The optimal number of response alternatives for a scale: a Review. Journal of Marketing Research, 17, 407-422.
Comrey, A. L., & Montag, I. (1982). Comparison of factor analyticresults with two choice and seven choice personality item formats. Applied Psychological Measurement, 6, 285-289.
Cronbach, L. J. (1951). Coefficient alpha and the internal structure of test. Psychometrika, 16, 297-334.
Ferrando, P. J. (1999). Likert scaling using continuous, censored, and graded response models: Effects on criterion-related validity. Applied Psychological Measurement,23, 161-175.
Ferrando, P. J. (2000). Testing the equivalence among different item response formats in personality measurement: A structural equation modeling approach. Structural Equation Modeling, 7, 271-286.
García-Cueto, E., Muñiz, J., & Lozano, L. M. (2002). Influencia del número de alternativas en las propiedades psicométricas de los test. Metodología de las Ciencias del Comportamiento, volumen especial.
Hernández, B. A, Muñiz, J., & García-Cueto, E. (2000). Comportamiento del modelo de respuesta graduada en función del número de categorías de la escala. Psicothema, 12, 288-291.
Jöreskog, (1971). Statistical analysis of sets of congeneric test. Psychometrica, 36, 109-133.
Martínez-Arias, R., Hernández-Lloreda, M. J., & Hernández-Lloreda, M. V. (2006). Psicometría. Alianza Editorial: Madrid.
Matell, M. S. & Jacoby, J. (1971). Is there an optimal number of alternatives for likert scale items? Study I: reliability and validity. Educational and Psychological Measurement, 31, 657-674.
McKelvie, S. (1978). Graphic rating scales how many categories? British Journal of psychology, 69,185-202.
Mood, A., Gaybill, F. y Boes, D. (1974). Introduction to the theory of statistics. McGrawn-Hill International, London.
Muñiz, J., García-Cueto, E., & Lozano, L. (2005). Item format and the psychometric properties of the Eysenck Personality Questionnaire. Personality and Individual Differences,38, 61-69.
McCallum, D. M., Keith, B. R., & Wiebe, D. J. (1988). Comparison of response formats for multidimensional health locus of control scales: Six levels versus two levels. Journal of Personality Assessment, 52, 732-736.
Muthen, B. O., Kao, Ch-F., & Burstein, L. (1991). Instructionally sensitive psychometrics: application of new IRT-based detection technique to mathematics achievement test items. Journal of Educational Measurement, 28,1-22.
Muthen, L. K., & Muthen, B. O. (2006). Mplus: statistical analysis with latent variables: Users guide (fifth edition). Los angeles, CA. Muthen & Muthen.
Muraki, E., & Bock, R. D. (2003). Parscale 4.1. Scientific Software International.
Rodriguez, M. (2005). Three options are optimal for multiple-choice items: a meta-analysis of 80 years of research. Educational Measurement: Issues and Practice, 24, 3-13.
Rojas, A. (2001). Nuevos modelos para la medición de actitudes. Promolibro, Valencia, pp. 214.
Samejima, F. (1969). Estimation of latent ability using a response pattern of graded scores. Psychometric Monograph Supplement, 17.
Sancerini, M. D., Meliá, J. L., & González-Romá, V. (1990). Formato de respuesta, fiabilidad y validez, en la medición del conflicto de rol. Psicologica, 11, 167-175.
Satorra, A., & P. M. Bentler (2001). A scaled difference chi-square test statistic for moment structure analysis , Psychometrika, 66, 507-514.
Thissen, D. (1991). MULTILOG user ́s guide (version 6.0) [computer manual]. Mooresville, IN: Scientific Sofware.
Thissen, D., Steinberg, L., & Wainer, H. (1993). Detection of differential item functioning using the parameters of item response models. En P. W. Holland y H. Wainer (Eds.). Differential item functioning (pp. 67-113). Hillsdale, NJ: Lawrence Erlbaum.
Vellicer, W. F., & Stevenson, J. F. (1978). The relation between item format and the structure of the Eysenck Personality Inventory. Applied Psychological Measurement, 2, 293-304.
Wakita, T., Ueshima, N., & Noguchi, H. (2012). Psychological distance between categories in the likert scale: comparing different numbers of options. Educational and Psychological Measurement, 72, 533-546.
Weng, L. (2004). Impact of the number of response categories and anchor labels on coefficient alpha and test-retest reliability. Educational and Psychological Measurement, 64, 956-972.
Downloads
Published
Issue
Section
License
Copyright (c) 2012 Fabiola1 González-Betanzos, Iwin Leenen, Jennifer Lira-Mandujano, Zaira Vega-Valero
This work is licensed under a Creative Commons Attribution 4.0 International License.
Revista Evaluar aplica la Licencia Internacional de Atribuciones Comunes Creativas (Creative Commons Attribution License, CCAL). Bajo esta licencia, los autores retienen la propiedad de copyright de los artículos pero permiten que, sin que medie permiso de autor o editor, cualquier persona descargue y distribuya los artículos publicados en Evaluar. La única condición es que siempre y en todos los casos se cite a los autores y a la fuente original de publicación (i.e. Evaluar). El envío de artículos a Evaluar y la lectura de los mismos es totalmente gratuito.