Performances of MIMIC and Logistic Regression Procedures in Detecting DIF

Performances of MIMIC and Logistic Regression Procedures in Detecting DIF

In this study, differential item functioning (DIF) detection performances of multiple indicators, multiple causes(MIMIC) and logistic regression (LR) methods for dichotomous data were investigated. Performances of thesetwo methods were compared by calculating the Type I error rates and power for each simulation condition.Conditions covered in the study were: sample size (2000 and 4000 respondents), ability distribution of focalgroup [N(0, 1) and N(-0.5, 1)], and the percentage of items with DIF (10% and 20%). Ability distributions of therespondents in the reference group [N(0, 1)], ratio of focal group to reference group (1:1), test length (30 items),and variation in difficulty parameters between groups for the items that contain DIF (0.6) were the conditionsthat were held constant. When the two methods were compared according to their Type I error rates, it wasconcluded that the change in sample size was more effective for MIMIC method. On the other hand, the changein the percentage of items with DIF was more effective for LR. When the two methods were compared accordingto their power, the most effective variable for both methods was the sample size

___

  • Cohen, J., & Cohen, P. (1983). Applied multiple regression/correlation analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ: Lawrence Erlbaum Associates, Inc.
  • Crane, P. K., Belle, G., & Larson, E. B. (2004). Test bias in a cognitive test: Differential item functioning in the CASI. Statistics in Medicine, 23(2), 241-256. doi: 10.1002/sim.1713
  • Dorans, N. J., & Kulick, E. (1986). Demonstrating the utility of the standardization approach to assessing unexpected differential item performance on the Scholastic Aptitude Test. Journal of Educational Measurement, 23(4), 355-368.
  • Finch, H. (2005). The MIMIC model as a method for detecting DIF: Comparison with Mantel-Haenszel, SIBTEST, and the IRT likelihood ratio. Applied Psychological Measurement, 29(4), 278-295. doi: 10.1177/0146621605275728
  • Finch, W. H., & French, B. F. (2007). Detection of crossing differential item functioning: A comparison of four methods. Educational and Psychological Measurement, 67(4), 565-582. doi: 10.1177/0013164406296975
  • Fleishman, J. A., Spector, W. D., & Altman, B. M. (2002). Impact of differential item functioning on age and gender differences in functional disability. Journal of Gerontology: Social Sciences, 57B(5), 275-284.
  • Holland, P. W., & Thayer, D. T. (1988). Differential item performance and the Mantel-Haenszel procedure. In H. Wainer, & H. I. Braun (Eds.), Test validity (pp. 129-145). Hillsdale, NJ: Lawrence Erlbaum Associates. Holland, P. W., & Wainer, H. (1993). Differential Item Functioning. Hillsdale, NJ: Lawrence Erlbaum Associates, Inc.
  • Lord, F. M. (1980). Applications of item response theory to practical testing problems. Hillsdale, NJ: Lawrence Erlbaum Associates.
  • Mazor, K. M., Kanjee, A., & Clauser, B. E. (1995). Using logistic regression and the Mantel-Haenszel with multiple ability estimates to detect differential item functioning. Journal of Educational Measurement, 32(2), 131-144.
  • Muthén, B. O. (1988). Some uses of structural equation modeling in validity studies: Extending IRT to external variables. In H. Wainer, & H. I. Braun (Eds.), Test validity (pp. 213-238). Hillsdale, NJ: Lawrence Erlbaum Associates.
  • Muthén, L. K. & Muthén, B. O. (1998, 2010). Mplus user’s guide (6th ed.). Los Angeles, CA: Muthén & Muthén.
  • Oort, F. J. (1998). Simulation study of item bias detection with restricted factor analysis. Structural Equation Modeling: A Multidisciplinary Journal, 5(2), 107-124. doi: 10.1080/10705519809540095
  • R Core Team (2013). R: A language and environment for statistical computing [Computer software]. Vienna, Austria: R Foundation for Statistical Computing. Retrieved from http://www.R-project.org/
  • Sari, H. I. & Huggins, A. C. (2014). Differential item functioning detection across two methods of defining group comparisons: Pairwise and composite group comparisons. Educational and Psychological Measurement, 75(4), 648-676. doi: 10.1177/0013164414549764
  • SAS Institute Inc. (2007). SAS® 9.1.3 qualification tools user’s guide. Cary, NC: SAS Institute Inc.
  • Shealy, R., & Stout W. (1993). A model-based standardization approach that separates true bias/DIF from group ability differences and detects test bias/DTF as well as item bias/DIF. Psychometrika, 58(2), 159-194.
  • Swaminathan, H., & Rogers, H. J. (1990). Detecting differential item functioning using logistic regression procedures. Journal of Educational Measurement, 27(4), 361-370.
  • Thissen, D., Steinberg, L., & Wainer, H. (1988). Use of item response theory in the study of group differences in trace lines. In H. Wainer, & H. I. Braun (Eds.), Test validity (pp. 147-169). Hillsdale, NJ: Lawrence Erlbaum Associates.
  • Vaughn, B. K., & Wang, Q. (2010). DIF trees: Using classification trees to detect differential item functioning. Educational and Psychological Measurement, 70(6), 941-952. doi: 10.1177/0013164410379326
  • Wang, W. C., & Shih, C. L. (2010). MIMIC methods for assessing differential item functioning in polytomous items. Applied Psychological Measurement, 34(3), 166-180. doi: 10.1177/0146621609355279
  • Wang, W. C., Shih, C. L., & Yang, C. C. (2009). The MIMIC method with scale purification for detecting differential item functioning. Educational and Psychological Measurement, 69(5), 713-731. doi: 10.1177/0013164409332228
  • Woods, C. M. (2009). Evaluation of MIMIC-model methods for DIF testing with comparison to two-group analysis. Multivariate Behavioral Research,44(1), 1-27. doi: 10.1080/00273170802620121
  • Woods, C. M., Oltmanns, T. F., & Turkheimer, E. (2009). Illustration of MIMIC-model DIF testing with the schedule for nonadaptive and adaptive personality. J Psychopathol Behav Assess, 31, 320-330. doi: 10.1007/s10862-008-9118-9
  • Zumbo, B. D. (1999). A handbook on the theory and methods of differential item functioning (DIF): Logistic regression modeling as a unitary framework for binary and Likert-type (ordinal) item scores. Ottowa, ON: Directorate of Human Resources Research and Evaluation, Department of National Defense.