Are medical school preclinical tests biased for sex and race? A differential item functioning analysis

Abstract Background A common practice in assessment development, fundamental for fairness and consequently the validity of test score interpretations and uses, is to ascertain whether test items function equally across test-taker groups. Accordingly, we conducted differential item functioning (DIF)...

Full description

Saved in:
Bibliographic Details
Main Authors: Esther Dasari Dale, Mohammed A. A. Abulela, Hao Jia, Claudio Violato
Format: Article
Language:English
Published: BMC 2025-01-01
Series:BMC Medical Education
Subjects:
Online Access:https://doi.org/10.1186/s12909-024-06540-6
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832571597584596992
author Esther Dasari Dale
Mohammed A. A. Abulela
Hao Jia
Claudio Violato
author_facet Esther Dasari Dale
Mohammed A. A. Abulela
Hao Jia
Claudio Violato
author_sort Esther Dasari Dale
collection DOAJ
description Abstract Background A common practice in assessment development, fundamental for fairness and consequently the validity of test score interpretations and uses, is to ascertain whether test items function equally across test-taker groups. Accordingly, we conducted differential item functioning (DIF) analysis, a psychometric procedure for detecting potential item bias, for three preclinical medical school foundational courses based on students’ sex and race. Methods The sample included 520, 519, and 344 medical students for anatomy, histology, and physiology, respectively, collected from 2018 to 2020. To conduct DIF analysis, we used the Wald test based on the two-parameter logistic model as utilized in the IRTPRO software. Results The three assessments had as many as one-fifth of the items that functioned statistically differentially across one or more of the variables sex and race: 10 out of 49 items (20%), six out of 40 items (15%), 5 out of 45 items (11%) showed statistically significant DIF for Anatomy, Histology, and Physiology courses, respectively. Measurement specialists and subject matter experts independently reviewed the items to identify construct-irrelevant factors as potential sources for DIF as demonstrated in Appendix A. Most identified items were generally poorly written or had unclear images. Conclusions The validity of score-based inferences, particularly for group comparisons, requires test items to function equally across test-taker groups. In the present study, we found DIF of some items for sex and race in three content areas. The present approach should be utilized in other medical schools to address the generalizability of the present findings. Item level DIF should also be routinely conducted as part of psychometric analyses for basic sciences courses and other assessments. Clinical trial number Not applicable.
format Article
id doaj-art-8afac380f79c49bea6d58165703843de
institution Kabale University
issn 1472-6920
language English
publishDate 2025-01-01
publisher BMC
record_format Article
series BMC Medical Education
spelling doaj-art-8afac380f79c49bea6d58165703843de2025-02-02T12:29:32ZengBMCBMC Medical Education1472-69202025-01-012511810.1186/s12909-024-06540-6Are medical school preclinical tests biased for sex and race? A differential item functioning analysisEsther Dasari Dale0Mohammed A. A. Abulela1Hao Jia2Claudio Violato3University of Minnesota Medical SchoolUniversity of Minnesota Medical SchoolUniversity of Minnesota Medical SchoolUniversity of Minnesota Medical SchoolAbstract Background A common practice in assessment development, fundamental for fairness and consequently the validity of test score interpretations and uses, is to ascertain whether test items function equally across test-taker groups. Accordingly, we conducted differential item functioning (DIF) analysis, a psychometric procedure for detecting potential item bias, for three preclinical medical school foundational courses based on students’ sex and race. Methods The sample included 520, 519, and 344 medical students for anatomy, histology, and physiology, respectively, collected from 2018 to 2020. To conduct DIF analysis, we used the Wald test based on the two-parameter logistic model as utilized in the IRTPRO software. Results The three assessments had as many as one-fifth of the items that functioned statistically differentially across one or more of the variables sex and race: 10 out of 49 items (20%), six out of 40 items (15%), 5 out of 45 items (11%) showed statistically significant DIF for Anatomy, Histology, and Physiology courses, respectively. Measurement specialists and subject matter experts independently reviewed the items to identify construct-irrelevant factors as potential sources for DIF as demonstrated in Appendix A. Most identified items were generally poorly written or had unclear images. Conclusions The validity of score-based inferences, particularly for group comparisons, requires test items to function equally across test-taker groups. In the present study, we found DIF of some items for sex and race in three content areas. The present approach should be utilized in other medical schools to address the generalizability of the present findings. Item level DIF should also be routinely conducted as part of psychometric analyses for basic sciences courses and other assessments. Clinical trial number Not applicable.https://doi.org/10.1186/s12909-024-06540-6Differential item functioningTest validityRace and sex biasPsychometric analysis
spellingShingle Esther Dasari Dale
Mohammed A. A. Abulela
Hao Jia
Claudio Violato
Are medical school preclinical tests biased for sex and race? A differential item functioning analysis
BMC Medical Education
Differential item functioning
Test validity
Race and sex bias
Psychometric analysis
title Are medical school preclinical tests biased for sex and race? A differential item functioning analysis
title_full Are medical school preclinical tests biased for sex and race? A differential item functioning analysis
title_fullStr Are medical school preclinical tests biased for sex and race? A differential item functioning analysis
title_full_unstemmed Are medical school preclinical tests biased for sex and race? A differential item functioning analysis
title_short Are medical school preclinical tests biased for sex and race? A differential item functioning analysis
title_sort are medical school preclinical tests biased for sex and race a differential item functioning analysis
topic Differential item functioning
Test validity
Race and sex bias
Psychometric analysis
url https://doi.org/10.1186/s12909-024-06540-6
work_keys_str_mv AT estherdasaridale aremedicalschoolpreclinicaltestsbiasedforsexandraceadifferentialitemfunctioninganalysis
AT mohammedaaabulela aremedicalschoolpreclinicaltestsbiasedforsexandraceadifferentialitemfunctioninganalysis
AT haojia aremedicalschoolpreclinicaltestsbiasedforsexandraceadifferentialitemfunctioninganalysis
AT claudioviolato aremedicalschoolpreclinicaltestsbiasedforsexandraceadifferentialitemfunctioninganalysis