A Comparison of ChatGPT-4 Vision’s Diagnostic Accuracy for Inpatient Skin Conditions in White and Skin of Color Patients

Main Article Content

Joseph McGrath
Evelyn Fagan
Graham Grisedale
Jesse Hirner
Erin X Wei

Keywords

Artificial Intelligence, ChatGPT, ChatGPT-4V, clinical images, inpatient dermatology, skin of color

Abstract

The objective of this study was to assess ChatGPT-4 Vision's (ChatGPT-4V) ability to diagnose inpatient skin conditions in white and skin of color (SOC) individuals based solely on images. To do this, five white and five SOC images depicting the fifteen most common inpatient dermatologic conditions in the United States were selected from VisualDx, dermatology textbooks, and dermatology journals. ChatGPT-4V was graded on its ability to correctly diagnose the dermatologic condition or to list the correct diagnosis in its top three differential diagnoses. ChatGPT-4V correctly diagnosed a greater percentage of images depicting white individuals (43/75, 57.3%) than images depicting SOC individuals (32/75, 42.7%), though this difference was not statistically significant (P=0.103). While the accuracy increased when assessing ChatGPT-4V's top differential diagnoses, it did not eclipse 75% for white or SOC images. Accuracy varied depending on the condition that was presented. The findings of this study suggest ChatGPT-4V cannot be used as a reliable diagnostic tool to diagnose inpatient skin conditions in white or SOC individuals. 


 


 

References

1. Goktas P, Grzybowski A. Assessing the impact of ChatGPT in dermatology: a comprehensive rapid review. J Clin Med [Internet]. 2024 Oct 3 [cited 2025 Apr 28];13(19):5909. Available from: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11477344/

2. Lau CB, Kwa M, Shen L, Smith GP. Assessing the diagnostic performance of ChatGPT in dermatology across Fitzpatrick phototypes and skin of color. Journal of the American Academy of Dermatology [Internet]. 2025 Mar [cited 2025 Apr 28];92(3):578–9. Available from: https://linkinghub.elsevier.com/retrieve/pii/S0190962224030536

3. Hydol-Smith JA, Gallardo MA, Korman A, Madigan L, Shearer S, Nelson C, et al. The United States dermatology inpatient workforce between 2013 and 2019: a Medicare analysis reveals contraction of the workforce and vast access deserts—a cross-sectional analysis. Arch Dermatol Res [Internet]. 2024 [cited 2025 Apr 28];316(4):103. Available from: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10940353/

4. Arnold JD, Yoon S, Kirkorian AY. The national burden of inpatient dermatology in adults. Journal of the American Academy of Dermatology [Internet]. 2019 Oct [cited 2025 Apr 28];81(4):AB285. Available from: https://linkinghub.elsevier.com/retrieve/pii/S0190962219322352

5. Perlman KL, Williams NM, Egbeto IA, Gao DX, Siddiquee N, Park JH. Skin of color lacks representation in medical student resources: A cross-sectional study. International Journal of Women’s Dermatology [Internet]. 2021 Mar [cited 2025 Apr 28];7(2):195–6. Available from: https://linkinghub.elsevier.com/retrieve/pii/S2352647521000010