Abstract
While artificial intelligence (AI) algorithms hold great potential for improving health and reducing health disparities, biased AI algorithms have a potential to negatively impact the health of under-resourced communities or racial/ethnic minority populations. Our study highlights the major role of socioeconomic status (SES) in AI algorithm bias and (in)completeness of electronic health records (EHRs) data, which is commonly used for algorithm development. Understanding the extent to which SES impacts algorithmic bias and its pathways through which SES operates its impact on algorithmic bias such as differential (in)completeness of EHRs will be important for assessing and mitigating algorithmic bias. Despite its importance, the role of SES in the AI fairness science literature is currently under-recognized and under-studied, largely because objective and scalable individual-level SES measures are frequently unavailable in commonly used data sources such as EHRs. We addressed this challenge by applying a validated individual-level socioeconomic measure that we call the HOUSES index. This tool allows AI researchers to assess algorithmic bias due to SES. Although our study used a cohort with a relatively small sample size, these study results highlight a novel conceptual strategy for quantifying AI bias by SES.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
This work was supported by National Institute of Health (NIH)-funded R01 grant (R01 HL126667), R21 grant (R21AG65639-01A1) and R21 grant (R21AI142702).
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
This study was approved by the Mayo Clinic Institutional Review Board (IRB).
All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Footnotes
Sources of Support/Funding and their role: This work was supported by National Institute of Health (NIH)-funded R01 grant (R01 HL126667), R21 grant (R21AG65639-01A1) and R21 grant (R21AI142702).
Financial Disclosure: Young J. Juhn is Principal Investigator (PI) of the Respiratory Syncytial Virus incidence study supported by GlaxoSmithKline but they have no relationship with the presented work.
Conflicts of interest: The authors declare no conflict of interest pertaining to the presented work.
Data Sharing Statement: The datasets generated and/or analyzed during the current study are not publicly available as they include protected health information. Access to data could be discussed per the institutional policy after approval of the IRB at Mayo Clinic.
Data Availability
The datasets generated and/or analyzed during the current study are not publicly available as they include protected health information. Access to data could be discussed per the institutional policy after approval of the IRB at Mayo Clinic.
Abbreviations
- AE
- Asthma exacerbation
- AI
- Artificial Intelligence
- EHRs
- Electronic health records
- FN
- False negatives
- FP
- False positives
- GBM
- Gradient Boosting Machine
- HOUSES
- HOUsing-based SocioEconomic Status measure
- ML
- Machine Learning
- NB
- Naïve Bayes
- NAEPP
- National Asthma Education and Prevention Program;
- PAC
- Predetermined Asthma Criteria
- SDH
- Social Determinants of Health
- SES
- Socioeconomic status
- TN
- True negatives
- TP
- True positives