Skip to main content
medRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search

Natural language processing for scalable feature engineering and ultra-high-dimensional confounding adjustment in healthcare database studies

Richard Wyss, Jie Yang, Sebastian Schneeweiss, Joseph M. Plasek, Li Zhou, Thomas Deramus, Janick G. Weberpals, Kerry Ngan, Theodore N. Tsacogianis, Kueiyu Joshua Lin
doi: https://doi.org/10.1101/2025.01.30.25321403
Richard Wyss
1Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, USA
PhD
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: rwyss{at}bwh.harvard.edu
Jie Yang
1Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, USA
PhD
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Sebastian Schneeweiss
1Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, USA
MD, PhD
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Joseph M. Plasek
2Division of General Internal Medicine, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, USA
PhD
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Li Zhou
2Division of General Internal Medicine, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, USA
PhD
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Thomas Deramus
1Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, USA
PhD
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Janick G. Weberpals
1Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, USA
PhD
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Kerry Ngan
1Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, USA
MS
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Theodore N. Tsacogianis
1Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, USA
MS
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Kueiyu Joshua Lin
1Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, USA
3Department of Medicine, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
MD, PhD
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Preview PDF
Loading

Article Information

doi 
https://doi.org/10.1101/2025.01.30.25321403
History 
  • January 31, 2025.
Copyright 
The copyright holder for this preprint is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license.

Author Information

  1. Richard Wyss, PhD1,*,
  2. Jie Yang, PhD1,
  3. Sebastian Schneeweiss, MD, PhD1,
  4. Joseph M. Plasek, PhD2,
  5. Li Zhou, PhD2,
  6. Thomas Deramus, PhD1,
  7. Janick G. Weberpals, PhD1,
  8. Kerry Ngan, MS1,
  9. Theodore N. Tsacogianis, MS1 and
  10. Kueiyu Joshua Lin, MD, PhD1,3
  1. 1Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, USA
  2. 2Division of General Internal Medicine, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, USA
  3. 3Department of Medicine, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
  1. ↵*Address for Correspondence
    : Richard Wyss, PhD, Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School 1620 Tremont St. Suite 3030, Boston, MA 02120 Phone: (617) 278-0930; Fax: (617) 232-8602. Email: rwyss{at}bwh.harvard.edu
Back to top
PreviousNext
Posted January 31, 2025.
Download PDF
Email

Thank you for your interest in spreading the word about medRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Natural language processing for scalable feature engineering and ultra-high-dimensional confounding adjustment in healthcare database studies
(Your Name) has forwarded a page to you from medRxiv
(Your Name) thought you would like to see this page from the medRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Natural language processing for scalable feature engineering and ultra-high-dimensional confounding adjustment in healthcare database studies
Richard Wyss, Jie Yang, Sebastian Schneeweiss, Joseph M. Plasek, Li Zhou, Thomas Deramus, Janick G. Weberpals, Kerry Ngan, Theodore N. Tsacogianis, Kueiyu Joshua Lin
medRxiv 2025.01.30.25321403; doi: https://doi.org/10.1101/2025.01.30.25321403
Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
Natural language processing for scalable feature engineering and ultra-high-dimensional confounding adjustment in healthcare database studies
Richard Wyss, Jie Yang, Sebastian Schneeweiss, Joseph M. Plasek, Li Zhou, Thomas Deramus, Janick G. Weberpals, Kerry Ngan, Theodore N. Tsacogianis, Kueiyu Joshua Lin
medRxiv 2025.01.30.25321403; doi: https://doi.org/10.1101/2025.01.30.25321403

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Health Informatics
Subject Areas
All Articles
  • Addiction Medicine (576)
  • Allergy and Immunology (868)
  • Anesthesia (306)
  • Cardiovascular Medicine (4483)
  • Dentistry and Oral Medicine (449)
  • Dermatology (385)
  • Emergency Medicine (615)
  • Endocrinology (including Diabetes Mellitus and Metabolic Disease) (1528)
  • Epidemiology (15283)
  • Forensic Medicine (31)
  • Gastroenterology (1134)
  • Genetic and Genomic Medicine (6651)
  • Geriatric Medicine (671)
  • Health Economics (1006)
  • Health Informatics (4606)
  • Health Policy (1378)
  • Health Systems and Quality Improvement (1624)
  • Hematology (545)
  • HIV/AIDS (1276)
  • Infectious Diseases (except HIV/AIDS) (15965)
  • Intensive Care and Critical Care Medicine (1111)
  • Medical Education (626)
  • Medical Ethics (147)
  • Nephrology (675)
  • Neurology (6699)
  • Nursing (346)
  • Nutrition (1006)
  • Obstetrics and Gynecology (1153)
  • Occupational and Environmental Health (961)
  • Oncology (3370)
  • Ophthalmology (989)
  • Orthopedics (370)
  • Otolaryngology (421)
  • Pain Medicine (437)
  • Palliative Medicine (131)
  • Pathology (670)
  • Pediatrics (1704)
  • Pharmacology and Therapeutics (700)
  • Primary Care Research (717)
  • Psychiatry and Clinical Psychology (5497)
  • Public and Global Health (9288)
  • Radiology and Imaging (2225)
  • Rehabilitation Medicine and Physical Therapy (1375)
  • Respiratory Medicine (1202)
  • Rheumatology (598)
  • Sexual and Reproductive Health (721)
  • Sports Medicine (536)
  • Surgery (722)
  • Toxicology (100)
  • Transplantation (290)
  • Urology (267)