User profiles for Javier Alvarez-Valle

Javier Alvarez Valle

Microsoft Research
Verified email at microsoft.com
Cited by 427

Making the most of text semantics to improve biomedical vision–language processing

…, T Naumann, A Nori, J Alvarez-Valle… - European conference on …, 2022 - Springer
Multi-modal data abounds in biomedicine, such as radiology images and reports. Interpreting
this data at scale is essential for improving clinical care and accelerating clinical research. …

Learning to exploit temporal structure for biomedical vision-language processing

…, MP Lungren, A Nori, J Alvarez-Valle… - Proceedings of the …, 2023 - openaccess.thecvf.com
Self-supervised learning in vision--language processing (VLP) exploits semantic alignment
between imaging and text modalities. Prior work in biomedical VLP has mostly relied on the …

Active label cleaning for improved dataset quality under resource constraints

…, MP Lungren, A Nori, B Glocker, J Alvarez-Valle… - Nature …, 2022 - nature.com
Imperfections in data annotation, known as label noise, are detrimental to the training of
machine learning models and have a confounding effect on the assessment of model …

Evaluation of deep learning to augment image-guided radiotherapy for head and neck and prostate cancers

…, K O'Hara, C Bishop, J Alvarez-Valle… - JAMA network …, 2020 - jamanetwork.com
Importance Personalized radiotherapy planning depends on high-quality delineation of target
tumors and surrounding organs at risk (OARs). This process puts additional time burdens …

Multimodal healthcare AI: identifying and designing clinically relevant vision-language applications for radiology

…, O Oktay, M Lungren, J Alvarez-Valle… - Proceedings of the CHI …, 2024 - dl.acm.org
Recent advances in AI combine large language models (LLMs) with vision encoders that
bring forward unprecedented technical capabilities to leverage for a wide range of healthcare …

Exploring the Boundaries of GPT-4 in Radiology

…, AV Nori, MP Lungren, O Oktay, J Alvarez-Valle - arXiv preprint arXiv …, 2023 - arxiv.org
The recent success of general-domain large language models (LLMs) has significantly
changed the natural language processing paradigm towards a unified foundation model across …

Compositional zero-shot domain transfer with text-to-text models

…, A Nori, H Poon, J Alvarez-Valle… - Transactions of the …, 2023 - direct.mit.edu
Label scarcity is a bottleneck for improving task performance in specialized domains. We
propose a novel compositional transfer learning framework (DoT5 1 ) for zero-shot domain …

RAD-DINO: Exploring Scalable Medical Image Encoders Beyond Text Supervision

…, N Codella, SL Hyland, J Alvarez-Valle… - arXiv preprint arXiv …, 2024 - arxiv.org
Language-supervised pre-training has proven to be a valuable method for extracting
semantically meaningful features from images, serving as a foundational element in multimodal …

Enabling large-scale screening of Barrett's esophagus using weakly supervised deep learning in histopathology

…, A Thieme, A Nori, M Gehrung, J Alvarez-Valle - Nature …, 2024 - nature.com
Timely detection of Barrett’s esophagus, the pre-malignant condition of esophageal
adenocarcinoma, can improve patient survival rates. The Cytosponge-TFF3 test, a non-endoscopic …

MAIRA-1: A specialised large multimodal model for radiology report generation

…, MT Wetscherek, O Oktay, J Alvarez-Valle - arXiv preprint arXiv …, 2023 - arxiv.org
We present a radiology-specific multimodal model for the task for generating radiological
reports from chest X-rays (CXRs). Our work builds on the idea that large language model(s) …