User profiles for Javier Alvarez-Valle
Javier Alvarez ValleMicrosoft Research Verified email at microsoft.com Cited by 427 |
Making the most of text semantics to improve biomedical vision–language processing
Multi-modal data abounds in biomedicine, such as radiology images and reports. Interpreting
this data at scale is essential for improving clinical care and accelerating clinical research. …
this data at scale is essential for improving clinical care and accelerating clinical research. …
Learning to exploit temporal structure for biomedical vision-language processing
Self-supervised learning in vision--language processing (VLP) exploits semantic alignment
between imaging and text modalities. Prior work in biomedical VLP has mostly relied on the …
between imaging and text modalities. Prior work in biomedical VLP has mostly relied on the …
Active label cleaning for improved dataset quality under resource constraints
Imperfections in data annotation, known as label noise, are detrimental to the training of
machine learning models and have a confounding effect on the assessment of model …
machine learning models and have a confounding effect on the assessment of model …
Evaluation of deep learning to augment image-guided radiotherapy for head and neck and prostate cancers
…, K O'Hara, C Bishop, J Alvarez-Valle… - JAMA network …, 2020 - jamanetwork.com
Importance Personalized radiotherapy planning depends on high-quality delineation of target
tumors and surrounding organs at risk (OARs). This process puts additional time burdens …
tumors and surrounding organs at risk (OARs). This process puts additional time burdens …
Multimodal healthcare AI: identifying and designing clinically relevant vision-language applications for radiology
Recent advances in AI combine large language models (LLMs) with vision encoders that
bring forward unprecedented technical capabilities to leverage for a wide range of healthcare …
bring forward unprecedented technical capabilities to leverage for a wide range of healthcare …
Exploring the Boundaries of GPT-4 in Radiology
The recent success of general-domain large language models (LLMs) has significantly
changed the natural language processing paradigm towards a unified foundation model across …
changed the natural language processing paradigm towards a unified foundation model across …
Compositional zero-shot domain transfer with text-to-text models
Label scarcity is a bottleneck for improving task performance in specialized domains. We
propose a novel compositional transfer learning framework (DoT5 1 ) for zero-shot domain …
propose a novel compositional transfer learning framework (DoT5 1 ) for zero-shot domain …
RAD-DINO: Exploring Scalable Medical Image Encoders Beyond Text Supervision
Language-supervised pre-training has proven to be a valuable method for extracting
semantically meaningful features from images, serving as a foundational element in multimodal …
semantically meaningful features from images, serving as a foundational element in multimodal …
Enabling large-scale screening of Barrett's esophagus using weakly supervised deep learning in histopathology
Timely detection of Barrett’s esophagus, the pre-malignant condition of esophageal
adenocarcinoma, can improve patient survival rates. The Cytosponge-TFF3 test, a non-endoscopic …
adenocarcinoma, can improve patient survival rates. The Cytosponge-TFF3 test, a non-endoscopic …
MAIRA-1: A specialised large multimodal model for radiology report generation
We present a radiology-specific multimodal model for the task for generating radiological
reports from chest X-rays (CXRs). Our work builds on the idea that large language model(s) …
reports from chest X-rays (CXRs). Our work builds on the idea that large language model(s) …