User profiles for Jordi Armengol-Estapé

Jordi Armengol-Estapé

University of Edinburgh
Verified email at ed.ac.uk
Cited by 668

Maria: Spanish language models

A Gutiérrez-Fandiño, J Armengol-Estapé… - arXiv preprint arXiv …, 2021 - arxiv.org
This work presents MarIA, a family of Spanish language models and associated resources
made available to the industry and the research community. Currently, MarIA includes …

Pretrained biomedical language models for clinical NLP in Spanish

…, A Gutiérrez-Fandiño, J Armengol-Estapé… - Proceedings of the …, 2022 - aclanthology.org
This work presents the first large-scale biomedical Spanish language models trained from
scratch, using large biomedical corpora consisting of a total of 1.1 B tokens and an EHR …

[PDF][PDF] Overview of Automatic Clinical Coding: Annotations, Guidelines, and Solutions for non-English Clinical Cases at CodiEsp Track of CLEF eHealth 2020.

…, A Gonzalez-Agirre, J Armengol-Estapé… - CLEF (Working …, 2020 - ceur-ws.org
Clinical coding requires the analysis and transformation of medical narratives into a structured
or coded format using internationally recognized classification systems like ICD-10. These …

Biomedical and clinical language models for spanish: On the benefits of domain-specific pretraining in a mid-resource scenario

CP Carrino, J Armengol-Estapé… - arXiv preprint arXiv …, 2021 - arxiv.org
This work presents biomedical and clinical language models for Spanish by experimenting
with different pretraining choices, such as masking at word and subword level, varying the …

Medical word embeddings for Spanish: Development and evaluation

…, M Krallinger, J Armengol-Estapé - Proceedings of the …, 2019 - aclanthology.org
Word embeddings are representations of words in a dense vector space. Although they are
not recent phenomena in Natural Language Processing (NLP), they have gained momentum …

Are multilingual models the best choice for moderately under-resourced languages? A comprehensive assessment for Catalan

J Armengol-Estapé, CP Carrino… - arXiv preprint arXiv …, 2021 - arxiv.org
Multilingual language models have been a crucial breakthrough as they considerably reduce
the need of data for under-resourced languages. Nevertheless, the superiority of language-…

[PDF][PDF] Spanish language models

A Gutiérrez-Fandiño, J Armengol-Estapé… - arXiv preprint arXiv …, 2021 - academia.edu
This paper presents the Spanish RoBERTa-base and RoBERTa-large models, as well as
the corresponding performance evaluations. Both models were pre-trained using the largest …

On the multilingual capabilities of very large-scale English language models

J Armengol-Estapé, OG Bonet, M Melero - arXiv preprint arXiv:2108.13349, 2021 - arxiv.org
Generative Pre-trained Transformers (GPTs) have recently been scaled to unprecedented
sizes in the history of machine learning. These models, solely trained on the language …

Bind the gap: Compiling real software to hardware FFT accelerators

J Woodruff, J Armengol-Estapé, S Ainsworth… - Proceedings of the 43rd …, 2022 - dl.acm.org
Specialized hardware accelerators continue to be a source of performance improvement.
However, such specialization comes at a programming price. The fundamental issue is that of …

[HTML][HTML] Predicting the evolution of COVID-19 mortality risk: A Recurrent Neural Network approach

…, A Gutiérrez-Fandiño, J Armengol-Estapé… - Computer Methods and …, 2023 - Elsevier
Background: In December 2020, the COVID-19 disease was confirmed in 1,665,775 patients
and caused 45,784 deaths in Spain. At that time, health decision support systems were …