User profiles for Asier Gutiérrez-Fandiño
Asier Gutiérrez-FandiñoWalmart Global Tech Verified email at walmart.com Cited by 375 |
Maria: Spanish language models
A Gutiérrez-Fandiño, J Armengol-Estapé… - arXiv preprint arXiv …, 2021 - arxiv.org
This work presents MarIA, a family of Spanish language models and associated resources
made available to the industry and the research community. Currently, MarIA includes …
made available to the industry and the research community. Currently, MarIA includes …
Pretrained biomedical language models for clinical NLP in Spanish
This work presents the first large-scale biomedical Spanish language models trained from
scratch, using large biomedical corpora consisting of a total of 1.1 B tokens and an EHR …
scratch, using large biomedical corpora consisting of a total of 1.1 B tokens and an EHR …
Biomedical and clinical language models for spanish: On the benefits of domain-specific pretraining in a mid-resource scenario
…, J Armengol-Estapé, A Gutiérrez-Fandiño… - arXiv preprint arXiv …, 2021 - arxiv.org
This work presents biomedical and clinical language models for Spanish by experimenting
with different pretraining choices, such as masking at word and subword level, varying the …
with different pretraining choices, such as masking at word and subword level, varying the …
[PDF][PDF] Spanish language models
A Gutiérrez-Fandiño, J Armengol-Estapé… - arXiv preprint arXiv …, 2021 - academia.edu
This paper presents the Spanish RoBERTa-base and RoBERTa-large models, as well as
the corresponding performance evaluations. Both models were pre-trained using the largest …
the corresponding performance evaluations. Both models were pre-trained using the largest …
Anticipating the Debate: Predicting Controversy in News with Transformer-based NLP
BC Figueras, A Gutiérrez-Fandiño… - … del Lenguaje Natural, 2023 - journal.sepln.org
Controversy is a social phenomenon that emerges when a topic generates large disagreement
among people. In the public sphere, controversy is very often related to news. Whereas …
among people. In the public sphere, controversy is very often related to news. Whereas …
[HTML][HTML] Predicting the evolution of COVID-19 mortality risk: A Recurrent Neural Network approach
…, A Gonzalez-Agirre, A Gutiérrez-Fandiño… - Computer Methods and …, 2023 - Elsevier
Background: In December 2020, the COVID-19 disease was confirmed in 1,665,775 patients
and caused 45,784 deaths in Spain. At that time, health decision support systems were …
and caused 45,784 deaths in Spain. At that time, health decision support systems were …
Spanish biomedical crawled corpus: A large, diverse dataset for spanish biomedical language models
…, OG Bonet, A Gutiérrez-Fandiño… - arXiv preprint arXiv …, 2021 - arxiv.org
We introduce CoWeSe (the Corpus Web Salud Espa\~nol), the largest Spanish biomedical
corpus to date, consisting of 4.5GB (about 750M tokens) of clean plain text. CoWeSe is the …
corpus to date, consisting of 4.5GB (about 750M tokens) of clean plain text. CoWeSe is the …
Spanish legalese language model and corpora
A Gutiérrez-Fandiño, J Armengol-Estapé… - arXiv preprint arXiv …, 2021 - arxiv.org
There are many Language Models for the English language according to its worldwide
relevance. However, for the Spanish language, even if it is a widely spoken language, there are …
relevance. However, for the Spanish language, even if it is a widely spoken language, there are …
Fineas: Financial embedding analysis of sentiment
A Gutiérrez-Fandiño, P Kolm… - arXiv preprint arXiv …, 2021 - arxiv.org
We introduce a new language representation model in finance called Financial Embedding
Analysis of Sentiment (FinEAS). In financial markets, news and investor sentiment are …
Analysis of Sentiment (FinEAS). In financial markets, news and investor sentiment are …
esCorpius: A Massive Spanish Crawling Corpus
A Gutiérrez-Fandiño, D Pérez-Fernández… - arXiv preprint arXiv …, 2022 - arxiv.org
In the recent years, transformer-based models have lead to significant advances in language
modelling for natural language processing. However, they require a vast amount of data to …
modelling for natural language processing. However, they require a vast amount of data to …