World Science against COVID-19: Gender and Geographical Distribution of Research.

In just a year and a half, an enormous volume of scientific research has been generated throughout the world to study a virus/disease that turned into a pandemic. All the articles on COVID-19 or SARS-CoV-2 included in the SCI-EXPANDED database (Web of Science), signed by more than a third of a million of authorships, were analyzed. Gender could be identified in 92% of the authorships. Women represent 40% of all authors, a similar proportion as first authors, but just 30% as last/senior authors. The pattern of collaboration shows an interesting finding: when a woman signs as a first or last/senior author, the article byline approximates gender parity. According to the corresponding address, the USA shares 22.8% of all world articles, followed by China (14.4%), Italy (7.8%), the UK (5.8%), India (4.2%), Spain (3.8%), Germany (3.6%), France (2.9%), Turkey (2.5%), and Canada (2.4%). Despite their short lives, the papers received an average of 11 citations. The high impact of papers from China is striking (25.1 citations; the UK, 12.4 citations; the USA, 11.3 citations), presumably because the disease emerged in China, and the first publications (very cited) came from there.

In December 2019, the Chinese city of Wuhan became the center of an outbreak of pneumonia of unknown origin. One month later, Chinese scientists isolated a novel coronavirus, the severe acute respiratory syndrome coronavirus 2, or SARS-CoV-2, responsible for this viral pneumonia, which was later designated coronavirus disease 2019 (COVID-19) by the World Health Organization 1 .
Since then, in just a year and a half, an enormous volume of scientific research has been generated throughout the world to study this new virus/disease turned into a pandemic. I analyzed all the articles on COVID-19 or SARS-CoV-2 included in the Science Citation Index Expanded (SCI-EXPANDED) database of Web of Science, signed by more than a third of a million of authorships. The analysis was done from a double perspective as I wanted to determine the gender composition of the authors and discover the participation of women in this gigantic scientific endeavor. Furthermore, I wanted to know the participation of the different regions of the world by analyzing the corresponding addresses.

Method
Sample data. All the Articles on COVID-19 or SARS-CoV-2 (TOPIC) included in the SCI-EXPANDED database (Web of Science, Clarivate Analytics) were selected on [10][11][12][13] May 2021. It is recognized that this database includes the world's leading journals of science and technology after a rigorous selection process Our collection consisted of Gender identification of authors. I examined the authorships to determine their gender.
The SCI-EXPANDED database (like most scientific database) does not provide information about the authors' gender. However, in 2008 the Web of Science began to include the authors' full names, although a small proportion of records still display only the authors' initials. All the authors' first names were matched through two gender databases: GenderChecker (acquired from http://genderchecker.com/) and Gender API (acquired from https://genderapi.io/).
Procedure. Each variable of interest (author name and surnames, title of article, year of publication, journal, corresponding address, etc.) was extracted using the BibExcel program 2 and merged in a master Excel database to perform the bibliometric analyses.
Statistical analyses were carried out with the SPSS v.22 software.

Results and Discussion
Rate of women authors. From the total 340,868 authorships, and after excluding the authorships with only initials, unisex names, or first names that did not match the gender databases, gender could be identified in 314,319 (92.2%). Men were 188,465, and women, 125,854. Therefore, women represent 40% of all the known-gender authorships a .
This percentage of female researchers regarding COVID-19 (or SARS-CoV-2) is quite far from the gender parity [X 2 (df = 1) = 6297.99, p< .0001, Cramer's V = 0.10 b ], although a somewhat larger proportion than the overall presence of women in worldwide science, about a third of researchers 3 . González-Alvarez analyzed The Lancet journals during a The percentages of female or male authorships will always refer to the known-gender totals. b Cramer's V determines the effect size. The standard interpretation for one degree of freedom (df) is: 0.10 = small, 0.30 = medium, 0.50 = large effect.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. Our data show that women are clearly underrepresented as last/senior authors compared to the overall rate: only 30.2% of the last authors of articles on COVID-19, signed by three or more authors, were women (Table 1, Figure 1). In biomedical sciences, this position is usually reserved for the senior or leading scientist on a research project and normally corresponds to a scientist with a consolidated and longer career 7 . This relative female underrepresentation as last/senior authors has also been observed in other gender studies on scientific and biomedical publications 3,4,6 , suggesting that, in addition to other variables, age-or more exactly, seniority-might play some role in the . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. and 45.8% men/54.2% women when a woman signed as the last/senior author, which was slightly female-biased). This fact does not necessarily mean that there is a causeand-effect relationship between the presence of women in one of these two key positions and near gender parity in the article byline, but these two facts are correlated. It gives the impression that leading female researchers tend to co-publish with women more than . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 29, 2021. ; https://doi.org/10.1101/2021.09.29.21264261 doi: medRxiv preprint leading male researchers do; alternatively, they may be working on subtopics that are relatively more appealing to women.

Geographical distribution. Although research on COVID-19 is logically transnational
in most groups, I considered the corresponding address of each article. Table 2 shows data from the 20 countries with the highest number of articles. In Table S1

Conclusions
After analyzing all the scientific articles on COVID-19/SARS-CoV-2 included in the SCI-EXPANDED database, we can draw the following conclusions: c Each gender received more citations than the overall mean because the most cited articles tended to be signed by more authors. d The effect size interpretations for partial eta squared (η 2 p) values are: .01 = small, .06 = medium, and .14 = large.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 29, 2021. ; • Women represent 40% of authorships, still far from gender parity.
• Compared to the overall rate, women are relatively underrepresented as last or senior authors (30.2%). This fact, also found in other studies, suggests that age, or more specifically, seniority, could play some role in the gender composition of biomedical researchers.
• The pattern of collaboration shows an interesting finding, also observed in another study 6 : when a woman signs as the first or last/senior author, the article byline approximates gender parity.

Ethical approval
This work did not require ethical approval.

Acknowledgements:
This work was completed with resources provided by the University Jaume I of Castellon (Spain).

Conflict of interests:
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.  . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 29, 2021. ; Figure 2. Percentages of Men and Women as authors, depending on which gender occupied the first or last/senior positions in the article byline. Values were calculated for articles with at least three coauthors.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint (*) computed from articles with a least three authors.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted September 29, 2021. ; https://doi.org/10.1101/2021.09.29.21264261 doi: medRxiv preprint  Table 3. The 30 most cited articles on COVID-19 or SARS-CoV-2 (SCI-EXPANDED, Web of Science).