ABSTRACT
Objectives Horizon-scanning for innovative technologies that might be applied to medical products and require new assessment approaches/regulations will help to prepare regulators, allowing earlier access to the product for patients and an improved benefit/risk ratio. In this study, we focused on the field of AI-based medical image analysis as a retrospective example of medical devices, where many products have recently been developed and applied. We proposed and validated horizon-scanning using citation network analysis and text mining for bibliographic information analysis.
Methods and analysis Research papers for citation network analysis which contain “convolutional*” OR “machine-learning” OR “deep-learning” were obtained from Science Citation Index Expanded (SCI-expanded) in the Web of Science (WoS). The citation network among those papers was converted into an unweighted network with papers as nodes and citation relationships as links. The network was then divided into clusters using the topological clustering method and the characteristics of each cluster were confirmed by extracting a summary of frequently cited academic papers, and the characteristic keywords, in the cluster.
Results We classified 119,553 publications obtained from SCI and grouped them into 36 clusters. Hence, it was possible to understand the academic landscape of AI applications. The key articles on AI-based medical image analysis were included in one or two clusters, suggesting that clusters specific to the technology were appropriately formed. Based on the average publication year of the constituent papers of each cluster, we tracked recent research trends. It was also suggested that significant research progress would be detected as a quick increase in constituent papers and the number of citations of hub papers in the cluster.
Conclusion We validated that citation network analysis applies to the horizon-scanning of innovative medical devices and demonstrated that AI-based electrocardiograms and electroencephalograms can lead to the development of innovative products.
Strengths and limitations of this study
Citation network analysis can provide an academic landscape in the investigated research field, based on the citation relationship of research papers and objective information, such as characteristic keywords and publication year.
It might be possible to detect possible significant research progress and the emergence of new research areas through analysis every several months.
It is important to confirm the opinions of experts in this area when evaluating the results of the analysis.
Information on patents and clinical trials for this analysis is currently unavailable.
INTRODUCTION
The application of innovative technologies to the development of medical products is expected to be a potential new treatment or diagnostic tool for diseases currently lacking these. Conversely, there may be cases where the application of conventional development and evaluation concepts and/or regulatory frameworks to innovative technologies is inappropriate. Therefore, the early identification of innovative technologies with a potential application to medical products through horizon-scanning would encourage regulatory authorities to establish new approaches to assess their quality, efficacy, and safety to advise developers and revise their regulations if necessary. This will contribute to timely patient access and improve the benefit/risk ratio of product 1.
The International Coalition of Medicines Regulatory Authorities (ICMRA), consisting of regulatory authorities from 30 countries and regions, has recognised the need to respond quickly to innovative technologies and shares the importance of ‘horizon-scanning’ to identify such technologies 2 among member authorities. The ICMRA Innovation concept note 3 describes horizon-scanning as a broad-reaching information-gathering and monitoring activity to anticipate emerging products and technologies and potentially disruptive research avenues. Traditionally, horizon-scanning has been predominantly conducted in Europe for policy-making, scientific research funding, and health care budgeting purposes, by surveying a variety of sources — such as the Internet, government, international organisations and companies, databases, and journals — using the Delphi method, for example 4,5. Recently, the European Commission(EC) has published some reports including “Weak signals in Science and Technologies 2019 Report based on Tools for Innovation Monitoring (TIM) 6, which uses text mining and keywords in the scientific literature. The Japanese National Institute of Science and Technology Policy (NISTEP) also uses the Delphi method and a digital tool to analyse academic papers with the top 1 % of citations to contribute to science and innovation policy planning.
Hines et al. reported that, in the medical and health care field, the majority of horizon-scanning methods used were manual or semiautomated, with relatively few automated aspects; this may be resolved in the not-too-distant future via the rapidly evolving fields of machine learning and artificial intelligence 5.
It is nearly impossible to understand the whole picture of the extremely large and fragmented results of research and technological development and the limitations of existing methods are now being pointed out in many fields.
To solve this challenge, a computer-based approach can be used to complement the expert-based approach as it fits the scale of the information (Börner et al. 2003; Boyack et al. 2005). In particular, the citation-based approach assumes that the cited papers and research topics of the cited papers are similar. Analysing this citation network allows us to understand the structure of the research areas constituting the large volume of papers we are able to read. These methods have been widely used as powerful tools to visualise and understand the structure of a research field and to identify new trends and research directions; they have been proven effective through various studies (Chen 1999 7; Chen, Cribbin & al., 2003 8; Small, 1999 9).
For example, Kajikawa et al. (2007) 10 used citation network analysis to effectively and efficiently track emerging research areas in the field of sustainable science. Similar approaches have been applied to a wide range of fields, including energy research (Kajikawa et al., 2008 11), regenerative medicine (Shibata et al., 2011 12), robotics, and gerontology (Ittipanuvat et al., 2014 13). Sakata et al. (2012) 14 proposed a meta-structure of academic knowledge on patent and innovation research to effective assist policy discussions for intellectual property system reform. Using a citation-based approach, this study analyses the academic landscape of patent and innovation research to understand the current structure and trends of research and to detect major sub-research fields and core papers within it. They have shown that network analysis and machine learning methods are useful for understanding and predicting the development of technologies such as solar cells 15 and nanocarbon 16. Citation network analysis and text mining are useful tools for R&D strategists and policymakers in many fields to understand the broad scope of scientific and technological research and make decisions for worthwhile investments in promising technologies.
This paper proposes and discusses a methodology for horizon-scanning to identify innovative technologies that may be applied to medical products by utilising citation network analysis methods and text mining. In this paper, we focus on AI-based medical image analysis as a retrospective example of AI-based medical devices that have been developed in recent years, applied in many fields, and selected for consideration in ICMRA1. By analysing research papers on the development of AI-based medical technologies, we explored the network-like characteristics of this field and proposed a prediction procedure for innovative technologies related to the medical field.
METHODS
Extraction of paper data for analysis
To track the development history of AI-based medical image analysis and to select keywords for the extraction of the papers for citation network analysis, we selected 13 key articles 17–27 (presented in Table 1), including several papers cited in the review article 28 on the application of deep learning in medical image analysis and a study 29 that lead to the clinical development of IDx-DR, a retinal imaging software approved as a medical device by the US FDA in 2018.
In addition to the query setting “convolutional” OR “deep learning” in the review article of medical image analysis (G Litjens et al, 2017) 12, we used “machine-learning” to include a wide range of conventional studies. As a result, we obtained 140,794 papers that contain “convolutional*” OR “machine-learning” OR “deep-learning” from the Web of Science Core Collection (WoS, Thomson Reuters), between 1 January 1900 and 31 December 2020.
For analysis, data sets were created between 1 January 1900 and 31 December 2012 to 2019 and the cluster that contains key articles for each year was identified.
Citation network analysis
In this study, the citation network was converted into an unweighted network with papers as nodes and citation relationships as links. Papers with no citations as the largest component were considered digressional and were ignored in this study (Step 2 in Fig 1). The core paper with the highest number of citations is located at the centre of the citation relations. Papers with no citation relationship with other papers were considered deviant and ignored in this study. The network is then divided into several clusters using the topological clustering method. Topological clustering is a clustering method based on the graph structure of a network; here, we use modularity maximisation. A cluster is a module in a citation network and is a group of papers in which the citation relations are divided using a modularity (Q value) maximisation method and are densely aggregated (Louvain method) 16,30. The modularity maximisation method appreciates network partitioning such that the intracluster is dense and the intercluster is sparse. The modularity maximisation method determines an optimal partitioning pattern by extracting the partitioning pattern that maximises the modularity using a greedy algorithm. Q is an evaluation function of the degree of coupling within a cluster and between clusters, as follows: where Aij represents the weight of the edge between i and j, ki = ∑j Aij is the sum of the weights of the edges attached to vertex i, ci is the community to which vertex i is assigned, δ-function δ (u, v) is 1 if u= v and 0 otherwise, and
The clusters are assigned labels corresponding to the size of the number of papers included. The characteristics of each cluster were confirmed by extracting a summary of frequently cited academic papers in the cluster and the characteristic keywords in the cluster.
In addition, we computed the term frequency-inverse cluster frequency (TF-ICF) to extract the characteristic keywords of each cluster. The term frequency TF gives a measure of the importance of a term in a particular sentence. The inverse cluster frequency ICF provides a measure of the general importance of a term. The TFICF of a given term i in a given cluster j is given by: where N is the total number of sentences. Each cluster was labelled based on the resulting keywords and sentences.
To confirm the trends in the research field, we extracted the mean or median year of publication of papers in each cluster, as well as information on journals, authors, and affiliated institutions.
After clustering the network, visualisation is converted to intuitively infer a relation among these clusters. We use a large graph layout (LGL) based on a force-direct layout algorithm (Adai et al., 2004) 31,32. This layout can display the largest connected component of the network to generate coordinates for nodes in two dimensions. We visualise the citation network by expressing inter-cluster links with the same colour (Step 4 in Fig 1). However, the position of the clusters and the distance between clusters do not indicate an approximation of the content. An overview of this is shown in Fig 1.
For the extracted dataset, the citation network was converted into an unweighted network with papers as nodes and citation relationships as links (Step 2). The network was then divided into several clusters using the topological clustering method (Step 3). In addition, a large graph layout (LGL) – based on a force-direct layout algorithm – displayed the largest connected component of the network to generate coordinates for the nodes in two dimensions, visualising the citation network by expressing inter-cluster links with the same colour (Step 4).
RESULTS
Results of citation network analysis
We analysed the citation network of 140,794 papers and found that 119,553 (85 %) formed a citation network; this was divided into 36 clusters by extracting the largest linkage component from all linkage components via direct citation of papers (excluding the grey linkage not involved in cluster formation shown in Fig 1 and 2). The contents of the top 10 clusters, which contain approximately 75 % of the papers in a citation network, were estimated from the characteristic keywords appearing in each cluster and the titles and abstracts of the papers with the highest number of citations. The cluster numbers (number of papers) and their contents are listed below.
Cluster 1 (11,711): basic studies on deep learning and convolutional neural networks (CNNs), including geographic information system (GIS) image analysis using remote sensing.
Cluster 2 (7,597): drug discovery technologies related to proteins, peptides, etc., using machine learning.
Cluster 3 (5,323): applied research in medical image analysis.
Cluster 4 (4,340): feature classification using ensemble methods to increase accuracy by combination.
Cluster 5 (3,665): natural language processing of clinical records.
Cluster 6 (2,691): application of deep learning to fault diagnosis, for example, motor condition monitoring for machines running on electric motors.
Cluster 7 (2,497): machine learning (ML) and data mining (DM) methods for cyber analysis.
Cluster 8 (2,311): application to traffic flow information analysis for the implementation of intelligent transport systems.
Cluster 9 (2,281): single-image super-resolution (SR) to reconstruct high-quality data.
Cluster 10 (2,232): classification of individuals based on the analysis of text information from social media, such as emotions and behaviour.
Table 1 presents the clusters in which the key articles were included. Three papers (A, B, C) based on image recognition are found in cluster 1 and five on image diagnosis in cluster 3, including the review article “Deep Learning” 23, which is often cited in medical field papers. This indicates that clusters related to medical imaging were appropriately formed in cluster 3.
Tracking the time series of key articles
We analysed papers published each year and identified the cluster containing the key papers in Table 1 and the number of citations within the cluster to assess the position of the research on medical imaging in the past. As shown in Fig. 2, all the papers were included in the same cluster until 2015 and the rank of cluster number increased by one until 2014. In 2015, the number of papers in this field increased rapidly and the rank of cluster number rose from 13th in 2014 to 6th, suggesting that great scientific attention has increased. In 2016, a key paper on the imaging diagnosis of diabetic retinopathy (J in Table 1) was in cluster 7, which comprised papers on medical image analysis, and the other seven key articles were in cluster 3. Subsequently, in 2017, cluster 1 contained all of the key articles but, from 2018 onwards, a new separate cluster containing papers on image analysis using deep learning was formed; it can be seen that the number of citations of the key articles also increased.
Thus, most of the key articles were in one or two clusters, suggesting that the clusters related to the targeted AI-based medical image analysis were properly formed. The research status of the clusters can also be confirmed by the cluster numbers, which reflect the number of papers comprising the cluster and the number of citations of the key articles.
Recent research trends in AI-based medical products
To detect the latest research trends in AI-based medical products, we focused on ‘younger’ clusters with an average publication year later than 2017 as research progress could be observed over three years for AI-based medical image analysis (Fig 3). We re-analysed clusters 3, 15, 12, 5, 13, and 2, which were considered to be closely related to AI-based medical technologies. Clusters 3, 15, 12, 5, 13, and 2 were listed in order of average publication year. Table 2 lists the sub-clusters formed by re-analysis of the most cited article 23,33–64 (hub-paper) in each subcluster, suggesting recent research trends in this field as follows. Cluster 3: applied research in medical image analysis. Cluster 15: electrocardiogram, electroencephalogram, and other electrical biosignals of human activity. Cluster 12: human activity recognition. Cluster 5: natural language processing of clinical records. Cluster 13: neuroimaging analysis. Cluster 2: drug discovery with machine learning related to proteins, peptides, etc.
Among these AI-based medical technologies, EEG analysis was identified for applications in epileptic seizure prediction, emotional analysis, and brain-computer interfaces, for which the FDA issued draft guidance on non-clinical and clinical trials in 2019.
Electrocardiograms (ECGs) and electroencephalograms (EEGs) in cluster 15 are most likely to be applied to new medical devices; therefore, we tried to follow the cluster containing a key paper on the application of deep learning to EEG analysis by Cecotti & Graser(2010) 65, which was one of the triggers for the development of this field. During 2015 – 2016, the article was included in the same cluster as other neuroimaging techniques, such as MRI (MEG, fNIRS, etc.). In 2017, the key article was found in a separate cluster numbered 20 from other neuroimaging techniques, suggesting a new cluster specific to the application of deep learning to EEG was formed. Then, in 2018, the article was included in cluster 1 of the applications of deep learning in various fields but was included in a specific cluster re-formed, numbered 14 and 15 in 2019 and 2020, respectively, and the number of citations of the article increased. This suggests that research in this field has developed rapidly since 2017.
DISCUSSION
The development of medical products based on innovative technologies may sometimes not be amenable to current development and evaluation approaches or regulatory frameworks. Horizon-scanning for such technologies will contribute to earlier access to the product for patients and a better benefit/risk ratio for the product by encouraging regulatory authorities to develop new guidance/regulations. Experts who have a deep understanding of innovative technologies would be able to predict the development of medical products based on the technology but experts on all evolving innovative technologies may not be available to regulatory authorities. However, it might sometimes be inappropriate to narrow the scope of consideration based solely on expert’s opinions, as information from experts is subjective and the outcome depends upon the choice of the expert 66. Therefore, it is more efficient and appropriate to use a method based on objective information as a primary screening tool for horizon-scanning to identify candidate technologies that may require new guidance or regulation. In this study, we have shown that citation network analysis and text mining are suitable for this purpose. We used these methods to classify a large number of papers in the field by research topic and identified the topics of the clusters based on the characteristic keywords of the clusters and the titles of the most cited papers. We also objectively evaluated the attention and novelty of the topic based on the number of papers and the median year of publication.
In this study, we examined the possibility of using this analysis method for horizon-scanning targeting AI-based medical image analysis as a typical example that requires new regulatory frameworks and evaluation approaches 67. IDx-DR, an image analysis software for the automatic diagnosis of diabetic retinopathy, received US FDA certification in 2018. The AI characteristics are self-learning, the algorithm for learning data during the development of a medical product is in a black box, and performance changes as the product continues learning during clinical use. This has become an interesting dilemma for regulators 67.
We assessed the feasibility of using citation network analysis and text mining to identify trend history in AI-based medical image analysis research and development, which is as follows: Research on convolutional neural networks (CNNs), the currently leading technology in deep learning arose in the 1970s, renewed interest in neural networks was Werbos’s multi-layer networks (1975) 68 and LeNet (1998) 17 – a CNN-based handwritten number recognition system – was developed and succeeded by a CNN called AlexNet (2012) 19, which was a key trigger for renewed interest in neural networks.
Later, the U-net 22 architecture was proposed, which consists of an upsampling section that uses “up” convolution to increase the image size. Furthermore, the combination of CNNs and recurrent neural networks (RNNs), represented by long short-term memory (LSTM), has been applied to analysis involving time-series data 28,44. We evaluated 13 key articles, including these milestones in the development of AI-based medical image analysis, to determine how key articles could be captured by citation network analysis and found that eight were identified in one or two clusters (Table 1), with a concentration of the characteristic keywords of the clusters, and the titles and abstracts of the articles with the highest number of citations confirmed that the clusters were related to AI-based medical image analysis and that it was possible to identify actual research trends. In addition, we analysed the papers reported each year and found that the number of constituent papers of the cluster containing the key articles increased dramatically after 2014, with the rank rising from 13th to 6th, suggesting that the technology related to diagnostic imaging has progressed dramatically. This might have led to a major clinical trial of IDx-DR in 2017. Since then, there has been a further increase in research activity in this field, as can be seen from the rank of cluster number and number of citations in the key articles. Five of the 13 selected articles were not included in the analysis: three papers were not included in the Web of Science Core Collection and the other two (MD Abràmoff et al. (2013) 20, AA van der Heijden et al. (2018) 29) on clinical evaluation were not found with the set query; this is because there was no mention of the underlying technology in the abstract or title, and the methods were mainly described as product names or computer detection in both papers.
Next, we explored recent trends in the development of new medical products using AI by re-analysing ‘young’ clusters with a late average of the publication year of constituent papers to identify more specific topics by sub-clustering (Table 2). We focused on EEG and ECG, which have the potential to lead to the development of new medical devices, and followed the cluster containing the key article on this topic. As shown in Fig 3, the increase in constituent papers and the number of citations of the key article suggested that this topic has made significant progress during 2017 – 2018, which might be related to the US FDA issuing draft guidance on brain-computer interfaces in 2019 69.
This study also showed that analysis every several months might allow us to identify the candidate topics to further investigate through the rapid rise of the rank of cluster number, i.e., a sharp increase in constituent papers (2014 – 2015 in Fig. 2 and 2017 – 1018 in Fig 3), or the emergence of a new cluster spun out of the original one (2017 – 2018 in Fig. 2 and 2016 – 2017 in Fig 3), which may be a signal of significant research progress.
However, this analysis method has the following limitations: papers in major journals are included in WoS relatively quickly after publication but there might be a delay of approximately six months for almost all journals and some research areas may not be reflected in WoS quickly enough, which may delay the identification of research trends. Although data are not shown in this paper, we also analysed the papers obtained from PubMed; however, approximately 30 % of the papers formed a citation network and only 5 of the 13 key articles were included. One of the possible reasons for not being able to extract appropriate research papers from PubMed was that many of the papers did not use terminology related to AI-based technologies. This suggests that the choice of the literature database according to the target technology is also critical. Furthermore, research results in the field of machine learning, which covers basic technologies in the field of AI, are sometimes published in venues such as arXiv.com, where researchers can directly exchange papers with each other via the Internet; therefore, the latest results cannot be covered by databases of academic papers, such as WoS or PubMed.
Another possible bias, as mentioned by Takano et al. (2017) 70, is that researchers mainly check and cite papers written in their native language or journals they contribute to, or that they tend to search and cite papers using the same terminology and not others — even when the technological meaning is the same.
It is important to hear the opinions of experts in the field regarding the candidate topics to be investigated, which will result in overcoming the limitations described above.
It is expected that this citation network analysis will be established as a primary horizon-scanning method by continuing the study in other fields and organising the analysis conditions and points to be noted according to the characteristics of the target technology.
Data Availability
The data that support the findings of this study are available from the corresponding author, Mayumi Shikano, upon reasonable request.
Funding
This research was supported by AMED (Japan Agency for Medical Research and Development) under Grant Number JP20mk0101155.
Author Contributions
MS obtained developed the research design, interpreted the results. TT investigated literatures, analyzed the data and interpreted the results. HS and HY designed the methodology, software and interpreted the results. MH designed the data editing. TT and MS drafted manuscript. All authors have read and approved the final manuscript.
Competing interests
All authors have completed the ICMJE uniform disclosure form at www.icmje.org/coi_disclosure.pdf and declare: no support from any organisation for the submitted work; no financial relationships with any organisations that might have an interest in the submitted work in the previous 3 years; no other relationships or activities that could appear to have influenced the submitted work.
Acknowledgment
We would like to thank Dr. Hidefumi Kobatake for his advice on the history of AI technology development. We also thank Dr. Rika Wakao, Dr. Masafumi Shimokawa, and Ms. Ai Fukaya for their help.