A machine-learning method based on ocular surface features for COVID-19 screening

Feng Li; Xiangyang Xue; Qiang Sun; Haicheng Tang; Wenxuan Wang; Mengwei Gu; Yanwei Fu

doi:10.1101/2020.09.03.20184226

Abstract

Background The standard practices for screening patients with COVID-19 are the CT imaging or RT-PCR (real-time polymerase chain reaction) for testing viral nucleic acid, which is expensive and in need of professional equipment and waiting time.

Methods Literatures have shown that the COVID-19 patients are usually accompanied by ocular manifestations. We constructed a machine-learning model based on ocular surface features and proposed a new screening method for COVID-19. A retrospective study of analyzing 446 subjects and a prospective study with 128 subjects were conducted with this method. The performance was measured at receiver-operating-characteristic curve (AUC), sensitivity, specificity and accuracy.

Results The performance of detecting COVID-19 patients in the retrospective study have achieved an AUC of 0.999 (95% CI, 0.997-1.000), with a sensitivity of 0.982 (95% CI, 0.954-1.000), and a specificity of 0.978 (95% CI, 0.961-0.995). And in the prospective study, our model performance on COVID-19 has achieved an AUC of 0.980 (95% CI, 0.970-0.990), with a sensitivity of 0.770 (95% CI, 0.694-0.846), and a specificity of 0.973(95% CI, 0.957-0.989).

Conclusion This deep learning method based on eye-region images demonstrates the high accuracy to distinguish COVID-19 patients. We hope this study can be inspiring and helpful for encouraging more researches on this topic.

Introduction

The Coronavirus disease 2019 (COVID-19) pandemic has caused more that one million deaths¹ since first breakout. Fever is one of the typical symptoms of COVID-19 patients, with a range of 43.8%² to 98%^3,4. Accordingly, body temperature has been the initial and most widely used screening method for COVID-19 patients⁵. Meanwhile, the emergence and rising number of asymptomatic COVID-19 patients⁶ are posing a threat to body temperature based screening method. Hence, developing other COVID-19 screening methods is of urgent needs.

In recent years, artificial intelligence (AI) has achieved remarkable performance in computer vision and medical analysis, including chest computed tomography (CT) based AI system for COVID-19 diagnosis^7,8, and ocular diseases like diabetic retinopathy and glaucoma ⁹. Notably, SARS-CoV-2 has been detected from COVID-19 patients’ ocular secretions¹⁰, and extrapulmonary manifestations¹¹, including ocular symptoms (conjunctival congestion, sagging eyelids, etc. ^12,13) were reported for COVID-19 patients, which inspires us to develop an AI model to screen COVID-19 patients, only based on ocular surface images captured by common CCD cameras.

In this study, we constructed a machine-learning model based on ocular surface features, and conducted a retro-prospective study to screen asymptomatic/mild COVID-19 patients from healthy volunteers, pulmonary disease patients without COVID-19 (e.g., pulmonary fungal infection, bronchopneumonia, chronic obstructive pulmonary disease, and lung cancer), and ocular disease patients. The results reveal that asymptomatic/mild COVID-19 patients have distinguished ocular features from others. The convenient method would provide an alternative choice for asymptomatic/mild COVID-19 patients screening and guide effective prevention and control measures against COVID-19.

Methods

Study design and oversight

We conducted a retro-prospective training, validation and testing study on a deep learning model using ocular surface photos to screen COVID-19 patients (Figure 1). This study included 2,108 photographs from 569 paticinpants (retrospective 446 and prospective 123) which included healthy volunteers, COVID-19 patients, pulmonary diseases or ocular diseases. All 1561 photographs in the open-labeled retrospective study were used for model training and validation, and 547 photographs in the single-blind (to AI group) prospective study were used for testing. The study was conducted in accordance with the principles of the Declaration of Helsinki. This study was approved by the Ethics Committee of Shanghai Public Health Clinical Center (approval No.: YJ-2020-S078-02), and informed consents have been obtained.

Figure 1. Study design and workflow of this study.

Study subjects

The participants were enrolled from Shanghai Public Health Clinical Center (SPHCC), Fudan University. In the retrospective study, 143 healthy volunteers, 104 COVID-19 patients, 131 pulmonary disease patients, and 68 ocular patients were included, from 2020 April 1^st to June 30^th. The prospective study comprises 35 healthy volunteers, 29 COVID-19 patients, 31 pulmonary and 33 ocular diseases patients, from 2020 July 1^st to August 31^st (table 1). Of 133 COVID-19 patients, 47 (retrospective 24 and prospective 23) were asymptomatic/mild type. Most of the COVID-19 are with a from East Asia (87.50% and 93.10%) for retrospective and prospective studies. The demographic and clinical characteristics of COVID-19 patients were shown in table 2.

View this table:

Table 1. Summary of Training, Validation, and Testing Datasets.

View this table:

Table 2. The demographic and clinical characteristics of COVID-19 patients in this study.

Definition of participants

The COVID-19 patients were confirmed by the RT-PCR detection for viral nucleic acids. According to the eighth version guideline published by the National Health Commission of China, asymptomatic COVID-19 carriers refers to individuals with viral nucleic acid test positive, and no clinical symptoms and no signs of pneumonia in chest imaging. The patients with pulmonary diseases other than COVID-19 were diagnosed as bronchopneumonia, chronic obstructive pulmonary diseases, pulmonary fungal infection and lung cancer, etc. The patients with ocular diseases were diagnosed as trachoma, pinkeye, conjunctivitis, glaucoma, cataract and keratitis, etc. The healthy volunteers were collected from individuals who had taken physical examination and no obviously abnormal results were demonstrated. All the subjects were tested for the COVID-19, and no participants showed viral nucleic acids positive during the following days, except for COVID-19 patients. No death events were observed in this study.

Image acquisition and exclusion

For each participant, 3-5 photographs of the ocular surface were taken using the common CCD and CMOS cameras, assisted by doctors or healthcare workers. The same shooting plain mode and parameters were used, and avoid shooting filters. The photos were captured in a good lighting condition, and not in dark or red background. The image resolution is at least 1900×500 96dpi. A total of 609 photographs (retrospective 447 and prospective 162) were taken from healthy volunteers, 506 photographs (retrospective 367 and prospective 139) from COVID-19 patients, 589 photographs (retrospective 473 and prospective 116) from pulmonary and 404 photographs (retrospective 274 and prospective 130) from ocular disease patients. 27(retrospective 6 and prospective 2) photographs from 8 participants with low quality were excluded.

Developing the deep-learning classification

The schematic of our proposed DLS is illustrated in Figure 2, which consists of two components, the Image Preprocessing¹⁴ and Classification Network^15,16. Specifically, the Image Preprocessing receives raw ocular images, and prepare them for model training or inference. For the Classification Network, it studies the characteristics of ocular surfaces according to the inputs, and learns texture and semantic embeddings. Finally, the risk assessment of pneumonia (i.e., Healthy volunteers, COVID-19, pulmonary and ocular disease patients) is performed by a classifier with the extracted features.

Figure 2. Illustration of the modeling framework.

The Classification Network is trained on the training data, and evaluated on the testing and validation data. The testing and validation data is are not used to train the network. Particularly, we apply five-fold cross-validation¹⁷ in the retrospective study, where all the retrospective samples are randomly partitioned into five equal sized subsets. Of the five subsets, a single subset is retained as the data for testing the model, and the remaining 4 subsets are used as training and validation data. The cross-validation process is then repeated five times, with each of the five subsets used exactly once as the testing data. In the prospective study, all the training and validation data are used for model training, and the performance of deep learning is measured at the test dataset.

Considering that the subjects may provide more than one image in the realistic test, we should classify them based on the prediction results of multiple images¹⁸. Therefore, we conduct the risk screening for subject based on the previous image-level predictions and set the priority, e.g., the COVID-19 is the most emergency situation, next is pulmonary, then ocular, and the last is healthy category. In other words, if one of the multiple images belonging to a person is predicted with the COVID-19, we assume that that person is most likely to be a patient with COVID-19.

Statistical analysis

To measure the performance characteristics, we used the one-versus-rest and COVID-19-versus-other strategy and calculated the area under the receiver-operating-characteristic curve (AUC), sensitivity, specificity, and accuracy according to the results of our classification model. Bootstrapping was used to estimate 95% confidence intervals of the performance metrics, with the photo as the resampling unit. The One-vs.-Rest setting refers to splitting the disease classification task into one binary classification problem per class; and the COVID-19-vs.-One setting means classifying one binary classification problem into either COVID-19, or another disease.

Results

Retrospective classification performance of COVID-19

In the retrospective study, we first conduct the 5-fold cross-validation on the training and validation datasets. All data is randomly partitioned into five equal sized subsets, and there is no identity overlap among the subsets. Of the five subsets, one subset is utilized as the testing data, and the remaining four subsets are used for training the model. The cross-validation process is then repeated five times, with each of the subset used exactly once as the testing data. Bootstrap is used to estimate the confidence interval for each model. Then the five results can be averaged to produce a single average estimation. In the One-vs.-Rest setting, the average AUC of COVID-19-vs.-Rest is 0.999 (95%CI, 0.997-1.000). The sensitivity and specificity of COVID-19-vs.-Rest is 0.982 and 0.978 respectively. In the Covid19-vs.-One setting, the AUC is 0.999 (95% CI, 0.998-1.000), 0.997(95% CI, 0.993-1.000), 1.000 (95% CI, 1.000-1.000) with respect to health volunteers, pulmonary and ocular diseases (Table 3 and Figure 3).

View this table:

Table 3. The 5-folder averaging classification performance of the DL-based model on the training and validation datasets with one-vs.-rest and COVID-19-vs.-one setting.

Figure 3. The ROC curves for the DL-based Classification Network in the retrospective study.

The Model represents different validation subset for cross-validation.

Prospective classification performance of COVID-19 patients

In the prospective study, we tested the performance on the test set in both One-vs.-Rest and COVID-19-vs.-one setting. Particularly, we investigate whether the model can distinguish the asymptomatic/mild COVID-19 patients from other groups of subjects. In the One-vs.-Rest setting, the average AUC of COVID-vs.-Rest is 0.980 (95%CI, 0.970-0.990). The sensitivity and specificity of COVID-vs.-Rest is 0.770 and 0.973, respectively. In the COVID-19-vs.-One setting, the AUC is 0.963 (95% CI, 0.944-0.982), 1.000 (95% CI, 1.000-1.000), 0.998 (95% CI, 0.996-1.000) with respect to health volunteers, pulmonary and ocular diseases. Particularly, the AUC of COVID-19 (asymptomatic/mild)-vs.-one is 0.960 (95% CI, 0.939-0.981), 1.000(95% CI, 1.000-1.000), 0.999 (95% CI, 0.997-1.000) with respect to health volunteers, pulmonary and ocular diseases (Table 4 and Figure 4). The results of COVID-19(asymptomatic/mild)-vs.-one are close to COVID-19-vs-one. The 0.003 decrease on the AUC of distinguishing from health volunteers, means it is harder to distinguish the asymptomatic/mild COVID-19 patients from health volunteers.

View this table:

Table 4. Classification Performance of the DL-based model on the testing datasets with one vs. rest and one vs. one setting.

Figure 4. The ROC curves for the model to distinguish the (asymptomatic/mild) COVID-19 patients to Healthy volunteers, Pulmonary and Ocular disease patients in the prospective study.

The overall classification system performed well, misjudgment also appeared. In the prospective study, the 139 photos of COVID-19 patients have 107 classified correctly, 30 misclassified as health volunteers, and 2 misclassified as pulmonary patients. The 162 healthy people have 150 classified correctly, 2 misclassified as ocular disease and 10 misclassified as COVID-19 where 8 photos belong to the patients who just turned into COVID-19 negative from positive. In reality, one person can take multiple photos as the input of the model, which can strengthen the robustness of the model, which is widely used in ensemble learning. With a privilege voting strategy, we can boost the sensitivity performance of the model on the most important disease. As a result, only 1 COVID-19 patient was misclassified as health volunteers, 28 were classified correctly in the prospective study. The confusion matrix of classification results of subjects on the retrospective and prospective study were shown in Figure 5.

Ablation study on full eyes, left eye and right eye

Since the eyes are symmetrical, we conduct an ablation study to investigate whether the left or right eye contain more information for medical diagnosis. Table 6 shows that the model with the right eye as input is best on the accuracy in the classification of health volunteers, COVID-19, pulmonary and ocular patients. The model with full eyes is better than the model with the left eye on COVID-19, pulmonary and ocular patients.

View this table:

Table 5. The confusion matrix of classification result of subjects into the retrospective and prospective study.

View this table:

Table 6. The ablation study on using full eyes, left eye and right eye in the retrospective training and validation dataset.

The continuous observations of patients who turn into COVID-19 negative

At last, we made a continuous observation of 4 patients turning from COVID-19 positive to negative. As shown in Table 7, when turned into COVID-19 negative the four patients are less likely (8/15 = 53.3%) to be diagnosed as COVID-19 than the time when they are COVID-19 positive (12/16 = 75.0%), but still has a much more chance to be misclassified as COVID-19 patients than the normal healthy people who only has the possibility (2/147=1.4%) to be misclassified as COVID-19 patients.

View this table:

Table 7. The continuous observations of patients who turn into COVID-19 negative.

Discussion

Detecting abnormal body temperature has been the initial and most widely used screening method for COVID-19 patients. Meanwhile, the emergence and rising number of asymptomatic COVID-19 patients are posing a threat to body temperature based screening method. On one hand, despite the high sensitivity and good specificity of detecting the nucleic acid sequence of pathogens and amplify the signal by amplification reaction, the long process of RT-PCR may remarkably increase the risk of infection, less ideal for the epidemic control. On the other hand, conventional CT tests can show that patients with COVID-19 can have bilateral ground-glass appearance and pulmonary turbidity, analyzed by radiologists, or deep learning algorithms, however, for some patients, CT imaging results are not obvious or even negative. Particularly, there are many other bottlenecks and difficulties for the COVID-19 diagnosis globally, such as the restriction of pharyngeal swab sampling, the lack of RT-PCR kits, the difficulties in the transportation, the preservation of samples, political factors in some country, and so on.

Our model can successfully classify COVID-19 patients from healthy persons, pulmonary patients except for COVID-19 (e.g., pulmonary fungal infection, bronchopneumonia, chronic obstructive pulmonary disease, and lung cancer), and ocular patients. The model has two major components: an image preprocessing method to detect, crop and align the eye area from the input image, and a deep learning based classification network to extract discriminative features and recognize COVID-19 patients based on the processed eye-region data. The ocular surface features extracted by DL model from the training data show that the eye features of different diseases have obvious differences and have certain regularity. For healthy person, the DL model extract the ocular surface features around the iris, sclera and eyelid through the healthy people. To sum up, after eliminating the influence of human eye pose and iris position, we have the following conclusion that, (1) The DL model extracted features to healthy people is mainly focused on iris, with evenly distributed features; (2)The model’s attention to non-COVID-19 pneumonia patients is mainly concentrated near the inner corner of the eye. (3)The ocular features to patients with COVID-19 mainly focuses on the inner and outer corner of the eye, the upper and lower eyelid. (4)The model’s features to patients with ocular diseases is mainly focused on the iris and sclera, and the coverage is large and irregular. The convenience and easy acccedibility of the ocular photos based prediction would provide an alternative setting for COVID-19 screening.

Meanwhile, there are some limits in this study. First, the study sample size is small and most COVID-19 patients were collected from East Asia (China). A larger multicenter study covering more patients with diverse race, would be necessary to test the performance of the ocular surface feature based deep learning system. Secondly, the continuous observations study covered only four COVID-19 patients with limited time points, and a longitudinal observation study including more patients and time points would provide more evidence to evaluate the screening performance of the model. As last, the pathological significance of extracted features from COVID-19 patients should be carefully interpreted and re-verified by ophthalmologist.

Data Availability

All participants were provided with written informed consent at the time of recruitment. Please contact the first authors, for the data availability.

Declarations

Authors’ contributions

Mengwei Gu (MG) and Feng Li (FL) conceived of the presented idea and discussed it with others, and drafted the manuscript. Yanwei Fu (YF) and Xiangyang Xue (XX) contributed to design the model, conducted experiments on all datasets and wrote the paper. Qiang Sun (QS) and Haicheng Tang (HT) provided the major support to this work. Haicheng Tang (HT) collected all data from the shanghai public health clinic center, and provided medical guidance. Qiang Sun (QS) helped to improve the model. He also participated in the paper writing. Xiangyang Xue revised the manuscript and supervised the project. All authors provided critical feedback and helped shape the research, analysis, and manuscript.

Conflict of interest statements

All other authors declare no competing interests.

Ethics committee approval

All participants were provided with written informed consent at the time of recruitment. And this study was approved by the Ethics Committee of Shanghai public health clinic center of Fudan University.

Footnotes

The whole paper is reorganized. We add more clinical experimental analysis, and more discussion as well. More interestingly, we have some clinical experiments over the asymptomatic/mild covid-19 patients. And our method can identify these patients.

Reference

1.↵
Rodriguez Mega, E. COVID has killed more than one million people. How many more will die? Nature (2020).
2.↵
Guan, W.J., et al. Clinical Characteristics of Coronavirus Disease 2019 in China. N Engl J Med 382, 1708–1720 (2020).
OpenUrl CrossRef PubMed
3.↵
Huang, C., et al. Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China. Lancet 395, 497–506 (2020).
OpenUrl CrossRef PubMed
4.↵
Wang, D., et al. Clinical Characteristics of 138 Hospitalized Patients With 2019 Novel Coronavirus-Infected Pneumonia in Wuhan, China. JAMA 323, 1061–1069 (2020).
OpenUrl CrossRef PubMed
5.↵
Wynants, L., et al. Prediction models for diagnosis and prognosis of covid-19 infection: systematic review and critical appraisal. BMJ 369, m1328 (2020).
OpenUrl Abstract/FREE Full Text
6.↵
Long, Q.X., et al. Clinical and immunological assessment of asymptomatic SARS-CoV-2 infections. Nat Med 26, 1200–1204 (2020).
OpenUrl PubMed
7.↵
Mei, X., et al. Artificial intelligence-enabled rapid diagnosis of patients with COVID-19. Nat Med 26, 1224–1228 (2020).
OpenUrl
8.↵
Zhang, K., et al. Clinically Applicable AI System for Accurate Diagnosis, Quantitative Measurements, and Prognosis of COVID-19 Pneumonia Using Computed Tomography. Cell 181, 1423–1433 e1411 (2020).
OpenUrl
9.↵
Ting, D.S.W., et al. Development and Validation of a Deep Learning System for Diabetic Retinopathy and Related Eye Diseases Using Retinal Images From Multiethnic Populations With Diabetes. JAMA 318, 2211–2223 (2017).
OpenUrl PubMed
10.↵
Colavita, F., et al. SARS-CoV-2 Isolation From Ocular Secretions of a Patient With COVID-19 in Italy With Prolonged Viral RNA Detection. Ann Intern Med 173, 242–243 (2020).
OpenUrl PubMed
11.↵
Gupta, A., et al. Extrapulmonary manifestations of COVID-19. Nat Med 26, 1017–1032 (2020).
OpenUrl CrossRef PubMed
12.↵
Kirschenbaum, D., et al. Inflammatory olfactory neuropathy in two patients with COVID-19. Lancet 396, 166 (2020).
OpenUrl
13.↵
Zhou, Y., et al. Ocular Findings and Proportion with Conjunctival SARS-COV-2 in COVID-19 Patients. Ophthalmology 127, 982–983 (2020).
OpenUrl PubMed
14.↵
T Wang, J.C., A Hunter, D Greig. Learnable Stroke Models for Example-based Portrait Painting. BMVC (2013).
15.↵
A Krizhevsky, I.S., GE Hinton. Imagenet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems 25 (NIPS 2012) (2012).
16.↵
Simonyan, K., and Andrew Zisserman. Very Deep Convolutional Networks for Large-Scale Image Recognition. ICLR 2015?: International Conference on Learning Representations (2015).
17.↵
Huang, P.W. & Lee, C.H. Automatic classification for pathological prostate images based on fractal analysis. IEEE Trans Med Imaging 28, 1037–1050 (2009).
OpenUrl CrossRef PubMed Web of Science
18.↵
Zia, M.S., Majid Hussain, and M. Arfan Jaffar. A novel spontaneous facial expression recognition using dynamically weighted majority voting based ensemble classifier. Multimedia Tools and Applications (2018).