Estimating pulse wave velocity from the radial pressure wave using machine learning algorithms ============================================================================================== * Weiwei Jin * Phil Chowienczyk * Jordi Alastruey ## Abstract The European gold standard measurement of vascular ageing, a risk factor for cardiovascular disease, is the carotid-femoral pulse wave velocity (cfPWV), which requires an experienced operator to measure pulse waves at multiple sites. In this work, two machine learning pipelines have been proposed to estimate cfPWV from a peripheral pulse wave measured at a single site, the radial pressure wave measured by applanation tonometry. The study populations were the Twins UK cohort containing 3,082 subjects aged from 18 to 110 years, and a database containing 4,374 virtual subjects aged from 25 to 75 years. The first pipeline uses Gaussian process regression to estimate cfPWV from features extracted from the radial pressure wave using pulse wave analysis. The mean difference and upper and lower limit of agreement (LOA) of the estimation on the 924 hold-out test subjects from the Twins UK cohort were 0.2 m/s, and 3.75 m/s & -3.34 m/s, respectively. The estimation also included a 95% confidence interval for each estimation, which covered 98% of the measured data. The second pipeline uses a recurrent neural network (RNN) to estimate cfPWV from the entire radial pressure wave. The mean difference and upper and lower LOA of the estimation on the 924 hold-out test subjects from the Twins UK cohort were 0.05 m/s, and 3.21 m/s & -3.11m/s, respectively. Further test on the noise sensitivity of the estimation using the RNN on the database of virtual subjects shows that the percentage error increased by less than 2% when adding 20% noise to the waveform. These results have shown the possibility of replacing cfPWV with a peripheral pulse wave, such as the radial pressure wave, for vascular ageing assessment. The code for the machine learning pipelines proposed is available from the following online depository ([https://github.com/WeiweiJin/Estimate-Cardiovascular-Risk-from-Pulse-Wave-Signal](https://github.com/WeiweiJin/Estimate-Cardiovascular-Risk-from-Pulse-Wave-Signal)). ## Introduction Vascular ageing is a result of the age-induced damage inflicted upon the vascular structure and function, which leads to increased risk of chronic diseases, such as cardiovascular disease (CVD), and type 2 diabetes [1, 2]. Reducing the risk factors related to vascular ageing (*e*.*g*. blood pressure, glycemia, and lipids) at an early stage could prevent further progression of the disease [3]. Further studies also have shown that vascular ageing is associated with lifestyles [4] and exercise [5]. Thus, detecting vascular ageing at an early stage can lead to early intervention and prevention of the relevant diseases. Studies have shown that arterial stiffening as a result of lacking compliance function, acts as a proxy for vascular ageing [6, 7]. It has been suggested that arterial stiffness can be evaluated through the measurement of pulse wave velocity (PWV) [8, 9], for which the European standard assessment is the carotid-femoral PWV (cfPWV) [10]. Despite its wide use, cfPWV requires measurements at two arterial sites, manually handling the probes, and estimating the distance between the carotid and femoral arteries, which makes the measurement operator dependent. A single-site and automated measurement could overcome the limitations of the current clinical assessment of vascular ageing. Machine learning methods have been applied to solve a range of medical issues, including detecting CVD. The majority of the machine learning research involving medical signals is based on either electrocardiogram (ECG) [11, 12] or photoplethysmogram (PPG) [13] data. Those studies mainly focused on critical CVD that could lead to mortality, such as heart failure [14, 15]. Whereas, the development of CVD is a long process, and early detection and intervention can stop disease progression and avoid expensive medical cost and mortality [16]. Using machine learning methods to detect earlier signs of CVD would be beneficial in improving cardiovascular health. Although little effort has been carried out to assess the CVD risk via machine learning methods, researchers have recently become engaged in the subject. For instance, a recent study has proposed a potential algorithm to estimate the size of an abdominal aortic aneurysm from pressure waves measured at carotid, brachial and femoral arteries using deep learning methods [17]. In vascular ageing research, Tavallali *et al*. used an artificial neural network to estimate cfPWV with an RMSE of 1.1244 m/s. However, their appraoch required a central pressure wave, the carotid pressure wave, and also included other medical record information, such as chronological age [18]. This study aims to estimate cfPWV (hereafter refers as PWV) from only the pulse wave measured at a single peripheral site (the radial artery in this study) using machine learning algorithms, which can be broken down into the following three case studies. **Case Study 1** has proposed a machine learning pipeline that uses the Gaussian process regression to estimate the PWV from the key features (timing and magnitude of the fiducial points and the heart rate) extracted from the radial pressure wave on the data from the Twins UK cohort. **Case Study 2** has presented a second machine learning pipeline that uses a recurrent neural network (RNN) with long short-term memory (LSTM) to estimate PWV from the entire radial pressure waveform also on the data from the Twins UK cohort. **Case Study 3** has assessed the ability of RNN for estimating PWV from the radial pressure waveforms with random noises on the data from a database of virtual subjects, as the input for RNN can be an entire pulse waveform without any noise reduction. Both machine learning pipelines established in this article are available from the following online depository ([https://github.com/WeiweiJin/Estimate-Cardiovascular-Risk-from-Pulse-Wave-Signal](https://github.com/WeiweiJin/Estimate-Cardiovascular-Risk-from-Pulse-Wave-Signal)). ## Case Study 1: Estimate PWV from Radial Pressure Wave Features ### Methods #### Study population The study population in case study 1 consisted of 3,082 unselected twins (99% are females) from the Twins UK cohort. The mean and standard deviation of the biological characteristics of these subjects can be found in Table 1. The study was approved by the St Thomas’ Hospital Research Ethics Committees, and all subjects have signed the written informed consent. Most of the measurement data from the Twins UK cohort is available for external researchers via an application. More information about this cohort can be found on its official website ([https://twinsuk.ac.uk](https://twinsuk.ac.uk)) and relevant publications [19, 20]. The data used in this case study were the radial pressure waves measured by applanation tonometry and cfPWV measured by SphygmoCor CvMS. The data were acquired by an experienced operator over the period 2006 to 2017. View this table: [Table 1.](http://medrxiv.org/content/early/2020/11/30/2020.11.29.20239962/T1) Table 1. Biological characteristics of the subjects from the Twins UK cohort (N = 3,082). SD: Standard Deviation; BMI: body mass index; DBP: diastolic blood pressure; SBP: systolic blood pressure; MAP: mean arterial pressure; PWV: pulse wave velocity. #### Wave feature extraction The features of the radial pressure wave were extracted as the timings and magnitudes of the fiducial points identified on the waveform and the heart rate using the pulse wave analyser developed by Charlton *et al*. [21]. In total, 14 fiducial points on each waveform were identified, which made the numbers of the features from one radial pressure wave to be 29. More detailed descriptions of the fiducial points can be found in previous studies by Charlton *et al*. [21, 22]. #### Preprocessing for Gaussian process regression Before performing the Guassian process regression, LASSO regression was performed to identify the key features from all features identified on the waveform. Then principal component analysis (PCA) was performed after LASSO regression to exclude outliners in the analysed dataset, as the outliers could affect the accuracy of machine learning algorithms [23]. The [linear model](https://scikit-learn.org/stable/modules/linear_model.html#lasso) module from the scikit-learn package was used to perform the LASSO regression in Python. The hyperparameter in the model was found by cross-validation using the GridSearchCV library. Then, PCA was performed on the key features that were identified by the LASSO regression using the PCA library from the scikit-learn package. Finally, based on the distance of the data points away from the origin, outliers were identified and excluded from the machine learning training and testing. #### Gaussian process regression Gaussian process regression was used to estimate the PWV based on the key features from the radial pressure wave identified by LASSO regression. The advantages of using Gaussian process regression are i) it can provide uncertainty of the estimation, which most machine learning regression methods are not able to; ii) the hyperparameters in the model can be identified by maximising the log likelihood, which is less time consuming than cross-validation. The [GaussianProcessRegressor](https://scikit-learn.org/stable/modules/generated/sklearn.gaussian_process.GaussianProcessRegressor.html) library and kernel functions from the scikit-learn package were used to perform Gaussian process regression in Python. Three kernel functions: radial basis function (RBF), Matérn kernel with *ν* = 5*/*2, rational quadratic kernel, and their sum combinations were tested (results shown in S1 Fig). Finally, the rational quadratic kernel was chosen to be used in this study based on the accuracy of the estimation. #### Other machine learning methods To confirm the accuracy of the PWV estimation by Gaussian process regression, three other machine learning methods: support vector regression (SVR), and two tree-based methods (*i*.*e*. random forest regression and gradient boosting regression) were also used to estimate the PWV. All machine learning algorithms were performed using the libraries from the scikit-learn package. The hyperparameters in the SVR were tuned by cross-validation using the [optunity package](https://scikit-learn.org/stable/modules/linear_model.html#lasso). The hyperparameters in the tree-based methods were tuned by cross-validation with random search using the scikit-learn package. In addition, apart from the tree-based methods, the features from the radial pressure wave were normalised using the [StandardScaler](https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.StandardScaler.html) library in the scikit-learn package. The training and testing/developing data ratio for all machine learning analysis was 7:3. #### Error evaluation The root mean square error (RMSE) has been calculated to evaluate each machine learning approach, which is defined as, ![Formula][1] where *n* is the size of the test dataset; PŴV*i* and PWV*i*are the *i*th estimated and measured PWV, respectively. Then, a percentage error, *∈*, was also calculated based on the RMSE: ![Formula][2] where ![Graphic][3] is the mean value of the PWV of the study population. ### Results The features from the radial pressure wave were reduced from 29 to 17 after performing the LASSO regression. The fiducial points containing those key features are shown in Fig 1a. Then, PCA was performed on the subjects using only those key features (Fig 1b). The results have shown that 3 of the 3082 subjects were outliers, therefore were excluded from the machine learning analysis. ![Fig 1.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/11/30/2020.11.29.20239962/F1.medium.gif) [Fig 1.](http://medrxiv.org/content/early/2020/11/30/2020.11.29.20239962/F1) Fig 1. Data pre-processing for pulse wave velocity estimation from the features extracted from the radial pressure wave. (a) The fiducial points containing key features identified by the LASSO regression. (b) Identify outliers in the database using principal component analysis (PCA). Red, blue and green dots represent subject groups with pulse wave velocity (PWV) less than 7 m/s, 7-9 m/s, and greater than 9 m/s, respectively. The Gaussian process regression was performed on the study population without the outliers (3079 data samples). The model was trained on 2155 data samples, and the estimation results and errors when testing on the hold-out test data set containing 924 samples are shown in Fig 2a&c, and Table 2, respectively. Fig 2a shows a linear relationship between the estimated and measured PWV, with a slope of 1.00 and an offset of 0.24 m/s. The coefficient of determination, r2 equals to 0.42, and the p-value is less than 0.0001. The Bland-Altman plot shows a mean difference of 0.2 m/s, and the upper and lower limit of agreement (LOA) of 3.75 m/s & -3.34 m/s (Fig 2c). Both plots suggested that the accuracy of the PWV estimation deteriorated as the value of PWV increased. Table 2 illustrated that PWV could be estimated from the radial pulse waveform with an RMSE of 1.82 m/s and a percentage error, *∈*, of 19.4% over the whole test data set. In addition, Gaussian process regression can also provide a 95% confidence interval additional to the estimated PWV (S2 Fig), which 98% of the measured PWV values were within the 95% confidence interval range. View this table: [Table 2.](http://medrxiv.org/content/early/2020/11/30/2020.11.29.20239962/T2) Table 2. The root mean square error (RMSE) and percentage error (*ϵ*) on the estimated pulse wave velocity (PWV) using different machine learning methods. ![Fig 2.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/11/30/2020.11.29.20239962/F2.medium.gif) [Fig 2.](http://medrxiv.org/content/early/2020/11/30/2020.11.29.20239962/F2) Fig 2. Estimation of pulse wave velocity (PWV) on a hold-out test set containing 924 subjects using Gaussian process regression and recurrent neural network (RNN) with long short-term memory (LSTM). (a) and (b) show the estimated PWV plot against measured PWV with a linear regression line in red, the coefficient of determination, r2, and the p-value. (c) and (d) show the Band-Altman plots comparing the estimated and measured PWV. (e) and (f) show Pearson correlation coefficients (r) between the biological characteristics and the Difference value shown on panel (c) and (d), respectively. BMI: body mass index; DBP: diastolic blood pressure; SBP: systolic blood pressure; MAP: mean arterial pressure. To confirm the accuracy of the estimation made by Gaussian process regression, three other machine learning methods were applied to the same training and hold-out testing data set to estimate the PWV, which the error evaluations can be found in Table 2. The results show that the other three machine learning methods can provide a PWV estimation with less errors than Gaussian process regression, with Gradient Boosting regression obtaining the lowest RMSE (= 1.63 m/s) and *ϵ* (=17.4%). Still, the improvement of the errors was limited (less than 0.2 m/s for RMSE, and less than 2% for *ϵ*). Besides, these alternative methods do not provide uncertainty for the estimation (*i*.*e*. 95% confidence interval), and take longer to train (≤ 1 minute vs ≥ 30 minutes). In addition, the measured PWV plot against estimated PWV and Bland-Altman plots simulated by these three algorithms can be found in S3 Fig. Furthermore, Pearson’s correlation coefficient, r, was used to investigate if the accuracy of the estimations using Gaussian process regression could be related to the biological characteristics. The biological characteristics that have been studied are height, weight, body mass index (BMI), age (chronological age), diastolic blood pressure (DBP), systolic blood pressure (SBP), and mean arterial pressure (MAP). Fig 2e show that the difference (between the estimated and measured PWV) correlates with the age the most, r = 0.286. ## Case Study 2: Estimate PWV from the Whole Radial Pressure Wave ### Methods The study population in case study 2 is identical to the study population in case study 1, and same error evaluation metrics have been used to assess the PWV estimation. The description of the machine learning approach in case study 2, a RNN, is shown in the following subsection. ### Recurrent neural network The schematic of the RNN structure used in this case study is shown in Fig 3. The input data was a time-variant radial pressure waveform. As the cardiac cycle of different subjects varied, the time duration of the radial pressure wave also differed from subject to subject. To overcome the length difference in the input data, the waves with short durations were extended to the duration of the longest wave by filling dummy values (maximum floating point number in this case) at the end. Then, a masking layer was applied to exclude the dummy values from being considered when estimating the PWV. Afterwards, a bidirectional RNN with LSTM was used to process the time-variant radial pressure waveform, as it has been proven effective on forecasting time series data [24–26]. Finally, a dense layer with a linear activation function was used to estimate the PWV based on the results from the bidirectional RNN with LSTM. Before carrying out the main simulation, hyperparameter tuning was undertaken and the following parameters were chosen: number of units for LSTM = 16; batch size = 64; epoch number = 1500; optimizer = Adam. The RNN was constructed using open-source neural-network library [TensorFlow Core v.2.2.0,](https://www.tensorflow.org/api_docs/python/tf) including a high-level application programming interface [Keras](https://www.tensorflow.org/api_docs/python/tf/keras). The training and testing/developing data ratio for the RNN was also 7:3. ![Fig 3.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/11/30/2020.11.29.20239962/F3.medium.gif) [Fig 3.](http://medrxiv.org/content/early/2020/11/30/2020.11.29.20239962/F3) Fig 3. A schematic illustration of the recurrent neural network structure used to estimate pulse wave velocity from an entire radial pressure wave. *P**t* is the time-variant radial pressure wave data at the discrete time point *t, cf* PWV is the carotid-femoral pulse wave velocity. ## Results The RNNs with LSTM were trained and tested on the same datasets as the one used in case study 1. Fig 2b&d show the performance of RNN on estimating the PWV from the entire radial pressure wave. The results illustrate that, in comparison with the PWV estimation using Gaussian process regression, the estimation using RNN has a smaller offset on the regression line and a larger correlation (r2). The Bland-Altman plots show that both mean difference and the upper and lower LOA are smaller in comparison to the estimation by Gaussian process regression (Fig 2c&d). The RMSE and percentage error, *ϵ*, of PWV estimation using RNN were shown in Table 2, and were similar to other machine learning methods in the same Table. Furthermore, Pearson’s correlation coefficients, r, between biological characteristics and the difference of measured and estimated PWV were calculated for RNN (Fig 2f), which were similar to the ones using Gaussian process regression, with age showing the most correlation, r = 0.297. ## Case Study 3: Estimate PWV with Noisy Radial Pressure Wave ### Methods The structure of the RNN used in case study 3 was the same as in case study 2. The training and testing/developing data ratio in this case study was also set to 7:3. Same error evaluation metrics used in the previous two case studies have been used to assess the PWV estimation in this case study as well. The details of the study population and noise generation used in this cases study are shown in the following two subsections. ### Study population To systematically investigate the effects of high-frequency noise on the radial pressure wave, a database containing 4,374 virtual subjects representing a sample of “healthy” adults aged between 25 and 75 years old in ten-year increments was used as the study population. The database can be found in the following depository: [https://github.com/peterhcharlton/pwdb/wiki/Using-the-Pulse-Wave-Database](https://github.com/peterhcharlton/pwdb/wiki/Using-the-Pulse-Wave-Database). The data used in this case study were the radial pressure waves and cfPWV. Further details of this database can be found in a previous study [22]. The rational behind choosing a database of virtual subjects was to eliminate the possible effects of measurement errors. ### Noise generation Different intensities of high-frequency Gaussian white noises were generated and added to the radial pressure waves from the database of virtual subjects to test the noise sensitivity of the PWV estimation by RNN. The intensity of the noise was defined using signal to noise ratio (SNR), similar to the approach in [27], which the SNR was calculated as, ![Formula][4] where *P*signal and *P*noise are the power (averaged amplitude) of the signal and noise, respectively. An example of an original signal and the same signal with SNR of 20, 10, and 5 are shown in Fig 4. In this case study, six different SNRs: 20, 16, 12, 10, 8 and 5 were considered. ![Fig 4.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/11/30/2020.11.29.20239962/F4.medium.gif) [Fig 4.](http://medrxiv.org/content/early/2020/11/30/2020.11.29.20239962/F4) Fig 4. An example of an original signal, and the same signal with signal to noise ratio (SNR) of 20, 10, and 5, respectively. ## Results The radial pressure waves from the database of virtual subjects incorporated with different levels of random Gaussian white noises were used to test the noise sensitivity of the PWV estimation from the entire radial pressure wave using RNN. The measured PWV plot against estimated PWV and Bland-Altman plots of the estimations from the baseline radial pressure wave and waveforms with SNR of 20, 10 and 5 are shown in Fig 5. The coefficient of determination, r2 were ≥ 0.98 for all cases considered. The mean difference did not increase, but the upper and lower LOA increased from 0.14 m/s & -0.24 m/s to 0.5 m/s & -0.56 m/s when adding 20% noise to the original radial pressure wave (SNR = 5). The error evaluation is shown in Table 3. The RMSE increased from 0.10 m/s to 0.24 m/s, and the percentage error, *ϵ*, increased from 1.2% to 2.8%, when adding 20% noise to the original radial pressure wave. Besides, the errors of the PWV estimation using the baseline waveforms (*i*.*e*. without any noise) from the database of virtual subjects improved by more than 10 times in comparison with the errors of the PWV estimation using the data from the Twins UK cohort. View this table: [Table 3.](http://medrxiv.org/content/early/2020/11/30/2020.11.29.20239962/T3) Table 3. The root mean square error (RMSE) and percentage error (*ϵ*) on the estimated pulse wave velocity (PWV) from radial pressure wave with different intensity of noises using recurrent neural network (RNN). ![Fig 5.](http://medrxiv.org/https://www.medrxiv.org/content/medrxiv/early/2020/11/30/2020.11.29.20239962/F5.medium.gif) [Fig 5.](http://medrxiv.org/content/early/2020/11/30/2020.11.29.20239962/F5) Fig 5. Comparison of the measured and estimated pulse wave velocity (PWV) and Bland-Altman plots using radial pressure wave with different intensities of noises on a hold-out test set containing 1312 virtual subjects. SNR stands for signal to noise ratio. ## Discussion In this study, two machine learning pipelines were proposed to estimate PWV from the radial pressure wave: i) using Gaussian process regression from extracted features of the waveform and ii) using RNN from the entire waveform. The results show that the PWV can be estimated from both pipelines, with the second pipeline presenting a higher accuracy and a lower bias in the estimated PWV. However, the improvement in accuracy for PWV estimation from the second pipeline was limited, which indicated that the features extracted from the radial pressure wave using the pulse wave analyser developed by Charlton et al. [22] were sufficient to represent the entire radial pressure wave. Some of the key features identified by LASSO regression and applied to the PWV estimation using Gaussian process regression can be applied to calculate pulse wave indices that are closely related to vascular ageing [28–30]. For instance, reflection index can be calculated from dia; augmentation index and augmentation pressure can be calculated from p1in and p2pk; and modified ageing index is related to a, b, and c from the 2nd derivative of the waveform. Besides, Gaussian process regression was able to provide a 95% confidence interval for each estimation that covers at least 98% of the measured PWV, and required less time to train (less than a minute using the data from the Twins UK cohort). On the other hand, in order to use the pulse wave analyser to extract features from the wave, the wave needs to be preprocessed to eliminate high and low frequency noises, which can result in losses of information. Using the RNN to analyse the entire pulse wave signal does not require the waves to be preprocessed, which can avoid the losses of information due to pulse wave signal processing. The results in Table 3 suggested that the PWV estimation using RNN could provide accurate results even with noisy pressure waves. Comparing to other non-invasive devices (*e*.*g*. Pulse Pen [31]) and measurement methods (*e*.*g*. oscillometric method [32]) requiring two arterial measurement sites, the mean differences between the estimated and measured PWV were similar or smaller (≤ 0.214 m/s for Pulse Pen, 0.4 m/s for oscillometric method, vs ≤ 0.2 m/s in this study), whereas the upper and lower LOA were larger in this study (≤ 1.346 m/s & ≥ -0.918 m/s for Pulse Pen, ≤ 2.9 m/s & ≥ -2.0 m/s for oscillometric method, vs ≤ 3.75 m/s & ≥ -3.34 m/s in this study). On the other hand, comparing to the non-invasive device that only requires single site measurement (*e*.*g*. Arteriograph [33]), the mean difference was the same for the estimation using Gaussian process regression (= 0.2 m/s), and the upper and lower LOA was smaller in this study (≤ 4.5 m/s & ≥ -4.01 m/s vs ≤ 3.75 m/s & ≥ -3.34 m/s). Furthermore, the root mean square error (RMSE) in the estimation was larger in comparison to the machine learning study performed by Tavallali *et al*. [18] (RMSE = 1.1244 m/s). However, this could due to the fact that the average PWV in Tavallali *et al*.’s study was smaller than in this study, also less patient information (*e*.*g*. chronological age) and neither the information from central arteries (*e*.*g*. carotid artery) were used in this study. The errors in the PWV estimation using the machine learning pipelines proposed in this study can be due to the following causes. Firstly, the errors in the PWV estimations could come from the inaccurate PWV measurements. Previous studies [34, 35] have pointed out that the accuracy of the PWV measurement can be largely affected by the distance measured between the carotid and femoral arteries, which is measured on the patients’ body surface by tape when using the SphygmoCor CvMS device. The RMSE and percentage error for the PWV estimation on the database of virtual subjects with noise-free data using RNN was smaller (0.10 m/s vs 1.59 m/s and 1.2% vs 16.9%), which also suggested that the large error in the estimation using the Twins UK cohort could be originated from the measurement errors. However, further investigations on the accuracy of the PWV measurement would be needed to test this hypothesis. Secondly, the errors of the PWV estimation increased with the increasing PWV values, which could due to the low number of high PWV samples in the dataset. It is known that the accuracy of machine learning algorithms decreases when the sample size decreases [36]. This issue can be potentially solved by obtaining more data to make a light-tailed-distributed population with more subjects for developing the machine learning algorithm. Lastly, the errors in the PWV estimation could also be a result of the confounding biological characteristics of the patients, as the radial pressure wave was the only input used in the estimation. The Pearson’s correlation coefficient, r, between those biological characteristics and the difference of the estimated and measured PWV suggested that adding age as a predictor could potentially improve the estimation. However, as chronological age does not necessarily correspond to the biological age [37], which means adding age as a predictor can also bias the estimation results. Nevertheless, Pearson’s correlation coefficients in both machine learning approaches were smaller than 0.3. According to the guideline [38], the correlation is negligible if r ≤ 0.3. Thus, the analysis suggested that the errors in the estimations would not be largely depend on the biological characteristics. This study is also subject to a few limitations and requires future work. Firstly, the majority of participants in the Twins UK cohort are female, which means the trained model in this study is less likely to fit well when given unseen data from a wider population. However, this should not affect the accuracy of the estimation within the analysis performed in this study and the conclusions. Secondly, the pulse wave data in this study only contain a single cardiac cycle. Further study will be needed to investigate the effectiveness of the RNN on estimating cardiovascular indices using a pulse wave containing multiple cardiac cycles. Lastly, the peripheral pulse wave used in this study was the radial pressure wave measured using applanation tonometry. Further studies using peripheral pulse waves, such as the PPG signal measured at the digital artery using fingertip probe or smart phone camera, or PPG signal measured around the wrist using Apple Watch or Fitbit would need to further test the pipelines proposed in this study. The clinical significance of this study aligns with assessing the risk factors for CVD from more accessible measurements. Firstly, the radial pressure wave used to estimate PWV is a peripheral pulse wave, which can be easily measured via non-invasive devices. Secondly, the machine learning pipelines proposed in this study can also take other peripheral pulse waves, such as PPGs, even signal lead ECGs with more than one cardiac cycle as input to estimate CVD risks. Thirdly, the machine learning pipelines proposed in this study can be easily extended to take multiple peripheral pulse waves as input to further improve the accuracy of estimation for CVD risks. ## Conclusion In this work, three case studies have been carried out to investigate the possibility of estimating PWV (a well-established biomarker) from the radial pressure wave (a peripheral pulse wave) using machine learning methods. Results have shown that PWV can be estimated either from the features extracted from the pulse wave or the entire waveform with a mean difference up to 0.2 m/s and the upper and lower LOA up to 3.75 m/s and -3.34 m/s using the clinical database (Twins UK cohort). Furthermore, they suggested that the estimation of the PWV from the entire radial pressure wave using RNN can still be achieved with adding up to 20% noise to the wave signal using the database of virtual subjects. The outcome of this study can help deliver vascular ageing assessment to a wider population and enable repetitive measurements that can improve the accuracy of the assessment. Further application of the machine learning pipelines proposed in this study can help with remote patient monitoring and connected health. Additionally, the scripts for the machine learning pipelines proposed in this study are also available on the following online depository: [https://github.com/WeiweiJin/Estimate-Cardiovascular-Risk-from-Pulse-Wave-Signal](https://github.com/WeiweiJin/Estimate-Cardiovascular-Risk-from-Pulse-Wave-Signal). ## Supporting information Supplemental Figure 1 [[supplements/239962_file07.tif]](pending:yes) Supplemental Figure 2 [[supplements/239962_file08.tif]](pending:yes) Supplemental Figure 3 [[supplements/239962_file09.tif]](pending:yes) ## Data Availability Most of the measurement data from the Twins UK cohort is available for external researchers via an application. The database of virtual subjects can be found in an online depository. [https://twinsuk.ac.uk](https://twinsuk.ac.uk) [https://github.com/peterhcharlton/pwdb/wiki/Using-the-Pulse-Wave-Database](https://github.com/peterhcharlton/pwdb/wiki/Using-the-Pulse-Wave-Database) ## Supporting information **S1 Fig. Estimation of pulse wave velocity (PWV) using Gaussian process regression with different kernel functions and their sum combinations**. RBF: radial basis function; Matérn: Matérn kernel; RQ: rational quadratic kernel. **S2 Fig. Estimation of pulse wave velocity (PWV) with a 95% confidence interval using Gaussian process regression on a hold-out test set containing 924 subjects**. (a) and (b) show the measured and estimated PWV plot on top of each other; (c) and (d) show the first ten samples in (a) and (b), respectively. **S3 Fig. Comparison of measured and estimated pulse wave velocity (PWV) and Bland-Altman plots using support vector regression, random forest regression and gradient boosting regression on a hold-out test set containing 924 subjects**. ### Acknowledgments The authors would like to thank Dr James R. Bland for discussions, especially during the methodology development. ## Footnotes * * weiwei.jin{at}kcl.ac.uk, weiweijin722{at}gmail.com * Received November 29, 2020. * Revision received November 29, 2020. * Accepted November 30, 2020. * © 2020, Posted by Cold Spring Harbor Laboratory This pre-print is available under a Creative Commons License (Attribution-NonCommercial-NoDerivs 4.0 International), CC BY-NC-ND 4.0, as described at [http://creativecommons.org/licenses/by-nc-nd/4.0/](http://creativecommons.org/licenses/by-nc-nd/4.0/) ## References 1. 1.Laina A, Stellos K, Stamatelopoulos K. Vascular ageing: Underlying mechanisms and clinical implications. Experimental Gerontology. 2018;109:16–30. doi:10.1016/j.exger.2017.06.007. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.exger.2017.06.007&link_type=DOI) 2. 2.North BJ, Sinclair DA. The intersection between aging and cardiovascular disease. Circulation Research. 2012;110(8):1097–1108. doi:10.1161/CIRCRESAHA.111.246876. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6MTA6ImNpcmNyZXNhaGEiO3M6NToicmVzaWQiO3M6MTA6IjExMC84LzEwOTciO3M6NDoiYXRvbSI7czo1MDoiL21lZHJ4aXYvZWFybHkvMjAyMC8xMS8zMC8yMDIwLjExLjI5LjIwMjM5OTYyLmF0b20iO31zOjg6ImZyYWdtZW50IjtzOjA6IiI7fQ==) 3. 3.Nilsson PM, Boutouyrie P, Laurent S. Vascular aging: A tale of EVA and ADAM in cardiovascular risk assessment and prevention. Hypertension. 2009;54(1):3–10. doi:10.1161/HYPERTENSIONAHA.109.129114. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1161/HYPERTENSIONAHA.109.129114&link_type=DOI) 4. 4.Gomez-Sanchez M, Gomez-Sanchez L, Patino-Alonso MC, Cunha PG, Recio-Rodriguez JI, Alonso-Dominguez R, et al. Vascular aging and its relationship with lifestyles and other risk factors in the general Spanish population: Early Vascular Ageing Study. Journal of hypertension. 2020;38(6):1110–1122. doi:10.1097/HJH.0000000000002373. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1097/HJH.0000000000002373&link_type=DOI) 5. 5.Niebauer J, Müller EE, Schönfelder M, Schwarzl C, Mayr B, Stöggl J, et al. Acute effects of winter sports and indoor cycling on arterial stiffness. Journal of Sports Science and Medicine. 2020;19(3):460–468. 6. 6.Nilsson PM, Lurbe E, Laurent S. The early life origins of vascular ageing and cardiovascular risk: The EVA syndrome. Journal of Hypertension. 2008;26(6):1049–1057. doi:10.1097/HJH.0b013e3282f82c3e. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1097/HJH.0b013e3282f82c3e&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=18475139&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F11%2F30%2F2020.11.29.20239962.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000256229800001&link_type=ISI) 7. 7.Laurent S, Boutouyrie P, Cunha PG, Lacolley P, Nilsson PM. Concept of extremes in vascular aging: From early vascular aging to supernormal vascular aging. Hypertension. 2019;74(2):218–228. doi:10.1161/HYPERTENSIONAHA.119.12655. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1161/HYPERTENSIONAHA.119.12655&link_type=DOI) 8. 8.Vlachopoulos C, Aznaouridis K, Stefanadis C. Prediction of cardiovascular events and all-cause mortality with arterial stiffness. A systematic review and meta-analysis. Journal of the American College of Cardiology. 2010;55(13):1318–1327. doi:10.1016/j.jacc.2009.10.061. [FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6MzoiUERGIjtzOjExOiJqb3VybmFsQ29kZSI7czo0OiJhY2NqIjtzOjU6InJlc2lkIjtzOjEwOiI1NS8xMy8xMzE4IjtzOjQ6ImF0b20iO3M6NTA6Ii9tZWRyeGl2L2Vhcmx5LzIwMjAvMTEvMzAvMjAyMC4xMS4yOS4yMDIzOTk2Mi5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 9. 9.Van Bortel LM, Laurent S, Boutouyrie P, Chowienczyk P, Cruickshank JK, De Backer T, et al. Expert consensus document on the measurement of aortic stiffness in daily practice using carotid-femoral pulse wave velocity. Journal of Hypertension. 2012;30(3):445–448. doi:10.1097/HJH.0b013e32834fa8b0. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1097/HJH.0b013e32834fa8b0&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=22278144&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F11%2F30%2F2020.11.29.20239962.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000300412400001&link_type=ISI) 10. 10.Laurent S, Cockcroft J, Van Bortel L, Boutouyrie P, Giannattasio C, Hayoz D, et al. Expert consensus document on arterial stiffness: Methodological issues and clinical applications. European Heart Journal. 2006;27(21):2588–2605. doi:10.1093/eurheartj/ehl254. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/eurheartj/ehl254&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=17000623&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F11%2F30%2F2020.11.29.20239962.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000241629800023&link_type=ISI) 11. 11.Alqudah AM, Albadarneh A, Abu-Qasmieh I, Alquran H. Developing of robust and high accurate ECG beat classification by combining Gaussian mixtures and wavelets features. Australasian Physical and Engineering Sciences in Medicine. 2019;42(1):149–157. doi:10.1007/s13246-019-00722-z. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1007/s13246-019-00722-z&link_type=DOI) 12. 12.Attia ZI, Kapa S, Lopez-Jimenez F, McKie PM, Ladewig DJ, Satam G, et al. Screening for cardiac contractile dysfunction using an artificial intelligence–enabled electrocardiogram. Nature Medicine. 2019;25(1):70–74. doi:10.1038/s41591-018-0240-2. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/s41591-018-0240-2&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=http://www.n&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F11%2F30%2F2020.11.29.20239962.atom) 13. 13.Biswas D, Everson L, Liu M, Panwar M, Verhoef BE, Patki S, et al. CorNET: Deep learning framework for PPG-based heart rate estimation and biometric identification in ambulant environment. IEEE Transactions on Biomedical Circuits and Systems. 2019;13(2):282–291. doi:10.1109/TBCAS.2019.2892297. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1109/TBCAS.2019.2892297&link_type=DOI) 14. 14.Awan SE, Bennamoun M, Sohel F, Sanfilippo FM, Dwivedi G. Machine learning-based prediction of heart failure readmission or death: implications of choosing the right model and the right metrics. ESC Heart Failure. 2019;6(2):428–435. doi:10.1002/ehf2.12419. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1002/ehf2.12419&link_type=DOI) 15. 15.Cikes M, Sanchez-Martinez S, Claggett B, Duchateau N, Piella G, Butakoff C, et al. Machine learning-based phenogrouping in heart failure to identify responders to cardiac resynchronization therapy. European Journal of Heart Failure. 2019;21(1):74–85. doi:10.1002/ejhf.1333. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1002/ejhf.1333&link_type=DOI) 16. 16.Karunathilake SP, Ganegoda GU. Secondary prevention of cardiovascular diseases and application of technology for early diagnosis. BioMed Research International. 2018;2018. doi:10.1155/2018/5767864. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1155/2018/5767864&link_type=DOI) 17. 17.Chakshu NK, Sazonov I, Nithiarasu P. Towards enabling a cardiovascular digital twin for human systemic circulation using inverse analysis. Biomechanics and Modeling in Mechanobiology. 2020;doi:10.1007/s10237-020-01393-6. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1007/s10237-020-01393-6&link_type=DOI) 18. 18.Tavallali P, Razavi M, Pahlevan NM. Artificial intelligence estimation of carotid-femoral pulse wave velocity using carotid waveform. Scientific Reports. 2018;8(1):1–12. doi:10.1038/s41598-018-19457-0. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/s41598-018-19457-0&link_type=DOI) 19. 19.Moayyeri A, Hammond CJ, Hart DJ, Spector TD. The UK adult twin registry (TwinsUK resource). Twin Research and Human Genetics. 2013;16(1):144–149. doi:10.1017/thg.2012.89.The. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1017/thg.2012.89.The&link_type=DOI) 20. 20.Moayyeri A, Hammond CJ, Valdes AM, Spector TD. Cohort profile: TwinsUK and healthy ageing twin study. International Journal of Epidemiology. 2013;42(1):76–85. doi:10.1093/ije/dyr207. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1093/ije/dyr207&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=22253318&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F11%2F30%2F2020.11.29.20239962.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000316699300009&link_type=ISI) 21. 21.Charlton PH, Celka P, Farukh B, Chowienczyk P, Alastruey J. Assessing mental stress from the photoplethysmogram: A numerical study. Physiological Measurement. 2018;39(5). doi:10.1088/1361-6579/aabe6a. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1088/1361-6579/aabe6a&link_type=DOI) 22. 22.Charlton PH, Mariscal Harana J, Vennin S, Li Y, Chowienczyk P, Alastruey J. Modeling arterial pulse waves in healthy aging: a database for in silico evaluation of hemodynamics and pulse wave indexes. American journal of physiology Heart and circulatory physiology. 2019;317(5):H1062–H1085. doi:10.1152/ajpheart.00218.2019. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1152/ajpheart.00218.2019&link_type=DOI) 23. 23.Perez H, Tah JHM. Improving the accuracy of convolutional neural networks by identifying and removing outlier images in datasets using t-SNE. Mathematics. 2020;8(5). doi:10.3390/MATH8050662. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.3390/MATH8050662&link_type=DOI) 24. 24.Hochreiter S, Schmidhuber J. Long short-term memory. Neural Computation. 1997;9:1735–1780. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1162/neco.1997.9.8.1735&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=9377276&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F11%2F30%2F2020.11.29.20239962.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=A1997YA04500007&link_type=ISI) 25. 25.Lipton ZC, Berkowitz J, Elkan C. A critical review of recurrent neural networks for sequence learning. arXiv. 2015; p. 1–38. 26. 26.Schmidhuber J. Deep learning in neural networks: An overview. Neural Networks. 2015;61:85–117. doi:10.1016/j.neunet.2014.09.003. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.neunet.2014.09.003&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=25462637&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F11%2F30%2F2020.11.29.20239962.atom) 27. 27.Gaddum NR, Alastruey J, Beerbaum P, Chowienczyk P, Schaeffter T. A technical assessment of pulse wave velocity algorithms applied to non-invasive arterial waveforms. Annals of Biomedical Engineering. 2013;41(12):2617–2629. doi:10.1007/s10439-013-0854-y. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1007/s10439-013-0854-y&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=23817766&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F11%2F30%2F2020.11.29.20239962.atom) 28. 28. Mikael LdR, de Paiva AMG, Gomes MM, Sousa ALL, Jardim PCBV, Vitorino PVdO, et al. Vascular ageing and arterial stiffness. Arquivos Brasileiros de Cardiologia. 2017;109(3):253–258. doi:10.5935/abc.20170091. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.5935/abc.20170091&link_type=DOI) 29. 29.Mitchell GF, Parise H, Benjamin EJ, Larson MG, Keyes MJ, Vita JA, et al. Changes in arterial stiffness and wave reflection with advancing age in healthy men and women: The Framingham Heart Study. Hypertension. 2004;43(6):1239–1245. doi:10.1161/01.HYP.0000128420.01881.aa. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6MTU6Imh5cGVydGVuc2lvbmFoYSI7czo1OiJyZXNpZCI7czo5OiI0My82LzEyMzkiO3M6NDoiYXRvbSI7czo1MDoiL21lZHJ4aXYvZWFybHkvMjAyMC8xMS8zMC8yMDIwLjExLjI5LjIwMjM5OTYyLmF0b20iO31zOjg6ImZyYWdtZW50IjtzOjA6IiI7fQ==) 30. 30.Wang KL, Cheng HM, Sung SH, Chuang SY, Li CH, Spurgon HA, et al. Wave reflection and arterial stiffness in the prediction of 15-year all-cause and cardiovascular mortalities: A community-based study. Hypertension. 2010;55(3):799–805. doi:10.1038/jid.2014.371. [Abstract/FREE Full Text](http://medrxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6MTU6Imh5cGVydGVuc2lvbmFoYSI7czo1OiJyZXNpZCI7czo4OiI1NS8zLzc5OSI7czo0OiJhdG9tIjtzOjUwOiIvbWVkcnhpdi9lYXJseS8yMDIwLzExLzMwLzIwMjAuMTEuMjkuMjAyMzk5NjIuYXRvbSI7fXM6ODoiZnJhZ21lbnQiO3M6MDoiIjt9) 31. 31.Salvi P, Lio G, Labat C, Ricci E, Pannier B, Benetos A. Validation of a new non-invasive portable tonometer for determining arterial pressure wave and pulse wave velocity: The PulsePen device. Journal of Hypertension. 2004;22(12):2285–2293. doi:10.1097/00004872-200412000-00010. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1097/00004872-200412000-00010&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=15614022&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F11%2F30%2F2020.11.29.20239962.atom) [Web of Science](http://medrxiv.org/lookup/external-ref?access_num=000225723500010&link_type=ISI) 32. 32.Hametner B, Wassertheurer S, Kropf J, Mayer C, Eber B, Weber T. Oscillometric estimation of aortic pulse wave velocity: Comparison with intra-aortic catheter measurements. Blood Pressure Monitoring. 2013;18(3):173–176. doi:10.1097/MBP.0b013e3283614168. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1097/MBP.0b013e3283614168&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=23571229&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F11%2F30%2F2020.11.29.20239962.atom) 33. 33.Jekell A, Kahan T. The usefulness of a single arm cuff oscillometric method (Arteriograph) to assess changes in central aortic blood pressure and arterial stiffness by antihypertensive treatment: results from the Doxazosin-Ramipril Study. Blood Pressure. 2018;27(2):88–98. doi:10.1080/08037051.2017.1394791. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1080/08037051.2017.1394791&link_type=DOI) 34. 34.Segers P, Kips J, Trachet B, Swillens A, Vermeersch S, Mahieu D, et al. Limitations and pitfalls of non-invasive measurement of arterial pressure wave reflections and pulse wave velocity. Artery Research. 2009;3(2):79–88. doi:10.1016/j.artres.2009.02.006. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.artres.2009.02.006&link_type=DOI) 35. 35.Weir-McCall JR, Khan F, Cassidy DB, Thakur A, Summersgill J, Matthew SZ, et al. Effects of inaccuracies in arterial path length measurement on differences in MRI and tonometry measured pulse wave velocity. BMC Cardiovascular Disorders. 2017;17(1):1–9. doi:10.1186/s12872-017-0546-x. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1186/s12872-017-0546-x&link_type=DOI) 36. 36.Cui Z, Gong G. The effect of machine learning regression algorithms and sample size on individualized behavioral prediction with functional connectivity features. NeuroImage. 2018;178(May):622–637. doi:10.1016/j.neuroimage.2018.06.001. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.neuroimage.2018.06.001&link_type=DOI) 37. 37.Shiels PG, McGuinness D, Eriksson M, Kooman JP, Stenvinkel P. The role of epigenetics in renal ageing. Nature Reviews Nephrology. 2017;13(8):471–482. doi:10.1038/nrneph.2017.78. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1038/nrneph.2017.78&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=28626222&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F11%2F30%2F2020.11.29.20239962.atom) 38. 38.Mukaka MM. Statistics Corner: A guide to appropriate use of correlation coefficient in medical resaerch. Malawi Medical Journal. 2012;24(September):69–71. doi:10.1016/j.cmpb.2016.01.020. [CrossRef](http://medrxiv.org/lookup/external-ref?access_num=10.1016/j.cmpb.2016.01.020&link_type=DOI) [PubMed](http://medrxiv.org/lookup/external-ref?access_num=23638278&link_type=MED&atom=%2Fmedrxiv%2Fearly%2F2020%2F11%2F30%2F2020.11.29.20239962.atom) [1]: /embed/graphic-2.gif [2]: /embed/graphic-3.gif [3]: /embed/inline-graphic-1.gif [4]: /embed/graphic-8.gif