Abstract
Object This work aims to introduce a novel method to mitigate the global phase deviation inherent in photoplethysmography imaging (PPGI) due to hemodynamics.
Method We model the facial vascular network captured by a consumer camera as a two-dimensional manifold, where the complex dynamics of the vascular tree leads to intricate phase variations across skin sites. Utilizing PPGI, we sample the vector field on the facial manifold encoding these intricate phase variations over different skin sites resulting from blood volume modulations. We propose leveraging the graph connection Laplacian (GCL) technique to quantify the global phase deviation, with the hypothesis that correcting this deviation can improve the quality of the PPGI signal and that the phase deviation encodes valuable anatomical and physiological information.
Result The proposed algorithm yields a higher-quality global PPGI signal by correcting the global phase deviation estimated by GCL, emphasizing waveform features such as the dicrotic notch. The perfusion map, with the global phase deviation estimated by GCL as intensity, reflects skin perfusion dynamics influenced by varying travel distances and anatomical structures.
Conclusion This algorithm enhances the quality of the global PPGI signal, facilitating the analysis of morphological parameters and showing promise for advancing PPGI applications in scientific research and clinical practice.
1. Introduction
The landscape of cardiovascular health monitoring has undergone a notable transformation with the introduction of smartphone cameras. This technology allows for extraction and analysis of photoplethysmography (PPG) signals. Camera-based PPG imaging (PPGI) has evolved from the original contact-based methods which traces its origin back to the early 20th century [1]. Initially developed to measure blood volume changes in the microvascular bed, PPG has evolved over time to incorporate imaging modalities, providing a non-invasive means of assessing physiological parameters [2]. The advent of consumer electronics and advanced imaging technologies (like CCD earlier [3] and now CMOS sensors [4]) has propelled PPGI into the spotlight, particularly in the last two decades, starting with pioneering works of Blazek et al. group [3], [5]. The PPGI typically involves capturing variations in light absorption or reflection at the skin surface, primarily driven by pulsatile blood flow changes. This technique has found diverse applications, ranging from pulse oximetry [6], [7], [8], [9] and cardiovascular monitoring to more recent innovations in facial PPG for health assessments [10], [11]. The ability of PPGI to provide contactless measurements has garnered interest also in another fields such as such as vital sign monitoring [12], [13], [14], [15], [16], wound assessment [17], [18], driver state estimation [19], [20], [21] or pain evaluation [22]. The continuous refinement of PPGI methodologies and the exploration of novel applications underscore its significance in modern healthcare and wellness monitoring.
Researchers and practitioners are increasingly turning to additional techniques to broaden the comparative analysis and push the boundaries of PPGI capabilities. Starting from traditional spatial averaging [12], which primarily aims to increase the signal-to-noise ratio, different methods are used, differing from each other in their complexity or approach to signal extraction. At this point, we can mention the R-G algorithm [23] combining the red and green channels in the case of PPGI via an RGB camera, or the POS (Plane-Orthogonal-to-Skin) algorithm [24], which works with all three layers, or others such as, also in this case, the historically sorted G [12], which works only with the green layer or greyscale data, the aforementioned G-R [23], PCA [25], ICA [26], CHROM [27], PBV [28], 2SR [29], nonlinear-type time-frequency analysis like synchrosqueezing transform [30], the aforementioned POS [24] or Face2PPG [31]. Another approach is deep or machine learning methods [32], [33], which bring additional potential for feature extraction, classification and understanding of complex links within PPG signals. Another approach is using a lock in amplification method with the reference PPG signal either from external device or by averaging of skin pixels from whole recorded area [34], [35].
There is unavoidable global phase (or temporal) deviation across a body of tissue caused by the properties of the vascular tree, which is a system of interconnecting and branching elastic vessels where the pulse waves (i.e. direct and backward) propagate in different directions and velocities from the heart towards the periphery and back. This is further complicated by the interaction of this vascular tree with the autonomic system causing both localized and global changes in vascular tone. The main novelty of this paper is the use of synchronization based on the graph connection Laplacian (GCL) [36], [37] to quantify and modify the global phase deviation. Already as mentioned above in a physiological context, phase deviations in PPGI signals at different skin sites are intricately linked to the dynamic blood flow pattern in the vascular network. Our hypothesis is that this hemodynamic-induced phase variation embodies valuable anatomical information and potentially other relevant details of interest. We model the vascular network underlying the facial pixels, which in this case are recorded using a consumer camera, as a two-dimensional manifold that encodes the phase of localized blood flow. The goal is to use PPGI over each pixel to reconstruct this phase information and thus obtain an integral and improved global PPGI curve that could be used to analyze its morphological parameters and at the same time improve the spatial resolution of PPGI in mapping skin perfusion dynamics.
2. Materials and Methods
In this section, we provide a detailed description of our algorithm (the flow chart is shown in Fig. 1), which aims to extract tissue perfusion information from video recording acquired by consumer electronics (a smartphone camera) and improve the properties of the detected PPG curve in terms of its morphology and the detection of spatial dynamics across the recorded region.
2.1. Experimental setup
The experimental setup is based on a study conducted on volunteers undergoing a lower body negative pressure (LBNP) procedure [38], with their faces being recorded using a consumer-quality camera, namely the iPhone 11 (Apple, Cupertino, CA, USA). During recording, automatic settings such as autofocus were turned off while exposure was locked to prevent changes in the image that could reduce the quality of the perfusion information. Video was recorded in H264 compression format with a frame rate of 30 Hz. The subject’s face in supine position was illuminated with white diffuse LED light. As part of the comprehensive LBNP protocol, we selected only a 30 s segment from the baseline portion of the recording. The study protocol was approved by the Institutional Review Board of Yale University (No. 2000031899). Informed consent was obtained from all subjects prior to the study. All methods were performed in accordance with the study protocol and with the Declaration of Helsinki. To verify the functionality and assess the performance of the algorithm, we selected 5 subjects (2 females and 3 males) aged 36.2±11.39.
2.2. Photoplethysmography signal extraction from video
The first step was the extraction of PPG signals from the video recording. In this respect, the camera sensor can be thought of as a PPG sensor matrix, where in theory each pixel of the image sequence can carry spatial information about the blood supply of a given location. The SNR of consumer cameras is usually not sufficient to extract the PPG signal from only one pixel, and different techniques have been used to increase the SNR. In most cases, the first step is spatial averaging of pixels either in the form of spatial down sampling [9] or through a moving kernel with the desired overlap [39]. In our case, we average the pixel values in a box of size 30×30 px2 with 10 px overlap which represents cumulating pixel information from the areas of approx. 1 cm2 at given camera distance and its field of view. We call an overlapping region of size 30×30 pixels a channel and suppose we obtain L channels. This step will not only increase the SNR of the extracted PPGI signals, but also reduce the number of signal vectors to work with while still maintain a reasonable spatial resolution. The result of this process (see Fig. 2) is a raw PPG signal that we decompose into individual color layers, including red (R), green (G), and blue (B), and store on disk for further analysis. Next, we work only with the green (G) channel, which contains the most information about perfusion. This fact is related to the technical realization of the color camera (Bayer mask), the wavelength of the green light and its penetration into the tissue, as well as the absorption and scattering properties of the skin. The main luminophore in this case is blood hemoglobin [39], [40], [41]. Denote , where N = 900 is the number of sampling points over the 30 s segment with the frame rate of 30 Hz, to be the raw PPGI signal collected from the ith pixel.
2.3. Signal pre-processing and filtering
Further signal processing proceeds by segmenting the skin using a Gaussian Mixture Model (GMM) to ensure that we continue to work only with living skin pixels where we can expect PPG signals to be present (see the dark blue region in Figure 1(a)). The segmentation is semi-automatic and allows the user to control the detected clusters and add them as needed.
The next step involves the application of a discrete wavelet transform (DWT) for each PPGI channel to select a band with the range 0.4 – 4 Hz that we can assume cardiac activity. For their extraction, we use the Daubechies 4 wavelet (db4), with the number of DWT levels/octaves set to 5. Denote the resulting signals .
After the DWT, we select a segment with minimal motion artifact in the following way. First, find all channels associated with the subject’s forehead region (see the red box/ROI in Figure 1(a)) and obtain average of the associated signals, which is denoted as . From , we manually extract a high quality ten-second interval that is less impacted by motion artifacts. Then, restrict on the selected ten-second interval, and denote the resulting signals .
The process continues by extracting the first and second harmonics of each PPGI signal based on the continuous wavelet transform (CWT) using the time-frequency ridge method, followed by the inverse CWT and interpolating from the original 30 Hz to 900 Hz in order to retrieve subtle phase shifts across the signals, which is denoted as xi ∈ ℝ9,000. Since the PPGI signals are generally noisy and we use relatively small averaging regions to keep the spatial information, the second harmonic component in the CWT spectrum is usually overlaid with noise. To extract it, we use the frequency ridge by simply considering doubling the extracted ridge of the first harmonic component, and then extract the second harmonic component. Denote the first and second harmonics of the ith channel as h1,i ∈ ℝ9,000 and h2,i ∈ ℝ9,000 respectively.
Next, the signal is trimmed to the length of only 3 consecutive heartbeats, taking into account both the phase alignment dependence of the data length (influenced by heart rate variability) and the exact selection starting from the peak or valley of the PPG curve. A Hilbert transform is then performed.
2.4. Synchronization of the extracted PPG signals
Due to the inevitable global phase deviation caused by hemodynamics, the main novelty of this paper is applying the graph connection Laplacian [36], [37] to adjust the global phase deviation. In a physiological context, the phase variations in PPGI across different pixels are intricately linked to the dynamic progression of blood flow within the vascular network. Our hypothesis posits that this phase deviation induced by hemodynamics encapsulates valuable anatomical and physiological information and potentially other pertinent details of interest. We model the vascular network underlying the facial pixels as a 2-dim manifold and the geometric structure encodes the phase of blood flow changes. The goal is to use the PPGI sensors over each pixel to recreate this phase information and hence the vascular network structure. We first estimate the pairwise phase deviation between any two PPGI channels xi and xj using their 1st harmonics h1,i and h1,j. Compute: where H converts the real-form signal into its analytic companion, defined as Hilbert(h1,i), with Hilbert the Hilbert transform, and ⟨, ⟩is the inner product of two complex signals. The entry ci,j estimates the global phase shift of xi and xj. See [38] for more discussion about the nonlinear relationship between harmonic phase and PPG morphology and Discussion for more technical details. Since we have LPPGI channels, we obtain a Hermitian function C of size L×L, called the graph connection Laplacian (GCL), so that the (i, j)th entry of Cis ci,j if the ith channel and jth channel are within 5 cm distance, and 0 otherwise. We find the eigenvector of C, denoted as w1, w2 …, with the associated eigenvalues ranked from large to small. Then, the global phase deviation of the ith PPGI channel is then corrected by: where Re means taking the real part and the superscript * means taking the complex conjugation. The philosophy underlying this correction is that w1(i) recovers the blood flow phase underlying the ith channel. See discussion section for more technical details of this GCL algorithm.
2.5. Construction of the final non-contact PPG signal
We construct several non-contact PPG signals. The first one is by a simple averaging of all unaligned PPGI channels; that is, . The second one is averaging all aligned PPGI channels; that is .
2.6. Construction of perfusion maps
We propose the following approach to visualize perfusion patterns in the face. We generate images encoding the energy of the first harmonic, denoted as IE,1, which is defined as the energy of the 1st harmonic of each channel. We call these maps energy maps.
The next set of mappings is generated from the top eigenvector of GCL, w1. The phase encoded in , reflects the phase shift of the lth channel over the face. Generate a colored image on the face region with the color in each pixel reflecting the entry of ϑ 1. Denote the resulting images as Iphase. We call these maps phase maps.
For each image, to avoid the impact of oversaturation, we replace all values larger than 95% percentile by the 95% percentile before plotting. For phase maps, we also trim colormap based on the range of median phase ±1rad of entire skin area in order to highlight subtle difference in phase shifts across the PPGI channels.
3. Statistical analysis
To quantify the performance of phase synchronization for PPGI signal , we consider the following metric. For the raw unaligned signals, we apply the Hilbert transform and obtain the complex form . We then convert the amplitude and phase into the phasor form at the center point: where c denotes the center point that we extract the phasor and . For the aligned signals, which are the output of (2), we apply the Hilbert transform to obtain the complex form . We then convert the amplitude and phase into the phasor form at the center point: where c denotes the center point that we extract the phasor and . Denote the median of as and denote the median of as . To check if and have the same distribution, we apply the Kolmogorov-Smirnov test with p<0.05 viewed as statistically significant. Also, apply F-test to check if and have the same variance with p<0.05 viewed as statistically significant. We also performed Wilcoxon signed-rank test to investigate if the phase shift on the cheeks and forehead, where the phase deviation of the ith channel is quantified by the angle of , are different with p<0.05 viewed as statistically significant. Bonferroni correction is applied to correct the multiple tests.
4. Results
The first step to verify the quality of the phase alignment of the PPGI curves was to plot the phasors separately for 1st harmonic component. The distribution of phasors of all PPGIs channels evaluated by (6) of all subjects is shown in Fig. 3, where we plot as a vector, with the magnitude and angle φi(c). It can also be noticed in Fig. 3 that the unaligned phasors (blue arrows) tend to be directed and arranged in two or more clusters for some subjects, while aligned phasors are in general well clustered in one direction. The Kolmogorov-Smirnov test showed that the two distributions are significantly different with p < 10−4, and the F-test showed that the two distributions have significantly different variance with p < 10−4. On the other hand, we shall emphasize the observation that dominant direction of aligned PPGI signals differs from subject to subject. This is the consequence of the nonuniqueness of GCL approach inherited from the freedom of eigendecomposition.
After aligning the PPGI signals and verifying the quality of their alignment, we obtain various final non-contact PPG signals, including x(ua−mean), x(a−mean)shown in Fig. 4. One of the interesting effects of phase alignment is the appearance of the dicrotic notch, which is most pronounced for Subject 2 in Fig. 4. Another observation is that the average of phase aligned PPGIs contributes to maximizing the amplitude of the PPGI signal.
Next, we demonstrate the perfusion maps derived from PPGI. See Fig. 5 and Fig. 6 for our 5 different subjects, who were selected based on specific characteristics such as skin tone, facial hair, facial mask or extensive makeup. We will pay specific attention to the difference between cheeks and forehead, the two dominant areas on the face that are supplied by different arteries.
For subject 1 we can observe different captured amplitudes in the cheeks and glabella region in the energy maps, which might be interpreted as ‘‘cheeks are more perfused’’. Another interesting observation for subject 1 is its phase map associated with w1, where we can see clustering at specific locations in the image. There is a slightly different phase shift in the right side of the face. The forehead area also exhibits different phase shift compared to other parts of the face which could be related to different arterial supply.
Subject 2 is specific to his facial hair (chin and beard), which may be a source of motion artifacts that may deceptively influence the detected PPGI signal amplitude. Again, the cheeks have more dominant energy compared to forehead also in this case. In the phase map, while it is less obvious, the phase in the forehead is different from that in the cheek.
Subject 3 is specific in that he is wearing a face shroud and also has goggles on his eyes. For this case, it is not possible to obtain information over cheek, so a comparison between cheeks and forehead is not possible.
For subject 4 (Fig. 6), we notice a distinct vertical line across the middle of the forehead, which may be a wrinkle or an ongoing supratrochlear vein. This area is even more pronounced in the phase map. Again, we can see a phase difference between the forehead and cheeks in the phase map.
We can also see an interesting result for subject 5 (Fig. 6), where the glowing eye surroundings can be tracked in the energy maps. In this case, this is the result of different makeup on the face and around the eyes. In this case, this is a consequence of the different makeup on the face and around the eyes. In this case, the phase difference between the forehead and cheeks is less clear in the phase map.
Wilcoxon signed-rank test showed that the phase deviation is significantly different in forehead and cheeks for subject 1, 2 and 4 (with p < 10−4). We did not test subject 2 because we cannot access the cheeks due to the facial mask. We also omitted subject 5 due the extensive makeup.
5. Discussion and Conclusion
Our aim is to advance the signal processing techniques for PPGI further in two directions, both of which can be applied in different areas. The first attempt is applying the GCL-based synchronization algorithm to construct a better-quality global PPG, which contains details that are more challenging to obtain using traditional approaches, like dicrotic notch and its precisely localized position. The novelty is viewing GCL as a coupler for all PPGI signals so that we can determine the phase discrepancy arises from the pulse wave travel through the vascular tree, and hence synchronize all PPGI signals. The results show that this unique method proposed by us can enhance the features of the PPG waveform (see Fig. 4) and boost its amplitude. The second attempt is introducing novel images called perfusion images, which encode hemodynamics extracted from GCL. The novelty is using the phase information hidden in the eigenvectors of GCL, which directly encodes the global phase deviation caused by the blood flow in the complicated vascular network at different locations of the imaged area.
Physiologically, the supraorbital and supratrochlear arteries dominate the blood supply to the forehead, while the facial artery predominantly supplies blood to the cheeks. The supraorbital and supratrochlear arteries are supplied by the internal carotid while the facial artery originates from the external carotid artery. Due to the difference in the travel distance and anatomical structure, it is reasonable to hypothesize that the PPG phases are different in these two regions. Our findings support this hypothesis.
We shall elaborate some technical details. In general, PPG signal is not sinusoidal and can be represented by multiple harmonics [42]; that is, we can model a clean PPG signal as , where A is a positive smooth function describing the amplitude modulation, s is a 1-periodic function describing the morphology of the nonsinusoidal oscillation, φ(t) is a strictly monotonically increasing function describing the phase of the PPG so that its derivative describes the instantaneous heart rate, L> 1 is the number of harmonics, and αl > 0 and βl ∈ [0,2π) come from the Fourier series of s. In this case, recovering the phase φ(t) from the Hilbert transform of x(t) is inaccurate due to the existence of harmonics. This situation is worsened by the existence of noise. Instead, we could better recover the phase from the fundamental component (the first harmonic) h1(t) ≔ A(t)α1cos (2πφ(t) + β1), thanks to the Nuttall theorem [43] and the assumption that A(t) in general oscillates slowly compared with the cos (2πφ(t) + β1). This is the main reason we design the algorithm with the first harmonic in (1). On the other hand, note that since the PPGI signals x1 and xj are usually noisy, ci,j is a noisy estimate of the global phase shift of xi and xj. The capability of recovering phase from the usually noisy PPGI signal with the GCL-based synchronization algorithm is supported by the robustness of GCL to noise [37], [44].
In addition to the robustness of GCL, we shall also discuss the geometric meaning of GCL. Consider a simplest and ideal situation that PPGI is clean and faithfully representative of the blood flow changes, and there is no geometric constraint among pixels. In this case, C can be constructed without setting 0; that is, the (i, j)-th entry of Cis simply ci,j for any i, j. In this case, C is a rank one matrix as C= θθ∗, where θ ∈ ℂL so that θ(i) encodes the phase of the blood flow underlying the ith channel. In this simplest case, the top eigenvector of Cgives the full information of phase in each channel. However, in practice the PPGI is noisy, and there are geometric constraints imposed by the vascular network. We thus only trust the phase information when two channels are close by, which leads to the construction of C in our algorithm. Geometrically, the top eigenvector of C gives us an estimate of the most synchronized phase information over the vascular network on the face. In this paper, we focus only on the top eigenvector. The higher eigenvectors of C encode more geometric information associated with the underlying geometric structure associated with the blood vessel distribution. We will explore this topic in our future work.
It is noteworthy to mention that the phase of PPG signals (in our case, it is captured by the first eigenvector of GCL) has been studied in different contents. In [12], the authors applied Discrete Fourier Transform to possibly reveal carotid artery or differentiating normal skin with comparison with port wine stains. Another approach can be found in [45], where the lock-in amplification is used for phase estimation. It is worth noting that the phase shift can be calculated across different frequency bands across PPGI signals (in our case, we focus on the frequency band associated with the first harmonic), e.g. low-frequency oscillation phase distribution which shows as promising tool to study autonomic nerve system responses to external stimuli [10]. Studying and mapping the phase shift across imaged area may reveal the locations that the propagating pulse wave reached at the same time, or the positions of the subcutaneous tissue perforators. It also encodes the locations where motion or ballistocardiographic (BCG) artifacts may be present, or we may have encountered a counter-phase of the original PPGI signal [46], [47], [48]. To the best of our knowledge, perfusion maps generated using the phases of the first eigenvector of GCL have not yet been published and may provide important robust phase estimation tool which is immune to inevitable noise with theoretical support and can be used in different PPGI scenarios.
Examples of applications of our proposed signal processing technique could be used in different fields, e.g. the body’s response to external stimuli, such as temperature (cold [10] or heat [49]) or mental stress [50]. While contact PPG has been widely applied in practice, there are situations that PPGI is needed. For example, it is not possible to place a contact sensor on the patient’s skin when the patient’s skin is damaged (injury, burns), or when monitoring the vital signs of premature newborns. Other applications are the extraction of physiological parameters such as HR or blood pressure [47], [48]. Given that the dicrotic notch and its determined position play an important role in studying the condition of the arterial tree, particularly with regard to its stiffness (as indicated by augmentation or stiffness index, etc.), our proposed algorithm shows promise in effectively capturing these features. As such, it holds potential for application in this direction. In addition, as some studies have already pointed, the PPGI has still untapped potential. Namely the challenge is to focus on truly spatial mapping of physiological parameters, especially in the context of ANS regulation or allergic reactions [10], [11], [51], [52]. There are also approaches in which the respiratory waveform is extracted from the PPGI recording or respiratory activity is detected using a method such as optical flow [11], [53] where the proposed algorithm could help too. Another option is to deploy this method in extracting information about SpO2, and thus in evaluation of the oxygenation level of arterial or venous or mixed blood [8], [9].
Finally, we want to summarize what our paper offers and how to use the tools. To obtain a higher quality global PPGI signal when integrating all PPGI signals from the sensed area, we use GCL for their phase alignment. This results in higher amplitude and highlighted details such as dicrotic notch. If we want to work with spatial information and analyze the distribution of perfusion parameters across the imaged area, a good step here is to analyze higher eigenvectors of C. If we are interested in the phase relations between the PPGI waveforms, we can again use the GCL, which conveys information about the global phase of a given PPGI waveform at a given location in the image.
Here we would like to point out the limitations of this work. Although this is a small sample size, we would like to stress that this is mainly a methodological work using examples of subjects with a wide variety of characteristics. Even if the GCL is immune to noise, the quality of the PPGI signals may not always be sufficient. Instead of simple spatial averaging, other raw signal processing methods presented in the Materials and Methods should also be considered, which may offer higher SNR or robustness to motion artifacts in particular. Moreover, capturing the subtle differences between the phase shifts given by the pulse wave propagation velocity and the geometrical dimensions of the analyzed region is challenging, especially in terms of the demands on the sampling rate (interpolation may no longer reveal and reconstruct all details) and the size of the processed data. We may need a ground truth of the perfusion dynamics on the face in some form to further validate the proposed algorithm. In conclusion, despite the shortcomings, we believe that this novel approach will open the door for new applications of PPGI and its use not only in science but also in clinical practice.
Data Availability
All data produced in the present study are available upon reasonable request to the authors
Conflicts of Interests/Financial Disclosures
None
Author contributions
SB: idea, literature review, data analysis and write-up. HTW: idea, literature review, data analysis and write-up. KHS: idea, literature review, data collection, and write-up. AAA: idea, literature review, data collection, and write-up.
Acknowledgments
The work of Stefan Borik was supported by the National Scholarship Programme of the Slovak Republic (Application No. 47518). Heartfelt thanks also go to Prof. Vladimir Blazek from RWTH Aachen, a native of the former Czechoslovakia, who is considered as the pioneer of photoplethysmography imaging and who contributed greatly to the field and the scientific and personal growth of Stefan Borik.
Reference
- [1].↵
- [2].↵
- [3].↵
- [4].↵
- [5].↵
- [6].↵
- [7].↵
- [8].↵
- [9].↵
- [10].↵
- [11].↵
- [12].↵
- [13].↵
- [14].↵
- [15].↵
- [16].↵
- [17].↵
- [18].↵
- [19].↵
- [20].↵
- [21].↵
- [22].↵
- [23].↵
- [24].↵
- [25].↵
- [26].↵
- [27].↵
- [28].↵
- [29].↵
- [30].↵
- [31].↵
- [32].↵
- [33].↵
- [34].↵
- [35].↵
- [36].↵
- [37].↵
- [38].↵
- [39].↵
- [40].↵
- [41].↵
- [42].↵
- [43].↵
- [44].↵
- [45].↵
- [46].↵
- [47].↵
- [48].↵
- [49].↵
- [50].↵
- [51].↵
- [52].↵
- [53].↵