## Abstract

Alzheimer disease(AD) affects profoundly the quality of human life. Quantifying the severity degree of Alzheimer dementia for an individual person is critical for the early diagnose and prescription for delaying the further dementia progression. However, the quantitative diagnose for human subjects of the mild cognitively impairment(MCI) or AD with the different degree of dementia severity is a difficult task due to both the very broad distribution of dementia severities and the lack of good quantitative determinant to assess it. Here, based on cortical thickness profiles of human brains measured by magnetic resonance experiment, a new algebraic approach is presented for the personalized quantification of the severity degree of AD, ranging from 0 for the basin of cognitively normal state to 1 for AD state. Now one can unfold the broad distribution of dementia severities and corresponding cortex regions of human brains with MCI or AD by the different severity degree.

## Introduction

Alzheimer’s disease (AD) is one of the most well-known neurodegenerative diseases, and it profoundly affects the quality of human life. Various causes of AD are known, and clinical treatments for preventing or delaying the progression of AD are currently practiced. The precise diagnosis of AD requires not only the systematic identification of cohorts that classify the different stages of brain progression toward AD but also the estimation of the severity degree of AD for a given individual (Galvin et al., 2012, Solomon and Soininen, 2015, Raj et al., 2012). The symptoms of AD appear in various forms in the human body, behavior, and cognition, yet the direct anatomical evidences appear in the structural change within the brain cortex (Hojjati et al., 2019, Kim et al., 2020, Qiu et al., 2020, Tetreault et al., 2020). Among these evidences is the degradation of the cortical thickness of the human brain, which is one of the imprints of AD. Such physical change can be monitored through the neuro-image analysis, for example the magnetic resonance image (MRI) analysis of the brain (Hartikainen et al., 2012, Im et al., 2008, Kim et al., 2014, Lebedev et al., 2013, Paternico et al., 2016, Querbes et al., 2009).

The anatomical degradation of the cortical thickness becomes more pronounced as the degree of dementia becomes greater (Im et al., 2008, Querbes et al., 2009). Therefore, there is a rationale for taking advantage of the degree of cortical degradation for determining the dementia cohort and estimating the severity of dementia progression. Given the big data information of cortical thickness for a single subject, however, we are faced with several obstacles to overcome in accomplishing abovementioned tasks. First, we noted that the person-to-person fluctuations in cortical thickness of an individual may overwhelm the degradation in cortical thickness. In clinical cases, we frequently observed that the average cortical thickness of some cognitively normal people is thinner than that of people with AD, which appears to contrast conventional view. The second obstacle is the ambiguity in what we should do if two different cohorts have a difference in the cortical thickness in brain regions that have little to do with AD. In principle, we should construct some determinants for distinguishing a cognitively normal person from a person with AD based on cortical thickness, but in practice, we are confronted with the differences in cortical thickness in regions of the cortex that are unrelated to the pathogenesis of AD. We noted that the abundant existence of such unrelated regions is an intrinsic source that increases the uncertainty of the dementia determinants and hinders the appropriate construction of good classifier and predictor for AD.

In this study, instead of dealing with all 327,684 vertices point on the whole cortex of a human brain, we demonstrated that the consideration of a few hundred essential vertices were enough for distinguishing CN, MCI, AD cohorts each other. With cortical thickness data at these essential vertices, we defined the statistical score matrix and the covariance correlation matrix between human subjects as a set of classifier and predictor for AD. Unlike the conventional view that the degradation of the cortical thickness of human brain was responsible for AD, the singular valued decomposition analysis of the statistical score matrix developed in this study implicated that the simultaneous consideration of both thinner and thicker cortical regions together compared to those of CN are important and necessary for the precise diagnose of the severity of AD.

## Materials and methods

### Preparation of cortical thickness data from MRI of 1522 human brain images from ADNI

Data used in the preparation of this article were obtained from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database (adni.loni.usc.edu). The ADNI was launched in 2003 as a public-private partnership, led by principal investigator Michael W. Weiner, MD. The primary goal of the ADNI has been to test whether serial MRI, PET, other biological markers, and clinical and neuropsychological assessment can be combined to measure the progression of MCI and early Alzheimer’s disease. For up-to-date information, see www.adni-info.org.

We selected 274 individuals (human subjects) who were identified as CN, 265 individuals with MCI, 125 individuals with AD from the ADNI-2 study of ADNI, and 97 with MCI from the ADNI-GO study of ADNI. A human brain image-data set of 1522 MR images from a total of 761 subjects was constructed, for each of which both 1.2-mm sagittal Magnetization Prepared Rapid Gradient Echo (MPRAGE) and MPRAGE_SENSE2 images were taken separately.

### Partition 1516 MR images of human brains into four groups and determine the essential region-of-interest vertices for each group

We performed the FreeSurfer analysis to obtain the cortical thickness data at 327,684 vertices on the cortex of a human brain (Dale et al., 1999, Fischl et al., 1999). The cortical thickness at each vertex ranges from 0 mm to 5 mm. After eliminating those vertices at which cortical thickness information was missing for any one of the 1522 MR images of human brains in the ADNI data set, 276,825 common vertices whose cortical thickness values are available for all 1522 MR images were selected for our study. The average cortical thickness over 276,825 vertices for each brain images was evaluated, and we divided 1516 values of average thickness into four groups (A-D) of different windows of average thickness except 6 values of that run out-of-bounds. Demographic characteristics of the average cortical thickness of the four groups are listed in Table 1.

In order to assign subjects from each CN, MCI, and AD cohort into one of the four groups (A-D) of average cortical thickness, we employed the *Z* score criteria in selecting the region-of-interest (ROI) vertices and the essential ROI vertices on the cortex at which the distribution of cortical thickness of the CN cohort is distinguished from that of the AD cohort within each group of average cortical thickness. A similar procedure is repeated for distinguishing the CN cohort from the MCI cohort and also the MCI cohort from the AD cohort:
Here, ⟨*t*_{p,h∈k}⟩ is the average cortical thickness at a vertex point *p* averaged over the subject *h* who belongs to the *k* (one of CN, MCI, AD) cohort, and σ_{p,h∈k} is its standard deviation, and *n*_{p,h∈k} is the number of MR images belonging to the *k* cohort. The positive (negative) value of , for example, indicates that the distribution curve of the average cortical thickness of the CN cohort is right (or left)-shifted compared to that of AD cohort. And the bigger the absolute value of the *Z* score, the better distinguished the distribution curves of average cortical thickness of the cohorts. In this study, we identified ROI vertices satisfying the absolute value of the *Z* score larger than 1.5, and essential ROI vertices satisfying much higher cut-off Z scores (Table S1).

### Construction of a statistical score matrix for classifying subjects into one of CN, MCI, AD cohorts

Within each group of average cortical thickness, we constructed the statistical score matrix for determining a subject’s cohort as either CN, MCI, or AD (Yu et al., 2011). First of all, *t*_{p,h∈k} was transformed into the probability distribution matrix , which is a probability that the cortical thickness *t*_{p,h∈k} at a vertex point *p* of the subject in *k* cohort is between (*m* − 1)*Δ* and *mΔ*:
Here, Δ = 0.2 mm, and cortical thickness index *m* runs from 1 to 30; this covers the cortical thickness from 0 mm to 6 mm. δ(x) is a Dirac delta function, and Θ(x) is a step function whereΘ(x < 0) = 0; Θ(x ≥ 0) = 1. Then, we defined the statistical score matrix from in the following way:
Since and the second term are constants. The value of the statistical score matrix varies depending on the cohort *k*; the smaller is, the larger is.

With this statistical score matrix , we employed a strategy for determining to which one of *k* cohorts a given subject would belong. First, we evaluated the averaged cortical thickness of a given subject over 276,825 vertices. Second, we assigned this subject to one of four groups (A-D) of average cortical thickness. Third, based on the preselected essential ROI vertices *p* for the assigned group, we determined the cortical thickness index m’(p) at which the cortical thickness at an essential ROI vertex *p* is between (*m* − 1)*Δ* and *mΔ*. Then, for each *k* cohort, the total score was calculated by summing up over the preselected essential ROI vertices *p* for the assigned group of the average cortical thickness, . Lastly, to which *k* cohort a given subject would belong was decided by a cohort which gives the minimum score out of S’_{(CN)}, S’_{(MCI)}, S’_{(AD)}.

We, however, noted that the accuracy of both and may become unsatisfactory if the number of people in the AD cohort was less than that of the CN cohort and the MCI cohort (Table 1). In order to overcome the unsatisfactory nature of both and , we employed the method of Kernel Density Estimation (KDE); namely, a Dirac delta function δ(*t* − *t*_{p,h}) in the definition of the probability distribution matrix , is replaced by a kernel function *f*(t − *t*):
Here, the relative ratio among coefficients *a*_{l} is *a*_{0}: *a*_{±1}: *a*_{±2}: *a*_{±3}: *a*_{±4} = 56: 43: 21: 7: 1. The kernel function *f*(t − *t*_{p,h}) satisfies and the standard deviation σ_{f} ≈ 1.435. Upon subjecting KDE, , becomes
In this study, we constructed the statistical score matrix on which KDE was employed and used it for determining to which *k* cohort a given subject would belong.

### Construction of a covariance correlation matrix and a predictor for the severity degree of AD

Within each group of the average cortical thickness, the severity degree of AD for a given subject is estimated by the following strategy. First of all, we transformed the cortical thickness matrix *t*_{p,h} at essential ROI vertices *p* for a subject *h* into the normalized matrix *t*′_{p,h} such that
Here, the product of *t*′_{p,j} by its transpose *t*′_{p,i}^{T} results in the square matrix *t*′′_{i,j} = *t*′_{p,i}^{T}·*t*′_{p,j}, and then its normalized matrix (called by a covariance correlation matrix) *C*_{ij} is defined by *C*_{ij} = *t*′′_{ij}/ max{*t*′′_{ij}}, where max{*t*′′_{ij}} is the maximum value of elements in the square matrix *t*′′_{ij}. The larger the value of *C*_{ij}, the higher the covariance correlation between a subject *i* and a subject *j* in their profile of the cortical thickness at essential ROI vertices. Based on this covariance correlation matrix, we defined the severity degree AD for a given subject *i by*
Where . The severity degree of AD ranged from 0 for the basin of CN state to 1 for the basin of AD state. Rank-ordering this degree in ascending order illustrates that a subject *i* with the larger (or smaller) value of the severity degree is more prone to AD (CN) state.

## Results

### Identification of essential ROI vertices at which the distributions of cortical thickness of CN, MCI, AD subjects are distinguishable

Although the averaged cortical thickness of subjects with AD is generally known to be thinner than that of CN or MCI subjects, the distribution curves of averaged cortical thickness for the cohorts are not well distinguishable except near both ends of the distribution curves as demonstrated in Figure 1A. This illustrates that a subject can be CN even though the averaged cortical thickness is thinner than that of a subject with AD, and vice versa. Also, we found that many subjects identified as CN, MCI, or AD have a similar averaged cortical thickness. This is due to the fact that the average cortical thickness for a subject was calculated over all 276,825 vertices, and the cortical thicknesses at most vertices are similar for all subjects, which prohibits us from successfully clustering 1552 human brain images into the image of CN, MCI, AD cohorts. Therefore, instead of resorting on the cortical thickness of all 276,825 vertices, we need to select the ROI vertices at which the cortical thickness values are distinguishable from each other among CN, MCI, and AD. For a fair selection of such ROI vertices, we divided the range of the averaged cortical thickness of subjects into four (A-D) different regions (for the detailed method, see in the section 2.2). Figure 1B illustrates how we identified ROI and essential ROI vertices. The x-axis represents the degree of separation between the distribution curves of cortical thickness for CN subjects and AD subjects at a vertex *p* normalized by the dispersion of the cortical thickness of CN subjects, which is quantified by the value of its *Z* score. The y-axis represents values normalized by the dispersion of the cortical thickness of individuals with AD. Therefore, the x-values (y-values) at a vertex point *p* represent the degree by which the distribution of cortical thickness at this point *p* of CN (AD) subjects is distinguished from averaged cortical thickness of subjects with AD (CN). It means the larger the value of is, the two distribution curves are more distinguished each other (for the illustration, see Figure S1). The ROI cut-off line is defined by points satisfying , and the distribution of cortical thickness of CN subjects and individuals with AD is clearly distinguished at those points satisfying (outside of the ROI cut-off line).

Figure 1C shows ROI vertices for each of the cortical thickness groups (A-D) by colored points on the white cortex, at which the thickness of the cortex is either thicker or thinner particularly for one cohort compared with that of two other cohorts. These ROI vertices are widely distributed on the cortex, and their locations are not fixed but vary depending on the groups A to D. We uncovered, however, that the medial temporal lobe, known to be very important for memory formation, is always indicated by a red or dark red color irrespective of the groups A to D (Figure S2). This implies that the cortical thickness values of the medial temporal lobe for subjects with AD are characteristically thinner than those of CN subjects or individuals with MCI (red color), and this decrease occurs in the following descending order: CN-MCI-AD (dark red color). The medial temporal lobe is the region where the cortical thickness gradually decreases as dementia progresses and therefore is the critical region necessary for determining the dementia cohort and the severity degree of AD. We also noted that the cortical thickness at the orange-colored region for subjects with AD is thicker than that for CN subjects or those with MCI. This has nothing to do with the damage in the cortex but contributes to the increase in the accuracy for predicting the dementia cohort since it could provide better distinguishability of subjects with AD from CN subjects and those with MCI.

### Character of statistical score matrix and classification of subject’s cohort

The section 2.3 described the detailed procedure of constructing the statistical score matrix for determining a subject’s cohort within each group of the average cortical thickness (Figure 2A). In order to judge how well the statistical score matrix distinguishes CN, MCI, and AD cohorts from each other before we predict the cohort of a new subject, we performed the singular value decomposition (SVD) analysis on the combined statistical score matrix S^{(All)} which consists of matrix elements of S^{(CN)}, S^{(MCI)}, and S^{(AD)}. We used the SVD character of a matrix that a given matrix can be reconstructed as a linear combination of the products of two singular vectors by one singular matrix. Since the reconstructed matrix from the few highest modes of singular values contains the predominant character of an original given matrix, one expects that the differences among the cohorts should appear in singular vectors of different cohorts. Otherwise, the statistical score matrix S^{(All)} is not reliable nor does it contain the characteristic ingredient of different cohorts. Figure 2B, 2C and S3 show the highest six singular vectors corresponding to the six largest singular values from SVD analysis of the statistical score matrices for each group of the average cortical thickness A to D. Here, it demonstrates that elements in the singular vectors *v*_{1} to *v*_{3} for CN, MCI, and AD follow qualitatively a similar trend, meaning that these compose the fundamental default modes, whereas those in *v*_{4} to *v*_{5} follow a different trend and are distinguished each other.

Out of 547 CN, 722 MCI, and 247 AD human brain images predetermined clinically and provided from the ADNI data set, we performed the self-recognition test and the stratified 3-fold cross validation test using the 1516 human brain images for each group on the average cortical thickness A to D (Table 2). For the first (second; third) iteration of stratified 3-fold cross validation test, 1006 (1011; 1015) human brain images were used as the training set for learning the statistical score matrix and 510 (505; 501) human brain images were used as an independent validation set. The new method presented in this study recognized and predicted the subjects with AD in the cohort with more than 91% (self-recognition test) and 82% (stratified 3-fold cross validation test) accuracy, respectively.

### Estimating the severity of AD of subjects by covariance correlation matrix

Developing a quantitative measure to tell how serious the degree of dementia progression for a given subject is very important for diagnosing and clinically treating patients with MCI and AD with the different degree of Alzheimer dementia. In this study, we already identified essential ROI vertices and constructed the statistical score matrices as a classifier ensuring the correct prediction of subjects with AD at more than 80% accuracy that they belong to the AD cohort. Thus, we extracted the cortical thickness profile (or vector) at essential ROI vertices for all brain images, and constructed the covariance correlation matrix between them. Then we compared the profile for a given subject’s image with that of patients with AD, to estimate the severity degree of AD for a given subject towards patients with AD (for the detailed method, see the section 2.4 and Figure 3A). The personalized quantitative severity degree of dementia (see the equation (8)) is plotted at the right-bottom graph of Fig. 3A for each subject of CN, MCI, AD cohort of the group D in the ascending order. The values of severity degree of dementia were distributed around the averaged value of 0 (ranging from about −0.5 to +0.5) for subjects with CN, 0.5 (ranging from about −0.2 to +1.2) for subjects with MCI, and 1.0 (ranging from about +0.2 to +1.5) for subjects with AD, respectively. The distribution of the severity degree for subjects with MCI was laid across both ranges of those for CN and AD, which points out that this is the intrinsic source of the low success ratio in determining the dementia cohort of subjects with MCI. One could sort out quantitatively the broad spectrum of the severity of dementia for MCI subjects in that whether they are prone to CN or how much they are progressed toward AD. Given a new person for diagnosing the dementia state, one of the cohort CN, MCI, AD was assigned by the equation (3) and the personalized quantitative severity degree of dementia was estimated by the equation (8). Then, with these two-qualitative and quantitative-determinants, one may infer that a new person with the estimated severity degree below 0.0 is most likely to be CN, with that between 0.0 and 0.5 might be CN or MCI prone to CN, with that between 0.5 and 1.0 might be MCI prone to AD or AD, and with that above 1.0 is most likely to be AD state.

We constructed the covariance correlation matrices for all groups A, B, C, D of the average cortical thickness and observed the common pattern in the matrices that subjects with AD (CN), possessing a strong correlation among themselves are clustered at the top-right (bottom-left) corner, represented by the cluster of red colors (Figure S4). Also, we presented the reordered covariance correlation matrices by the severity degree of AD and the results to which one of the CN, MCI, AD cohorts each human brain would belong, based on both the clinical test and our stratified 3-fold cross validation test (Figure 3B). After comparing the result from our independent validation test with that of the clinical test, we noted that those subjects which were predicted to belong to the MCI cohort by the clinical test and yet estimated to have the higher (lower) severity degree of AD by our estimation, were predicted to belong to the AD (CN) cohort from the our validation test.

## Discussion

Based on the cortical thickness data of 1516 human brain images from the ADNI data set, we presented a new algebraic approach for both (1) the identification of the cohort (CN, MCI, AD) a given subject would belong to and (2) the quantitative estimation of the severity degree of AD for a given new person (Figure 4). A total of 1516 human brain MR images were partitioned into four groups by the average cortical thickness of each subject. Out of 327,684 vertices on the cortex, a few hundred essential ROI vertices for each group were identified, which were enough to distinguish the cortical thickness distribution of the CN, MCI, and AD cohorts from each other. Statistical score matrices using the cortical thickness on the essential ROI vertices were constructed as a classifier for determining the cohort of a given subject. Out of 547 CN, 722 MCI, and 247 AD subjects predetermined clinically, the success ratio for recognizing their cohort was 92% with CN, 69% with MCI, and 91% with AD subjects. On the other hand, the stratified 3-fold cross-validation test gave the correct prediction rate of 84% with CN, 59% with MCI, and 82% in subjects with AD; this is in agreement with the results of clinical determination. Using the quantitative severity degree of AD for subjects, we could explain the reason why the inevitable uncertainty in the determination of the MCI cohort arouse by the very broad distribution of the severity degree of dementia which MCI subjects possess intrinsically. We suggested that the severity degree of AD presented in this study would be a realistic measure for the quantitative personalized diagnosis of a given subject instead of tri-partitioning the classification of a subject’s cohort only by CN, MCI or AD. It is the continuous degree for a given subject along the scale from 0 for the basin of CN state to 1 for the basin of AD state. One could sort out quantitatively the broad spectrum of the severity degree of dementia for MCI or AD subjects with the different severity degree of dementia in that whether they are prone to CN or how much they are progressed toward AD. This study not only provided a straightforward algebraic approach to analyzing the cortical thicknesses of human brains but also suggested quantitative measures by which one could estimate both the cohort and the severity degree of AD for a given new subject based on the neuro-images from the structural MRI. The MRI data of a larger number of human brains could also be implemented into this study in a systematic and robust manner, which would facilitate the better diagnose of AD with the different degree of dementia.

## Data Availability

Data used in preparation of this article were obtained from the Alzheimers Disease Neuroimaging Initiative (ADNI) database (adni.loni.usc.edu). As such, the investigators within the ADNI contributed to the design and implementation of ADNI and/or provided data but did not participate in analysis or writing of this report. A complete listing of ADNI investigators can be found at: http://adni.loni.usc.edu/wp-content/uploads/how_to_apply/ADNI_Acknowledgement_List.pdf.

## Competing interests

All authors report no competing interests.

## Supplementary materials

## Acknowledgments

This work was supported by the National Research Foundation, Korea [grant number NRF 2017R1E1A1A03070854]. We acknowledged DGIST supercomputing big data center for the allocation of supercomputing resources. We appreciated Mookyung Cheon and Wookyung Yu for the fruitful discussions, and also Keonho Lee and Jangjae Lee of Chosun university for the preparation of MRI image data in the initial stage of this work.

Data collection and sharing for this project was funded by the AD Neuroimaging Initiative (ADNI) (National Institutes of Health Grant U01 AG024904) and DOD ADNI (Department of Defense award number W81XWH-12-2-0012). ADNI is funded by the National Institute on Aging, the National Institute of Biomedical Imaging and Bioengineering, and through generous contributions from the following: AbbVie, Alzheimer’s Association; Alzheimer’s Drug Discovery Foundation; Araclon Biotech; BioClinica, Inc.; Biogen; Bristol-Myers Squibb Company; CereSpir, Inc.; Cogstate; Eisai Inc.; Elan Pharmaceuticals, Inc.; Eli Lilly and Company; EuroImmun; F. Hoffmann-La Roche Ltd and its affiliated company Genentech, Inc.; Fujirebio; GE Healthcare; IXICO Ltd.; Janssen Alzheimer Immunotherapy Research & Development, LLC.; Johnson & Johnson Pharmaceutical Research & Development LLC.; Lumosity; Lundbeck; Merck & Co., Inc.; Meso Scale Diagnostics, LLC.; NeuroRx Research; Neurotrack Technologies; Novartis Pharmaceuticals Corporation; Pfizer Inc.; Piramal Imaging; Servier; Takeda Pharmaceutical Company; and Transition Therapeutics. The Canadian Institutes of Health Research is providing funds to support ADNI clinical sites in Canada. Private sector contributions are facilitated by the Foundation for the National Institutes of Health (www.fnih.org). The grantee organization is the Northern California Institute for Research and Education, and the study is coordinated by the Alzheimer’s Therapeutic Research Institute at the University of Southern California. ADNI data are disseminated by the Laboratory for Neuro Imaging at the University of Southern California.

## Footnotes

↵* Data used in preparation of this article were obtained from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database (adni.loni.usc.edu). As such, the investigators within the ADNI contributed to the design and implementation of ADNI and/or provided data but did not participate in analysis or writing of this report. A complete listing of ADNI investigators can be found at: http://adni.loni.usc.edu/wp-content/uploads/how_to_apply/ADNI_Acknowledgement_List.pdf.