Evaluation of applicability of the online version of HADS-D for depression phenotype screening in the general population

One of the most promising areas of research into the biological underpinnings of depression is genetic studies. However, the absence of generally accepted phenotyping methods leads to the difficulties in generalizing their results due to the heterogeneity of the samples. Thus, the development of a reliable and convenient phenotyping method that allows large sample sizes to be included in studies remains a top priority for the further development of genetic studies of depression. The aim of this study was to evaluate the applicability of the online version of the depression subscale of Hospital Anxiety and Depression Scale (HADS-D) for depression phenotype screening in the general population. Using online HADS-D we performed screening of depressive symptoms and compared results with known population patterns of depression. We conducted an online survey of 2610 Russian-speaking respondents over the age of 18. The overall HADS-D score was higher in women (p=0.003), in individuals under 30 y.o compared to participants over 42 y.o.(p=0.004) and in individuals reporting cardiovascular diseases (CVD) symptoms (p<0.0001). Linear regression showed that the presence of CVD leads to higher HADS-D scores (p<0.001), male gender (p=0.002) and older age (p<0.001) led to lower scores. Logistic regression showed that CVD increases the risk of having depression symptoms by HADS-D (p=0.033, OR=1.29), older age (p=0.015, OR=0.87) and male sex (as a trend, p=0.052, OR=0.80) decrease this risk. These results are consistent with the known data on the association of sex, age, and the presence of CVD with the prevalence of depression. The online version of HADS-D, given the ease of its usage, can be regarded as an effective tool for phenotyping depression in the general population.

Depression is one of the most common mental disorders. According to the World Health Organization, depression is one of the leading causes of disability and contributes significantly to the global burden of disease [1]. Given the extreme social significance of depression, the study of its biological underpinnings is considered to be one of the priority areas in medical research.
To date, a large amount of data has been accumulated on the significant role of genetic factors in depression risk and development. That makes it possible to regard genetic studies as a promising direction for studying its biological mechanisms and identifying biomarkers of the risk [2]. Today, the most informative and promising field of research are genome-wide association studies (GWAS), which require the largest possible sample size and qualitative phenotypic criteria while maintaining a reasonable ratio of research cost and the value of its potential result [3].
A number of GWAS studies on depression have been conducted recently, but generalizing their results is still difficult, mainly due to the extreme heterogeneity of the samples in these studies. This variability is caused both by the clinical heterogeneity of depression and by the use of different approaches to depression phenotyping, which is a consequence of the absence of generally accepted clinical phenotypes of this condition [2].
In some works, depression is determined on the basis of criteria specified in modern diagnostic classifications (ICD (International Classification of Diseases) and DSM (Diagnostic and Statistical Manual of Mental Disorders) of various revisions), which requires a personal contact with a psychiatrist during a clinical interview. This approach is difficult when analyzing large samples due to the high costs and high expending of resources [4]. Other works use a variety of diagnostic scales and questionnaires for self-completion (including online versions), medical records and registers [5]. Finally, in some works "minimal phenotyping" is used, based on self-reports of respondents about the presence or absence of a diagnosis of depression or the fact of taking antidepressants. This approach allows researchers to analyze very large samples (up to hundreds of thousands of participants), but the quality of phenotyping is significantly deteriorating [6].
Thus, the development of reliable and convenient methods for depression phenotyping in the general population remains a top priority for the further development of genetic studies of depression. This will allow unifying the approach and increasing the homogeneity of samples. An important condition for such a method should be the ability to analyze large samples and relative simplicity while maintaining clinical validity.
One of the possible options is using the screening psychometric scales. Their use in research practice has a number of advantages. First, the screening scales are self-questioning, which can simplify data collection and increase sample sizes. Second, the screening scales are based on a continuous quantitative approach, which provides additional opportunities for analyzing not only categorical, but also quantitative elements of the depression phenotype, increasing the quality of phenotyping. Nevertheless, remains the question of their effectiveness and reliability in identifying depression as a clinical phenomenon. This can be assessed by comparing the results of the screening with patterns of depression prevalence in the general population. These include the high prevalence and severity of depression among women [7]; the heterogeneity of the prevalence of depression in different age groups [8]; and pronounced association of depression with cardiovascular diseases (CVD) [9].
One of the tools for depressive symptoms screening in the context of the population patterns mentioned above can be the Hospital Anxiety and Depression Scale (HADS) [10]. The data accumulated over the years of using HADS have shown that this scale is a fairly reliable tool for screening anxiety and depression in general practice patients [11]. HADS consists of two subscales -anxiety and depression. This allows depressive symptoms to be assessed separately using only the corresponding subscale -HADS-D (HADS -Depression). In addition, HADS does not include questions related to autonomic and general somatic symptoms, which can "overlap" with other nosologies and lead to false positive results. In addition, the simple structure of HADS makes it possible to use it in an online questionnaire format, which can simplify the data collection procedure, potentially preserving the accuracy of this tool in its ability to detect depressive disorders in the general population.
There are some publications about the possibilities of using online-HADS-D for assessing depressive symptoms, but only in specific cohorts: in patients with cystic fibrosis [12] and in patients with tinnitus [13]. The potential of the online version of HADS as a screening tool for the detection and phenotyping of depression in the general population has not yet been explored.
Thus, the aim of this study was to investigate the capabilities and applicability of the online version of HADS-D as a potential tool for phenotyping depression in the general population.

METHODS
The clients of Genotek Ltd., provider of genetic testing services in Russian Federation, participated in this study. Data were collected during 2019-2020 by an online survey on the Internet portal of Genotek Ltd. The study included respondents over 18 years old, who voluntarily gave their personal data. Respondents who did not complete the questionnaire were excluded from the study. Participants' personal information was completely hidden during the study. All research was approved by the ethics committee of Genotek Ltd. The participants have provided written informed consent. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted October 20, 2020. .
The questionnaire was in Russian and included three blocks. The first block contained questions about sex and age. The second block consisted of questions from the depression subscale of HADS (HADS-D). There were 7 given questions, each of which was offered 4 options of answer, assessed from 0 to 3 points depending on the severity of the symptom. The third block consisted of questions about whether the respondent had diagnoses of cardiovascular diseases (CVD) or certain symptoms associated with CVD. The respondents were asked to answer "yes" or "no" to the questions about the presence of the following conditions: "high blood pressure (hypertension)", "cardiac rhythm disturbances", "heart pain", "shortness of breath", "stroke", "heart attack", "lower extremity vessel thrombosis", "pulmonary embolism", "varicose veins". The presence of at least one affirmative answer to the questions from this block was the reason for classifying the respondent as a person with possible CVD.
To identify the differences between the respondents depending on age, groups by age were formed based on the analysis of the age distribution in the studied cohort using quartiles (Q1 = 30, Q2 (Me) = 35, Q3 = 42). Accordingly, four age groups of patients were identified: up to 30 y.o., 31 to 35 y.o., 36 to 42 y.o. and 43 y.o. and older.
We analyzed the differences between groups for two characteristics associated with the HADS-D score for each participant: the mean score (quantitative assessment) and the category of the participant determined by the individual score: "no depression" (0-7 points), "subclinical depression" (8-10 points) and "clinical depression" (11 or more points) [10].
To analyze statistical data the R programming language was used [14]. For the analysis of quantitative variables, the nonparametric Wilcoxon-Mann-Whitney and the Kruskal-Wallis tests were used. The Fisher's exact test was used for the analysis of nominal variables. If there were multiple comparisons, Bonferroni and Holm corrections were applied. Multiple linear regression was used to assess the influence of factors (sex, CVD and age) on the HADS-D score. The effects of factors on the risk of developing depressive symptoms (as determined by HADS-D), was assessed using binary logistic regression. The absence of depression by HADS-D (0-7 points) and the presence of depressive symptoms at the subclinical and clinical levels (8 or more points) were chosen as binary outcomes. It is worth noting that a meta-analysis of studies using HADS-D proved that 8 points are the optimal threshold value for detecting the presence of depressive symptoms in people without psychiatric diagnoses [11].

Sample characteristics
The final sample size was 2610 people, of which 51.7% were women (1349). The average age of the respondents was (M (SD)) 36.79 (9.67) years. The presence of CVDs was reported by 30.4% (794) of the respondents.
The overall results for HADS-D are shown in Table 1. Respondents with clinical depression were significantly younger compared to participants without depressive symptoms (M (SD): 34.5 (10.5) vs. 37.0 (9.7), p = 0.038). There were no age differences between participants with clinical and subclinical depression (36.0 (9.3)) (p = 0.342). is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted October 20, 2020. Gender differences In women the mean HADS-D score was significantly higher (p=0.003; Table 2). Among women, the proportion of respondents with clinical depression (in comparison with the proportion of nodepression group) was higher compared to men (p=0.046), however, after applying the correction for multiple comparisons, this difference ceased to reach the level of statistical significance (p=0.139). Also, men were significantly older than women (p <0.0001). There were no significant differences in the incidence of CVD (p=0.202) between men and women. Thus, women, despite a higher mean score, did not differ from men in the incidence of clinical and subclinical depression as assessed by HADS-D. Age (М(SD)), years 37,6 (9,7)* 36,1 (9,5) % of respondents with CVD (n) 31,6% (399) 29,3% (395)
is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted October 20, 2020.

Age differences
It was found that in the group of young respondents (up to 30 y.o.), the average HADS-D score was significantly higher compared to older respondents (43 and older) (p = 0.004; Table 3, Fig.  1), as well as compared with participants aged 36 to 42, although this difference is only a trend (p = 0.082).
In the group of young respondents up to 30 y.o., the proportion of participants with clinically significant depression (when compared with the proportion of participants without depressive symptoms) was significantly higher than in the group of respondents aged 36 to 42 (p = 0.00134). In addition, in the group of older respondents (43 and older), the proportion of respondents with CVD was significantly higher compared to other age groups (p <0.01) is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted October 20, 2020.

Fig. 1. Box-plots and violin-plots of differences between age groups
The gray areas show the density of distribution of HADS-D scores in age groups.

Differences between respondents with and without CVD
. CC-BY 4.0 International license It is made available under a perpetuity.
is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted October 20, 2020. .
The mean HADS-D score among respondents with CVD was higher (p<0.0001; Table 4). However, in the group of respondents with CVD, the proportions of participants with subclinical and clinical depression did not differ significantly from those in the group without CVD. Groups with and without CVD did not differ in the proportion of women (p=0.202), however, participants with CVD were significantly older (p <0.0001).  is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted October 20, 2020. . Logistic regression (Fig. 2) showed that the presence of CVD is a risk factor for the presence of symptoms of depression (p=0.033, OR=1.29). On the contrary, older age reduces the risk (p=0.015, OR=0.87) as a male sex too (trend, p=0.052, OR=0.80).

DISCUSSION
In our study the online version of HADS-D revealed the same population patterns found with other screening methods for depression. Thus, using this phenotyping method, we demonstrated that a greater severity of depressive symptoms is associated with the female sex, younger age and the presence of CVD.
According to the results we obtained using the online HADS-D questionnaire as a screening method for depression, the proportion of respondents with depressive symptoms was 14.4% (of which 11.5% were at subclinical level, and 2.9%at clinical). Our data is somewhat lower than the results obtained by Shalnova et al. in the study of the prevalence of anxiety and depression in the general Russian population using HADS, conducted in 2012-2013 during the epidemiological study of CVD. It should be noted that HADS was used in a "paper" format and was a part of extensive medical examination with the use of a variety of laboratory and instrumental methods. In this study, the proportion of respondents with subclinical depression (HADS-D ≥ 8) was 25.6%; and with clinical depression (HADS-D≥11) -8.8% [15]. However, in other studies using HADS, conducted on European populations, the proportion of respondents with 8 or more points was lowerfrom 11% to 23.5%; however, the proportion of patients with 11 or more points was comparable, ranging from 5% to 9.6% [16]- [19]. The mean HADS-D score in our study was 4.4 points, which is also comparable to the data on the mean HADS-D score in the European population, which ranged from 3.0 to 4.7 points [16], [17], [ 20] - [22]. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted October 20, 2020. . In our study the mean HADS-D score was higher in women. This result is confirmed by numerous studies (also using self-questionnaires) indicating a greater severity of depression in women compared to men. In a study by Shalnova et al. there have been demonstrated similar differences in HADS-D scores between genders [15]. Gender differences in the severity of depression (assessed using Beck Depression Inventory (BDI) and General Health Questionnaire (GHQ) scales) were also demonstrated in the European population [23]. However, according to the results of our study, the proportion of respondents with depressive symptoms did not differ significantly between sex groups. Although data from epidemiological studies based on the formal diagnostic criteria of the DSM or ICD show a higher incidence of depression in women [24], some HADS studies indicate that there is no such difference. For example, a study by Hinz et al., conducted in a German population, demonstrated no significant differences between men and women in terms of both the mean score and the proportion of respondents with depressive disorders [17]. One of the largest HADS studies in the general population, conducted in Norway, also found no significant difference between sex groups in the proportion of respondents with depressive disorders [21]. In addition, some studies have found even higher rates of depression when assessed by HADS-D in men [18], [25]. Several explanations have been proposed for the observed "reverse" sex pattern of depression as measured by HADS-D. Some researchers argue that depression in women is more likely to manifest with somatic symptoms, therefore, the absence of questions about them in HADS-D leads to a decrease in the detection of depressive disorders in women [26]. Other authors demonstrate that this difference may be due to the fact that HADS-D questions are able to identify only one of the possible clinical subtypes of the disease -"anhedonic" depression [21]. At the same time, studies of anhedonia have shown the absence of sex differences, which, in turn, may cause changes in the sex structure of depression when assessed by HADS-D [25]. In addition, much of the epidemiological data on the higher incidence of depression in women is based primarily on treatment demand. Men are known to seek help less often and, thus, more cases of depression in men remain undiagnosed. It has been shown that treatment demand in men is associated with a greater severity of health problems than in women [27]. In our study design participants did not seek medical care for depression and that could also reduce the observed difference between men and women in depression. Moreover, this fact may be an advantage of our study, making it possible to identify a more objective picture of the distribution of depressive symptoms in the population outside of their connection with the use of medical care.
We have demonstrated that participant's age also was associated with the severity and incidence of depressive symptoms as assessed by HADS-D. Depression has been considered more common in middle-aged and seniors and that was demonstrated in population studies conducted in the late 20th and early 21st centuries [24]. An increase in the incidence of depressive disorders in older age groups has also been demonstrated in various studies using HADS-D [17], [21]. However, our results indicate a greater severity of depressive symptoms in the young age group (up to 30 y.o.), which correlates with the results of newer studies. For example, in a 2012 study by Patten et al., conducted on Canadian adults, there was a 2% decrease in the annual prevalence of depression with each year of life [28]. Similar results were demonstrated in a study by Almas et al. conducted on a Swedish population. Authors found that depressed patients (assessed by Major Depression Inventory) were younger than healthy participants. [29]. These results can be explained by a significant increase in the incidence of depression in the young age group over the past 15 years. According to the results of the 2017 National Survey on Drug Use and Health study conducted in the United States, the annual prevalence of depression in the 18-25 age group was significantly higher compared to older groups [30].
Our results show that CVD, self-reported by respondents, was also a factor associated with the severity of depressive symptoms when assessed by online HADS-D. Although the proportion of patients with depressive symptoms did not differ significantly among respondents with CVD compared with healthy respondents, logistic regression analysis showed that the presence of CVD increases the risk of developing depressive symptoms (OR = 1.29). These results are consistent with the known data on a higher level of depression in patients with CVD in various populations, including Russian one [15], [24].

Strengths
To our knowledge, this is the first study with the evaluation of the online version of HADS-D for phenotyping depression in the general population. A large sample of respondents was studied, and positive results were obtained, which makes it possible to use this tool as one of the methods of phenotyping depression in the general population. New data were obtained on the prevalence of depressive symptoms in the Russian population, as well as on the association of sex, age, and the presence of cardiovascular diseases on the severity of depressive symptoms.

Limitations
This study had several limitations. We did not use diagnostic tools to assess prevalence of depression as a clinical diagnosis in our sample and were unable to compare these rates with the results of screening with HADS-D. In addition, HADS-D allows identifying only depressive symptoms that are relevant at the time of testing and does not allow assessing the possible presence of depression in the past. Also, we did not conduct medical examinations of respondents and did not have access to their medical records, and therefore we determined the presence of CVD solely on the basis of self-report.

CONCLUSIONS
Online HADS-D demonstrated results similar to those obtained with the "paper" version, which assumes personal contact between respondents and researchers. In addition, the population patterns of depression prevalence identified with the online version of HADS-D are consistent with the results of large-scale epidemiological studies of depression. Thus, the online HADS-D, as a simpler and less expensive method with satisfactory applicability and may be efficient enough for depression phenotype screening in the general population. However, further studies on larger samples are needed to more accurately determine the capabilities and limitations of this tool in depression phenotyping for genetic studies.