Phq-9 depression screening tool


















The objectives of this study were to use individual participant data meta-analysis to evaluate the diagnostic accuracy of the PHQ-9 screening tool among studies using semistructured, fully structured MINI excluded , and MINI diagnostic interviews as reference standards, separately, with priority given to semistructured interview results; among participants not diagnosed as having or receiving treatment for a mental health problem; and among participant subgroups based on age, sex, country human development index, and recruitment setting.

After de-duplication, unique citations were uploaded into DistillerSR Evidence Partners, Ottawa, Canada for storing and tracking of search results. Datasets from articles in any language were eligible for inclusion if they included diagnostic classification for current major depressive disorder or major depressive episode on the basis of a validated semistructured or fully structured interview conducted within two weeks of PHQ-9 administration among participants aged 18 years or over who were not recruited from youth or psychiatric settings or because they were identified as having symptoms of depression.

We required the diagnostic interviews and PHQ-9 to be administered within two weeks of each other because the Diagnostic and Statistical Manual of Mental Disorders DSM and international classification of diseases ICD diagnostic criteria for major depression specify that symptoms must have been present in the previous two weeks. We excluded patients from psychiatric settings and those already identified as having symptoms of depression because screening is done to identify previously unrecognized cases.

Datasets in which not all participants were eligible were included if primary data allowed selection of eligible participants. For defining major depression, we considered major depressive disorder or major depressive episode based on the DSM or ICD criteria.

If more than one was reported, we prioritized major depressive episode over major depressive disorder, as screening would attempt to detect depressive episodes and further interview would determine whether the episode was related to major depressive disorder or bipolar disorder, and DSM over ICD. Across all studies, there were 23 discordant diagnoses depending on classification prioritization 0. Two investigators independently reviewed titles and abstracts for eligibility.

If either deemed a study potentially eligible, two investigators did full text review independently, with disagreements resolved by consensus, consulting a third investigator when necessary.

We consulted translators for languages other than those in which team members were fluent. We invited authors of eligible datasets to contribute de-identified primary data.

Two investigators independently extracted country, recruitment setting non-medical, primary care, inpatient specialty, outpatient specialty , and diagnostic interview from published reports, with disagreements resolved by consensus.

In two primary studies, multiple recruitment settings were included, so recruitment setting was coded at the participant level. When datasets included statistical weights to reflect sampling procedures, we used the weights provided. For studies in which sampling procedures merited weighting but the original study did not weight, we constructed weights by using inverse selection probabilities.

Weighting occurred, for instance, when a diagnostic interview was administered to all participants with positive screens and a random subset of participants with negative screens. We converted individual participant data to a standard format and synthesized them into a single dataset with study level data.

We compared published participant characteristics and diagnostic accuracy results with results from raw datasets and resolved any discrepancies in consultation with the original investigators. Two investigators assessed risk of bias of included studies independently, on the basis of the primary publications, using the Quality Assessment of Diagnostic Accuracy Studies-2 tool supplementary methods B.

We did three main sets of analyses. Secondly, for each reference standard category, we estimated sensitivity and specificity across PHQ-9 cut-off scores for all participants from primary studies, as has been done in existing conventional meta-analyses and, separately, among only participants who could be confirmed as not diagnosed as having or receiving treatment for a mental health problem at the time of assessment.

We did this because existing conventional meta-analyses have all been based on primary studies that generally do not exclude patients already diagnosed as having or receiving treatment for a mental health problem. As screening is done to identify previously unrecognized cases, those patients would not be screened in practice, and their inclusion in diagnostic accuracy studies could bias results.

Among studies that used the MINI, we combined inpatient and outpatient specialty care settings, as only one study included inpatient participants.

In each subgroup analysis, we excluded primary studies with no major depression cases, as this did not allow application of the bivariate random effects model. This resulted in a maximum of 15 participants excluded from any subgroup analysis.

For each meta-analysis, for cut-off scores separately, bivariate random effects models were fitted via Gauss-Hermite adaptive quadrature.

For each analysis, this model provided estimates of pooled sensitivity and specificity. To compare results across reference standards and other subgroups, we constructed empirical receiver operating characteristic curves for each group based on the pooled sensitivity and specificity estimates and calculated areas under the curve.

We estimated differences in sensitivity and specificity between subgroups at each cut-off score by constructing confidence intervals for differences via the cluster bootstrap approach, 33 34 resampling at study and participant levels. For each comparison, we ran iterations of the bootstrap. We removed iterations that did not produce difference estimates for cut-off scores before determining confidence intervals and noted the number of iterations removed.

Similarly to our main subgroup analyses, we again determined which significant interactions replicated across all three reference standard categories.

For subgrouping variables that were significantly associated with sensitivity or specificity coefficients for all three reference standard categories for all or most cut-off scores in the main one stage meta-regression, we did additional one stage meta-regression to produce accuracy estimates for the subgroups of interest, and we compared these results with those seen in the original two stage bivariate random effects meta-analytic models.

To investigate heterogeneity, we generated forest plots of sensitivities and specificities for cut-off score 10 for each study, first for all studies in each reference standard category and then separately across participant subgroups within each reference standard category.

The other studies had eligible datasets but did not publish eligible diagnostic accuracy results supplementary table A. For all analyses, we used R R version R 3. The only substantive deviations from our initial protocol were that we stratified accuracy results by reference standard category and did not do sensitivity analyses that combined accuracy results from individual participant data meta-analysis with published results from studies that did not contribute individual participant data.

No patients were involved in setting the research question or the outcome measures, nor were they involved in developing plans for design or implementation of the study. No patients were asked to advise on interpretation or writing up of results. There are no plans to disseminate the results of the research to study participants or the relevant patient community. Reasons for exclusion for the articles excluded at full text level are given in supplementary table A.

Characteristics of included studies and eligible studies that did not provide datasets are shown in supplementary table B. Of 58 included studies, 29 used semistructured reference standards, 14 used fully structured reference standards, and 15 used the MINI table 1.

Some variables were coded at study level, and others were coded at participant level. Thus, number of studies does not always add up to total number in reference category. Table 3 and table 4 show comparisons of sensitivity and specificity estimates by reference standard category. A cut-off score of 10 maximized combined sensitivity and specificity among studies using semistructured interviews sensitivity 0.

Based on cut-off score 10, sensitivity and specificity were 0. Receiver operating characteristic curves and area under the curve values are shown in supplementary figure B. Comparison of sensitivity and specificity estimates among semistructured versus fully structured reference standards. Comparison of sensitivity and specificity estimates among semistructured versus MINI reference standards. Heterogeneity analyses suggested moderate heterogeneity across studies, which improved in some instances when we considered subgroups.

Figure 1 shows nomograms of positive and negative predictive values for cut-off score 10 for each reference standard category. When examined with meta-regression analysis, consistent with our main results, we found that PHQ-9 sensitivity estimates for semistructured interviews were significantly higher than for fully structured interviews or the MINI supplementary table D. Sensitivity and specificity estimates were not statistically significantly different for any reference standard category when we restricted analyses to participants not currently diagnosed as having or receiving treatment for a mental health problem compared with all participants.

See supplementary table E for results and supplementary figure D for receiver operating characteristic curves and area under the curve values.

For each reference standard category, comparisons of sensitivity and specificity estimates based on bivariate models across PHQ-9 cut-off scores among subgroups based on age, sex, country human development index. No comparisons that were significantly different in one reference standard category were statistically significant in either of the other two reference standard categories. Subgroup analyses are shown in supplementary table E. In the meta-regression analyses, on the other hand, older age measured continuously was associated with higher specificity for all reference standards supplementary table D.

Supplementary table F shows Quality Assessment of Diagnostic Accuracy Studies-2 ratings for each included primary study, and comparisons of PHQ-9 accuracy across individual items for each reference standard category are shown in supplementary table E.

For the item on blinding of the reference standard to PHQ-9 results, specificity was significantly greater for studies and participants with high or unclear risk versus low risk of bias for semistructured interviews but significantly greater for low risk versus high or unclear risk of bias for fully structured interviews and the MINI.

For the item on recruiting a consecutive or random sample of participants, specificity was significantly greater for low risk versus high or unclear risk of bias for fully structured interviews and the MINI. We found no other statistically significant differences, and no significant differences were replicated across all reference standards. We compared the accuracy of scores on the PHQ-9 for screening to detect major depression, separately, with semistructured diagnostic interviews, fully structured diagnostic interviews MINI excluded , and the MINI.

It is not a screening tool for depression but it is used to monitor the severity of depression and response to treatment. However, it can be used to make a tentative diagnosis of depression in at-risk populations - eg, those with coronary heart disease or after stroke. Validity has been assessed against an independent structured mental health professional MHP interview. The copyright for the PHQ-9 was formerly held with Pfizer, who provided the educational grant for Drs Spitzer, Williams and Kroenke who originally designed it.

Take our quick 5 minute survey to share your thoughts on Patient. Disclaimer: This article is for information only and should not be used for the diagnosis or treatment of medical conditions. Egton Medical Information Systems Limited has used all reasonable care in compiling the information but make no warranty as to its accuracy. Consult a doctor or other health care professional for diagnosis and treatment of medical conditions.

For details see our conditions. This article is for Medical Professionals. Several days. More than half the days. Nearly every day. Feeling down, depressed, or hopeless? Trouble falling or staying asleep, or sleeping too much? Feeling tired or having little energy? Poor appetite or overeating? Feeling bad about yourself — or that you are a failure or have let yourself or your family down? Trouble concentrating on things, such as reading the newspaper or watching television? Moving or speaking so slowly that other people could have noticed?

Or so fidgety or restless that you have been moving a lot more than usual? Thoughts that you would be better off dead, or thoughts of hurting yourself in some way?

Result: Please fill out required fields. Next Steps. Creator Insights. Kurt Kroenke. Poor appetite or overeating. Trouble concentrating on things, such as reading the newspaper or watching television. Moving or speaking so slowly that other people could have noticed. Or, the opposite - being so fidgety or restless that you have been moving around a lot more than usual.

Thoughts that you would be better off dead or of hurting yourself in some way. PHQ-9 score obtained by adding score for each question total points. Interpretation: Total scores of 5, 10, 15, and 20 represent cutpoints for mild, moderate, moderately severe and severe depression, respectively.

Note: Question 9 is a single screening question on suicide risk. A patient who answers yes to question 9 needs further assessment for suicide risk by an individual who is competent to assess this risk.



0コメント

  • 1000 / 1000