Chronic conditions are leading causes of illness, disability, and death among Medicare beneficiaries and account for a disproportionate share of healthcare expenditures.
Chronic conditions are leading causes of illness, disability, and death among Medicare beneficiaries and account for a disproportionate share of healthcare expenditures. About 14% of Medicare beneficiaries have heart failure (accounting for 43% of Medicare spending), while approximately 18% of Medicare beneficiaries have diabetes mellitus (DM) (accounting for 32% of Medicare spending).1 The Medicare Modernization Act of 2003 created the voluntary Medicare Health Support (MHS) program to coordinate care for high-risk Medicare beneficiaries with chronic illness under traditional fee-for-service Medicare.2 In its initial phase, the MHS program consisted of 3-year pilot programs at 8 sites across the United States, each operated by a different MHS organization. The MHS program provides care management services to Medicare beneficiaries with congestive heart failure (CHF) or complex DM. Each site represented a randomized controlled trial and/or covered 20,000 to 30,000 participants. Enrollment began in the summer of 2005.
Phase 1 of the MHS program recommends screening for depression. Patients with DM and CHF have a 2-fold higher prevalence of depression than other medical control subjects.3 Based on the severity and complexity of their condition, the MHS target population is likely to have prevalence rates of major depression ranging from 15% to 25%. Medically ill patients with comorbid depression have lower adherence to recommended treatments and to self-care regimens, such as diet changes and increased exercise.4 Medically ill patients with comorbid depression are more functionally impaired, manifest increased complication and mortality rates, and have 50% to 100% higher medical costs than those without depression, even after controlling for demographics and severity of medical illness.5 Improving the quality of depression management for patients with comorbid depression and DM has been shown to improve patients’ depression outcomes and to decrease medical costs during a 2-year period relative to “usual” care.6,7
This article describes efforts to optimize the identification of program participants with possible depression by the Green Ribbon Health, LLC (GRH) MHS organization. We used various methods to identify participants who should be further evaluated for the presence of depression by care managers (CMs) using the Patient Health Questionnaire 2 (PHQ-2) and, if needed, should be provided with additional resources for depression management. Although a growing consensus exists about the value of identifying depression as a necessary first step to effective intervention, there is little evidence about practical methods for large-scale depression screening, particularly in a real-world population of medically ill older adults. We discuss training and tactics to enhance telephonic depression screening. We also report discrepancies in depression prevalence among different methods of administering the PHQ-2 (ie, telephonic, by mail, or in person). Finally, we discuss the potential benefits of combining administrative and screener data into a multipronged hierarchical depression screening strategy.
This study examines depression screening data from 14,902 Medicare beneficiaries participating in the GRH MHS program who completed an initial Medicare Domain Assessment tool (hereafter, mDAT1). The MHS program is designed to serve fee-for-service Medicare beneficiaries with CHF and/or complex DM who live in MHS catchment areas.2 Beneficiaries were ineligible for the MHS program if they belonged to a Medicare Advantage plan, had end-stage renal disease, or were enrolled in hospice care. Green Ribbon Health, LLC serves MHS beneficiaries in 9 counties in Florida. The program commenced on November 1, 2005.
The core of the GRH multidisciplinary care coordination model is telephonic interface by a personal nurse, who educates and supports participants in managing their health and in following their providers’ prescribed plan of care. If a personal nurse believes that a participant’s condition warrants a more intensive face-to-face intervention, the participant is contacted by a member of the field CM team. This team is composed of social workers and registered nurses who coordinate or provide appropriate services.
All participants receive a health assessment at least every 6 to 12 months using the mDAT developed for GRH. Depression screening is a core requirement of the program by GRH and is administered at set intervals as part of the mDAT. The mDAT includes the PHQ-2, which is a 2-item depression screener.8 All GRH participants with possible depression are eligible to receive health support and education regarding depression (eg, prevalence, risks, and treatment options). Participants screening positive for depression on the PHQ-2 are further evaluated using the PHQ-9.9 They may then be referred to their primary care provider or to a specialist for assessment, diagnosis, and treatment recommendations. The CM team monitors depressive symptoms and response to treatment and provides information to participants to help them manage their symptoms and connect with appropriate resources. Care managers also promote self-care and help participants communicate directly with their physicians. For participants whose depression is not improving, CMs interface with the provider for review and adjustment of the treatment plan.
Data and Measures
Data on GRH MHS participants come from 3 main sources. These include (1) mDATs administered by GRH, (2) selfreported use of prescription drugs as assessed by interviews with GRH CMs (typically along with mDATs), and (3) Medicare eligibility and claims data provided by the Centers for Medicare & Medicaid Services (CMS).
Demographics. Limited demographic information was available from the CMS, including participants’ age, sex, and race/ethnicity. Other sociodemographic measures, particularly education, literacy, and occupation, were available for fewer than 5% of participants.
Clinical Status and Case Mix. The CMS data included the conditions for which participants qualified for the MHS program (ie, DM, CHF, or both), participants’ Hierarchical Condition Category (HCC) risk score (ie, the CMS version of the HCC risk-adjustment method), and high-, medium-, or low-risk designations based on the HCC score. In addition, GRH developed a risk-tiering algorithm that uses the HCC score and the mDAT to classify participants into categories of intervention need. These measures were later used to balance CM caseloads to assess any differences among positive depression screens.
Depression Status. Possible depression status came from 3 sources. These included (1) the PHQ-2 screens via the mDAT (a score of ≥3 of 6 is the optimum cutoff point on the PHQ-2)8; (2) the International Classification of Diseases, Ninth Revision (ICD-9) depression diagnosis codes from Medicare claims data with service dates between November 2004 and March 2007 (codes 296.2, 296.3, 298.0, 300.4, 309.1, and 311 for current depression and codes 296.2-296.9, 298, 300.4, 309.0, 309.1, 309.28, and 311 for prior episodes of depression); and (3) self-reports of lifetime antidepressant medication use (medications with US Food and Drug Administration indications for depression) from the GRH medication assessment interviews collected by CMs between November 2005 and May 2007.
To assess the rate of possible depression, we examined the proportion of participants who screened positive on the PHQ-2 mDAT1, the proportion with a diagnosis of depression during the past year, and the proportion reporting antidepressant use. In addition, because CMs varied in their experience with depression, we examined whether this affected their administration of the PHQ-2. To assess possible variation, we evaluated whether positive PHQ-2 screening rates varied by CM after adjusting for differences in caseload composition. We anticipated the use of focus groups among CMs in the event of significant variation in positive depression screens. This would allow us to determine the source of such variation and to provide training to minimize future discrepancies.
Almost all (98.7%) mDAT1s were administered by telephone; 13.8% of completed follow-up assessments (mDAT2s) were administered by mail. Green Ribbon Health, LLC conducted a “mail blitz” campaign to raise mDAT response rates and to follow up on participants who were difficult to reach by telephone. Overall, 85.1% of participants (n = 12,677) who completed the mDAT1 completed the mDAT2. We examined differences in PHQ-2—positive rates between telephone- and mail-based mDATs using bivariate analysis and multivariate logistic regression analysis to account for possible differences in caseloads.
Finally, we examined the efficiency of alternative depression screening strategies among the sample. Particularly, we illustrate the degree of independence and overlap between the following 3 sources of information on possible depression: PHQ-2 screens, self-reported use of antidepressants, and ICD-9 depression diagnoses.
Descriptive characteristics for our sample are given in Table 1. The mDAT1 column includes participants who completed the mDAT1 by telephone (190 participants who completed the mDAT1 face-to-face and 5 participants who completed the mDAT1 by mail are excluded). Participants were principally of white race/ethnicity, and slightly more than half were male. More than three fourths had DM only, while one fifth had DM and CHF. About 14% had an ICD-9 depression diagnosis, 7.1% reported current antidepressant use, and 5.1% screened positive for depression on the PHQ-2.
The mDAT2 data include participants who completed the mDAT2 by telephone (83.8%) or by mail (13.8%) (305 participants who completed the mDAT2 in person are excluded). Compared with telephone respondents, mail mDAT2 respondents were slightly older and were slightly more likely to be male and of white race/ethnicity (P <.05 for all). Primary clinical and risk characteristics did not differ statistically for responders by telephone versus by mail. However, mail respondents were more than twice as likely to screen positive for depression (P <.001), despite their being less likely to had an ICD-9 depression diagnosis during the prior year (P <.001). We used multivariate logistic regression analysis to examine the effect of the mDAT2 mode on PHQ-2 screening, controlling for the demographic and clinical characteristics in Table 1. The predicted PHQ-2—positive rates were 6.5% by telephone and 14.1% by mail (P <.001) (ie, adjusting for covariates increases the mDAT mode effect slightly).
Overall, 77 different CMs administered mDAT1s. For 46 CMs who administered 50 or more by telephone, the Figure shows the fractions of each CM’s caseload that screened positive via the PHQ-2, had an ICD-9 depression diagnosis during the prior year, and was classified as high risk based on the HCC risk-adjustment score. Among these measures, the PHQ-2—positive rate was much more variable, and this variation cannot be accounted for by caseload. Indeed, for 10 of 40 CMs who screened 200 or more participants, the PHQ-2–positive rate was 2% or lower; 2 of these CMs who each screened approximately 300 participants did not have a single positive PHQ-2 (detailed results are available from the author). In contrast, only 2 of the 45 CMs had fewer than 10% of their caseload with a recent ICD-9 depression diagnosis (and the PHQ-2 positive rate of those 2 CMs was 5%-6% at mDAT1).
Given the lower-than-expected depression screening results via telephone and the significant variation across CMs in PHQ-2—positive rates, GRH conducted a focus group with CMs. Green Ribbon Health, LLC invited CMs with low and high PHQ-2–positive rates in their mDAT1 panels to join the focus group. The goal of the focus group was to learn about the perceptions of CMs regarding depression screening, including possible reasons for CM variations, and to identify strategies to improve screening. These discussions confirmed that CMs often found it challenging to administer the PHQ-2 questions. The mDAT1s were conducted early in the program when a focus on enrollment may have limited the development of a trusting therapeutic relationship, such that the participant was uncomfortable in disclosing sensitive health issues. Care managers also described substantial “competing demands,” such as MHS program enrollment and performance of clinical assessments. In response to these issues, GRH retrained all CMs to improve competency, effectiveness, and comfort in addressing depression with medically ill older adults. This training included education on using the PHQ tool, talking about depression with clients, and dealing with high-impact situations (eg, bereavement, grief, severe depression, and suicide).
We applied each of 3 available indicators of possible depression (positive PHQ-2, self-reported use of antidepressants, and ICD-9 depression diagnosis) in each of 6 possible sequences. For each step of each sequence, we list the cumulative and marginal proportions of participants identified by the given indicators. Based on telephone responses to the mDAT1 and mDAT2, ICD-9 codes identified the most marginal participants, while PHQ-2 screens identified the fewest marginal participants. However, this pattern changed based on mail administration of the mDAT2, with antidepressant use identifying the most marginal participants, followed closely by mailed PHQ-2s; ICD-9 codes identified the fewest marginal participants.
Although the mailed PHQ-2 revealed expected depression prevalence rates, the telephonic PHQ-2 revealed surprisingly low rates of depression. Moreover, when the PHQ-2 was administered telephonically, there was substantial variation in screener- positive rates based on the person administering the screen. Several instruments have been developed and tested as depression screens in older adults.10-12 Among the most commonly used tools are the Geriatric Depression Scale13 and the Center for Epidemiologic Studies Depression Scale.14 The PHQ-2 has become one of the most widely used screening instruments in recent years because of its brevity and ease of use. Initially validated among large samples of mixed-aged primary care patients,8 this 2-item screen has been validated among Veterans Affairs populations of largely older medically ill adults who were screened in person15,16 and, more recently, among a large nationally representative sample of communityliving older adults.17 In the latter study by Li and colleagues,17 8205 older adults were screened using the PHQ-2 and were evaluated using a structured diagnostic interview for Diagnostic and Statistical Manual of Mental Disorders (Fourth Edition) major depression, with 26% of participants screening positive for depression. The study found that the criterion validity of the PHQ-2 for major depression was good (100% sensitivity, 77% specificity, 0.88 area under the curve, and 14% positive predictive value).
The GRH experience with administering the PHQ-2 by telephone found a substantially lower screener- positive rate than the study by Li and colleagues,17 which also administered the PHQ-2 telephonically. A possible reason for the difference is that trained research interviewers focusing specifically on depression administered the questions in the study by Li et al, whereas in our study the GRH questions were administered by CMs who may have had little or no experience with depression assessments delivered early in a therapeutic relationship. Moreover, administration of depression screens is a small part of the overall scope of work among CMs.
It is also possible that our MHS population was more medically ill than the participants in the study by Li and colleagues,17 and concerns have been raised about the performance of depression screeners among medically ill populations18 and among old-old adults.19 The fact that mail administration of the PHQ-2 screen yielded a consistently higher screener-positive rate than telephonic administration suggests that the problem may not be primarily the particular screening tool but rather the method of administration.
After conducting a focus group, GRH provided retraining for CMs to improve the effectiveness of telephone-based depression screening and their comfort in addressing depression. This training was generally well received, and CMs described it as helpful. Perhaps as a result of this retraining, the telephonic mDAT2 screener-positive rates were slightly higher than the telephonic mDAT1 rates (6.6% vs 5.1%), but remained lower than expected.
In part because of persistently low telephone screening rates, a mailing of PHQ-2 screens was added during the collection of the mDAT2. This mail-based approach was successfully used in a study20 of more than 10,000 older primary care patients at Group Health Cooperative. Consistent with this earlier work, we also obtained substantially higher rates of positive depression screens (doubling from 6.6% to 13.6%), especially after adjusting for differences among participants who received telephone versus mail screens for the mDAT2.
Based on the initial experience with screening, the screening strategy for depression was expanded to include information about depression status from ICD-9 claims diagnoses and from self-reported use of antidepressants. Approximately 10% of program participants had a claims diagnosis of depression during the year before program entry, a finding that is consistent with data from the current Medicare beneficiary survey and with other Medicare claims data.21 More than 15% of program participants reported the use of antidepressants. The addition of data about depression diagnoses and antidepressant use substantially improved our screening yield, increasing the proportions of participants with possible depression from 6.6% (for telephonic screening) and 13.6% (for mail screening) to 28.5% when all 3 methods were used.
We explored several screening strategies for this chronically ill Medicare population, and we recommend the following screening components based on our experience to date: (1) mail screen using the PHQ-2 (perhaps augmented by telephone screen in patients who do not return mail screens), (2) evaluation of self-reported antidepressant use during the prior year, and (3) assessment for participants with ICD-9 depression claims who have not been identified by 1 of the 2 prior methods. Using this approach, the number of participants who need a telephone screen would be small, and a group of staff who are well trained and experienced in telephonebased depression screening could be designated to perform such screens, which should decrease the variability of results.
There are limitations of this work. First, we do not report on the entire sample eligible for the GRH MHS program but only on the 14,902 participants for whom depression screening information (via the mDAT) was available. Second, we do not know the exact denominator of the mail screening sample but focus on the results of returned screens. However, when adjusted for sociodemographic and clinical differences between individuals screened in person versus by mail, the projected screener-positive rates for patients screened by mail increased. Third, the validity of the screening methods used may be limited, especially without a gold standard diagnosis for depression for comparison (eg, patients reporting the use of an antidepressant may be taking the medication for another indication, such as chronic pain).
It is important to note that screening and other case finding strategies are only the beginning of effective care management of depression. Depression screening and systematic feedback to providers alone have been shown to increase diagnosis rates but do not improve depression outcomes.22 On the other hand, screening combined with methods to improve exposure to effective treatment improves depression outcomes.23 This is why the US Preventive Services Task Force23 recommends routine screening for depression only if adequate mechanisms are in place to provide depression treatment. Effective programs for depression go beyond case finding and focus on the identification of patients who are not treated or are not effectively treated, as well as patients who are in treatment and are persistently depressed (eg, repeat screener positive). Future research will attempt to evaluate the effectiveness of depression treatment in the context of the MHS program.
1. Centers for Medicare & Medicaid Services. Medicare Health Support Overview. February 2008
. Accessed July 16, 2008.
2. McCall N, Bernard S. Evaluation of Phase I of Medicare Health Support (Formerly Voluntary Chronic Care Improvement) Pilot Program Under Traditional Fee-for-Service Medicare: Report to Congress. Washington, DC: RTI International; June 2007.
3. Katon WJ. Clinical and health services relationships between major depression, depressive symptoms, and general medical illness. Biol Psychiatry. 2003;54(3):216-226.
4. Lin EH, Katon W, Von Korff M, et al. Relationship of depression and diabetes self-care, medication adherence, and preventive care. Diabetes Care. 2004;27(9):2154-2160.
5. Unützer J. Diagnosis and treatment of older adults with depression in primary care. Biol Psychiatry. 2002;52(3):285-292.
6. Katon W, Unützer J, Fan MY, et al. Cost-effectiveness and net benefit of enhanced treatment of depression for older adults with diabetes and depression. Diabetes Care. 2006;29(2):265-270.
7. Simon GE, Katon WJ, Lin EH, et al. Cost-effectiveness of systematic depression treatment among people with diabetes mellitus. Arch Gen Psychiatry. 2007;64(1):65-72.
8. Kroenke K, Spitzer RL, Williams JB. The Patient Health Questionnaire- 2: validity of a two-item depression screener. Med Care. 2003;41(11):1284-1292.
9. Kroenke K, Spitzer RL, Williams JB. The PHQ-9: validity of a brief depression severity measure. J Gen Intern Med. 2001;16(9):606-613.
10. Williams JW Jr, Pignone M, Ramirez G, Perez Stellato C. Identifying depression in primary care: a literature synthesis of case-finding instruments. Gen Hosp Psychiatry. 2002;24(4):225-237.
11. Watson LC, Pignone MP. Screening accuracy for late-life depression in primary care: a systematic review. J Fam Pract. 2003;52(12):956-964.
12. Blank K, Gruman C, Robison JT. Case-finding for depression in elderly people: balancing ease of administration with validity in varied treatment settings. J Gerontol A Biol Sci Med Sci. 2004;59(4):378-384.
13. Yesavage JA. Geriatric Depression Scale. Psychopharmacol Bull. 1988;24(4):709-711.
14. Radloff LS. The CES-D Scale: a self-report depression scale for research in the general population. Appl Psychological Measurement. 1977;1(3):385-401.
15. Whooley MA, Avins AL, Miranda J, Browner WS. Case-finding instruments for depression: two questions are as good as many. J Gen Intern Med. 1997;12(7):439-445.
16. Corson K, Gerrity MS, Dobscha SK. Screening for depression and suicidality in a VA primary care setting: 2 items are better than 1 item. Am J Manag Care. 2004;10(11, pt 2):839-845.
17. Li C, Friedman B, Conwell Y, Fiscella K. Validity of the Patient Health Questionnaire 2 (PHQ-2) in identifying major depression in older people. J Am Geriatr Soc. 2007;55(4):596-602.
18. Nease DE Jr, Klinkman MS, Aikens JE. Depression case finding in primary care: a method for the mandates. Int J Psychiatry Med. 2006;36(2):141-151.
19. Watson LC, Lewis CL, Kistler CE, Amick HR, Boustani M. Can we trust depression screening instruments in healthy ‘old-old’ adults? Int J Geriatr Psychiatry. 2004;19(3):278-285.
20. Katon WJ, Lin E, Russo J, Unützer J. Increased medical costs of a population-based sample of depressed elderly patients. Arch Gen Psychiatry. 2003;60(9):897-903.
21. Stuart B, Zuckerman I, Doshi J, Shea D, Shaffer T, Zhao L. Medication Use by Aged and Disabled Medicare Beneficiaries Across the Spectrum of Morbidity: A Chartbook. Baltimore, MD: Peter Lamy Center on Drug Therapy and Aging, University of Maryland School of Pharmacy; May 9, 2007.
22. Katon W, Gonzales J. A review of randomized trials of psychiatric consultation-liaison studies in primary care. Psychosomatics. 1994;35(3):268-278.
23. Pignone MP, Gaynes BN, Rushton JL, et al. Screening for depression in adults: a summary of the evidence for the U.S. Preventive Services Task Force. Ann Intern Med. 2002;136(10):765-776.
We thank the many coworkers who supported our efforts, including the personal nurses at Green Ribbon Health, LLC and our data manager, Mark Bingener.
Author Affiliations: Green Ribbon Health, LLC, Tampa, FL (JKT, DMH); Mental Health Services, Epidemiology, and Economics, Division of Services and Intervention Research, National Institute of Mental Health, Bethesda, MD (MS); Department of Psychiatry and Behavioral Sciences, University of Washington, Seattle (WJK, JU); and Department of Psychiatry, Columbia University, New York, NY (HAP).
Funding Source: This study was supported by grant 1 R01 MH75159 from the National Institute of Mental Health.
Author Disclosures: Dr Pincus reports receiving payment for speaking engagements (none of which involve specific products) for Bimark Center for Medical Education, Cardinal Health, Inc, Comprehensive NeuroScience, Inc, and HealthPartners. Dr Pincus consults for Bristol-Myers Squibb, CO, Cisco Systems, Inc, UPMC Health Plan, Inc, Magellan Health Care, and the Urban Institute. The other authors (JKT, MS, WJK, DMH, JU) report no relationship or financial interest with any entity that would pose a conflict of interest with the subject matter of this article.
Authorship Information: Concept and design (JKT, MS, WJK, HAP, DMH, JU); acquisition of data (JKT); analysis and interpretation of data (JKT, MS, WJK, HAP, DMH, JU); drafting of the manuscript (JKT, MS, WJK, DMH, JU); critical revision of the manuscript for important intellectual content (JKT, MS, WJK, HAP, DMH, JU); statistical analysis (JKT, MS); obtaining funding (MS, HAP, JU); administrative, technical, or logistic support (DMH, JU); and supervision (JU).
Address correspondence to: Jennifer K. Taylor, PhD, Green Ribbon Health, LLC, 5201 W Kennedy Blvd, Ste 205, Tampa, FL 33609. E-mail: