A busy primary care clinic sees 30 patients a day. Screening every one of them for depression is the right thing to do -- the USPSTF recommends it for all adults. But administering a 9-item questionnaire to 30 people who may or may not have depression adds up fast.
What if you could screen everyone with two questions in 30 seconds, then give the full questionnaire only to those who need it?
That's exactly what the PHQ-2 was designed to do. It's the first two questions of the PHQ-9, extracted into a standalone ultra-brief screener. Together, they form a two-step screening approach that's been adopted by health systems worldwide.
The relationship between PHQ-2 and PHQ-9
The PHQ-2 isn't a separate instrument. It's the first two items of the PHQ-9, pulled out for use as a rapid screen. Both tools come from the same research team (Kroenke, Spitzer, and Williams) and the same larger Patient Health Questionnaire instrument.
| Feature | [PHQ-2](/surveys/phq2) | [PHQ-9](/surveys/phq9) |
|---|---|---|
| **Items** | 2 | 9 |
| **Score range** | 0-6 | 0-27 |
| **Clinical cutoff** | 3 | 10 |
| **Sensitivity** | 83% (Kroenke et al., 2003) | 88% (Kroenke et al., 2001) |
| **Specificity** | 92% (Kroenke et al., 2003) | 88% (Kroenke et al., 2001) |
| **Time to complete** | ~30 seconds | ~2-3 minutes |
| **Purpose** | First-step screen / gate | Severity assessment / monitoring |
| **Severity grading** | No (positive/negative only) | Yes (5 levels) |
| **Suicidal ideation item** | No | Yes (item 9) |
| **Diagnostic algorithm** | No | Yes |
| **Functional assessment** | No | Yes |
| **Cost** | Free | Free |
What the PHQ-2 asks
The PHQ-2 consists of exactly two questions, covering the two cardinal symptoms of major depression:
1. "Little interest or pleasure in doing things" (anhedonia)
2. "Feeling down, depressed, or hopeless" (depressed mood)
Same response options as the PHQ-9: "Not at all" (0), "Several days" (1), "More than half the days" (2), "Nearly every day" (3). Total score ranges from 0 to 6.
These aren't random selections. The DSM requires that at least one of these two symptoms be present for a diagnosis of major depressive disorder. If neither is present, major depression is essentially ruled out. That's what makes the PHQ-2 work as a gate: it targets the two symptoms without which the diagnosis can't be made.
How the two-step model works
The stepped screening approach follows a simple logic:
Step 1: Administer the PHQ-2 to everyone
Every patient completes two questions. This takes about 30 seconds and can be done on a check-in form, a tablet in the waiting room, or verbally by a medical assistant.
If the score is 0-2 (negative screen): Depression is unlikely. No further depression screening is needed at this visit. The patient continues with their appointment as planned.
If the score is 3-6 (positive screen): Depression is possible. Proceed to Step 2.
Step 2: Administer the PHQ-9 to positive screens
Patients who scored 3 or higher on the PHQ-2 now complete the full PHQ-9. Since the PHQ-2 items are the first two items of the PHQ-9, some systems simply have the patient complete the remaining seven questions. Others administer the full PHQ-9 from scratch.
The PHQ-9 provides:
- A severity score (0-27) with five clinical bands
- Treatment guidance tied to severity
- A suicidal ideation screen (item 9)
- A functional impairment assessment
- A baseline for monitoring treatment response
The math behind stepped screening
In a general primary care population, roughly 10-15% of patients will screen positive on the PHQ-2. This means the full PHQ-9 is administered to only a fraction of patients, saving significant time.
For a clinic seeing 30 patients per day:
- Without stepped screening: 30 patients complete the PHQ-9 (about 90 minutes of total patient time)
- With stepped screening: 30 patients complete the PHQ-2 (about 15 minutes), and roughly 3-5 positive screens complete the PHQ-9 (about 10-15 minutes). Total: about 25-30 minutes of patient time.
That's a 60-70% reduction in screening burden while maintaining strong diagnostic accuracy.
Psychometric analysis: What you gain and lose
PHQ-2 accuracy
At a cutoff of 3 or greater, the PHQ-2 achieves (Kroenke et al., 2003, Medical Care):
- Sensitivity: 83% -- It correctly identifies 83% of people who have major depression
- Specificity: 92% -- It correctly clears 92% of people who don't have depression
The PHQ-2 has also been validated in older adults, where Li et al. (2007) found sensitivity approaching 100% and specificity of 77% in non-institutionalized adults over age 65.
PHQ-9 accuracy
At a cutoff of 10, the PHQ-9 achieves (Kroenke et al., 2001, Journal of General Internal Medicine):
- Sensitivity: 88% -- It correctly identifies 88% of people with major depression
- Specificity: 88% -- It correctly clears 88% of people without depression
Combined stepped approach accuracy
A meta-analysis by Levis et al. (2020) confirmed that the two-step PHQ-2/PHQ-9 approach maintains excellent combined diagnostic accuracy. The sequential logic works like this:
1. PHQ-2 catches most true positives (83% sensitivity) and filters out most true negatives (92% specificity)
2. PHQ-9 further refines the positive screens with its own 88%/88% sensitivity/specificity
The combined false negative rate of the stepped approach is slightly higher than using the PHQ-9 alone, because some depressed patients will score below 3 on the PHQ-2 and never receive the full screen. This is the trade-off: you miss about 17% of true depression cases at the first step that would have been caught by the PHQ-9 alone.
What the missed cases look like
The 17% of depressed patients missed by the PHQ-2 tend to share certain characteristics:
- Somatic presentation: Their depression manifests primarily through physical symptoms (fatigue, sleep disruption, appetite changes) rather than mood changes or anhedonia. They may not feel "depressed" per se, but they're exhausted, sleeping poorly, and losing weight.
- Mild cases near the threshold: Patients with PHQ-9 scores in the 10-14 range (moderate) are more likely to score below 3 on the PHQ-2 than those with more severe depression.
- Depression without anhedonia: While rare, some presentations of depression are dominated by guilt, concentration problems, or psychomotor changes without prominent anhedonia or depressed mood.
For most clinical purposes, this false negative rate is acceptable. But it's important to know it exists.
When the PHQ-2 alone is enough
The PHQ-2 was designed as a gate, not a standalone diagnostic tool. Its original validation study (Kroenke et al., 2003) is explicit: positive screens should be followed by the PHQ-9 or a clinical interview. However, there are situations where the PHQ-2 alone provides sufficient information:
Population-level screening
In large-scale epidemiological studies or public health screenings, the PHQ-2 provides a reasonable estimate of depression prevalence. It's not precise at the individual level, but across thousands of respondents, it provides useful population data.
Quick clinical triage
In emergency departments, urgent care, or other settings where time is extremely limited, a PHQ-2 score of 0 provides reasonable reassurance that major depression is not the acute issue. If the score is positive, it can flag the need for follow-up without requiring immediate full assessment.
Repeated brief check-ins
Some clinicians use the PHQ-2 as a between-visit check-in, reserving the PHQ-9 for formal assessment sessions. A text message or patient portal question with the two items can signal when a patient needs to come in sooner.
When the patient declines further screening
If a patient screens positive on the PHQ-2 but declines the PHQ-9, the PHQ-2 result still provides actionable information. A score of 3+ warrants a clinical conversation about depression, even without the full severity assessment.
When you need the full PHQ-9
Initial assessment of depression
When a patient presents with symptoms suggestive of depression, skip the PHQ-2 and go straight to the PHQ-9. The two-step approach is designed for universal screening of unselected populations, not for evaluating patients who are already reporting depression.
Treatment monitoring
The PHQ-2 cannot track treatment response meaningfully. Its 0-6 range doesn't provide enough resolution to detect the 5-point change that signals clinically meaningful improvement on the PHQ-9. If your patient starts an antidepressant, you need the PHQ-9 every 2-4 weeks to monitor response.
Severity assessment
A PHQ-2 score of 6 tells you depression is very likely. It doesn't tell you whether it's moderate (PHQ-9 score 10-14) or severe (PHQ-9 score 20-27). This distinction matters for treatment planning -- moderate depression might warrant watchful waiting or therapy alone, while severe depression typically requires pharmacotherapy with or without psychotherapy.
Suicidal ideation screening
The PHQ-2 does not include item 9. If suicidal ideation screening is a clinical priority -- and in most settings it should be -- the PHQ-9 is necessary. This is the single most important clinical limitation of the PHQ-2.
Documentation and referral
Insurance requirements, quality metrics, and referral processes typically require the PHQ-9. A PHQ-2 score alone may not satisfy documentation needs for initiating antidepressant therapy or referring to mental health services.
Practical implementation: How clinics do it
Workflow option 1: Paper-based sequential
1. Check-in staff hands every patient a PHQ-2 form
2. Medical assistant scores it before the clinician enters the room
3. If score >= 3, the PHQ-9 is administered (either the remaining 7 items or the full 9)
4. Results are documented in the chart for the clinician
Workflow option 2: Digital pre-screening
1. Patient completes PHQ-2 via patient portal or check-in tablet
2. System automatically presents PHQ-9 if score >= 3
3. Results populate the EHR before the visit
4. Clinician reviews flagged scores during chart review
Workflow option 3: Verbal administration
1. Medical assistant asks the two PHQ-2 questions during vitals
2. If positive, clinician administers the full PHQ-9 during the visit
3. This approach works well when literacy or language barriers make paper forms challenging
Which cutoff to use for the PHQ-2
The standard cutoff of 3 or greater is recommended for most settings. This optimizes the balance between sensitivity (catching true cases) and specificity (avoiding false positives).
Some clinicians use a lower cutoff of 2 or greater in high-risk populations (e.g., postpartum patients, patients with chronic pain, or those with a history of depression). This increases sensitivity at the cost of more false positives, meaning more patients complete the full PHQ-9 unnecessarily. Whether this trade-off is worthwhile depends on the population and available resources.
For individuals: Which should you take?
If you're wondering whether you might be experiencing depression, here's the practical guide:
Start with the PHQ-2 if you want a very quick initial check. Two questions, 30 seconds. If both answers are "Not at all," depression is unlikely to be a major concern right now.
Go straight to the PHQ-9 if you're already concerned about depression, if you've been treated for depression before, or if you want a more detailed picture. The 2-3 minutes it takes are well worth the additional information.
If your PHQ-2 score is 3 or higher, take the PHQ-9. The two-item version tells you something might be going on, but the nine-item version tells you how significant it is and gives you a number to share with a provider.
For ongoing self-monitoring, the PHQ-9 is more useful. Its wider score range shows trends more clearly. A PHQ-2 score that goes from 4 to 3 doesn't tell you much; a PHQ-9 score that goes from 18 to 11 tells you a lot.
The PHQ-2 as part of a broader screening battery
Many health systems pair the PHQ-2 with other ultra-brief screeners to create an efficient multi-condition screening approach:
- PHQ-2 (depression gate) + GAD-2 (anxiety gate): Four questions covering the two most common mental health conditions. Positive screens on either trigger the corresponding full instrument.
- PHQ-2 + AUDIT-C (alcohol): Captures depression and problematic alcohol use, which commonly co-occur.
- PHQ-2 + PC-PTSD-5 (trauma): Appropriate for populations with high trauma exposure, such as veteran or emergency department patients.
This stacked ultra-brief screening approach can cover multiple conditions in under two minutes, with full assessment reserved for flagged concerns.
Common questions
If the PHQ-2 items are the first two items of the PHQ-9, do I count them twice?
No. If a patient scores positive on the PHQ-2 and then completes the PHQ-9, the total PHQ-9 score includes items 1 and 2. You don't add the PHQ-2 score on top of the PHQ-9 score. Some systems have the patient complete only items 3-9 after a positive PHQ-2, then combine all nine items for the total. Others restart with the full PHQ-9. Either approach yields the same total.
Can I use the PHQ-2 score as part of the PHQ-9 total?
Yes. The PHQ-2 items are identical to PHQ-9 items 1 and 2. If a patient scored 4 on the PHQ-2 (e.g., 2 on each item), and then answers items 3-9 for a subtotal of 10, the PHQ-9 total is 14.
What if my PHQ-2 is negative but I still feel depressed?
This happens in about 17% of cases where depression is present. If you're concerned despite a low PHQ-2 score, take the full PHQ-9. The additional seven items capture symptoms (sleep, appetite, fatigue, concentration, psychomotor changes, self-worth, suicidal thoughts) that might be driving your depression without prominent anhedonia or depressed mood.
How often should I screen with the PHQ-2 vs PHQ-9?
For routine universal screening (e.g., annual physical, new patient visit), the PHQ-2 is efficient and appropriate. For patients with known depression, in active treatment, or at high risk, use the PHQ-9 directly every 2-4 weeks.
Is the PHQ-2 validated for adolescents?
The PHQ-9 is validated for ages 12 and above, and the PHQ-2 items are a subset of the PHQ-9. While less extensively studied independently in adolescents, the PHQ-2 is used in adolescent screening protocols, with positive screens followed by the full PHQ-A (adolescent version of the PHQ-9).
Does the PHQ-2 work across cultures?
It has been studied in multiple languages and populations. However, the same caveats that apply to the PHQ-9 apply here: in cultures where depression presents primarily through somatic symptoms rather than mood changes, the PHQ-2's exclusive focus on anhedonia and depressed mood may reduce sensitivity.
The bottom line
The PHQ-2 and PHQ-9 aren't competing tools -- they're partners in a stepped screening strategy. The PHQ-2 is the fast gate: two questions, 30 seconds, high specificity. It efficiently identifies who needs further evaluation. The PHQ-9 is the full assessment: severity grading, treatment guidance, suicidal ideation screening, and a monitoring baseline.
Use the PHQ-2 to screen everyone. Use the PHQ-9 to evaluate those who screen positive. Use the PHQ-9 directly when depression is already on the table. This two-step approach reduces screening burden by 60-70% while maintaining strong diagnostic accuracy -- a practical solution for the real-world challenge of screening at scale.