August 2, 2023 | 12 min read

The Case for Passive Digital Phenotyping in Mental Health

A review of why traditional mental health assessment and self-report apps fall short, and how passive digital phenotyping — using behavioural data from smartphones and wearables — offers a scalable, objective alternative supported by evidence from 46 studies.

VT SB
Vidusha Tewari, Sugam Budhraja

Introduction

Depression and anxiety are among the most prevalent health conditions worldwide. Anxiety disorders currently affect approximately 278 million people globally and are ranked sixth most disabling among disorders overall (Kessler & Greenberg, 2002; Baxter et al., 2014). Depression affects approximately 280 million people worldwide (World Health Organization, 2021). Together, these conditions carry an enormous cost — not only in individual suffering but in lost productivity, strained relationships, and healthcare expenditure (Wade, 2012; Kinderman et al., 2015).

Nearly 50% of individuals who die by suicide have a prior history of depression, and individuals with depression are 25 times more likely to die by suicide than the general population (Centre for Suicide Prevention, 2014). The COVID-19 pandemic has compounded the problem, with evidence showing that survivors are at increased risk of developing neurological and psychiatric conditions including anxiety and mood disorders (Taquet et al., 2021).

Effective assessment is the foundation of effective treatment. Yet current methods face fundamental limitations in both clinical and consumer health settings. This review examines those limitations — in traditional clinical assessment, in self-report mental health apps, and in active data entry approaches — and makes the case for passive digital phenotyping as a scalable, objective, and evidence-supported alternative.

Limitations of Traditional Clinical Assessment

Current psychiatric assessment relies heavily on the clinician’s subjective judgement guided by the criteria of the Diagnostic and Statistical Manual of Mental Disorders, 5th edition (DSM-5; American Psychiatric Association, 2013) and the International Statistical Classification of Diseases, 10th edition (ICD-10; World Health Organization, 2010). While these frameworks provide structured diagnostic criteria, the assessment process itself remains time-consuming, resource-intensive, and often inaccessible.

Poor diagnosis or misdiagnosis of mental health disorders — which occurs at non-trivial rates in primary care settings (Vermani et al., 2011) — can lead to inadequate treatment plans that cost individuals time, money, and continued suffering. Many individuals face barriers to accessing clinical assessment altogether, whether due to cost, availability of specialists, geographic constraints, or stigma.

There is a clear need for objective, scalable methods to complement clinical assessment — methods that do not depend on a face-to-face clinical encounter to provide meaningful insight into an individual’s mental state.

The Rise of Mental Health Apps — And Their Limitations

The gap between the demand for mental health services and their availability has driven rapid growth in mental health and wellness applications. Clinicians and researchers have shown interest in these tools, viewing them as a way to help individuals manage their mental health with greater ease and accessibility (Chan & Honey, 2022; Van Ameringen et al., 2017).

Mental health apps offer several apparent advantages. They are accessible whenever the user has their device. They can track health data such as sleep, mood, and physical activity in near real-time. And they have the potential to identify behavioural patterns and triggers that inform self-management (Schueller et al., 2021).

However, a growing body of evidence reveals fundamental problems with how most of these apps collect data.

Users avoid recording negative states

In a semi-structured interview study, Schueller et al. (2021) found that while participants were interested in understanding their negative moods, many became disheartened by frequently seeing the results. This led some to avoid recording negative moods altogether, reducing the accuracy of the collected data and eliminating the app’s ability to provide a true reflection of their mental state.

Depression itself undermines engagement

The symptoms of depression — including low motivation, fatigue, and diminished interest — directly interfere with a user’s ability to engage with an app that requires active input. Torous et al. (2018) identified this as a core challenge: users who most need monitoring are least likely to provide it. Additional barriers include device fatigue, technical difficulties, and setup friction (Torous et al., 2018; Stiles-Shields et al., 2017).

Self-report data is inherently biased

Even when users do engage consistently, the accuracy of self-reported data is questionable. Perez-Pozuelo et al. (2021) note that self-report measures of sleep and physical activity often generate biased data that provides incomplete or partial information about actual behaviour. Subjective perception of sleep quality, for example, frequently diverges from objectively measured sleep architecture.

The implication is clear: mental health applications that depend on active data entry will systematically undercount the users who need help the most, and will collect distorted data from those who do engage.

Digital Phenotyping: A Passive, Objective Alternative

Digital phenotyping — a term introduced by Insel (2017) and Onnela & Rauch (2016) — refers to the moment-by-moment quantification of the individual-level human phenotype using data from smartphones and other personal digital devices. In practice, this means using passively collected behavioural data — step counts, sleep patterns, device usage, location patterns, and communication metadata — to infer aspects of physical and mental health without requiring any active input from the user.

The approach addresses the core limitations of self-report methods. There is no data entry to avoid, no motivation threshold to overcome, and no subjective bias in the recorded signals. The ubiquitous nature of smartphones means that data collection can occur continuously and at scale, covering populations that would never engage with a traditional screening questionnaire or clinical assessment.

Passive data collection minimises both the time constraints and accessibility barriers that limit current approaches, as the user does not need to actively engage in any activity or interact with a clinician to generate meaningful behavioural data.

Evidence for Passive Behavioural Markers of Depression

The scientific basis for inferring mental states from passively collected behavioural data is substantial and growing. Three bodies of evidence are particularly relevant.

Systematic review: 46 studies of mobile and wearable behavioural features

Rohani et al. (2018) conducted a systematic review of 46 studies examining correlations between objective behavioural features collected from mobile and wearable devices and depressive mood symptoms in patients with unipolar (major depressive disorder) and bipolar depression. The review included studies using the PHQ-9, HDRS, CES-D, BDI-21, DASS-21, and other validated mood scales, and covered features spanning physical activity, location, social interaction, device usage, sleep, voice, environment, and biometric signals.

Key findings for non-clinical participants:

  • Device usage (screen time, app usage, lock/unlock frequency) had the highest percentage of statistically significant correlations with depressed mood.
  • Sleep and voice features had a high percentage of statistically significant correlations.
  • Physical activity showed consistent negative associations — increased activity correlated with lower depression scores, while depression was associated with lower daytime activity but higher nighttime activity.
  • Staying at home and screen time showed strong positive correlations with depression scores across studies.
  • Social features (call duration and frequency) had the lowest percentage of statistically significant correlations.

For clinical participants, the patterns shifted:

  • Screen active duration had the highest positive correlations with depression scores.
  • Sleep duration was positively correlated with depression in clinical populations — suggesting that longer sleep duration (hypersomnia) may be a marker in individuals already experiencing depression, in contrast to the negative association seen in non-clinical populations.
  • Social interaction patterns revealed a directional asymmetry: depressed individuals were more likely to have longer incoming calls and shorter outgoing calls, suggesting withdrawal from social initiation.

Ambulatory mood assessment and mood dynamics

Burchert et al. (2021) examined whether short, frequent mood ratings collected via a smartphone app (Moodpath) could match the PHQ-9 in screening for depression. In a study of 113 participants assessed three times daily over 14 days, the researchers found strong correlations between PHQ-9 scores and momentary app-assessed depression scores.

A lower 14-day mood average was strongly associated with higher PHQ-9 scores (r = −0.66, p < 0.001) and higher Moodpath depression scores (r = −0.82, p < 0.001). Beyond average mood, higher mood instability and mood variability — indicators of emotional dysregulation — were also associated with depression symptoms. This suggests that the dynamics of mood over time, not just the level, carry diagnostic value.

Multimodal passive sensing at scale

Nickels et al. (2021) conducted a study with 600 participants (480 treatment-seeking individuals with depression, 120 non-depressed controls) using an Android smartphone app that collected data from over 20 passive sensors, including accelerometer, ambient audio, gyroscope, screen state, step count, physical activity level, and Bluetooth. Participants also completed daily active surveys about sleep, functioning, and activity, along with weekly PHQ-9 assessments over 14 weeks.

Univariate correlations of 34 behavioural features showed promising associations with PHQ-9 scores. Passive features such as physical activity minutes, social app usage, and environmental noise levels provided clinicians with potential objective signals of patient behaviour and circumstances that self-report alone cannot capture.

Implications

Taken together, these findings demonstrate that passively collected behavioural data from smartphones and wearables can reliably detect meaningful variation in depressive symptom severity. The evidence spans non-clinical and clinical populations, multiple validated mood instruments, and diverse behavioural features.

The implications extend across several use cases:

  • Early detection: Continuous passive monitoring can identify behavioural changes — such as decreased physical activity, increased screen time, disrupted sleep, or reduced social initiation — that precede self-reported symptom worsening.
  • Treatment monitoring: Passive data can complement clinical assessments by providing objective, continuous measures of behavioural change during treatment, rather than relying on periodic self-report.
  • Population-level screening: Because passive collection does not require user engagement, it can provide meaningful signals for entire user populations, including individuals who would never complete a PHQ-9 or engage with a mood tracker.
  • Reducing assessment burden: For individuals already managing mental health conditions, eliminating the requirement for active data input removes a barrier that disproportionately affects those with the highest symptom burden.

Conclusion

Traditional clinical assessment is subjective, resource-intensive, and inaccessible for large segments of the population. Self-report mental health apps, while well-intentioned, suffer from systematic data quality issues driven by user avoidance, motivational barriers inherent to the conditions they aim to measure, and the inherent biases of subjective recall.

Digital phenotyping offers a fundamentally different approach: continuous, objective, and passive collection of behavioural signals that correlate with mental health outcomes across dozens of studies. The smartphone — already carried by billions — becomes not a questionnaire delivery device, but a behavioural sensor capable of detecting meaningful health signals without requiring a single tap from the user.

As the evidence base matures and the technology improves, passive digital phenotyping is positioned to complement clinical assessment, enhance mental health apps, and enable scalable early detection at a level that active data collection methods cannot achieve.

References

  1. American Psychiatric Association. (2013). Diagnostic and statistical manual of mental disorders (5th ed.).
  2. Baxter, A. J., Vos, T., Scott, K. M., Ferrari, A. J., & Whiteford, H. A. (2014). The global burden of anxiety disorders in 2010. Psychological Medicine, 44(11), 2363–2374. https://doi.org/10.1017/S0033291713003243
  3. Burchert, S., Kerber, A., Zimmermann, J., & Knaevelsrud, C. (2021). Screening accuracy of a 14-day smartphone ambulatory assessment of depression symptoms and mood dynamics in a general population sample: Comparison with the PHQ-9 depression screening. PLOS ONE, 16(1), e0244955. https://doi.org/10.1371/journal.pone.0244955
  4. Centre for Suicide Prevention. (2014). Depression and Suicide Prevention. https://www.suicideinfo.ca/resource/depression-suicide-prevention/
  5. Chan, A. H. Y., & Honey, M. L. (2022). User perceptions of mobile digital apps for mental health: Acceptability and usability — an integrative review. Journal of Psychiatric and Mental Health Nursing, 29(1), 147–168. https://doi.org/10.1111/jpm.12744
  6. Insel, T. R. (2017). Digital phenotyping: technology for a new science of behavior. JAMA, 318(13), 1215–1216. https://doi.org/10.1001/jama.2017.11295
  7. Kessler, R. C., & Greenberg, P. E. (2002). The economic burden of anxiety and stress disorders. Neuropsychopharmacology: The Fifth Generation of Progress, 67, 981–992.
  8. Kinderman, P., Tai, S., Pontin, E., Schwannauer, M., Jarman, I., & Lisboa, P. (2015). Causal and mediating factors for anxiety, depression and well-being. British Journal of Psychiatry, 206(6), 456–460. https://doi.org/10.1192/bjp.bp.114.147553
  9. Nickels, S., Edwards, M. D., Poole, S. F., Winter, D., Gronsbell, J., Rozenkrants, B., Miller, D. P., Fleck, M., McLean, A., Peterson, B., & Chen, Y. (2021). Toward a mobile platform for real-world digital measurement of depression: User-centered design, data quality, and behavioral and clinical modeling. JMIR Mental Health, 8(8), e27589. https://doi.org/10.2196/27589
  10. Onnela, J. P., & Rauch, S. L. (2016). Harnessing smartphone-based digital phenotyping to enhance behavioral and mental health. Neuropsychopharmacology, 41(7), 1691–1696. https://doi.org/10.1038/npp.2016.7
  11. Perez-Pozuelo, I., Spathis, D., Clifton, E. A., & Mascolo, C. (2021). Wearables, smartphones, and artificial intelligence for digital phenotyping and health. In Digital Health (pp. 33–54). Elsevier. https://doi.org/10.1016/B978-0-12-820077-3.00003-1
  12. Rohani, D. A., Faurholt-Jepsen, M., Kessing, L. V., & Bardram, J. E. (2018). Correlations between objective behavioral features collected from mobile and wearable devices and depressive mood symptoms in patients with affective disorders: systematic review. JMIR mHealth and uHealth, 6(8), e165. https://doi.org/10.2196/mhealth.9691
  13. Schueller, S. M., Neary, M., Lai, J., & Epstein, D. A. (2021). Understanding people’s use of and perspectives on mood-tracking apps: interview study. JMIR Mental Health, 8(8), e29368. https://doi.org/10.2196/29368
  14. Stiles-Shields, C., Montague, E., Lattie, E. G., Kwasny, M. J., & Mohr, D. C. (2017). What might get in the way: barriers to the use of apps for depression. Digital Health, 3. https://doi.org/10.1177/2055207617713827
  15. Taquet, M., Geddes, J. R., Husain, M., Luciano, S., & Harrison, P. J. (2021). 6-month neurological and psychiatric outcomes in 236,379 survivors of COVID-19: a retrospective cohort study using electronic health records. The Lancet Psychiatry, 8(5), 416–427. https://doi.org/10.1016/S2215-0366(21)00084-5
  16. Torous, J., Nicholas, J., Larsen, M. E., Firth, J., & Christensen, H. (2018). Clinical review of user engagement with mental health smartphone apps: evidence, theory and improvements. Evidence-Based Mental Health, 21(3), 116–119. https://doi.org/10.1136/eb-2018-102891
  17. Van Ameringen, M., Turna, J., Khalesi, Z., Pullia, K., & Patterson, B. (2017). There is an app for that! The current state of mobile applications (apps) for DSM-5 obsessive-compulsive disorder, posttraumatic stress disorder, anxiety and mood disorders. Depression and Anxiety, 34(6), 526–539. https://doi.org/10.1002/da.22657
  18. Vermani, M., Marcus, M., & Katzman, M. A. (2011). Rates of detection of mood and anxiety disorders in primary care: a descriptive, cross-sectional study. The Primary Care Companion to CNS Disorders, 13(2). https://doi.org/10.4088/PCC.10m01013
  19. Wade, A. G. (2012). The economic burden of anxiety and depression. Medicographia, 34(3), 300–306.
  20. World Health Organization. (2010). International Statistical Classification of Diseases and Related Health Problems (10th Revision). https://icd.who.int/browse10/2010/en
  21. World Health Organization. (2021). Depression. https://www.who.int/news-room/fact-sheets/detail/depression