Skip to main content
medRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search

Sociodemographic Characteristics of Missing Data in Digital Phenotyping

View ORCID ProfileMathew V Kiang, Jarvis T Chen, View ORCID ProfileNancy Krieger, View ORCID ProfileCaroline O Buckee, Monica J Alexander, Justin T Baker, Randy L Buckner, Garth Coombs III, Janet W Rich-Edwards, Kenzie W Carlson, Jukka-Pekka Onnela
doi: https://doi.org/10.1101/2020.12.29.20249002
Mathew V Kiang
aDepartment of Epidemiology and Population Health, Stanford University School of Medicine, Stanford, California USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Mathew V Kiang
Jarvis T Chen
bDepartment of Social and Behavioral Sciences, Harvard T.H. Chan School of Public Health, Boston, Massachusetts USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Nancy Krieger
bDepartment of Social and Behavioral Sciences, Harvard T.H. Chan School of Public Health, Boston, Massachusetts USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Nancy Krieger
Caroline O Buckee
cDepartment of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, Massachusetts USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Caroline O Buckee
Monica J Alexander
dDepartment of Sociology, University of Toronto, Toronto, Ontario CAN
eDepartment of Statistical Sciences, University of Toronto, Toronto, Ontario CAN
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Justin T Baker
fDepartment of Psychiatry, Harvard Medical School, Boston, Massachusetts USA
gInstitute for Technology in Psychiatry, McLean Hospital, Belmont, Massachusetts USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Randy L Buckner
hDepartment of Psychology, Harvard University, Cambridge, Massachusetts USA
iDepartment of Psychiatry, Massachusetts General Hospital, Boston, Massachusetts USA
jDepartment of Radiology, Massachusetts General Hospital, Boston, Massachusetts USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Garth Coombs III
hDepartment of Psychology, Harvard University, Cambridge, Massachusetts USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Janet W Rich-Edwards
cDepartment of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, Massachusetts USA
kDivision of Women’s Health, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical, Boston, Massachusetts USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Kenzie W Carlson
lDepartment of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Jukka-Pekka Onnela
lDepartment of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: onnela@hsph.harvard.edu
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Data/Code
  • Preview PDF
Loading

Abstract

The ubiquity of smartphones, with their increasingly sophisticated array of sensors, presents an unprecedented opportunity for researchers to collect diverse, temporally-dense data about human behavior while minimizing participant burden. Researchers increasingly make use of smartphone applications for “digital phenotyping,” the collection of phone sensor and log data to study the lived experiences of subjects in their natural environments. While digital phenotyping has shown promise in fields such as psychiatry and neuroscience, there are fundamental gaps in our knowledge about data collection and non-collection (i.e., missing data) in smartphone-based digital phenotyping. Here, we show that digital phenotyping presents a viable method of data collection, over long time periods, across diverse study participants with a range of sociodemographic characteristics. We examined accelerometer and GPS sensor data of 211 participants, amounting to 29,500 person-days of observation, using Bayesian hierarchical negative binomial regression. We found that iOS users had higher rates of accelerometer non-collection but lower GPS non-collection than Android users. For GPS data, rates of non-collection did not differ by race/ethnicity, education, age, or gender. For accelerometer data, Black participants had higher rates of non-collection while Asian participants had slightly lower non-collection. For both sensors, non-collection increased by 0.5% to 0.9% per week. These results demonstrate the feasibility of using smartphone-based digital phenotyping across diverse populations, for extended periods of time, and within diverse cohorts. As smartphones become increasingly embedded in everyday life, the insights of this study will help guide the design, planning, and analysis of digital phenotyping studies.

Competing Interest Statement

JPO is a co-founder of a recently founded company on digital phenotyping. JTB has received consulting fees from Verily Life Sciences and Mindstrong, Inc. for unrelated work. All other authors have no conflicts of interest to disclose.

Funding Statement

JPO, MVK, and KWC received support from the National Institutes of Health (DP2MH103909). GC III received support from the National Institutes of Health (T90DA022759) and The Sackler Scholar Programme in Psychobiology. JPO and JWR-E received support from Harvard Catalyst (3UL1TR001102). MVK received support from the National Institute on Drug Abuse (K99DA051534). JTB received support from the National Institute of Mental Health (U01MH116925). The funders had no role in the study design, data collection and analysis, decision to publish, or preparation of this manuscript.

Author Declarations

I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.

Yes

The details of the IRB/oversight body that provided approval or exemption for the research described are given below:

Each study received institutional review board (IRB) approval from their respective institutions for data collection (Table S2); another IRB approved by Harvard University governed the secondary analysis of the collected Beiwe data.

All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.

Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

Yes

I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.

Yes

Data Availability

While this research does not used only metadata (e.g., timestamps of GPS pings rather than coordinates), dates of participant activity can be considered personally identifiable information; therefore, the data cannot be shared publicly. Data available upon request, contingent upon appropriate IRB approvals or exemptions from participating institutions. While not the raw data, these data will provide sufficient information to reproduce our results (e.g., using shifted and/or adding noise to timestamps, re-randomized user identifiers). Replication code can be found at https://github.com/mkiang/beiwe_missing_data or https://github.com/onnela-lab/beiwe_missing_data (Supplementary Information Text S3). The Beiwe platform is open source and publicly available (Supplementary Information Text S1).

Copyright 
The copyright holder for this preprint is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.
Back to top
PreviousNext
Posted January 04, 2021.
Download PDF

Supplementary Material

Data/Code
Email

Thank you for your interest in spreading the word about medRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Sociodemographic Characteristics of Missing Data in Digital Phenotyping
(Your Name) has forwarded a page to you from medRxiv
(Your Name) thought you would like to see this page from the medRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Sociodemographic Characteristics of Missing Data in Digital Phenotyping
Mathew V Kiang, Jarvis T Chen, Nancy Krieger, Caroline O Buckee, Monica J Alexander, Justin T Baker, Randy L Buckner, Garth Coombs III, Janet W Rich-Edwards, Kenzie W Carlson, Jukka-Pekka Onnela
medRxiv 2020.12.29.20249002; doi: https://doi.org/10.1101/2020.12.29.20249002
Digg logo Reddit logo Twitter logo CiteULike logo Facebook logo Google logo Mendeley logo
Citation Tools
Sociodemographic Characteristics of Missing Data in Digital Phenotyping
Mathew V Kiang, Jarvis T Chen, Nancy Krieger, Caroline O Buckee, Monica J Alexander, Justin T Baker, Randy L Buckner, Garth Coombs III, Janet W Rich-Edwards, Kenzie W Carlson, Jukka-Pekka Onnela
medRxiv 2020.12.29.20249002; doi: https://doi.org/10.1101/2020.12.29.20249002

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Public and Global Health
Subject Areas
All Articles
  • Addiction Medicine (62)
  • Allergy and Immunology (142)
  • Anesthesia (44)
  • Cardiovascular Medicine (409)
  • Dentistry and Oral Medicine (68)
  • Dermatology (47)
  • Emergency Medicine (141)
  • Endocrinology (including Diabetes Mellitus and Metabolic Disease) (171)
  • Epidemiology (4817)
  • Forensic Medicine (3)
  • Gastroenterology (177)
  • Genetic and Genomic Medicine (671)
  • Geriatric Medicine (70)
  • Health Economics (188)
  • Health Informatics (621)
  • Health Policy (314)
  • Health Systems and Quality Improvement (200)
  • Hematology (85)
  • HIV/AIDS (155)
  • Infectious Diseases (except HIV/AIDS) (5284)
  • Intensive Care and Critical Care Medicine (327)
  • Medical Education (91)
  • Medical Ethics (24)
  • Nephrology (73)
  • Neurology (677)
  • Nursing (41)
  • Nutrition (112)
  • Obstetrics and Gynecology (126)
  • Occupational and Environmental Health (203)
  • Oncology (439)
  • Ophthalmology (138)
  • Orthopedics (36)
  • Otolaryngology (89)
  • Pain Medicine (35)
  • Palliative Medicine (15)
  • Pathology (128)
  • Pediatrics (193)
  • Pharmacology and Therapeutics (129)
  • Primary Care Research (84)
  • Psychiatry and Clinical Psychology (771)
  • Public and Global Health (1800)
  • Radiology and Imaging (322)
  • Rehabilitation Medicine and Physical Therapy (138)
  • Respiratory Medicine (255)
  • Rheumatology (86)
  • Sexual and Reproductive Health (69)
  • Sports Medicine (61)
  • Surgery (100)
  • Toxicology (23)
  • Transplantation (28)
  • Urology (37)