PT - JOURNAL ARTICLE AU - Mathew V Kiang AU - Jarvis T Chen AU - Nancy Krieger AU - Caroline O Buckee AU - Monica J Alexander AU - Justin T Baker AU - Randy L Buckner AU - Garth Coombs III AU - Janet W Rich-Edwards AU - Kenzie W Carlson AU - Jukka-Pekka Onnela TI - Sociodemographic Characteristics of Missing Data in Digital Phenotyping AID - 10.1101/2020.12.29.20249002 DP - 2021 Jan 01 TA - medRxiv PG - 2020.12.29.20249002 4099 - http://medrxiv.org/content/early/2021/01/04/2020.12.29.20249002.short 4100 - http://medrxiv.org/content/early/2021/01/04/2020.12.29.20249002.full AB - The ubiquity of smartphones, with their increasingly sophisticated array of sensors, presents an unprecedented opportunity for researchers to collect diverse, temporally-dense data about human behavior while minimizing participant burden. Researchers increasingly make use of smartphone applications for “digital phenotyping,” the collection of phone sensor and log data to study the lived experiences of subjects in their natural environments. While digital phenotyping has shown promise in fields such as psychiatry and neuroscience, there are fundamental gaps in our knowledge about data collection and non-collection (i.e., missing data) in smartphone-based digital phenotyping. Here, we show that digital phenotyping presents a viable method of data collection, over long time periods, across diverse study participants with a range of sociodemographic characteristics. We examined accelerometer and GPS sensor data of 211 participants, amounting to 29,500 person-days of observation, using Bayesian hierarchical negative binomial regression. We found that iOS users had higher rates of accelerometer non-collection but lower GPS non-collection than Android users. For GPS data, rates of non-collection did not differ by race/ethnicity, education, age, or gender. For accelerometer data, Black participants had higher rates of non-collection while Asian participants had slightly lower non-collection. For both sensors, non-collection increased by 0.5% to 0.9% per week. These results demonstrate the feasibility of using smartphone-based digital phenotyping across diverse populations, for extended periods of time, and within diverse cohorts. As smartphones become increasingly embedded in everyday life, the insights of this study will help guide the design, planning, and analysis of digital phenotyping studies.Competing Interest StatementJPO is a co-founder of a recently founded company on digital phenotyping. JTB has received consulting fees from Verily Life Sciences and Mindstrong, Inc. for unrelated work. All other authors have no conflicts of interest to disclose.Funding StatementJPO, MVK, and KWC received support from the National Institutes of Health (DP2MH103909). GC III received support from the National Institutes of Health (T90DA022759) and The Sackler Scholar Programme in Psychobiology. JPO and JWR-E received support from Harvard Catalyst (3UL1TR001102). MVK received support from the National Institute on Drug Abuse (K99DA051534). JTB received support from the National Institute of Mental Health (U01MH116925). The funders had no role in the study design, data collection and analysis, decision to publish, or preparation of this manuscript.Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:Each study received institutional review board (IRB) approval from their respective institutions for data collection (Table S2); another IRB approved by Harvard University governed the secondary analysis of the collected Beiwe data.All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).Yes I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.YesWhile this research does not used only metadata (e.g., timestamps of GPS pings rather than coordinates), dates of participant activity can be considered personally identifiable information; therefore, the data cannot be shared publicly. Data available upon request, contingent upon appropriate IRB approvals or exemptions from participating institutions. While not the raw data, these data will provide sufficient information to reproduce our results (e.g., using shifted and/or adding noise to timestamps, re-randomized user identifiers). Replication code can be found at https://github.com/mkiang/beiwe_missing_data or https://github.com/onnela-lab/beiwe_missing_data (Supplementary Information Text S3). The Beiwe platform is open source and publicly available (Supplementary Information Text S1).