RT Journal Article SR Electronic T1 Associations between forensic loci and neighboring gene expression levels may compromise medical privacy JF medRxiv FD Cold Spring Harbor Laboratory Press SP 2021.07.20.21260897 DO 10.1101/2021.07.20.21260897 A1 BaƱuelos, Mayra M. A1 Zavaleta, Jhony A. A1 Roldan, Alennie A1 Reyes, Rochelle-Jan A1 Guardado, Miguel A1 Rojas, Berenice Chavez A1 Nyein, Thet A1 Vega, Ana Rodriguez A1 Santos, Maribel A1 Sanchez, Emilia Huerta A1 Rohlfs, Rori YR 2021 UL http://medrxiv.org/content/early/2021/07/23/2021.07.20.21260897.abstract AB A set of 20 short tandem repeats (STRs) is used by the United States criminal justice system to identify suspects, and to maintain a database of genetic profiles for individuals who have been previously convicted or arrested. Some of these STRs were identified in the 1990s, with a preference for markers in putative gene deserts to avoid forensic profiles revealing protected medical information. We revisit that assumption, investigating whether forensic genetic profiles reveal information about gene expression variation, or potential medical information. We find six significant correlations (FDR = 0.23) between the forensic STRs and the expression levels of neighboring genes in lymphoblastoid cell lines. We explore possible mechanisms for these associations, with evidence compatible with forensic STRs causing expression variation, or being in LD with a causal locus in three cases, and weaker or potentially spurious associations in the other three cases. Together, these results suggest that forensic genetic loci may reveal expression level and, perhaps, medical information.Competing Interest StatementThe authors have declared no competing interest.Funding StatementSan Francisco State University students Mayra Banuelos, Jhony Zavaleta, Alennie Roldan, and Miguel Guardado were supported by the SFSU MBRS-RISE Fellowships (R25-GM059298), SFSU MARC Fellowships (T34-GM008574), Bridges Fellowships (R25-GM048972), and Genentech Foundation Fellowships. Funds through NIH-R35GM128946-01 supported Emilia Huerta Sanchez, funds through Brown University Predoctoral Training Program in Biological Data Science (NIH T32 GM128596) supported Mayra Banuelos. The joint sixth authors were supported through the Big Data Summer Program funded by NIH 5R25MD011714-03.Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesThe details of the IRB/oversight body that provided approval or exemption for the research described are given below:No IRB approval or exemption was needed since the only data considered in the study are published results.All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.YesAll data analyzed in the study is already published. The specific sources are referenced and links to specific tables are included in the "Online Resources" section of the manuscript. http://gymreklab.com/2018/03/05/snpstr_imputation.html https://www.ebi.ac.uk/arrayexpress/experiments/E-GEUV-1/ https://github.com/HipSTR-Tool/HipSTR-references/blob/master/human/hg19.hipstr_reference.bed.gz https://strbase.nist.gov/str_fact.htm http://hgdownload.cse.ucsc.edu/goldenpath/hg19/encodeDCC/wgEncodeRegDnaseClustered/wgEncodeRegDnaseClusteredV3.bed.gz