ABSTRACT
A set of 20 short tandem repeats (STRs) is used by the United States criminal justice system to identify suspects, and to maintain a database of genetic profiles for individuals who have been previously convicted or arrested. Some of these STRs were identified in the 1990s, with a preference for markers in putative gene deserts to avoid forensic profiles revealing protected medical information. We revisit that assumption, investigating whether forensic genetic profiles reveal information about gene expression variation, or potential medical information. We find six significant correlations (FDR = 0.23) between the forensic STRs and the expression levels of neighboring genes in lymphoblastoid cell lines. We explore possible mechanisms for these associations, with evidence compatible with forensic STRs causing expression variation, or being in LD with a causal locus in three cases, and weaker or potentially spurious associations in the other three cases. Together, these results suggest that forensic genetic loci may reveal expression level and, perhaps, medical information.
Competing Interest Statement
The authors have declared no competing interest.
Funding Statement
San Francisco State University students Mayra Banuelos, Jhony Zavaleta, Alennie Roldan, and Miguel Guardado were supported by the SFSU MBRS-RISE Fellowships (R25-GM059298), SFSU MARC Fellowships (T34-GM008574), Bridges Fellowships (R25-GM048972), and Genentech Foundation Fellowships. Funds through NIH-R35GM128946-01 supported Emilia Huerta Sanchez, funds through Brown University Predoctoral Training Program in Biological Data Science (NIH T32 GM128596) supported Mayra Banuelos. The joint sixth authors were supported through the Big Data Summer Program funded by NIH 5R25MD011714-03.
Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
No IRB approval or exemption was needed since the only data considered in the study are published results.
All necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.
Yes
Data Availability
All data analyzed in the study is already published. The specific sources are referenced and links to specific tables are included in the "Online Resources" section of the manuscript.
http://gymreklab.com/2018/03/05/snpstr_imputation.html
https://www.ebi.ac.uk/arrayexpress/experiments/E-GEUV-1/
https://github.com/HipSTR-Tool/HipSTR-references/blob/master/human/hg19.hipstr_reference.bed.gz