Abstract
Studies examining relationships between genotypic and phenotypic variation have historically been carried out on people of European ancestry. Efforts are underway to address this limitation, but until they succeed, the legacy of a Euro-centric bias will continue to hinder research, including the use of polygenic scores, which are individual-level metrics of genetic risk. Ongoing debate surrounds the generalizability of polygenic scores based on genome-wide association studies (GWAS) conducted in European ancestry samples, to non-European ancestry samples. We analyzed the first decade of polygenic scoring studies (2008-2017, inclusive), and found that 67% of studies included exclusively European ancestry participants and another 19% included only East Asian ancestry participants. Only 3.8% of studies were carried out on samples of African, Hispanic, or Indigenous peoples. We find that effect sizes for European ancestry-derived polygenic scores are only 36% as large in African ancestry samples, as in European ancestry samples (t=−10.056, df=22, p=5.5×10−10). Analyzing global populations, we show that relationships between height polygenic scores and height are highly dependent on methodological choices in polygenic score construction, highlighting the need for caution in interpreting population level differences in distributions of polygenic scores, as currently calculated. These findings bolster the rationale for large-scale GWAS in diverse human populations and highlight the need for better handling of linkage disequilibrium and variant frequencies when applying scores to non-European samples.