Finding the needle in a high-dimensional haystack: Canonical correlation analysis for neuroscientists

Neuroimage. 2020 Aug 1:216:116745. doi: 10.1016/j.neuroimage.2020.116745. Epub 2020 Apr 8.

Abstract

The 21st century marks the emergence of "big data" with a rapid increase in the availability of datasets with multiple measurements. In neuroscience, brain-imaging datasets are more commonly accompanied by dozens or hundreds of phenotypic subject descriptors on the behavioral, neural, and genomic level. The complexity of such "big data" repositories offer new opportunities and pose new challenges for systems neuroscience. Canonical correlation analysis (CCA) is a prototypical family of methods that is useful in identifying the links between variable sets from different modalities. Importantly, CCA is well suited to describing relationships across multiple sets of data, such as in recently available big biomedical datasets. Our primer discusses the rationale, promises, and pitfalls of CCA.

Keywords: Big data; Data science; Deep phenotyping; Machine learning; Modality fusion.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Big Data*
  • Humans
  • Machine Learning*
  • Models, Statistical*
  • Neuroimaging / methods*
  • Neurosciences / methods*