High-throughput neuroimaging-genetics computational infrastructure

Ivo D Dinov; Petros Petrosyan; Zhizhong Liu; Paul Eggert; Sam Hobel; Paul Vespa; Seok Woo Moon; John D Van Horn; Joseph Franco; Arthur W Toga

doi:10.3389/fninf.2014.00041

High-throughput neuroimaging-genetics computational infrastructure

Front Neuroinform. 2014 Apr 23:8:41. doi: 10.3389/fninf.2014.00041. eCollection 2014.

Authors

Affiliations

¹ Laboratory of Neuro Imaging, Institute for Neuroimaging and Informatics, University of Southern California Los Angeles, CA, USA ; Biomedical Informatics Research Network, Information Sciences Institute, University of Southern California Los Angeles, CA, USA ; Statistics Online Computational Resource, University of Michigan, UMSN Ann Arbor, MI, USA.
² Laboratory of Neuro Imaging, Institute for Neuroimaging and Informatics, University of Southern California Los Angeles, CA, USA.
³ Laboratory of Neuro Imaging, Institute for Neuroimaging and Informatics, University of Southern California Los Angeles, CA, USA ; Department of Computer Science, University of California Los Angeles, Los Angeles, CA, USA.
⁴ Brain Injury Research Center, Department of Neurosurgery, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA.
⁵ Department of Neuropsychiatry, Konkuk University School of Medicine Seoul, Korea.
⁶ Laboratory of Neuro Imaging, Institute for Neuroimaging and Informatics, University of Southern California Los Angeles, CA, USA ; Biomedical Informatics Research Network, Information Sciences Institute, University of Southern California Los Angeles, CA, USA.

Abstract

Many contemporary neuroscientific investigations face significant challenges in terms of data management, computational processing, data mining, and results interpretation. These four pillars define the core infrastructure necessary to plan, organize, orchestrate, validate, and disseminate novel scientific methods, computational resources, and translational healthcare findings. Data management includes protocols for data acquisition, archival, query, transfer, retrieval, and aggregation. Computational processing involves the necessary software, hardware, and networking infrastructure required to handle large amounts of heterogeneous neuroimaging, genetics, clinical, and phenotypic data and meta-data. Data mining refers to the process of automatically extracting data features, characteristics and associations, which are not readily visible by human exploration of the raw dataset. Result interpretation includes scientific visualization, community validation of findings and reproducible findings. In this manuscript we describe the novel high-throughput neuroimaging-genetics computational infrastructure available at the Institute for Neuroimaging and Informatics (INI) and the Laboratory of Neuro Imaging (LONI) at University of Southern California (USC). INI and LONI include ultra-high-field and standard-field MRI brain scanners along with an imaging-genetics database for storing the complete provenance of the raw and derived data and meta-data. In addition, the institute provides a large number of software tools for image and shape analysis, mathematical modeling, genomic sequence processing, and scientific visualization. A unique feature of this architecture is the Pipeline environment, which integrates the data management, processing, transfer, and visualization. Through its client-server architecture, the Pipeline environment provides a graphical user interface for designing, executing, monitoring validating, and disseminating of complex protocols that utilize diverse suites of software tools and web-services. These pipeline workflows are represented as portable XML objects which transfer the execution instructions and user specifications from the client user machine to remote pipeline servers for distributed computing. Using Alzheimer's and Parkinson's data, we provide several examples of translational applications using this infrastructure.

Keywords: Alzheimer's disease; aging; big data; computation solutions; genetics; neuroimaging; pipeline; visualization.

Abstract

Grants and funding