ALS_Biomarker_Study

Machine Learning Reveals Protein Signatures in CSF and Plasma Fluids of Clinical Value for ALS

 

Michael S. Bereman1-3*,Joshua Beri2, Jeffrey R. Enders3, and Tara Nash3

 

1Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695

2Department of Chemistry, North Carolina State University, Raleigh, NC 27695

3Center for Human Health and the Environment, North Carolina State University, Raleigh, NC 27695

*Author for Correspondence

Michael S. Bereman, Ph.D.

Department of Biological Sciences

Center for Human Health and the Environment

North Carolina State University

Raleigh, NC

Phone: 919.515.8520

Email: michaelbereman@ncsu.edu

 

Abstract

We use shotgun proteomics to identify biomarkers of diagnostic and prognostic value in individuals diagnosed with amyotrophic lateral sclerosis.  Matched cerebrospinal and plasma fluids were subjected to abundant protein depletion and analyzed by nano-flow liquid chromatography high resolution tandem mass spectrometry.  Label free quantitation was used to identify differential proteins between individuals with ALS (n=33) and healthy controls (n=30) in both fluids.  In CSF, 118 (p-value<0.05) and 27 proteins (q-value<0.05) were identified as significantly altered between ALS and controls.  In plasma, 20 (p-value< 0.05) and 0 (q-value<0.05) proteins were identified as significantly altered between ALS and controls. Proteins involved in complement activation, acute phase response and retinoid signaling pathways were significantly enriched in the CSF from ALS patients.  Subsequently various machine learning methods were evaluated for disease classification using a repeated Monte Carlo cross-validation approach. A linear discriminant analysis model achieved a median area under the receiver operating characteristic curve of 0.94 with an interquartile range of 0.88-1.0.  Three proteins composed a prognostic model (p=5e-4) that explained 49% of the variation in the ALS-FRS scores.  Finally we validated the specificity of two promising proteins from our discovery data set, chitinase-3 like 1 protein and alpha-1-antichymotrypsin, using targeted proteomics in a separate set of  CSF samples derived from individuals diagnosed with ALS (n=15) and other neurodegenerative diseases (n=15). These results demonstrate the potential of a panel of targeted proteins for objective measurements of clinical value in ALS.