Classification of Breast Cancer versus Normal Samples from Mass Spectrometry Profiles Using Linear Discriminant Analysis of Important Features Selected by Random Forest
We present our approach to classifying the processed proteomic data that were made available to the participants of the classification competition. Although classification of the spectra was the goal of the competition we feel that proteomic applications to cancer biomarker studies make certain additional demands. For example, one such requirement should be identification of certain features which collectively could differentiate the two groups of samples. Also ideally, the size of the feature set should be small. To that end we propose a linear discriminant classifier based on nine m/z intensity values. Construction and performance of this classifier are discussed.