Defense: Feature Extraction and Fusion for Supervised and Semi-supervised Classification: Application to fMRI and LTM Data

difmri

Dissertation Defense

Feature Extraction and Fusion for Supervised and Semi-supervised
Classification: Application to fMRI and LTM Data

Wei Du

2:00pm Thursday, 24 April 2014, ITE 325B

Extracting powerful features from high dimensional noisy data promises to significantly improve the effectiveness of further analysis, especially of classification. Since there is no single feature selection and extraction method or classifier that works best on all given problems, developing effective and efficient feature selection and extraction methods and classifiers for specific applications has became one of the most active areas in the machine learning field. The aim of this dissertation is to develop novel data-driven methods for extracting and selecting the most distinguishing features for performing classification using functional magnetic resonance imaging (fMRI) and laser tread mapping (LTM) tire data.

FMRI data have the potential to characterize and classify various brain disorders including schizophrenia. However, the high dimensionality and unknown nature of fMRI data present numerous challenges to accurate analysis and interpretation. Independent component analysis (ICA), as a data-driven method, has proven very useful for fMRI analysis in extracting spatial components as multivariate features used in classification, and more recently, for the analysis of fMRI data in its native complex-valued form. In this dissertation, we first present a novel framework to extract powerful features from components estimated by ICA, allowing us to remove the redundancy and retain the most discriminative activation patterns from multivariate ICA features. We apply the proposed three-phase feature extraction framework to two real-valued fMRI data sets, and achieve high classification rates in discriminating healthy controls from patients with schizophrenia. Second, due to the iterative nature of ICA algorithms, typically independent components (ICs) are not estimated consistently during different ICA runs, and hence it is not clear which result to use further. We present a statistical framework that utilizes an objective criterion to select the best of multiple ICA runs such that the multivariate ICA features from the best run can be used for further analysis and inference. Using the proposed framework, we study the performance of a novel complex ICA algorithm for fMRI analysis, entropy rate bound minimization, which takes all three types of diversity into account, including non-Gaussianity, sample dependence and noncircularity that are present in the complex-valued fMRI data. We show that CERBM leads to significant improvement in ICs that provide higher classification accuracy, and thus is a promising ICA algorithm for the analysis of complex-valued fMRI data.

Classification using LTM data is another problem we address where we first study the use of highly multivariate solutions such as ICA and then note the advantages using lower-level features for classification. In this case, an important problem is the selection of best set of features for the best classification performance. Additionally, there are a large amount of unlabeled tire data that are easy to collect but only a few of them can be easily labeled by expert. In this dissertation, we propose a novel mutual information (MI) based approach to achieve feature splits for co-training, a practical and powerful data-driven method in semi-supervised learning. Inspired by the idea of dependent component analysis, the proposed MI-based approach presents feature splits that are maximally independent between- or within- subsets, and thus selects and fuses features more effectively than other feature split methods. Experimental results on both simulated study and LTM tire data indicate that co-training with MI-based feature splits yields significantly higher accuracy than supervised classification.

Committee: Profs. Tulay Adali (Chair), Joel Morris, Janet Rutledge, Charles E. Laberge, Vince D. Calhoun (University of New Mexico and the Mind Research Network), and Dr. Matthew Anderson (Northrop Grumman Corp.)


Posted

in

, , , ,

by

Tags: