-
Clinica: an open source software platform for reproducible clinical neuroscience studies
Authors:
Alexandre Routier,
Ninon Burgos,
Mauricio Díaz,
Michael Bacci,
Simona Bottani,
Omar El-Rifai,
Sabrina Fontanella,
Pietro Gori,
Jérémy Guillon,
Alexis Guyot,
Ravi Hassanaly,
Thomas Jacquemont,
Pascal Lu,
Arnaud Marcoux,
Tristan Moreau,
Jorge Samper-González,
Marc Teichmann,
Elina Thibeau--Sutre,
Ghislain Vaillant,
Junhao Wen,
Adam Wild,
Marie-Odile Habert,
Stanley Durrleman,
Olivier Colliot
Abstract:
We present Clinica (www.clinica.run), an open-source software platform designed to make clinical neuroscience studies easier and more reproducible. Clinica aims for researchers to i) spend less time on data management and processing, ii) perform reproducible evaluations of their methods, and iii) easily share data and results within their institution and with external collaborators. The core of Cl…
▽ More
We present Clinica (www.clinica.run), an open-source software platform designed to make clinical neuroscience studies easier and more reproducible. Clinica aims for researchers to i) spend less time on data management and processing, ii) perform reproducible evaluations of their methods, and iii) easily share data and results within their institution and with external collaborators. The core of Clinica is a set of automatic pipelines for processing and analysis of multimodal neuroimaging data (currently, T1-weighted MRI, diffusion MRI and PET data), as well as tools for statistics, machine learning and deep learning. It relies on the brain imaging data structure (BIDS) for the organization of raw neuroimaging datasets and on established tools written by the community to build its pipelines. It also provides converters of public neuroimaging datasets to BIDS (currently ADNI, AIBL, OASIS and NIFD). Processed data include image-valued scalar fields (e.g. tissue probability maps), meshes, surface-based scalar fields (e.g. cortical thickness maps) or scalar outputs (e.g. regional averages). These data follow the ClinicA Processed Structure (CAPS) format which shares the same philosophy as BIDS. Consistent organization of raw and processed neuroimaging files facilitates the execution of single pipelines and of sequences of pipelines, as well as the integration of processed data into statistics or machine learning frameworks. The target audience of Clinica is neuroscientists or clinicians conducting clinical neuroscience studies involving multimodal imaging, and researchers developing advanced machine learning algorithms applied to neuroimaging data.
△ Less
Submitted 21 July, 2021;
originally announced July 2021.
-
Accuracy of MRI Classification Algorithms in a Tertiary Memory Center Clinical Routine Cohort
Authors:
Alexandre Morin,
Jorge Samper-González,
Anne Bertrand,
Sebastian Stroer,
Didier Dormont,
Aline Mendes,
Pierrick Coupé,
Jamila Ahdidan,
Marcel Lévy,
Dalila Samri,
Harald Hampel,
Bruno Dubois,
Marc Teichmann,
Stéphane Epelbaum,
Olivier Colliot
Abstract:
BACKGROUND:Automated volumetry software (AVS) has recently become widely available to neuroradiologists. MRI volumetry with AVS may support the diagnosis of dementias by identifying regional atrophy. Moreover, automatic classifiers using machine learning techniques have recently emerged as promising approaches to assist diagnosis. However, the performance of both AVS and automatic classifiers has…
▽ More
BACKGROUND:Automated volumetry software (AVS) has recently become widely available to neuroradiologists. MRI volumetry with AVS may support the diagnosis of dementias by identifying regional atrophy. Moreover, automatic classifiers using machine learning techniques have recently emerged as promising approaches to assist diagnosis. However, the performance of both AVS and automatic classifiers has been evaluated mostly in the artificial setting of research datasets.OBJECTIVE:Our aim was to evaluate the performance of two AVS and an automatic classifier in the clinical routine condition of a memory clinic.METHODS:We studied 239 patients with cognitive troubles from a single memory center cohort. Using clinical routine T1-weighted MRI, we evaluated the classification performance of: 1) univariate volumetry using two AVS (volBrain and Neuroreader$^{TM}$); 2) Support Vector Machine (SVM) automatic classifier, using either the AVS volumes (SVM-AVS), or whole gray matter (SVM-WGM); 3) reading by two neuroradiologists. The performance measure was the balanced diagnostic accuracy. The reference standard was consensus diagnosis by three neurologists using clinical, biological (cerebrospinal fluid) and imaging data and following international criteria.RESULTS:Univariate AVS volumetry provided only moderate accuracies (46% to 71% with hippocampal volume). The accuracy improved when using SVM-AVS classifier (52% to 85%), becoming close to that of SVM-WGM (52 to 90%). Visual classification by neuroradiologists ranged between SVM-AVS and SVM-WGM.CONCLUSION:In the routine practice of a memory clinic, the use of volumetric measures provided by AVS yields only moderate accuracy. Automatic classifiers can improve accuracy and could be a useful tool to assist diagnosis.
△ Less
Submitted 19 March, 2020;
originally announced March 2020.
-
Convolutional Neural Networks for Classification of Alzheimer's Disease: Overview and Reproducible Evaluation
Authors:
Junhao Wen,
Elina Thibeau-Sutre,
Mauricio Diaz-Melo,
Jorge Samper-Gonzalez,
Alexandre Routier,
Simona Bottani,
Didier Dormont,
Stanley Durrleman,
Ninon Burgos,
Olivier Colliot
Abstract:
Over 30 papers have proposed to use convolutional neural network (CNN) for AD classification from anatomical MRI. However, the classification performance is difficult to compare across studies due to variations in components such as participant selection, image preprocessing or validation procedure. Moreover, these studies are hardly reproducible because their frameworks are not publicly accessibl…
▽ More
Over 30 papers have proposed to use convolutional neural network (CNN) for AD classification from anatomical MRI. However, the classification performance is difficult to compare across studies due to variations in components such as participant selection, image preprocessing or validation procedure. Moreover, these studies are hardly reproducible because their frameworks are not publicly accessible and because implementation details are lacking. Lastly, some of these papers may report a biased performance due to inadequate or unclear validation or model selection procedures. In the present work, we aim to address these limitations through three main contributions. First, we performed a systematic literature review and found that more than half of the surveyed papers may have suffered from data leakage. Our second contribution is the extension of our open-source framework for classification of AD using CNN and T1-weighted MRI. Finally, we used this framework to rigorously compare different CNN architectures. The data was split into training/validation/test sets at the very beginning and only the training/validation sets were used for model selection. To avoid any overfitting, the test sets were left untouched until the end of the peer-review process. Overall, the different 3D approaches (3D-subject, 3D-ROI, 3D-patch) achieved similar performances while that of the 2D slice approach was lower. Of note, the different CNN approaches did not perform better than a SVM with voxel-based features. The different approaches generalized well to similar populations but not to datasets with different inclusion criteria or demographical characteristics.
△ Less
Submitted 31 May, 2020; v1 submitted 16 April, 2019;
originally announced April 2019.
-
Reproducible evaluation of diffusion MRI features for automatic classification of patients with Alzheimers disease
Authors:
Junhao Wen,
Jorge Samper-Gonzalez,
Simona Bottani,
Alexandre Routier,
Ninon Burgos,
Thomas Jacquemont,
Sabrina Fontanella,
Stanley Durrleman,
Stephane Epelbaum,
Anne Bertrand,
Olivier Colliot
Abstract:
Diffusion MRI is the modality of choice to study alterations of white matter. In past years, various works have used diffusion MRI for automatic classification of AD. However, classification performance obtained with different approaches is difficult to compare and these studies are also difficult to reproduce. In the present paper, we first extend a previously proposed framework to diffusion MRI…
▽ More
Diffusion MRI is the modality of choice to study alterations of white matter. In past years, various works have used diffusion MRI for automatic classification of AD. However, classification performance obtained with different approaches is difficult to compare and these studies are also difficult to reproduce. In the present paper, we first extend a previously proposed framework to diffusion MRI data for AD classification. Specifically, we add: conversion of diffusion MRI ADNI data into the BIDS standard and pipelines for diffusion MRI preprocessing and feature extraction. We then apply the framework to compare different components. First, FS has a positive impact on classification results: highest balanced accuracy (BA) improved from 0.76 to 0.82 for task CN vs AD. Secondly, voxel-wise features generally gives better performance than regional features. Fractional anisotropy (FA) and mean diffusivity (MD) provided comparable results for voxel-wise features. Moreover, we observe that the poor performance obtained in tasks involving MCI were potentially caused by the small data samples, rather than by the data imbalance. Furthermore, no extensive classification difference exists for different degree of smoothing and registration methods. Besides, we demonstrate that using non-nested validation of FS leads to unreliable and over-optimistic results: 0.05 up to 0.40 relative increase in BA. Lastly, with proper FR and FS, the performance of diffusion MRI features is comparable to that of T1w MRI. All the code of the framework and the experiments are publicly available: general-purpose tools have been integrated into the Clinica software package (www.clinica.run) and the paper-specific code is available at: https://github.com/aramis-lab/AD-ML.
△ Less
Submitted 11 June, 2020; v1 submitted 28 December, 2018;
originally announced December 2018.
-
Reproducible evaluation of classification methods in Alzheimer's disease: framework and application to MRI and PET data
Authors:
Jorge Samper-González,
Ninon Burgos,
Simona Bottani,
Sabrina Fontanella,
Pascal Lu,
Arnaud Marcoux,
Alexandre Routier,
Jérémy Guillon,
Michael Bacci,
Junhao Wen,
Anne Bertrand,
Hugo Bertin,
Marie-Odile Habert,
Stanley Durrleman,
Theodoros Evgeniou,
Olivier Colliot
Abstract:
A large number of papers have introduced novel machine learning and feature extraction methods for automatic classification of AD. However, they are difficult to reproduce because key components of the validation are often not readily available. These components include selected participants and input data, image preprocessing and cross-validation procedures. The performance of the different appro…
▽ More
A large number of papers have introduced novel machine learning and feature extraction methods for automatic classification of AD. However, they are difficult to reproduce because key components of the validation are often not readily available. These components include selected participants and input data, image preprocessing and cross-validation procedures. The performance of the different approaches is also difficult to compare objectively. In particular, it is often difficult to assess which part of the method provides a real improvement, if any. We propose a framework for reproducible and objective classification experiments in AD using three publicly available datasets (ADNI, AIBL and OASIS). The framework comprises: i) automatic conversion of the three datasets into BIDS format, ii) a modular set of preprocessing pipelines, feature extraction and classification methods, together with an evaluation framework, that provide a baseline for benchmarking the different components. We demonstrate the use of the framework for a large-scale evaluation on 1960 participants using T1 MRI and FDG PET data. In this evaluation, we assess the influence of different modalities, preprocessing, feature types, classifiers, training set sizes and datasets. Performances were in line with the state-of-the-art. FDG PET outperformed T1 MRI for all classification tasks. No difference in performance was found for the use of different atlases, image smoothing, partial volume correction of FDG PET images, or feature type. Linear SVM and L2-logistic regression resulted in similar performance and both outperformed random forests. The classification performance increased along with the number of subjects used for training. Classifiers trained on ADNI generalized well to AIBL and OASIS. All the code of the framework and the experiments is publicly available at: https://gitlab.icm-institute.org/aramislab/AD-ML.
△ Less
Submitted 20 August, 2018;
originally announced August 2018.
-
Yet Another ADNI Machine Learning Paper? Paving The Way Towards Fully-reproducible Research on Classification of Alzheimer's Disease
Authors:
Jorge Samper-González,
Ninon Burgos,
Sabrina Fontanella,
Hugo Bertin,
Marie-Odile Habert,
Stanley Durrleman,
Theodoros Evgeniou,
Olivier Colliot
Abstract:
In recent years, the number of papers on Alzheimer's disease classification has increased dramatically, generating interesting methodological ideas on the use machine learning and feature extraction methods. However, practical impact is much more limited and, eventually, one could not tell which of these approaches are the most efficient. While over 90\% of these works make use of ADNI an objectiv…
▽ More
In recent years, the number of papers on Alzheimer's disease classification has increased dramatically, generating interesting methodological ideas on the use machine learning and feature extraction methods. However, practical impact is much more limited and, eventually, one could not tell which of these approaches are the most efficient. While over 90\% of these works make use of ADNI an objective comparison between approaches is impossible due to variations in the subjects included, image pre-processing, performance metrics and cross-validation procedures. In this paper, we propose a framework for reproducible classification experiments using multimodal MRI and PET data from ADNI. The core components are: 1) code to automatically convert the full ADNI database into BIDS format; 2) a modular architecture based on Nipype in order to easily plug-in different classification and feature extraction tools; 3) feature extraction pipelines for MRI and PET data; 4) baseline classification approaches for unimodal and multimodal features. This provides a flexible framework for benchmarking different feature extraction and classification tools in a reproducible manner. We demonstrate its use on all (1519) baseline T1 MR images and all (1102) baseline FDG PET images from ADNI 1, GO and 2 with SPM-based feature extraction pipelines and three different classification techniques (linear SVM, anatomically regularized SVM and multiple kernel learning SVM). The highest accuracies achieved were: 91% for AD vs CN, 83% for MCIc vs CN, 75% for MCIc vs MCInc, 94% for AD-A$β$+ vs CN-A$β$- and 72% for MCIc-A$β$+ vs MCInc-A$β$+. The code is publicly available at https://gitlab.icm-institute.org/aramislab/AD-ML (depends on the Clinica software platform, publicly available at http://www.clinica.run).
△ Less
Submitted 21 September, 2017;
originally announced September 2017.