To the Editor: Alzheimer’s disease (AD) is an irreversible chronic neurodegenerative disease. AD initially affects short-term memory, thinking, and behavior. It then severely disrupts the normal lives of patients and their families and may eventually lead to death. Mild cognitive impairment (MCI) is considered an early stage of AD. Some studies have shown that nearly 20% of patients with MCI are at a risk of developing AD within the next four years.[1] Although there is no impressive way to stop the further development of MCI, only a series of procedures can slow it. Thus, timely and accurate intervention is essential to effectively slow the disease progression.
Owing to its non-invasive nature, structural magnetic resonance imaging (sMRI) has become the most commonly used imaging modality for diagnosing AD, because it can capture anatomical information. The application of traditional machine-learning algorithms to sMRI images for AD classification is relatively mature. However, complex preprocessing of sMRI images is usually required to extract more desirable features for classification purposes. In contrast, deep learning algorithms have been more widely employed in recent years because they require no assistance from relevant medical experts for feature extraction and can automatically learn advanced features.
Hence, this paper reviews sMRI-assisted AD classification based on various machine-learning algorithms developed in recent years. We also present some prospective approaches to address the shortcomings and problems of the existing research, in addition to providing a reference for future studies in this field.
Deep learning methods used for AD classification. According to the scale used in sMRI-based AD classification methods, they can be broadly classified into three groups: regions of interest (ROI), voxels, and patches. ROI-based methods involve segmenting specific brain regions for feature extraction, but may miss crucial areas. Voxel-based methods measure brain atrophy through features, such as cortical thickness and volume, requiring preprocessing, whereas some methods use deep learning algorithms to analyze sMRI images directly, bypassing extensive preprocessing. Patch-based methods select disease-related patches for feature analysis, offering greater flexibility and potentially higher accuracy, although their effectiveness depends heavily on the choice of the patch.
CNN-based AD classification research. As one of the most effective algorithms in deep learning, convolutional neural networks (CNN) can use image features and spatial context to generate hierarchical features relevant to specific tasks and datasets by encompassing neighborhood information. Various CNN-based architectures have performed well in the diagnosis of AD based on sMRI.
One of the routine methods for AD diagnosis is to segment brain sMRI images into multiple brain regions based on a brain atlas and extract the brain region features (such as volume and shape). When the dataset is insufficient, using ROI-based methods can effectively alleviate the overfitting problem and improve classification performance. Patients with AD typically show symptoms such as reduced memory, cognitive function, language difficulties, and spatial cognition issues. Thus, researchers often focus on crucial ROIs, such as the hippocampus, parietal, temporal, and frontal lobes, and amygdala. As AD progresses, the brain structure exhibits slight accelerated aging, which is associated with MCI. Poloni KM et al[2] produced an efficient age estimation framework using only the hippocampal regions that explores the associations of the brain age prediction error of age-matched normal cognition (NC) subjects with AD and MCI subjects.
Voxel-based methods directly analyze the entire brain at the voxel level, without the need for predefined ROI. CNN-based voxel-scale classification of AD simplifies the construction of classifiers using two-dimensional (2D) or three-dimensional (3D) structural magnetic resonance imaging without the need for complex preprocessing. 3D-CNN detects brain structural changes related to Alzheimer’s disease, helping to identify previously unnoticed areas, but it increases computational costs. Zhao et al[3] proposed a disease progression prediction framework that incorporates a 3D generative adversarial network (MI-GAN) and a multi-class classification networks. MI-GAN can generate high-quality 3D brain sMRI images of individuals at future time points based on baseline 3D brain structure magnetic resonance imaging (sMRI) and various factors such as age, gender, education level, APOE genotype, etc. Subsequently, a multi-class classification network was utilized to classify the generated sMRI images and estimate the clinical condition of the subjects.[3]
Voxel-scale research can also include an attention mechanism that focuses on key brain regions without missing information, and combines different levels of features into higher level features. For example, Liu et al[4] constructed a deep CNN based on multiple attention mechanisms. It employs cyclic convolutional enhancement of sMRI images to highlight the feature information of the original images, which significantly improves prediction accuracy and robustness. Moreover, the recalibration of image features based on a multi-attention mechanism can realize adaptive learning, allowing the identification of brain regions that are particularly relevant to disease diagnosis.
Notably, patients in the early stages of AD generally show minimal structural changes in local brain regions. The patch-based method divides the brain image into smaller blocks or ‘patches’ and then analyzes these patches separately. This approach can capture more detailed changes within local areas and may provide richer local structural information compared to analyzing the entire ROI or individual voxels. In previous studies, anatomical marker detectors and statistical methods were used to detect patches. Ashtari-Majlan et al[5] used multivariate statistical testing (T2 Hotelling test) to compare MRI images of Alzheimer’s disease patients and normal individuals, key anatomical landmarks are identified. Patches are then extracted based on these landmarks and input into a multi-stream CNN for classification. This method overcomes the problem of insufficient training data. Furthermore, an attention mechanism has been added to patch-level studies. Liu et al[6] proposed a task-driven hierarchical attention network (THAN) consisting of two sub networks: the Information Sub-Network (IS), which generates an information graph highlighting disease-related regions and their importance to final classification, and the Hierarchical Attention Sub-Network (HAS), which extracts discriminative visual and semantic features from the information graphs to aid in disease diagnosis. The generated information map can highlight potential MCI and AD biomarkers, such as the hippocampus and ventricles. By combining visual and semantic attention modules, the model’s performance is significantly improved.
Transformer-based AD classification research. Currently, there are two methods to incorporate the attention mechanism into CNN in the medical imaging field. One is to add attention blocks to the original CNN, and the other is to replace the convolutional blocks in the CNN with attention blocks. In either case, the overall network structure of the CNN was maintained. Because natural language processing (NLP) has recently achieved quite good performance based on the transformer, there has been an attempt to apply the transformer to computer vision, and a Vision Transformer (ViT) was born.
Due to the multi-head self-attention mechanism, ViT can effectively focus on multiple regions in the image and extract features by calculating the correlation between different regions. This enables ViT to effectively capture the dependency relationships between different brain regions in the image. Kushol et al[7] improved ViT to learn the spatial and frequency-domain features of sMRI images. However, the transformer has limitations in terms of the length of the input sequence and selects only some 2D coronal slices as the input, which may miss critical information. Jang and Hwang[8] combined 3D-CNN, 2D-CNN, and a transformer. The inductive bias of the CNN can sufficiently extract local abnormal details, enabling the identification of brain atrophy areas in 3D MRI images of AD patients, and the transformer can effectively capture possible dependencies in distal brain regions, for instance, it can analyze whether there’s a correlation between cortical atrophy in AD patients and a decrease in hippocampal volume.
Traditional machine learning methods used for AD classification. The classification features used in traditional machine learning methods need to be extracted separately from the dataset and fed into classifiers such as support vector machines (SVM), random forest (RF), K-nearest neighbor (KNN), decision-making tree (DT), and their variants. Brain morphology analyses at the voxel scale usually face the challenge of high-dimensional features. The potential overfitting problem caused by high-dimensional features is commonly mitigated by manually selecting features and using dimensionality reduction algorithms, followed by further improvements to the classifier to increase the classification accuracy.
In recent years, mixing CNN with traditional machine learning methods has also been a research direction, owing to the outstanding contribution of deep learning algorithms to medical images. For example, Sharma et al[9] proposed a deep residual network for extracting AD diagnostic features from sMRI sagittal slices by applying a fuzzy-hyperplane least-squares twin support vector machine (FLS-TWSVM).
To conclude, studies have shown that research focusing on AD classification has gradually shifted from traditional machine learning algorithms to deep learning models or a combination of both methods in recent years. Deep learning algorithms can optimize feature extraction by allowing hierarchical extraction of discriminative feature representations and naturally combining features from different scales. However, manually extracted features are more independent of the classifiers; thus, there is a potential heterogeneity that may lead to poor diagnostic performance.
Nevertheless, the applications of deep learning models using sMRI images are not always problem-free, and one of the main obstacles is data shortage. It is due to two factors: (1) the need for tagged data and the high cost of tagging by experienced physicians, and (2) the ethical issues and security of patient information, which limits the cross-institutional use of medical images. Cases using inadequate data in CNN computations revealed severe overfitting problems. Data augmentation methods are typically utilized to overcome sample imbalances and adapt the network architecture via transfer learning.
In addition to the above issues, future research on AD classification based on sMRI should consider the following three suggestions: (1) Most current studies are single-scale. However, studies have shown that subtle local structural features, overall global features, frequency-domain features, spatial-domain features, low-level morphological features, and high-level semantic features are complemented. Therefore, combining multi-scale and multi-dimensional features can capture more information from the subjects. (2) Existing patch-based methods are partly data driven or empirically predefined, and the features extracted in this manner are usually independent of the subsequent classifier-learning process. Owing to the potential heterogeneity of features and classifiers, predefined features may lead to suboptimal learning performance in brain disease diagnosis. Therefore, an effective combination of feature learning and classifier training has the potential to further optimize the results. (3) The commonality along specificity exists among individuals. Previous research methods have restricted AD-sensitive regions to the same locations in all subjects, ignoring individual differences in disease progression across subjects. This might be one of the reasons why it has been challenging to further increase the classification accuracy.
Various CNN models have been applied in AD research, with each model demonstrating its strengths. These models were optimized and modified by optimizing the convolutional kernel structure and introducing multi-scale convolution, residual connections, and attention mechanisms. However, the diversity of methods also means that there is no specific direction for improving network structures. The application of transformers in this field is still in the exploratory stage. Although the research results are less abundant than those based on CNN methods, transformers have developed rapidly in the field of vision and have shown great potentials for development in AD research.
Furthermore, it is essential to note that most current Computer-Aided Diagnosis (CAD) systems for AD classification are designed based on prior medical imaging and medical knowledge to improve model performance. However, less emphasis has been placed on utilizing deep learning algorithms to discover other features from sMRI data that can benefit AD classification. Therefore, differentiation between diseases and identifying additional features that contribute to AD classification are important directions for future research. Moreover, utilizing computer-aided diagnosis for Alzheimer’s disease can enhance diagnostic accuracy, expedite diagnosis, and reduce healthcare costs, which is crucial for improving patients’ quality of life and promoting the sustainable development of the healthcare system. This review provides diagnostic evidence for clinical medicine.
Conflicts of interestNone.
References 1. Roberts R, Knopman SD. Classification and epidemiology of MCI. Clin Geriatr Med 2013;29:753–772. doi: 10.1016/j.cger.2013.07.003. 2. Poloni KM, Ricardo José Ferrari. A deep ensemble hippocampal CNN model for brain age estimation applied to Alzheimer’s diagnosis. Expert Systems with Applications 2022;195:116622. doi: 10.1016/j.eswa.2022.116622. 3. Zhao Y, Ma B, Jiang P, Zeng D, Wang X, Li S. Prediction of Alzheimer’s disease progression with multi-information generative adversarial network. IEEE J Biomed Health Inform 2021;25:711–719. doi: 10.1109/JBHI.2020.3006925. 4. Liu F, Wang H, Chen Y, Quan Y, Tao L. Convolutional neural network based on feature enhancement and attention mechanism for Alzheimer’s disease prediction using MRI images. Int Conf Graph Image Process (ICGIP) 2022;12083:281–295. doi: 10.1117/12.2623580. 5. Ashtari-Majlan M, Seifi A, Dehshibi MM. A multi-stream convolutional neural network for classification of progressive MCI in Alzheimer’s disease using structural MRI images. IEEE J Biomed Health Inform 2022;26:3918–3926. doi: 10.1109/JBHI.2021.3091912. 6. Liu F, Wang X, Liu D, Zhang C. THAN: Task-driven hierarchical attention network for the diagnosis of mild cognitive impairment and Alzheimer’s disease. Quant Imaging Med Surg 2021;11:3338–3354. doi: 10.21037/qims-20-1034. 7. Kushol R, Masoumzadeh A, Huo D, Kalra S, Yang YH. AddFormer: Alzheimer’s disease detection from structural MRI using fusion transformer. Kolkata, India: 2022 IEEE 19th International Symposium on Biomedical Imaging (ISBI); 2022:1–5. doi: 10.1109/ISBI52829.2022.9761421. 8. Jang J, Hwang D. M3T: Three-dimensional medical image classifier using multi-plane and multi-slice transformer. New Orleans, LA, USA: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); 2022:20718–20729. doi: 10.1109/CVPR52688.2022.02006. 9. Sharma R, Goel T, Tanveer M, Murugan R. FDN-ADNet: Fuzzy LS-TWSVM based deep learning network for prognosis of the Alzheimer’s disease using the sagittal plane of MRI scans. Appl Soft Comput 2022;115:108099. doi: 10.1016/j.asoc.2021.108099.
Comments (0)