Search | arXiv e-print repository

arXiv:2406.19796 [pdf, other]

Comprehensive Generative Replay for Task-Incremental Segmentation with Concurrent Appearance and Semantic Forgetting

Authors: Wei Li, Jingyang Zhang, Pheng-Ann Heng, Lixu Gu

Abstract: Generalist segmentation models are increasingly favored for diverse tasks involving various objects from different image sources. Task-Incremental Learning (TIL) offers a privacy-preserving training paradigm using tasks arriving sequentially, instead of gathering them due to strict data sharing policies. However, the task evolution can span a wide scope that involves shifts in both image appearanc… ▽ More Generalist segmentation models are increasingly favored for diverse tasks involving various objects from different image sources. Task-Incremental Learning (TIL) offers a privacy-preserving training paradigm using tasks arriving sequentially, instead of gathering them due to strict data sharing policies. However, the task evolution can span a wide scope that involves shifts in both image appearance and segmentation semantics with intricate correlation, causing concurrent appearance and semantic forgetting. To solve this issue, we propose a Comprehensive Generative Replay (CGR) framework that restores appearance and semantic knowledge by synthesizing image-mask pairs to mimic past task data, which focuses on two aspects: modeling image-mask correspondence and promoting scalability for diverse tasks. Specifically, we introduce a novel Bayesian Joint Diffusion (BJD) model for high-quality synthesis of image-mask pairs with their correspondence explicitly preserved by conditional denoising. Furthermore, we develop a Task-Oriented Adapter (TOA) that recalibrates prompt embeddings to modulate the diffusion model, making the data synthesis compatible with different tasks. Experiments on incremental tasks (cardiac, fundus and prostate segmentation) show its clear advantage for alleviating concurrent appearance and semantic forgetting. Code is available at https://github.com/jingyzhang/CGR. △ Less

Submitted 28 June, 2024; originally announced June 2024.

Comments: Accepted by MICCAI24

arXiv:2404.14837 [pdf, other]

Ultrasound SAM Adapter: Adapting SAM for Breast Lesion Segmentation in Ultrasound Images

Authors: Zhengzheng Tu, Le Gu, Xixi Wang, Bo Jiang

Abstract: Segment Anything Model (SAM) has recently achieved amazing results in the field of natural image segmentation. However, it is not effective for medical image segmentation, owing to the large domain gap between natural and medical images. In this paper, we mainly focus on ultrasound image segmentation. As we know that it is very difficult to train a foundation model for ultrasound image data due to… ▽ More Segment Anything Model (SAM) has recently achieved amazing results in the field of natural image segmentation. However, it is not effective for medical image segmentation, owing to the large domain gap between natural and medical images. In this paper, we mainly focus on ultrasound image segmentation. As we know that it is very difficult to train a foundation model for ultrasound image data due to the lack of large-scale annotated ultrasound image data. To address these issues, in this paper, we develop a novel Breast Ultrasound SAM Adapter, termed Breast Ultrasound Segment Anything Model (BUSSAM), which migrates the SAM to the field of breast ultrasound image segmentation by using the adapter technique. To be specific, we first design a novel CNN image encoder, which is fully trained on the BUS dataset. Our CNN image encoder is more lightweight, and focuses more on features of local receptive field, which provides the complementary information to the ViT branch in SAM. Then, we design a novel Cross-Branch Adapter to allow the CNN image encoder to fully interact with the ViT image encoder in SAM module. Finally, we add both of the Position Adapter and the Feature Adapter to the ViT branch to fine-tune the original SAM. The experimental results on AMUBUS and BUSI datasets demonstrate that our proposed model outperforms other medical image segmentation models significantly. Our code will be available at: https://github.com/bscs12/BUSSAM. △ Less

Submitted 23 April, 2024; originally announced April 2024.

arXiv:2403.09157 [pdf, ps, other]

VM-UNET-V2 Rethinking Vision Mamba UNet for Medical Image Segmentation

Authors: Mingya Zhang, Yue Yu, Limei Gu, Tingsheng Lin, Xianping Tao

Abstract: In the field of medical image segmentation, models based on both CNN and Transformer have been thoroughly investigated. However, CNNs have limited modeling capabilities for long-range dependencies, making it challenging to exploit the semantic information within images fully. On the other hand, the quadratic computational complexity poses a challenge for Transformers. Recently, State Space Models… ▽ More In the field of medical image segmentation, models based on both CNN and Transformer have been thoroughly investigated. However, CNNs have limited modeling capabilities for long-range dependencies, making it challenging to exploit the semantic information within images fully. On the other hand, the quadratic computational complexity poses a challenge for Transformers. Recently, State Space Models (SSMs), such as Mamba, have been recognized as a promising method. They not only demonstrate superior performance in modeling long-range interactions, but also preserve a linear computational complexity. Inspired by the Mamba architecture, We proposed Vison Mamba-UNetV2, the Visual State Space (VSS) Block is introduced to capture extensive contextual information, the Semantics and Detail Infusion (SDI) is introduced to augment the infusion of low-level and high-level features. We conduct comprehensive experiments on the ISIC17, ISIC18, CVC-300, CVC-ClinicDB, Kvasir, CVC-ColonDB and ETIS-LaribPolypDB public datasets. The results indicate that VM-UNetV2 exhibits competitive performance in medical image segmentation tasks. Our code is available at https://github.com/nobodyplayer1/VM-UNetV2. △ Less

Submitted 14 March, 2024; originally announced March 2024.

Comments: 12 pages, 4 figures

arXiv:2312.09529 [pdf, other]

Can Physician Judgment Enhance Model Trustworthiness? A Case Study on Predicting Pathological Lymph Nodes in Rectal Cancer

Authors: Kazuma Kobayashi, Yasuyuki Takamizawa, Mototaka Miyake, Sono Ito, Lin Gu, Tatsuya Nakatsuka, Yu Akagi, Tatsuya Harada, Yukihide Kanemitsu, Ryuji Hamamoto

Abstract: Explainability is key to enhancing artificial intelligence's trustworthiness in medicine. However, several issues remain concerning the actual benefit of explainable models for clinical decision-making. Firstly, there is a lack of consensus on an evaluation framework for quantitatively assessing the practical benefits that effective explainability should provide to practitioners. Secondly, physici… ▽ More Explainability is key to enhancing artificial intelligence's trustworthiness in medicine. However, several issues remain concerning the actual benefit of explainable models for clinical decision-making. Firstly, there is a lack of consensus on an evaluation framework for quantitatively assessing the practical benefits that effective explainability should provide to practitioners. Secondly, physician-centered evaluations of explainability are limited. Thirdly, the utility of built-in attention mechanisms in transformer-based models as an explainability technique is unclear. We hypothesize that superior attention maps should align with the information that physicians focus on, potentially reducing prediction uncertainty and increasing model reliability. We employed a multimodal transformer to predict lymph node metastasis in rectal cancer using clinical data and magnetic resonance imaging, exploring how well attention maps, visualized through a state-of-the-art technique, can achieve agreement with physician understanding. We estimated the model's uncertainty using meta-level information like prediction probability variance and quantified agreement. Our assessment of whether this agreement reduces uncertainty found no significant effect. In conclusion, this case study did not confirm the anticipated benefit of attention maps in enhancing model reliability. Superficial explanations could do more harm than good by misleading physicians into relying on uncertain predictions, suggesting that the current state of attention mechanisms in explainability should not be overestimated. Identifying explainability mechanisms truly beneficial for clinical decision-making remains essential. △ Less

Submitted 14 December, 2023; originally announced December 2023.

arXiv:2309.03906 [pdf, other]

A-Eval: A Benchmark for Cross-Dataset Evaluation of Abdominal Multi-Organ Segmentation

Authors: Ziyan Huang, Zhongying Deng, Jin Ye, Haoyu Wang, Yanzhou Su, Tianbin Li, Hui Sun, Junlong Cheng, Jianpin Chen, Junjun He, Yun Gu, Shaoting Zhang, Lixu Gu, Yu Qiao

Abstract: Although deep learning have revolutionized abdominal multi-organ segmentation, models often struggle with generalization due to training on small, specific datasets. With the recent emergence of large-scale datasets, some important questions arise: \textbf{Can models trained on these datasets generalize well on different ones? If yes/no, how to further improve their generalizability?} To address t… ▽ More Although deep learning have revolutionized abdominal multi-organ segmentation, models often struggle with generalization due to training on small, specific datasets. With the recent emergence of large-scale datasets, some important questions arise: \textbf{Can models trained on these datasets generalize well on different ones? If yes/no, how to further improve their generalizability?} To address these questions, we introduce A-Eval, a benchmark for the cross-dataset Evaluation ('Eval') of Abdominal ('A') multi-organ segmentation. We employ training sets from four large-scale public datasets: FLARE22, AMOS, WORD, and TotalSegmentator, each providing extensive labels for abdominal multi-organ segmentation. For evaluation, we incorporate the validation sets from these datasets along with the training set from the BTCV dataset, forming a robust benchmark comprising five distinct datasets. We evaluate the generalizability of various models using the A-Eval benchmark, with a focus on diverse data usage scenarios: training on individual datasets independently, utilizing unlabeled data via pseudo-labeling, mixing different modalities, and joint training across all available datasets. Additionally, we explore the impact of model sizes on cross-dataset generalizability. Through these analyses, we underline the importance of effective data usage in enhancing models' generalization capabilities, offering valuable insights for assembling large-scale datasets and improving training strategies. The code and pre-trained models are available at \href{https://github.com/uni-medical/A-Eval}{https://github.com/uni-medical/A-Eval}. △ Less

Submitted 7 September, 2023; originally announced September 2023.

arXiv:2309.03331 [pdf, other]

Expert Uncertainty and Severity Aware Chest X-Ray Classification by Multi-Relationship Graph Learning

Authors: Mengliang Zhang, Xinyue Hu, Lin Gu, Liangchen Liu, Kazuma Kobayashi, Tatsuya Harada, Ronald M. Summers, Yingying Zhu

Abstract: Patients undergoing chest X-rays (CXR) often endure multiple lung diseases. When evaluating a patient's condition, due to the complex pathologies, subtle texture changes of different lung lesions in images, and patient condition differences, radiologists may make uncertain even when they have experienced long-term clinical training and professional guidance, which makes much noise in extracting di… ▽ More Patients undergoing chest X-rays (CXR) often endure multiple lung diseases. When evaluating a patient's condition, due to the complex pathologies, subtle texture changes of different lung lesions in images, and patient condition differences, radiologists may make uncertain even when they have experienced long-term clinical training and professional guidance, which makes much noise in extracting disease labels based on CXR reports. In this paper, we re-extract disease labels from CXR reports to make them more realistic by considering disease severity and uncertainty in classification. Our contributions are as follows: 1. We re-extracted the disease labels with severity and uncertainty by a rule-based approach with keywords discussed with clinical experts. 2. To further improve the explainability of chest X-ray diagnosis, we designed a multi-relationship graph learning method with an expert uncertainty-aware loss function. 3. Our multi-relationship graph learning method can also interpret the disease classification results. Our experimental results show that models considering disease severity and uncertainty outperform previous state-of-the-art methods. △ Less

Submitted 6 September, 2023; originally announced September 2023.

arXiv:2308.01771 [pdf]

Deep Learning-based Prediction of Stress and Strain Maps in Arterial Walls for Improved Cardiovascular Risk Assessment

Authors: Yasin Shokrollahi1, Pengfei Dong1, Xianqi Li, Linxia Gu

Abstract: This study investigated the potential of end-to-end deep learning tools as a more effective substitute for FEM in predicting stress-strain fields within 2D cross sections of arterial wall. We first proposed a U-Net based fully convolutional neural network (CNN) to predict the von Mises stress and strain distribution based on the spatial arrangement of calcification within arterial wall cross-secti… ▽ More This study investigated the potential of end-to-end deep learning tools as a more effective substitute for FEM in predicting stress-strain fields within 2D cross sections of arterial wall. We first proposed a U-Net based fully convolutional neural network (CNN) to predict the von Mises stress and strain distribution based on the spatial arrangement of calcification within arterial wall cross-sections. Further, we developed a conditional generative adversarial network (cGAN) to enhance, particularly from the perceptual perspective, the prediction accuracy of stress and strain field maps for arterial walls with various calcification quantities and spatial configurations. On top of U-Net and cGAN, we also proposed their ensemble approaches, respectively, to further improve the prediction accuracy of field maps. Our dataset, consisting of input and output images, was generated by implementing boundary conditions and extracting stress-strain field maps. The trained U-Net models can accurately predict von Mises stress and strain fields, with structural similarity index scores (SSIM) of 0.854 and 0.830 and mean squared errors of 0.017 and 0.018 for stress and strain, respectively, on a reserved test set. Meanwhile, the cGAN models in a combination of ensemble and transfer learning techniques demonstrate high accuracy in predicting von Mises stress and strain fields, as evidenced by SSIM scores of 0.890 for stress and 0.803 for strain. Additionally, mean squared errors of 0.008 for stress and 0.017 for strain further support the model's performance on a designated test set. Overall, this study developed a surrogate model for finite element analysis, which can accurately and efficiently predict stress-strain fields of arterial walls regardless of complex geometries and boundary conditions. △ Less

Submitted 3 August, 2023; originally announced August 2023.

arXiv:2306.10198 [pdf]

doi 10.1109/TPEL.2023.3290590

Output Voltage Response Improvement and Ripple Reduction Control for Input-parallel Output-parallel High-Power DC Supply

Authors: Jianhui Meng, Xiaolong Wu, Tairan Ye, Jingsen Yu, Likang Gu, Zili Zhang, Yang Li

Abstract: A three-phase isolated AC-DC-DC power supply is widely used in the industrial field due to its attractive features such as high-power density, modularity for easy expansion and electrical isolation. In high-power application scenarios, it can be realized by multiple AC-DC-DC modules with Input-Parallel Output-Parallel (IPOP) mode. However, it has the problems of slow output voltage response and la… ▽ More A three-phase isolated AC-DC-DC power supply is widely used in the industrial field due to its attractive features such as high-power density, modularity for easy expansion and electrical isolation. In high-power application scenarios, it can be realized by multiple AC-DC-DC modules with Input-Parallel Output-Parallel (IPOP) mode. However, it has the problems of slow output voltage response and large ripple in some special applications, such as electrophoresis and electroplating. This paper investigates an improved Adaptive Linear Active Disturbance Rejection Control (A-LADRC) with flexible adjustment capability of the bandwidth parameter value for the high-power DC supply to improve the output voltage response speed. To reduce the DC supply ripple, a control strategy is designed for a single module to adaptively adjust the duty cycle compensation according to the output feedback value. When multiple modules are connected in parallel, a Hierarchical Delay Current Sharing Control (HDCSC) strategy for centralized controllers is proposed to make the peaks and valleys of different modules offset each other. Finally, the proposed method is verified by designing a 42V/12000A high-power DC supply, and the results demonstrate that the proposed method is effective in improving the system output voltage response speed and reducing the voltage ripple, which has significant practical engineering application value. △ Less

Submitted 16 June, 2023; originally announced June 2023.

Comments: Accepted by IEEE Transactions on Power Electronics

Journal ref: IEEE Transactions on Power Electronics 38 (2023) 11102-11112

arXiv:2205.10354 [pdf]

Prediction of stent under-expansion in calcified coronary arteries using machine-learning on intravascular optical coherence tomography

Authors: Yazan Gharaibeh, Juhwan Lee, Vladislav N. Zimin, Chaitanya Kolluru, Luis A. P. Dallan, Gabriel T. R. Pereira, Armando Vergara-Martel, Justin N. Kim, Ammar Hoori, Pengfei Dong, Peshala T. Gamage, Linxia Gu, Hiram G. Bezerra, Sadeer Al-Kindi, David L. Wilson

Abstract: BACKGROUND Careful evaluation of the risk of stent under-expansions before the intervention will aid treatment planning, including the application of a pre-stent plaque modification strategy. OBJECTIVES It remains challenging to achieve a proper stent expansion in the presence of severely calcified coronary lesions. Building on our work in deep learning segmentation, we created an automated mach… ▽ More BACKGROUND Careful evaluation of the risk of stent under-expansions before the intervention will aid treatment planning, including the application of a pre-stent plaque modification strategy. OBJECTIVES It remains challenging to achieve a proper stent expansion in the presence of severely calcified coronary lesions. Building on our work in deep learning segmentation, we created an automated machine learning approach that uses lesion attributes to predict stent under-expansion from pre-stent images, suggesting the need for plaque modification. METHODS Pre- and post-stent intravascular optical coherence tomography image data were obtained from 110 coronary lesions. Lumen and calcifications in pre-stent images were segmented using deep learning, and numerous features per lesion were extracted. We analyzed stent expansion along the lesion, enabling frame, segmental, and whole-lesion analyses. We trained regression models to predict the poststent lumen area and then to compute the stent expansion index (SEI). Stents with an SEI < or >/= 80% were classified as "under-expanded" and "well-expanded," respectively. RESULTS Best performance (root-mean-square-error = 0.04+/-0.02 mm2, r = 0.94+/-0.04, p < 0.0001) was achieved when we used features from both the lumen and calcification to train a Gaussian regression model for a segmental analysis over a segment length of 31 frames. Under-expansion classification results (AUC=0.85+/-0.02) were significantly improved over other approaches. CONCLUSIONS We used calcifications and lumen features to identify lesions at risk of stent under-expansion. Results suggest that the use of pre-stent images can inform physicians of the need to apply plaque modification approaches. △ Less

Submitted 16 May, 2022; originally announced May 2022.

Comments: 25 pages, 7 figures, 1 table, 6 supplemental figures, 3 supplemental tables

arXiv:2201.02314 [pdf, other]

RestoreDet: Degradation Equivariant Representation for Object Detection in Low Resolution Images

Authors: Ziteng Cui, Yingying Zhu, Lin Gu, Guo-Jun Qi, Xiaoxiao Li, Peng Gao, Zenghui Zhang, Tatsuya Harada

Abstract: Image restoration algorithms such as super resolution (SR) are indispensable pre-processing modules for object detection in degraded images. However, most of these algorithms assume the degradation is fixed and known a priori. When the real degradation is unknown or differs from assumption, both the pre-processing module and the consequent high-level task such as object detection would fail. Here,… ▽ More Image restoration algorithms such as super resolution (SR) are indispensable pre-processing modules for object detection in degraded images. However, most of these algorithms assume the degradation is fixed and known a priori. When the real degradation is unknown or differs from assumption, both the pre-processing module and the consequent high-level task such as object detection would fail. Here, we propose a novel framework, RestoreDet, to detect objects in degraded low resolution images. RestoreDet utilizes the downsampling degradation as a kind of transformation for self-supervised signals to explore the equivariant representation against various resolutions and other degradation conditions. Specifically, we learn this intrinsic visual structure by encoding and decoding the degradation transformation from a pair of original and randomly degraded images. The framework could further take the advantage of advanced SR architectures with an arbitrary resolution restoring decoder to reconstruct the original correspondence from the degraded input image. Both the representation learning and object detection are optimized jointly in an end-to-end training fashion. RestoreDet is a generic framework that could be implemented on any mainstream object detection architectures. The extensive experiment shows that our framework based on CenterNet has achieved superior performance compared with existing methods when facing variant degradation situations. Our code would be released soon. △ Less

Submitted 6 January, 2022; originally announced January 2022.

Comments: 11 pages, 3figures

arXiv:2112.01034 [pdf, other]

Leveraging Human Selective Attention for Medical Image Analysis with Limited Training Data

Authors: Yifei Huang, Xiaoxiao Li, Lijin Yang, Lin Gu, Yingying Zhu, Hirofumi Seo, Qiuming Meng, Tatsuya Harada, Yoichi Sato

Abstract: The human gaze is a cost-efficient physiological data that reveals human underlying attentional patterns. The selective attention mechanism helps the cognition system focus on task-relevant visual clues by ignoring the presence of distractors. Thanks to this ability, human beings can efficiently learn from a very limited number of training samples. Inspired by this mechanism, we aim to leverage ga… ▽ More The human gaze is a cost-efficient physiological data that reveals human underlying attentional patterns. The selective attention mechanism helps the cognition system focus on task-relevant visual clues by ignoring the presence of distractors. Thanks to this ability, human beings can efficiently learn from a very limited number of training samples. Inspired by this mechanism, we aim to leverage gaze for medical image analysis tasks with small training data. Our proposed framework includes a backbone encoder and a Selective Attention Network (SAN) that simulates the underlying attention. The SAN implicitly encodes information such as suspicious regions that is relevant to the medical diagnose tasks by estimating the actual human gaze. Then we design a novel Auxiliary Attention Block (AAB) to allow information from SAN to be utilized by the backbone encoder to focus on selective areas. Specifically, this block uses a modified version of a multi-head attention layer to simulate the human visual search procedure. Note that the SAN and AAB can be plugged into different backbones, and the framework can be used for multiple medical image analysis tasks when equipped with task-specific heads. Our method is demonstrated to achieve superior performance on both 3D tumor segmentation and 2D chest X-ray classification tasks. We also show that the estimated gaze probability map of the SAN is consistent with an actual gaze fixation map obtained by board-certified doctors. △ Less

Submitted 2 December, 2021; originally announced December 2021.

Comments: BMVC 2021

arXiv:2109.12634 [pdf, other]

A Novel Hybrid Convolutional Neural Network for Accurate Organ Segmentation in 3D Head and Neck CT Images

Authors: Zijie Chen, Cheng Li, Junjun He, Jin Ye, Diping Song, Shanshan Wang, Lixu Gu, Yu Qiao

Abstract: Radiation therapy (RT) is widely employed in the clinic for the treatment of head and neck (HaN) cancers. An essential step of RT planning is the accurate segmentation of various organs-at-risks (OARs) in HaN CT images. Nevertheless, segmenting OARs manually is time-consuming, tedious, and error-prone considering that typical HaN CT images contain tens to hundreds of slices. Automated segmentation… ▽ More Radiation therapy (RT) is widely employed in the clinic for the treatment of head and neck (HaN) cancers. An essential step of RT planning is the accurate segmentation of various organs-at-risks (OARs) in HaN CT images. Nevertheless, segmenting OARs manually is time-consuming, tedious, and error-prone considering that typical HaN CT images contain tens to hundreds of slices. Automated segmentation algorithms are urgently required. Recently, convolutional neural networks (CNNs) have been extensively investigated on this task. Particularly, 3D CNNs are frequently adopted to process 3D HaN CT images. There are two issues with naïve 3D CNNs. First, the depth resolution of 3D CT images is usually several times lower than the in-plane resolution. Direct employment of 3D CNNs without distinguishing this difference can lead to the extraction of distorted image features and influence the final segmentation performance. Second, a severe class imbalance problem exists, and large organs can be orders of times larger than small organs. It is difficult to simultaneously achieve accurate segmentation for all the organs. To address these issues, we propose a novel hybrid CNN that fuses 2D and 3D convolutions to combat the different spatial resolutions and extract effective edge and semantic features from 3D HaN CT images. To accommodate large and small organs, our final model, named OrganNet2.5D, consists of only two instead of the classic four downsampling operations, and hybrid dilated convolutions are introduced to maintain the respective field. Experiments on the MICCAI 2015 challenge dataset demonstrate that OrganNet2.5D achieves promising performance compared to state-of-the-art methods. △ Less

Submitted 26 September, 2021; originally announced September 2021.

Comments: 10 pages, 2 figures

arXiv:2109.12629 [pdf, ps, other]

Group Shift Pointwise Convolution for Volumetric Medical Image Segmentation

Authors: Junjun He, Jin Ye, Cheng Li, Diping Song, Wanli Chen, Shanshan Wang, Lixu Gu, Yu Qiao

Abstract: Recent studies have witnessed the effectiveness of 3D convolutions on segmenting volumetric medical images. Compared with the 2D counterparts, 3D convolutions can capture the spatial context in three dimensions. Nevertheless, models employing 3D convolutions introduce more trainable parameters and are more computationally complex, which may lead easily to model overfitting especially for medical a… ▽ More Recent studies have witnessed the effectiveness of 3D convolutions on segmenting volumetric medical images. Compared with the 2D counterparts, 3D convolutions can capture the spatial context in three dimensions. Nevertheless, models employing 3D convolutions introduce more trainable parameters and are more computationally complex, which may lead easily to model overfitting especially for medical applications with limited available training data. This paper aims to improve the effectiveness and efficiency of 3D convolutions by introducing a novel Group Shift Pointwise Convolution (GSP-Conv). GSP-Conv simplifies 3D convolutions into pointwise ones with 1x1x1 kernels, which dramatically reduces the number of model parameters and FLOPs (e.g. 27x fewer than 3D convolutions with 3x3x3 kernels). Naïve pointwise convolutions with limited receptive fields cannot make full use of the spatial image context. To address this problem, we propose a parameter-free operation, Group Shift (GS), which shifts the feature maps along with different spatial directions in an elegant way. With GS, pointwise convolutions can access features from different spatial locations, and the limited receptive fields of pointwise convolutions can be compensated. We evaluate the proposed methods on two datasets, PROMISE12 and BraTS18. Results show that our method, with substantially decreased model complexity, achieves comparable or even better performance than models employing 3D convolutions. △ Less

Submitted 26 September, 2021; originally announced September 2021.

Comments: 10 pages, 2 figures

arXiv:2109.05208 [pdf, other]

Autonomous Underwater Vehicle-Manipulator Systems Path Planning with RRTAUVMS Algorithm

Authors: Xiaoxu Cao, Linyi Gu, JunChen Mu, Qian Zhang, Qi Song, Chunxiao Liu, Cong Qiu

Abstract: Autonomous Underwater Vehicle-Manipulator systems (AUVMS) is a new tool for ocean exploration, the AUVMS path planning problem is addressed in this paper. AUVMS is a high dimension system with a large difference in inertia distribution, also it works in a complex environment with obstacles. By integrating the rapidly-exploring random tree(RRT) algorithm with the AUVMS kinematics model, the propose… ▽ More Autonomous Underwater Vehicle-Manipulator systems (AUVMS) is a new tool for ocean exploration, the AUVMS path planning problem is addressed in this paper. AUVMS is a high dimension system with a large difference in inertia distribution, also it works in a complex environment with obstacles. By integrating the rapidly-exploring random tree(RRT) algorithm with the AUVMS kinematics model, the proposed RRTAUVMS algorithm could randomly sample in the configuration space(C-Space), and also grow the tree directly towards the workspace goal in the task space. The RRTAUVMS can also deal with the redundant mapping of workspace planning goal and configuration space goal. Compared with the traditional RRT algorithm, the efficiency of the AUVMS path planning can be significantly improved. △ Less

Submitted 11 September, 2021; originally announced September 2021.

arXiv:2107.00296 [pdf, other]

doi 10.1109/JBHI.2021.3110593

Explainable Diabetic Retinopathy Detection and Retinal Image Generation

Authors: Yuhao Niu, Lin Gu, Yitian Zhao, Feng Lu

Abstract: Though deep learning has shown successful performance in classifying the label and severity stage of certain diseases, most of them give few explanations on how to make predictions. Inspired by Koch's Postulates, the foundation in evidence-based medicine (EBM) to identify the pathogen, we propose to exploit the interpretability of deep learning application in medical diagnosis. By determining and… ▽ More Though deep learning has shown successful performance in classifying the label and severity stage of certain diseases, most of them give few explanations on how to make predictions. Inspired by Koch's Postulates, the foundation in evidence-based medicine (EBM) to identify the pathogen, we propose to exploit the interpretability of deep learning application in medical diagnosis. By determining and isolating the neuron activation patterns on which diabetic retinopathy (DR) detector relies to make decisions, we demonstrate the direct relation between the isolated neuron activation and lesions for a pathological explanation. To be specific, we first define novel pathological descriptors using activated neurons of the DR detector to encode both spatial and appearance information of lesions. Then, to visualize the symptom encoded in the descriptor, we propose Patho-GAN, a new network to synthesize medically plausible retinal images. By manipulating these descriptors, we could even arbitrarily control the position, quantity, and categories of generated lesions. We also show that our synthesized images carry the symptoms directly related to diabetic retinopathy diagnosis. Our generated images are both qualitatively and quantitatively superior to the ones by previous methods. Besides, compared to existing methods that take hours to generate an image, our second level speed endows the potential to be an effective solution for data augmentation. △ Less

Submitted 1 July, 2021; originally announced July 2021.

Comments: Code is available at https://github.com/zzdyyy/Patho-GAN

arXiv:2105.02674 [pdf, other]

SS-CADA: A Semi-Supervised Cross-Anatomy Domain Adaptation for Coronary Artery Segmentation

Authors: Jingyang Zhang, Ran Gu, Guotai Wang, Hongzhi Xie, Lixu Gu

Abstract: The segmentation of coronary arteries by convolutional neural network is promising yet requires a large amount of labor-intensive manual annotations. Transferring knowledge from retinal vessels in widely-available public labeled fundus images (FIs) has a potential to reduce the annotation requirement for coronary artery segmentation in X-ray angiograms (XAs) due to their common tubular structures.… ▽ More The segmentation of coronary arteries by convolutional neural network is promising yet requires a large amount of labor-intensive manual annotations. Transferring knowledge from retinal vessels in widely-available public labeled fundus images (FIs) has a potential to reduce the annotation requirement for coronary artery segmentation in X-ray angiograms (XAs) due to their common tubular structures. However, it is challenged by the cross-anatomy domain shift due to the intrinsically different vesselness characteristics in different anatomical regions under even different imaging protocols. To solve this problem, we propose a Semi-Supervised Cross-Anatomy Domain Adaptation (SS-CADきゃどA) which requires only limited annotations for coronary arteries in XAs. With the supervision from a small number of labeled XAs and publicly available labeled FIs, we propose a vesselness-specific batch normalization (VSBN) to individually normalize feature maps for them considering their different cross-anatomic vesselness characteristics. In addition, to further facilitate the annotation efficiency, we employ a self-ensembling mean-teacher (SEMT) to exploit abundant unlabeled XAs by imposing a prediction consistency constraint. Extensive experiments show that our SS-CADA is able to solve the challenging cross-anatomy domain shift, achieving accurate segmentation for coronary arteries given only a small number of labeled XAs. △ Less

Submitted 6 May, 2021; originally announced May 2021.

arXiv:2103.16903 [pdf, ps, other]

UAV Deployment Optimization for Secure Precise Wireless Transmission

Authors: Tong Shen, Guiyang Xia, Jingjing Ye, Lichuan Gu, Xiaobo Zhou, Feng Shu

Abstract: This paper develops an unmanned aerial vehicle (UAV) deployment scheme in the context of the directional modulation-based secure precise wireless transmissions (SPWTs), where the optimal UAV position for the SPWT is derived to maximum the secrecy rate (SR) without injecting any artificial noise (AN) signaling. To be specific, the proposed scheme reveals that the optimal position of UAV for maximiz… ▽ More This paper develops an unmanned aerial vehicle (UAV) deployment scheme in the context of the directional modulation-based secure precise wireless transmissions (SPWTs), where the optimal UAV position for the SPWT is derived to maximum the secrecy rate (SR) without injecting any artificial noise (AN) signaling. To be specific, the proposed scheme reveals that the optimal position of UAV for maximizing the SR performance has to be placed at the null space of Eves channel, which impels the received energy of the confidential message at the unintended receiver deteriorating to zero whilst benefits the one at the intended receiver achieving its maximum value. Finally, simulation results verify the optimality of our proposed scheme in terms of the achievable SR performance. △ Less

Submitted 28 February, 2023; v1 submitted 31 March, 2021; originally announced March 2021.

arXiv:1910.09858 [pdf]

doi 10.1016/j.neucom.2019.10.054

Fixed Pattern Noise Reduction for Infrared Images Based on Cascade Residual Attention CNN

Authors: Juntao Guan, Rui Lai, Ai Xiong, Zesheng Liu, Lin Gu

Abstract: Existing fixed pattern noise reduction (FPNR) methods are easily affected by the motion state of the scene and working condition of the image sensor, which leads to over smooth effects, ghosting artifacts as well as slow convergence rate. To address these issues, we design an innovative cascade convolution neural network (CNN) model with residual skip connections to realize single frame blind FPNR… ▽ More Existing fixed pattern noise reduction (FPNR) methods are easily affected by the motion state of the scene and working condition of the image sensor, which leads to over smooth effects, ghosting artifacts as well as slow convergence rate. To address these issues, we design an innovative cascade convolution neural network (CNN) model with residual skip connections to realize single frame blind FPNR operation without any parameter tuning. Moreover, a coarse-fine convolution (CF-Conv) unit is introduced to extract complementary features in various scales and fuse them to pick more spatial information. Inspired by the success of the visual attention mechanism, we further propose a particular spatial-channel noise attention unit (SCNAU) to separate the scene details from fixed pattern noise more thoroughly and recover the real scene more accurately. Experimental results on test data demonstrate that the proposed cascade CNN-FPNR method outperforms the existing FPNR methods in both of visual effect and quantitative assessment. △ Less

Submitted 22 October, 2019; originally announced October 2019.

arXiv:1910.01262 [pdf, other]

Quantum tensor singular value decomposition with applications to recommendation systems

Authors: Xiaoqiang Wang, Lejia Gu, Joseph Heung-wing Joseph Lee, Guofeng Zhang

Abstract: In this paper, we present a quantum singular value decomposition algorithm for third-order tensors inspired by the classical algorithm of tensor singular value decomposition (t-svd) and then extend it to order-$p$ tensors. It can be proved that the quantum version of the t-svd for a third-order tensor $\mathcal{A} \in \mathbb{R}^{N\times N \times N}$ achieves the complexity of… ▽ More In this paper, we present a quantum singular value decomposition algorithm for third-order tensors inspired by the classical algorithm of tensor singular value decomposition (t-svd) and then extend it to order-$p$ tensors. It can be proved that the quantum version of the t-svd for a third-order tensor $\mathcal{A} \in \mathbb{R}^{N\times N \times N}$ achieves the complexity of $\mathcal{O}(N{\rm polylog}(N))$, an exponential speedup compared with its classical counterpart. As an application, we propose a quantum algorithm for recommendation systems which incorporates the contextual situation of users to the personalized recommendation. We provide recommendations varying with contexts by measuring the output quantum state corresponding to an approximation of this user's preferences. This algorithm runs in expected time $\mathcal{O}(N{\rm polylog}(N){\rm poly}(k)),$ if every frontal slice of the preference tensor has a good rank-$k$ approximation. At last, we provide a quantum algorithm for tensor completion based on a different truncation method which is tested to have a good performance in dynamic video completion. △ Less

Submitted 3 February, 2020; v1 submitted 2 October, 2019; originally announced October 2019.

Comments: 29 pages, 5 figure. Comments are welcome!

arXiv:1907.10456 [pdf, other]

doi 10.1016/j.patcog.2020.107332

Understanding Adversarial Attacks on Deep Learning Based Medical Image Analysis Systems

Authors: Xingjun Ma, Yuhao Niu, Lin Gu, Yisen Wang, Yitian Zhao, James Bailey, Feng Lu

Abstract: Deep neural networks (DNNs) have become popular for medical image analysis tasks like cancer diagnosis and lesion detection. However, a recent study demonstrates that medical deep learning systems can be compromised by carefully-engineered adversarial examples/attacks with small imperceptible perturbations. This raises safety concerns about the deployment of these systems in clinical settings. In… ▽ More Deep neural networks (DNNs) have become popular for medical image analysis tasks like cancer diagnosis and lesion detection. However, a recent study demonstrates that medical deep learning systems can be compromised by carefully-engineered adversarial examples/attacks with small imperceptible perturbations. This raises safety concerns about the deployment of these systems in clinical settings. In this paper, we provide a deeper understanding of adversarial examples in the context of medical images. We find that medical DNN models can be more vulnerable to adversarial attacks compared to models for natural images, according to two different viewpoints. Surprisingly, we also find that medical adversarial attacks can be easily detected, i.e., simple detectors can achieve over 98% detection AUえーゆーC against state-of-the-art attacks, due to fundamental feature differences compared to normal examples. We believe these findings may be a useful basis to approach the design of more explainable and secure medical deep learning systems. △ Less

Submitted 13 March, 2020; v1 submitted 24 July, 2019; originally announced July 2019.

Comments: 15 pages, 11 figures, to appear in Pattern Recognition

ACM Class: I.4.9; I.5.4

Showing 1–20 of 20 results for author: Gu, L