Search | arXiv e-print repository

Deep Multimodal Collaborative Learning for Polyp Re-Identification

Authors: Suncheng Xiang, Jincheng Li, Zhengjie Zhang, Shilun Cai, Jiale Guan, Dahong Qian

Abstract: Colonoscopic Polyp Re-Identification aims to match the same polyp from a large gallery with images from different views taken using different cameras and plays an important role in the prevention and treatment of colorectal cancer in computer-aided diagnosis. However, traditional methods for object ReID directly adopting CNN models trained on the ImageNet dataset usually produce unsatisfactory ret… ▽ More Colonoscopic Polyp Re-Identification aims to match the same polyp from a large gallery with images from different views taken using different cameras and plays an important role in the prevention and treatment of colorectal cancer in computer-aided diagnosis. However, traditional methods for object ReID directly adopting CNN models trained on the ImageNet dataset usually produce unsatisfactory retrieval performance on colonoscopic datasets due to the large domain gap. Worsely, these solutions typically learn unimodal modal representations on the basis of visual samples, which fails to explore complementary information from different modalities. To address this challenge, we propose a novel Deep Multimodal Collaborative Learning framework named DMCL for polyp re-identification, which can effectively encourage modality collaboration and reinforce generalization capability in medical scenarios. On the basis of it, a dynamic multimodal feature fusion strategy is introduced to leverage the optimized multimodal representations for multimodal fusion via end-to-end training. Experiments on the standard benchmarks show the benefits of the multimodal setting over state-of-the-art unimodal ReID models, especially when combined with the specialized multimodal fusion strategy. △ Less

Submitted 12 August, 2024; originally announced August 2024.

Comments: Work in progress. arXiv admin note: text overlap with arXiv:2307.10625

arXiv:2407.19861 [pdf, other]

Phase transitions in rolling of irregular cylinders and spheres

Authors: Daoyuan Qian, Yeonsu Jung, L. Mahadevan

Abstract: When placed on an inclined plane, a perfect 2D disk or 3D sphere simply rolls down in a straight line under gravity. But how is the rolling affected if these shapes are irregular or random? Treating the terminal rolling speed as an order parameter, we show that phase transitions arise as a function of the dimension of the state space and inertia. We calculate the scaling exponents and the macrosco… ▽ More When placed on an inclined plane, a perfect 2D disk or 3D sphere simply rolls down in a straight line under gravity. But how is the rolling affected if these shapes are irregular or random? Treating the terminal rolling speed as an order parameter, we show that phase transitions arise as a function of the dimension of the state space and inertia. We calculate the scaling exponents and the macroscopic lag time associated with the presence of first and second order transitions, and describe the regimes of co-existence of stable states and the accompanying hysteresis. Experiments with rolling cylinders corroborate our theoretical results on the scaling of the lag time. Experiments with spheres reveal closed orbits and their period-doubling in the overdamped and inertial limits respectively, providing visible manifestations of the hairy ball theorem and the doubly-connected nature of SO(3), the space of 3-dimensional rotations. Going beyond simple curiosity, our study might be relevant in a number of natural and artificial systems that involve the rolling of irregular objects, in systems ranging from nanoscale cellular transport to robotics. △ Less

Submitted 29 July, 2024; originally announced July 2024.

arXiv:2407.19753 [pdf, other]

PredIN: Towards Open-Set Gesture Recognition via Prediction Inconsistency

Authors: Chen Liu, Can Han, Chengfeng Zhou, Crystal Cai, Dahong Qian

Abstract: Gesture recognition based on surface electromyography (sEMG) has achieved significant progress in human-machine interaction (HMI). However, accurately recognizing predefined gestures within a closed set is still inadequate in practice; a robust open-set system needs to effectively reject unknown gestures while correctly classifying known ones. To handle this challenge, we first report prediction i… ▽ More Gesture recognition based on surface electromyography (sEMG) has achieved significant progress in human-machine interaction (HMI). However, accurately recognizing predefined gestures within a closed set is still inadequate in practice; a robust open-set system needs to effectively reject unknown gestures while correctly classifying known ones. To handle this challenge, we first report prediction inconsistency discovered for unknown classes due to ensemble diversity, which can significantly facilitate the detection of unknown classes. Based on this insight, we propose an ensemble learning approach, PredIN, to explicitly magnify the prediction inconsistency by enhancing ensemble diversity. Specifically, PredIN maximizes the class feature distribution inconsistency among ensemble members to enhance diversity. Meanwhile, it optimizes inter-class separability within an individual ensemble member to maintain individual performance. Comprehensive experiments on various benchmark datasets demonstrate that the PredIN outperforms state-of-the-art methods by a clear margin.Our proposed method simultaneously achieves accurate closed-set classification for predefined gestures and effective rejection for unknown gestures, exhibiting its efficacy and superiority in open-set gesture recognition based on sEMG. △ Less

Submitted 29 July, 2024; originally announced July 2024.

Comments: Under review

arXiv:2407.13246 [pdf, other]

STS MICCAI 2023 Challenge: Grand challenge on 2D and 3D semi-supervised tooth segmentation

Authors: Yaqi Wang, Yifan Zhang, Xiaodiao Chen, Shuai Wang, Dahong Qian, Fan Ye, Feng Xu, Hongyuan Zhang, Qianni Zhang, Chengyu Wu, Yunxiang Li, Weiwei Cui, Shan Luo, Chengkai Wang, Tianhao Li, Yi Liu, Xiang Feng, Huiyu Zhou, Dongyun Liu, Qixuan Wang, Zhouhao Lin, Wei Song, Yuanlin Li, Bing Wang, Chunshi Wang , et al. (2 additional authors not shown)

Abstract: Computer-aided design (CADきゃど) tools are increasingly popular in modern dental practice, particularly for treatment planning or comprehensive prognosis evaluation. In particular, the 2D panoramic X-ray image efficiently detects invisible caries, impacted teeth and supernumerary teeth in children, while the 3D dental cone beam computed tomography (CBCT) is widely used in orthodontics and endodontics d… ▽ More Computer-aided design (CADきゃど) tools are increasingly popular in modern dental practice, particularly for treatment planning or comprehensive prognosis evaluation. In particular, the 2D panoramic X-ray image efficiently detects invisible caries, impacted teeth and supernumerary teeth in children, while the 3D dental cone beam computed tomography (CBCT) is widely used in orthodontics and endodontics due to its low radiation dose. However, there is no open-access 2D public dataset for children's teeth and no open 3D dental CBCT dataset, which limits the development of automatic algorithms for segmenting teeth and analyzing diseases. The Semi-supervised Teeth Segmentation (STS) Challenge, a pioneering event in tooth segmentation, was held as a part of the MICCAI 2023 ToothFairy Workshop on the Alibaba Tianchi platform. This challenge aims to investigate effective semi-supervised tooth segmentation algorithms to advance the field of dentistry. In this challenge, we provide two modalities including the 2D panoramic X-ray images and the 3D CBCT tooth volumes. In Task 1, the goal was to segment tooth regions in panoramic X-ray images of both adult and pediatric teeth. Task 2 involved segmenting tooth sections using CBCT volumes. Limited labelled images with mostly unlabelled ones were provided in this challenge prompt using semi-supervised algorithms for training. In the preliminary round, the challenge received registration and result submission by 434 teams, with 64 advancing to the final round. This paper summarizes the diverse methods employed by the top-ranking teams in the STS MICCAI 2023 Challenge. △ Less

Submitted 18 July, 2024; originally announced July 2024.

arXiv:2407.03177 [pdf, other]

EDPNet: An Efficient Dual Prototype Network for Motor Imagery EEG Decoding

Authors: Can Han, Chen Liu, Crystal Cai, Jun Wang, Dahong Qian

Abstract: Motor imagery electroencephalograph (MI-EEG) decoding plays a crucial role in developing motor imagery brain-computer interfaces (MI-BCIs). However, decoding intentions from MI remains challenging due to the inherent complexity of EEG signals relative to the small-sample size. In this paper, we propose an Efficient Dual Prototype Network (EDPNet) to enable accurate and fast MI decoding. EDPNet emp… ▽ More Motor imagery electroencephalograph (MI-EEG) decoding plays a crucial role in developing motor imagery brain-computer interfaces (MI-BCIs). However, decoding intentions from MI remains challenging due to the inherent complexity of EEG signals relative to the small-sample size. In this paper, we propose an Efficient Dual Prototype Network (EDPNet) to enable accurate and fast MI decoding. EDPNet employs a lightweight adaptive spatial-spectral fusion module, which promotes more efficient information fusion between multiple EEG electrodes. Subsequently, a parameter-free multi-scale variance pooling module extracts more comprehensive temporal features. Furthermore, we introduce dual prototypical learning to optimize the feature space distribution and training process, thereby improving the model's generalization ability on small-sample MI datasets. Our experimental results show that the EDPNet outperforms state-of-the-art models with superior classification accuracy and kappa values (84.11% and 0.7881 for dataset BCI competition IV 2a, 86.65% and 0.7330 for dataset BCI competition IV 2b). Additionally, we use the BCI competition III IVa dataset with fewer training data to further validate the generalization ability of the proposed EDPNet. We also achieve superior performance with 82.03% classification accuracy. Benefiting from the lightweight parameters and superior decoding accuracy, our EDPNet shows great potential for MI-BCI applications. The code is publicly available at https://github.com/hancan16/EDPNet. △ Less

Submitted 3 July, 2024; originally announced July 2024.

arXiv:2406.14109 [pdf, other]

Protect Measurement-Induced Phase Transition from Noise

Authors: Dongheng Qian, Jing Wang

Abstract: Measurement-induced phase transition (MIPT) is a novel non-equilibrium phase transition characterized by entanglement entropy. The scrambling dynamics induced by random unitary gates can protect information from low-rate measurements. However, common decoherence noises, such as dephasing, are detrimental to the volume law phase, posing a significant challenge for observing MIPT in current noisy in… ▽ More Measurement-induced phase transition (MIPT) is a novel non-equilibrium phase transition characterized by entanglement entropy. The scrambling dynamics induced by random unitary gates can protect information from low-rate measurements. However, common decoherence noises, such as dephasing, are detrimental to the volume law phase, posing a significant challenge for observing MIPT in current noisy intermediate-scale quantum devices. Here, we demonstrate that incorporating quantum-enhanced operations can effectively protect MIPT from environmental noise. The conditional entanglement entropy is associated with a statistical mechanics model wherein noise and quantum-enhanced operations act as two competing external random fields. Then we show that an average apparatus-environment exchange symmetry ensures the conditional entanglement entropy is a valid probe of entanglement. Furthermore, we provide numerical evidence on a (2+1)-d quantum circuit under dephasing noise, demonstrating that MIPT can indeed be observed with the aid of quantum-enhanced operations. This result not only serves as a concrete example of the power of quantum enhancement in combating noise but also holds experimental relevance, as the protocol is straightforward to implement in practice. △ Less

Submitted 15 July, 2024; v1 submitted 20 June, 2024; originally announced June 2024.

Comments: 7+10 pages, 4+7 figures

arXiv:2406.09687 [pdf]

Interplay between topology and correlations in the second moiré band of twisted bilayer MoTe2

Authors: Fan Xu, Xumin Chang, Jiayong Xiao, Yixin Zhang, Feng Liu, Zheng Sun, Ning Mao, Nikolai Peshcherenko, Jiayi Li, Kenji Watanabe, Takashi Taniguchi, Bingbing Tong, Li Lu, Jinfeng Jia, Dong Qian, Zhiwen Shi, Yang Zhang, Xiaoxue Liu, Shengwei Jiang, Tingxin Li

Abstract: Topological flat bands formed in two-dimensional lattice systems offer unique opportunity to study the fractional phases of matter in the absence of an external magnetic field. Celebrated examples include fractional quantum anomalous Hall (FQAH) effects and fractional topological insulators. Recently, FQAH effects have been experimentally realized in both the twisted bilayer MoTe2 (tMoTe2) system… ▽ More Topological flat bands formed in two-dimensional lattice systems offer unique opportunity to study the fractional phases of matter in the absence of an external magnetic field. Celebrated examples include fractional quantum anomalous Hall (FQAH) effects and fractional topological insulators. Recently, FQAH effects have been experimentally realized in both the twisted bilayer MoTe2 (tMoTe2) system and the rhombohedral stacked multilayer graphene/hBN moiré systems. To date, experimental studies mainly focus on the first moiré flat band, except a very recent work that reported the evidence of the integer and fractional quantum spin Hall effects in higher moiré bands of a 2.1° tMoTe2 device. Here, we present the systematical transport study of approximately 3° tMoTe2 devices, especially for the second moiré band. At νにゅー = -2 and -4, time-reversal-symmetric single and double quantum spin Hall states formed, consistent with the previous observation in 2.1° tMoTe2 device. On the other hand, we observed ferromagnetism in the second moiré band, and a Chern insulator state driven by out-of-plane magnetic fields at νにゅー = -3. At νにゅー = -2.5, nonmonotonic temperature dependence of resistivity and large out-of-plane negative magnetoresistance have been observed, which likely arises from the frustrated liquid-like ground state and weak effective Ruderman-Kittel-Kasuya-Yosida interactions. Applying out-of-plane electric field can induce quantum phase transitions at both integer and fractional filling factors. Our studies pave the way for realizing tunable topological states and other unexpected magnetic phases beyond the first moiré flat band based on twisted MoTe2 platform. △ Less

Submitted 13 June, 2024; originally announced June 2024.

arXiv:2406.07925 [pdf, other]

FDLoRA: Personalized Federated Learning of Large Language Model via Dual LoRA Tuning

Authors: Jiaxing QI, Zhongzhi Luan, Shaohan Huang, Carol Fung, Hailong Yang, Depei Qian

Abstract: Large language models (LLMs) have emerged as important components across various fields, yet their training requires substantial computation resources and abundant labeled data. It poses a challenge to robustly training LLMs for individual users (clients). To tackle this challenge, the intuitive idea is to introduce federated learning (FL), which can collaboratively train models on distributed pri… ▽ More Large language models (LLMs) have emerged as important components across various fields, yet their training requires substantial computation resources and abundant labeled data. It poses a challenge to robustly training LLMs for individual users (clients). To tackle this challenge, the intuitive idea is to introduce federated learning (FL), which can collaboratively train models on distributed private data. However, existing methods suffer from the challenges of data heterogeneity, system heterogeneity, and model size, resulting in suboptimal performance and high costs. In this work, we proposed a variant of personalized federated learning (PFL) framework, namely FDLoRA, which allows the client to be a single device or a cluster and adopts low-rank adaptation (LoRA) tuning. FDLoRA sets dual LoRA modules on each client to capture personalized and global knowledge, respectively, and only the global LoRA module uploads parameters to the central server to aggregate cross-client knowledge. Finally, an adaptive fusion approach is employed to combine the parameters of the dual LoRAs. This enables FDLoRA to make effective use of private data distributed across different clients, thereby improving performance on the client without incurring high communication and computing costs. We conducted extensive experiments in two practice scenarios. The results demonstrate that FDLoRA outperforms six baselines in terms of performance, stability, robustness, computation cost, and communication cost. △ Less

Submitted 12 June, 2024; originally announced June 2024.

arXiv:2405.18944 [pdf, other]

Predicting Many Properties of Crystals by a Single Deep Learning Model

Authors: Haosheng Xu, Dongheng Qian, Jing Wang

Abstract: The use of machine learning methods for predicting the properties of crystalline materials encounters significant challenges, primarily related to input encoding, output versatility, and interpretability. Here, we introduce CrystalBERT, an adaptable transformer-based framework with novel structure that integrates space group, elemental, and unit cell information. The method's adaptability lies not… ▽ More The use of machine learning methods for predicting the properties of crystalline materials encounters significant challenges, primarily related to input encoding, output versatility, and interpretability. Here, we introduce CrystalBERT, an adaptable transformer-based framework with novel structure that integrates space group, elemental, and unit cell information. The method's adaptability lies not only in its ability to seamlessly combine diverse features but also in its capability to accurately predict a wide range of physically important properties, including topological properties, superconducting transition temperatures, dielectric constants, and more. CrystalBERT also provides insightful physical interpretations regarding the features that most significantly influence the target properties. Our findings indicate that space group and elemental information are more important for predicting topological and superconducting properties, in contrast to some properties that primarily depend on the unit cell information. This underscores the intricate nature of topological and superconducting properties. By incorporating all these features, we achieve a high accuracy of 91% in topological classification, surpassing prior studies and identifying previously misclassified topological materials, further demonstrating the effectiveness of our model. △ Less

Submitted 29 May, 2024; originally announced May 2024.

Comments: 7 pages, 4 figures. The codes are available upon reasonable request

arXiv:2405.11845 [pdf, other]

Speed of Random Walks in Dirichlet Environment on a Galton-Watson Tree

Authors: Dongjian Qian, Yang Xiao

Abstract: This paper deals with a transient random walk in Dirichlet environment, or equivalently a linearly edge reinforced random walk, on a Galton-Watson tree. We compute the stationary distribution of the environment seen from the particle of an edge reinforced random walk. We obtain a formula for the speed and give a necessary and sufficient condition for the walk to have a positive speed under some mo… ▽ More This paper deals with a transient random walk in Dirichlet environment, or equivalently a linearly edge reinforced random walk, on a Galton-Watson tree. We compute the stationary distribution of the environment seen from the particle of an edge reinforced random walk. We obtain a formula for the speed and give a necessary and sufficient condition for the walk to have a positive speed under some moment conditions on the offspring distribution of the tree. △ Less

Submitted 20 May, 2024; originally announced May 2024.

MSC Class: 60K37; 60K35; 60J80; 60F15

arXiv:2404.19584 [pdf, other]

Broadband microwave-rate dark pulse microcombs in dissipation-engineered LiNbO$_3$ microresonators

Authors: Xiaomin Lv, Binbin Nie, Chen Yang, Rui Ma, Ze Wang, Yanwu Liu, Xing Jin, Kaixuan Zhu, Zhenyu Chen, Du Qian, Guanyu Zhang, Guowei Lv, Qihuang Gong, Fang Bo, Qi-Fan Yang

Abstract: Kerr microcombs generated in optical microresonators provide broadband light sources bridging optical and microwave signals. Their translation to thin-film lithium niobate unlocks second-order nonlinear optical interfaces such as electro-optic modulation and frequency doubling for completing comb functionalities. However, the strong Raman response of LiNbO$_3$ has complicated the formation of Kerr… ▽ More Kerr microcombs generated in optical microresonators provide broadband light sources bridging optical and microwave signals. Their translation to thin-film lithium niobate unlocks second-order nonlinear optical interfaces such as electro-optic modulation and frequency doubling for completing comb functionalities. However, the strong Raman response of LiNbO$_3$ has complicated the formation of Kerr microcombs. Until now, dark pulse microcombs, requiring a double balance between Kerr nonlinearity and normal group velocity dispersion as well as gain and loss, have remained elusive in LiNbO$_3$ microresonators. Here, by incorporating dissipation engineering, we demonstrate dark pulse microcombs with 25 GHz repetition frequency and 200 nm span in a high-$Q$ LiNbO$_3$ microresonator. Resonances near the Raman-active wavelengths are strongly damped by controlling phase-matching conditions of a specially designed pulley coupler. The coherence and tunability of the dark pulse microcombs are also investigated. Our work provides a solution to realize high-power microcombs operating at microwave rates on LiNbO$_3$ chips, promising new opportunities for the monolithic integration of applications spanning communication to microwave photonics. △ Less

Submitted 30 April, 2024; originally announced April 2024.

arXiv:2404.17837 [pdf, other]

Hybrid 3D Human Pose Estimation with Monocular Video and Sparse IMUs

Authors: Yiming Bao, Xu Zhao, Dahong Qian

Abstract: Temporal 3D human pose estimation from monocular videos is a challenging task in human-centered computer vision due to the depth ambiguity of 2D-to-3D lifting. To improve accuracy and address occlusion issues, inertial sensor has been introduced to provide complementary source of information. However, it remains challenging to integrate heterogeneous sensor data for producing physically rational 3… ▽ More Temporal 3D human pose estimation from monocular videos is a challenging task in human-centered computer vision due to the depth ambiguity of 2D-to-3D lifting. To improve accuracy and address occlusion issues, inertial sensor has been introduced to provide complementary source of information. However, it remains challenging to integrate heterogeneous sensor data for producing physically rational 3D human poses. In this paper, we propose a novel framework, Real-time Optimization and Fusion (RTOF), to address this issue. We first incorporate sparse inertial orientations into a parametric human skeleton to refine 3D poses in kinematics. The poses are then optimized by energy functions built on both visual and inertial observations to reduce the temporal jitters. Our framework outputs smooth and biomechanically plausible human motion. Comprehensive experiments with ablation studies demonstrate its rationality and efficiency. On Total Capture dataset, the pose estimation error is significantly decreased compared to the baseline method. △ Less

Submitted 27 April, 2024; originally announced April 2024.

Comments: 10 pages, 5 figures, Under Review

arXiv:2404.10296 [pdf, other]

Engineering software 2.0 by interpolating neural networks: unifying training, solving, and calibration

Authors: Chanwook Park, Sourav Saha, Jiachen Guo, Xiaoyu Xie, Satyajit Mojumder, Miguel A. Bessa, Dong Qian, Wei Chen, Gregory J. Wagner, Jian Cao, Wing Kam Liu

Abstract: The evolution of artificial intelligence (AI) and neural network theories has revolutionized the way software is programmed, shifting from a hard-coded series of codes to a vast neural network. However, this transition in engineering software has faced challenges such as data scarcity, multi-modality of data, low model accuracy, and slow inference. Here, we propose a new network based on interpola… ▽ More The evolution of artificial intelligence (AI) and neural network theories has revolutionized the way software is programmed, shifting from a hard-coded series of codes to a vast neural network. However, this transition in engineering software has faced challenges such as data scarcity, multi-modality of data, low model accuracy, and slow inference. Here, we propose a new network based on interpolation theories and tensor decomposition, the interpolating neural network (INN). Instead of interpolating training data, a common notion in computer science, INN interpolates interpolation points in the physical space whose coordinates and values are trainable. It can also extrapolate if the interpolation points reside outside of the range of training data and the interpolation functions have a larger support domain. INN features orders of magnitude fewer trainable parameters, faster training, a smaller memory footprint, and higher model accuracy compared to feed-forward neural networks (FFNN) or physics-informed neural networks (PINN). INN is poised to usher in Engineering Software 2.0, a unified neural network that spans various domains of space, time, parameters, and initial/boundary conditions. This has previously been computationally prohibitive due to the exponentially growing number of trainable parameters, easily exceeding the parameter size of ChatGPT, which is over 1 trillion. INN addresses this challenge by leveraging tensor decomposition and tensor product, with adaptable network architecture. △ Less

Submitted 22 April, 2024; v1 submitted 16 April, 2024; originally announced April 2024.

Comments: 9 pages, 3 figures

arXiv:2404.03226 [pdf, other]

INSPIRIT: Optimizing Heterogeneous Task Scheduling through Adaptive Priority in Task-based Runtime Systems

Authors: Yiqing Wang, Xiaoyan Liu, Hailong Yang, Xinyu Yang, Pengbo Wang, Yi Liu, Zhongzhi Luan, Depei Qian

Abstract: As modern HPC computing platforms become increasingly heterogeneous, it is challenging for programmers to fully leverage the computation power of massive parallelism offered by such heterogeneity. Consequently, task-based runtime systems have been proposed as an intermediate layer to hide the complex heterogeneity from the application programmers. The core functionality of these systems is to real… ▽ More As modern HPC computing platforms become increasingly heterogeneous, it is challenging for programmers to fully leverage the computation power of massive parallelism offered by such heterogeneity. Consequently, task-based runtime systems have been proposed as an intermediate layer to hide the complex heterogeneity from the application programmers. The core functionality of these systems is to realize efficient task-to-resource mapping in the form of Directed Acyclic Graph (DAG) scheduling. However, existing scheduling schemes face several drawbacks to determine task priorities due to the heavy reliance on domain knowledge or failure to efficiently exploit the interaction of application and hardware characteristics. In this paper, we propose INSPIRIT, an efficient and lightweight scheduling framework with adaptive priority designed for task-based runtime systems. INSPIRIT introduces two novel task attributes \textit{inspiring ability} and \textit{inspiring efficiency} for dictating scheduling, eliminating the need for application domain knowledge. In addition, INSPIRIT jointly considers runtime information such as ready tasks in worker queues to guide task scheduling. This approach exposes more performance opportunities in heterogeneous hardware at runtime while effectively reducing the overhead for adjusting task priorities. Our evaluation results demonstrate that INSPIRIT achieves superior performance compared to cutting edge scheduling schemes on both synthesized and real-world task DAGs. △ Less

Submitted 4 April, 2024; originally announced April 2024.

Comments: 11 pages

arXiv:2403.19098 [pdf, other]

GraphAD: Interaction Scene Graph for End-to-end Autonomous Driving

Authors: Yunpeng Zhang, Deheng Qian, Ding Li, Yifeng Pan, Yong Chen, Zhenbao Liang, Zhiyao Zhang, Shurui Zhang, Hongxu Li, Maolei Fu, Yun Ye, Zhujin Liang, Yi Shan, Dalong Du

Abstract: Modeling complicated interactions among the ego-vehicle, road agents, and map elements has been a crucial part for safety-critical autonomous driving. Previous works on end-to-end autonomous driving rely on the attention mechanism for handling heterogeneous interactions, which fails to capture the geometric priors and is also computationally intensive. In this paper, we propose the Interaction Sce… ▽ More Modeling complicated interactions among the ego-vehicle, road agents, and map elements has been a crucial part for safety-critical autonomous driving. Previous works on end-to-end autonomous driving rely on the attention mechanism for handling heterogeneous interactions, which fails to capture the geometric priors and is also computationally intensive. In this paper, we propose the Interaction Scene Graph (ISG) as a unified method to model the interactions among the ego-vehicle, road agents, and map elements. With the representation of the ISG, the driving agents aggregate essential information from the most influential elements, including the road agents with potential collisions and the map elements to follow. Since a mass of unnecessary interactions are omitted, the more efficient scene-graph-based framework is able to focus on indispensable connections and leads to better performance. We evaluate the proposed method for end-to-end autonomous driving on the nuScenes dataset. Compared with strong baselines, our method significantly outperforms in the full-stack driving tasks, including perception, prediction, and planning. Code will be released at https://github.com/zhangyp15/GraphAD. △ Less

Submitted 6 April, 2024; v1 submitted 27 March, 2024; originally announced March 2024.

Comments: project page: https://github.com/zhangyp15/GraphAD

arXiv:2403.11518 [pdf, other]

doi 10.1088/1674-1056/ad0d9d

Optical manipulation of the topological phase in ZrTe5 revealed by time- and angle-resolved photoemission

Authors: Chaozhi Huang, Chengyang Xu, Fengfeng Zhu, Shaofeng Duan, Jianzhe Liu, Lingxiao Gu, Shichong Wang, Haoran Liu, Dong Qian, Weidong Luo, Wentao Zhang

Abstract: High-resolution time- and angle-resolved photoemission measurements were conducted on the topological insulator ZrTe5. With strong femtosecond photoexcitation, a possible ultrafast phase transition from a weak to a strong topological insulating phase was experimentally realized by recovering the energy gap inversion in a time scale that was shorter than 0.15 ps. This photoinduced transient strong… ▽ More High-resolution time- and angle-resolved photoemission measurements were conducted on the topological insulator ZrTe5. With strong femtosecond photoexcitation, a possible ultrafast phase transition from a weak to a strong topological insulating phase was experimentally realized by recovering the energy gap inversion in a time scale that was shorter than 0.15 ps. This photoinduced transient strong topological phase can last longer than 2 ps at the highest excitation fluence studied, and it cannot be attributed to the photoinduced heating of electrons or modification of the conduction band filling. Additionally, the measured unoccupied electronic states are consistent with the first-principles calculation based on experimental crystal lattice constants, which favor a strong topological insulating phase. These findings provide new insights into the longstanding controversy about the strong and weak topological properties in ZrTe5, and they suggest that many-body effects including electron-electron interactions must be taken into account to understand the equilibrium weak topological insulating phase in ZrTe5. △ Less

Submitted 18 March, 2024; originally announced March 2024.

Journal ref: Chinese Physics B 33, 017901 (2024)

arXiv:2403.09283 [pdf]

doi 10.1093/nsr/nwae127

Observation of quantum oscillations near the Mott-Ioffe-Regel limit in CaAs3

Authors: Yuxiang Wang, Minhao Zhao, Jinglei Zhang, Wenbin Wu, Shichao Li, Yong Zhang, Wenxiang Jiang, Nesta Benno Joseph, Liangcai Xu, Yicheng Mou, Yunkun Yang, Pengliang Leng, Yong Zhang, Li Pi, Alexey Suslov, Mykhaylo Ozerov, Jan Wyzula, Milan Orlita, Fengfeng Zhu, Yi Zhang, Xufeng Kou, Zengwei Zhu, Awadhesh Narayan, Dong Qian, Jinsheng Wen , et al. (3 additional authors not shown)

Abstract: The Mott-Ioffe-Regel limit sets the lower bound of carrier mean free path for coherent quasiparticle transport. Metallicity beyond this limit is of great interest because it is often closely related to quantum criticality and unconventional superconductivity. Progress along this direction mainly focuses on the strange-metal behaviors originating from the evolution of quasiparticle scattering rate… ▽ More The Mott-Ioffe-Regel limit sets the lower bound of carrier mean free path for coherent quasiparticle transport. Metallicity beyond this limit is of great interest because it is often closely related to quantum criticality and unconventional superconductivity. Progress along this direction mainly focuses on the strange-metal behaviors originating from the evolution of quasiparticle scattering rate such as linear-in-temperature resistivity, while the quasiparticle coherence phenomena in this regime are much less explored due to the short mean free path at the diffusive bound. Here we report the observation of quantum oscillations from Landau quantization near the Mott-Ioffe-Regel limit in CaAs3. Despite the insulator-like temperature dependence of resistivity, CaAs3 presents giant magnetoresistance and prominent Shubnikov-de Haas oscillations from Fermi surfaces, indicating highly coherent band transport. In contrast, the quantum oscillation is absent in the magnetic torque. The quasiparticle effective mass increases systematically with magnetic fields, manifesting a much larger value than the expectation given by magneto-infrared spectroscopy. It suggests a strong many-body renormalization effect near Fermi surface. We find that these unconventional behaviors may be explained by the interplay between the mobility edge and the van Hove singularity, which results in the formation of coherent cyclotron orbits emerging at the diffusive bound. Our results call for further study on the electron correlation effect of the van Hove singularity. △ Less

Submitted 14 March, 2024; originally announced March 2024.

Comments: 18 pages, 5 figures

Journal ref: National Science Review, nwae127 (2024)

arXiv:2402.15678 [pdf, other]

Minions: Accelerating Large Language Model Inference with Adaptive and Collective Speculative Decoding

Authors: Siqi Wang, Hailong Yang, Xuezhu Wang, Tongxuan Liu, Pengbo Wang, Xuning Liang, Kejie Ma, Tianyu Feng, Xin You, Yongjun Bao, Yi Liu, Zhongzhi Luan, Depei Qian

Abstract: Large language models (LLM) have recently attracted surging interest due to their outstanding capabilities across various domains. However, enabling efficient LLM inference is challenging due to its autoregressive decoding that generates tokens only one at a time. Although research works apply pruning or quantization to speed up LLM inference, they typically require fine-tuning the LLM, incurring… ▽ More Large language models (LLM) have recently attracted surging interest due to their outstanding capabilities across various domains. However, enabling efficient LLM inference is challenging due to its autoregressive decoding that generates tokens only one at a time. Although research works apply pruning or quantization to speed up LLM inference, they typically require fine-tuning the LLM, incurring significant time and economic costs. Meanwhile, speculative decoding has been proposed to use small speculative models (SSMs) to accelerate the inference of LLM. However, the low acceptance rate of SSM and the high verification cost of LLM prohibit further performance improvement of inference. In this paper, we propose Minions, an LLM inference system that accelerates LLM inference with a collective and adaptive speculative generation. Specifically, Minions proposes a majority-voted mechanism to leverage multiple SSMs to jointly speculate the outputs of LLM, which improves the inference performance without introducing prohibitive computation costs for LLM. To better trade off the number of tokens speculated from SSM and the verification cost of LLM, Minions proposes an adaptive mechanism to dynamically determine the optimal speculation length of SSM, which can achieve better inference performance across different models, datasets, and hyper-parameters. In addition, Minions decouples the SSM decoding and LLM verification efficiently and adopts a pipelined execution mechanism to further improve the inference performance of LLM. By comparing with the state-of-the-art LLM inference systems, we demonstrate that Minions can achieve higher inference throughput and lower inference time. △ Less

Submitted 23 February, 2024; originally announced February 2024.

arXiv:2401.09895 [pdf]

Skeleton-Guided Instance Separation for Fine-Grained Segmentation in Microscopy

Authors: Jun Wang, Chengfeng Zhou, Zhaoyan Ming, Lina Wei, Xudong Jiang, Dahong Qian

Abstract: One of the fundamental challenges in microscopy (MS) image analysis is instance segmentation (IS), particularly when segmenting cluster regions where multiple objects of varying sizes and shapes may be connected or even overlapped in arbitrary orientations. Existing IS methods usually fail in handling such scenarios, as they rely on coarse instance representations such as keypoints and horizontal… ▽ More One of the fundamental challenges in microscopy (MS) image analysis is instance segmentation (IS), particularly when segmenting cluster regions where multiple objects of varying sizes and shapes may be connected or even overlapped in arbitrary orientations. Existing IS methods usually fail in handling such scenarios, as they rely on coarse instance representations such as keypoints and horizontal bounding boxes (h-bboxes). In this paper, we propose a novel one-stage framework named A2B-IS to address this challenge and enhance the accuracy of IS in MS images. Our approach represents each instance with a pixel-level mask map and a rotated bounding box (r-bbox). Unlike two-stage methods that use box proposals for segmentations, our method decouples mask and box predictions, enabling simultaneous processing to streamline the model pipeline. Additionally, we introduce a Gaussian skeleton map to aid the IS task in two key ways: (1) It guides anchor placement, reducing computational costs while improving the model's capacity to learn RoI-aware features by filtering out noise from background regions. (2) It ensures accurate isolation of densely packed instances by rectifying erroneous box predictions near instance boundaries. To further enhance the performance, we integrate two modules into the framework: (1) An Atrous Attention Block (A2B) designed to extract high-resolution feature maps with fine-grained multiscale information, and (2) A Semi-Supervised Learning (SSL) strategy that leverages both labeled and unlabeled images for model training. Our method has been thoroughly validated on two large-scale MS datasets, demonstrating its superiority over most state-of-the-art approaches. △ Less

Submitted 19 January, 2024; v1 submitted 18 January, 2024; originally announced January 2024.

arXiv:2312.07623 [pdf]

Supervised Contrastive Learning for Fine-grained Chromosome Recognition

Authors: Ruijia Chang, Suncheng Xiang, Chengyu Zhou, Kui Su, Dahong Qian, Jun Wang

Abstract: Chromosome recognition is an essential task in karyotyping, which plays a vital role in birth defect diagnosis and biomedical research. However, existing classification methods face significant challenges due to the inter-class similarity and intra-class variation of chromosomes. To address this issue, we propose a supervised contrastive learning strategy that is tailored to train model-agnostic d… ▽ More Chromosome recognition is an essential task in karyotyping, which plays a vital role in birth defect diagnosis and biomedical research. However, existing classification methods face significant challenges due to the inter-class similarity and intra-class variation of chromosomes. To address this issue, we propose a supervised contrastive learning strategy that is tailored to train model-agnostic deep networks for reliable chromosome classification. This method enables extracting fine-grained chromosomal embeddings in latent space. These embeddings effectively expand inter-class boundaries and reduce intra-class variations, enhancing their distinctiveness in predicting chromosome types. On top of two large-scale chromosome datasets, we comprehensively validate the power of our contrastive learning strategy in boosting cutting-edge deep networks such as Transformers and ResNets. Extensive results demonstrate that it can significantly improve models' generalization performance, with an accuracy improvement up to +4.5%. Codes and pretrained models will be released upon acceptance of this work. △ Less

Submitted 12 December, 2023; originally announced December 2023.

arXiv:2312.02535 [pdf, other]

Towards Open-set Gesture Recognition via Feature Activation Enhancement and Orthogonal Prototype Learning

Authors: Chen Liu, Can Han, Chengfeng Zhou, Crystal Cai, Suncheng Xiang, Hualiang Ni, Dahong Qian

Abstract: Gesture recognition is a foundational task in human-machine interaction (HMI). While there has been significant progress in gesture recognition based on surface electromyography (sEMG), accurate recognition of predefined gestures only within a closed set is still inadequate in practice. It is essential to effectively discern and reject unknown gestures of disinterest in a robust system. Numerous m… ▽ More Gesture recognition is a foundational task in human-machine interaction (HMI). While there has been significant progress in gesture recognition based on surface electromyography (sEMG), accurate recognition of predefined gestures only within a closed set is still inadequate in practice. It is essential to effectively discern and reject unknown gestures of disinterest in a robust system. Numerous methods based on prototype learning (PL) have been proposed to tackle this open set recognition (OSR) problem. However, they do not fully explore the inherent distinctions between known and unknown classes. In this paper, we propose a more effective PL method leveraging two novel and inherent distinctions, feature activation level and projection inconsistency. Specifically, the Feature Activation Enhancement Mechanism (FAEM) widens the gap in feature activation values between known and unknown classes. Furthermore, we introduce Orthogonal Prototype Learning (OPL) to construct multiple perspectives. OPL acts to project a sample from orthogonal directions to maximize the distinction between its two projections, where unknown samples will be projected near the clusters of different known classes while known samples still maintain intra-class similarity. Our proposed method simultaneously achieves accurate closed-set classification for predefined gestures and effective rejection for unknown gestures. Extensive experiments demonstrate its efficacy and superiority in open-set gesture recognition based on sEMG. △ Less

Submitted 5 December, 2023; originally announced December 2023.

arXiv:2311.14069 [pdf]

Massive topological edge channels in three-dimensional topological materials induced by extreme surface anisotropy

Authors: Fengfeng Zhu, Chenqiang Hua, Xiao Wang, Lin Miao, Yixi Su, Makoto Hashimoto, Donghui Lu, Zhi-Xun Shen, Jin-Feng Jia, Yunhao Lu, Dandan Guan, Dong Qian

Abstract: A two-dimensional quantum spin Hall insulator exhibits one-dimensional gapless spin-filtered edge channels allowing for dissipationless transport of charge and spin. However, the sophisticated fabrication requirement of two-dimensional materials and the low capacity of one-dimensional channels hinder the broadening applications. We introduce a method to manipulate a three-dimensional topological m… ▽ More A two-dimensional quantum spin Hall insulator exhibits one-dimensional gapless spin-filtered edge channels allowing for dissipationless transport of charge and spin. However, the sophisticated fabrication requirement of two-dimensional materials and the low capacity of one-dimensional channels hinder the broadening applications. We introduce a method to manipulate a three-dimensional topological material to host a large number of one-dimensional topological edge channels utilizing surface anisotropy. Taking ZrTe5 as a model system, we realize a highly anisotropic surface due to the synergistic effect of the lattice geometry and Coulomb interaction, and achieve massive one-dimensional topological edge channels -- confirmed by electronic characterization using angle-resolved photoemission spectroscopy, in combination with first-principles calculations. Our work provides a new avenue to engineer the topological properties of three-dimensional materials through nanoscale tunning of surface morphology and opens up a promising prospect for the development of low-power-consumption electronic nano devices based on one-dimensional topological edge channels. △ Less

Submitted 23 November, 2023; originally announced November 2023.

arXiv:2309.09200 [pdf, ps, other]

Scaling limit of the random walk on a Galton-Watson tree with regular varying offspring distribution

Authors: Dongjian Qian, Yang Xiao

Abstract: We consider a random walk on a Galton-Watson tree whose offspring distribution has a regular varying tail of order $κかっぱ\in (1,2)$. We prove the convergence of the renormalised height function of the walk towards the continuous-time height process of a spectrally positive strictly stable Lévy process, jointly with the convergence of the renormalised trace of the walk towards the continuum tree coded… ▽ More We consider a random walk on a Galton-Watson tree whose offspring distribution has a regular varying tail of order $κかっぱ\in (1,2)$. We prove the convergence of the renormalised height function of the walk towards the continuous-time height process of a spectrally positive strictly stable Lévy process, jointly with the convergence of the renormalised trace of the walk towards the continuum tree coded by the latter continuous-time height process. △ Less

Submitted 26 March, 2024; v1 submitted 17 September, 2023; originally announced September 2023.

Comments: arXiv admin note: text overlap with arXiv:1608.07061 by other authors

MSC Class: 60J80; 60G50; 60F17

arXiv:2309.01315 [pdf, other]

doi 10.1103/PhysRevB.109.024301

Steering-induced phase transition in measurement-only quantum circuits

Authors: Dongheng Qian, Jing Wang

Abstract: Competing measurements alone can give rise to distinct phases characterized by entanglement entropy$\unicode{x2013}$such as the volume law phase, symmetry-breaking (SB) phase, and symmetry-protected topological (SPT) phase$\unicode{x2013}$that can only be discerned through quantum trajectories, making them challenging to observe experimentally. In another burgeoning area of research, recent studie… ▽ More Competing measurements alone can give rise to distinct phases characterized by entanglement entropy$\unicode{x2013}$such as the volume law phase, symmetry-breaking (SB) phase, and symmetry-protected topological (SPT) phase$\unicode{x2013}$that can only be discerned through quantum trajectories, making them challenging to observe experimentally. In another burgeoning area of research, recent studies have demonstrated that steering can give rise to additional phases within quantum circuits. In this work, we show that new phases can appear in measurement-only quantum circuit with steering. Unlike conventional steering methods that rely solely on local information, the steering scheme we introduce requires the circuit's structure as an additional input. These steering induced phases are termed as "informative" phases. They are distinguished by the intrinsic dimension of the bitstrings measured in each circuit run, making them substantially easier to detect in experimental setups. We explicitly show this phase transition by numerical simulation in three circuit models that are previously well-studied: projective transverse field Ising model, lattice gauge-Higgs model and XZZX model. When the informative phase coincides with the SB phase, our steering mechanism effectively serves as a "pre-selection" routine, making the SB phase more experimentally accessible. Additionally, an intermediate phase may manifest, where a discrepancy arises between the quantum information captured by entanglement entropy and the classical information conveyed by bitstrings. Our findings demonstrate that steering not only adds theoretical richness but also offers practical advantages in the study of measurement-only quantum circuits. △ Less

Submitted 7 December, 2023; v1 submitted 3 September, 2023; originally announced September 2023.

Journal ref: Phys. Rev. B 109, 024301 (2024)

arXiv:2309.01189 [pdf, other]

LogGPT: Exploring ChatGPT for Log-Based Anomaly Detection

Authors: Jiaxing Qi, Shaohan Huang, Zhongzhi Luan, Carol Fung, Hailong Yang, Depei Qian

Abstract: The increasing volume of log data produced by software-intensive systems makes it impractical to analyze them manually. Many deep learning-based methods have been proposed for log-based anomaly detection. These methods face several challenges such as high-dimensional and noisy log data, class imbalance, generalization, and model interpretability. Recently, ChatGPT has shown promising results in va… ▽ More The increasing volume of log data produced by software-intensive systems makes it impractical to analyze them manually. Many deep learning-based methods have been proposed for log-based anomaly detection. These methods face several challenges such as high-dimensional and noisy log data, class imbalance, generalization, and model interpretability. Recently, ChatGPT has shown promising results in various domains. However, there is still a lack of study on the application of ChatGPT for log-based anomaly detection. In this work, we proposed LogGPT, a log-based anomaly detection framework based on ChatGPT. By leveraging the ChatGPT's language interpretation capabilities, LogGPT aims to explore the transferability of knowledge from large-scale corpora to log-based anomaly detection. We conduct experiments to evaluate the performance of LogGPT and compare it with three deep learning-based methods on BGL and Spirit datasets. LogGPT shows promising results and has good interpretability. This study provides preliminary insights into prompt-based models, such as ChatGPT, for the log-based anomaly detection task. △ Less

Submitted 3 September, 2023; originally announced September 2023.

arXiv:2308.00929 [pdf, other]

Towards Discriminative Representation with Meta-learning for Colonoscopic Polyp Re-Identification

Authors: Suncheng Xiang, Qingzhong Chen, Shilun Cai, Chengfeng Zhou, Crystal Cai, Sijia Du, Zhengjie Zhang, Yunshi Zhong, Dahong Qian

Abstract: Colonoscopic Polyp Re-Identification aims to match the same polyp from a large gallery with images from different views taken using different cameras and plays an important role in the prevention and treatment of colorectal cancer in computer-aided diagnosis. However, traditional methods for object ReID directly adopting CNN models trained on the ImageNet dataset usually produce unsatisfactory ret… ▽ More Colonoscopic Polyp Re-Identification aims to match the same polyp from a large gallery with images from different views taken using different cameras and plays an important role in the prevention and treatment of colorectal cancer in computer-aided diagnosis. However, traditional methods for object ReID directly adopting CNN models trained on the ImageNet dataset usually produce unsatisfactory retrieval performance on colonoscopic datasets due to the large domain gap. Additionally, these methods neglect to explore the potential of self-discrepancy among intra-class relations in the colonoscopic polyp dataset, which remains an open research problem in the medical community. To solve this dilemma, we propose a simple but effective training method named Colo-ReID, which can help our model learn more general and discriminative knowledge based on the meta-learning strategy in scenarios with fewer samples. Based on this, a dynamic Meta-Learning Regulation mechanism called MLR is introduced to further boost the performance of polyp re-identification. To the best of our knowledge, this is the first attempt to leverage the meta-learning paradigm instead of traditional machine learning algorithm to effectively train deep models in the task of colonoscopic polyp re-identification. Empirical results show that our method significantly outperforms current state-of-the-art methods by a clear margin. △ Less

Submitted 28 November, 2023; v1 submitted 2 August, 2023; originally announced August 2023.

arXiv:2307.10625 [pdf, other]

Learning Discriminative Visual-Text Representation for Polyp Re-Identification

Authors: Suncheng Xiang, Cang Liu, Sijia Du, Dahong Qian

Abstract: Colonoscopic Polyp Re-Identification aims to match a specific polyp in a large gallery with different cameras and views, which plays a key role for the prevention and treatment of colorectal cancer in the computer-aided diagnosis. However, traditional methods mainly focus on the visual representation learning, while neglect to explore the potential of semantic features during training, which may e… ▽ More Colonoscopic Polyp Re-Identification aims to match a specific polyp in a large gallery with different cameras and views, which plays a key role for the prevention and treatment of colorectal cancer in the computer-aided diagnosis. However, traditional methods mainly focus on the visual representation learning, while neglect to explore the potential of semantic features during training, which may easily leads to poor generalization capability when adapted the pretrained model into the new scenarios. To relieve this dilemma, we propose a simple but effective training method named VT-ReID, which can remarkably enrich the representation of polyp videos with the interchange of high-level semantic information. Moreover, we elaborately design a novel clustering mechanism to introduce prior knowledge from textual data, which leverages contrastive learning to promote better separation from abundant unlabeled text data. To the best of our knowledge, this is the first attempt to employ the visual-text feature with clustering mechanism for the colonoscopic polyp re-identification. Empirical results show that our method significantly outperforms current state-of-the art methods with a clear margin. △ Less

Submitted 20 July, 2023; originally announced July 2023.

arXiv:2307.06567 [pdf]

Versatile Method of Engineering the Band Alignment and the Electron Wavefunction Hybridization of Hybrid Quantum Devices

Authors: Guoan Li, Xiaofan Shi, Ting Lin, Guang Yang, Marco Rossi, Ghada Badawy, Zhiyuan Zhang, Jiayu Shi, Degui Qian, Fang Lu, Lin Gu, An-Qi Wang, Bingbing Tong, Peiling Li, Zhaozheng Lyu, Guangtong Liu, Fanming Qu, Ziwei Dou, Dong Pan, Jianhua Zhao, Qinghua Zhang, Erik P. A. M. Bakkers, Michał P. Nowak, Paweł Wójcik, Li Lu , et al. (1 additional authors not shown)

Abstract: With the development of quantum technology, hybrid devices that combine superconductors (S) and semiconductors (Sm) have attracted great attention due to the possibility of engineering structures that benefit from the integration of the properties of both materials. However, until now, none of the experiments have reported good control of band alignment at the interface, which determines the stren… ▽ More With the development of quantum technology, hybrid devices that combine superconductors (S) and semiconductors (Sm) have attracted great attention due to the possibility of engineering structures that benefit from the integration of the properties of both materials. However, until now, none of the experiments have reported good control of band alignment at the interface, which determines the strength of S-Sm coupling and the proximitized superconducting gap. Here, we fabricate hybrid devices in a generic way with argon milling to modify the interface while maintaining its high quality. First, after the milling the atomically connected S-Sm interfaces appear, resulting in a large induced gap, as well as the ballistic transport revealed by the multiple Andreev reflections and quantized above-gap conductance plateaus. Second, by comparing transport measurement with Schrödinger-Poisson (SP) calculations, we demonstrate that argon milling is capable of varying the band bending strength in the semiconducting wire as the electrons tend to accumulate on the etched surface for longer milling time. Finally, we perform nonlocal measurements on advanced devices to demonstrate the coexistence and tunability of crossed Andreev reflection (CAR) and elastic co-tunneling (ECT) -- key ingredients for building the prototype setup for realization of Kitaev chain and quantum entanglement probing. Such a versatile method, compatible with the standard fabrication process and accompanied by the well-controlled modification of the interface, will definitely boost the creation of more sophisticated hybrid devices for exploring physics in solid-state systems. △ Less

Submitted 24 July, 2024; v1 submitted 13 July, 2023; originally announced July 2023.

Comments: 41 pages, 16 figures

arXiv:2306.00311 [pdf, other]

doi 10.1103/PhysRevLett.130.226501

Ultrafast Switching from the Charge Density Wave Phase to a Metastable Metallic State in 1T-TiSe$_2$

Authors: Shaofeng Duan, Wei Xia, Chaozhi Huang, Shichong Wang, Lingxiao Gu, Haoran Liu, Dao Xiang, Dong Qian, Yanfeng Guo, Wentao Zhang

Abstract: The ultrafast electronic structures of the charge density wave material 1T-TiSe$_2$ were investigated by high-resolution time- and angle-resolved photoemission spectroscopy. We found that the quasiparticle populations drove ultrafast electronic phase transitions in 1T-TiSe$_2$ within 100 fs after photoexcitation, and a metastable metallic state, which was significantly different from the equilibri… ▽ More The ultrafast electronic structures of the charge density wave material 1T-TiSe$_2$ were investigated by high-resolution time- and angle-resolved photoemission spectroscopy. We found that the quasiparticle populations drove ultrafast electronic phase transitions in 1T-TiSe$_2$ within 100 fs after photoexcitation, and a metastable metallic state, which was significantly different from the equilibrium normal phase, was evidenced far below the charge density wave transition temperature. Detailed time- and pump-fluence-dependent experiments revealed that the photoinduced metastable metallic state was a result of the halted motion of the atoms through the coherent electron-phonon coupling process, and the lifetime of this state was prolonged to picoseconds with the highest pump fluence used in this study. Ultrafast electronic dynamics were well captured by the time-dependent Ginzburg-Landau model. Our work demonstrates a mechanism for realizing novel electronic states by photoinducing coherent motion of atoms in the lattice. △ Less

Submitted 31 May, 2023; originally announced June 2023.

Comments: 13 Pages, 10 figures

Journal ref: Phys. Rev. Lett. 130, 226501 (2023)

arXiv:2305.00194 [pdf, other]

Searching from Area to Point: A Hierarchical Framework for Semantic-Geometric Combined Feature Matching

Authors: Yesheng Zhang, Xu Zhao, Dahong Qian

Abstract: Feature matching is a crucial technique in computer vision. A unified perspective for this task is to treat it as a searching problem, aiming at an efficient search strategy to narrow the search space to point matches between images. One of the key aspects of search strategy is the search space, which in current approaches is not carefully defined, resulting in limited matching accuracy. This pape… ▽ More Feature matching is a crucial technique in computer vision. A unified perspective for this task is to treat it as a searching problem, aiming at an efficient search strategy to narrow the search space to point matches between images. One of the key aspects of search strategy is the search space, which in current approaches is not carefully defined, resulting in limited matching accuracy. This paper, thus, pays attention to the search space and proposes to set the initial search space for point matching as the matched image areas containing prominent semantic, named semantic area matches. This search space favors point matching by salient features and alleviates the accuracy limitation in recent Transformer-based matching methods. To achieve this search space, we introduce a hierarchical feature matching framework: Area to Point Matching (A2PM), to first find semantic area matches between images and later perform point matching on area matches. We further propose Semantic and Geometry Area Matching (SGAM) method to realize this framework, which utilizes semantic prior and geometry consistency to establish accurate area matches between images. By integrating SGAM with off-the-shelf state-of-the-art matchers, our method, adopting the A2PM framework, achieves encouraging precision improvements in massive point matching and pose estimation experiments. △ Less

Submitted 1 May, 2024; v1 submitted 29 April, 2023; originally announced May 2023.

Comments: v3

arXiv:2304.09498 [pdf, other]

Learning Robust Visual-Semantic Embedding for Generalizable Person Re-identification

Authors: Suncheng Xiang, Jingsheng Gao, Mengyuan Guan, Jiacheng Ruan, Chengfeng Zhou, Ting Liu, Dahong Qian, Yuzhuo Fu

Abstract: Generalizable person re-identification (Re-ID) is a very hot research topic in machine learning and computer vision, which plays a significant role in realistic scenarios due to its various applications in public security and video surveillance. However, previous methods mainly focus on the visual representation learning, while neglect to explore the potential of semantic features during training,… ▽ More Generalizable person re-identification (Re-ID) is a very hot research topic in machine learning and computer vision, which plays a significant role in realistic scenarios due to its various applications in public security and video surveillance. However, previous methods mainly focus on the visual representation learning, while neglect to explore the potential of semantic features during training, which easily leads to poor generalization capability when adapted to the new domain. In this paper, we propose a Multi-Modal Equivalent Transformer called MMET for more robust visual-semantic embedding learning on visual, textual and visual-textual tasks respectively. To further enhance the robust feature learning in the context of transformer, a dynamic masking mechanism called Masked Multimodal Modeling strategy (MMM) is introduced to mask both the image patches and the text tokens, which can jointly works on multimodal or unimodal data and significantly boost the performance of generalizable person Re-ID. Extensive experiments on benchmark datasets demonstrate the competitive performance of our method over previous approaches. We hope this method could advance the research towards visual-semantic representation learning. Our source code is also publicly available at https://github.com/JeremyXSC/MMET. △ Less

Submitted 19 April, 2023; originally announced April 2023.

arXiv:2304.03782 [pdf, other]

doi 10.1007/s11390-022-1632-9

AutoQNN: An End-to-End Framework for Automatically Quantizing Neural Networks

Authors: Cheng Gong, Ye Lu, Surong Dai, Deng Qian, Chenkun Du, Tao Li

Abstract: Exploring the expected quantizing scheme with suitable mixed-precision policy is the key point to compress deep neural networks (DNNs) in high efficiency and accuracy. This exploration implies heavy workloads for domain experts, and an automatic compression method is needed. However, the huge search space of the automatic method introduces plenty of computing budgets that make the automatic proces… ▽ More Exploring the expected quantizing scheme with suitable mixed-precision policy is the key point to compress deep neural networks (DNNs) in high efficiency and accuracy. This exploration implies heavy workloads for domain experts, and an automatic compression method is needed. However, the huge search space of the automatic method introduces plenty of computing budgets that make the automatic process challenging to be applied in real scenarios. In this paper, we propose an end-to-end framework named AutoQNN, for automatically quantizing different layers utilizing different schemes and bitwidths without any human labor. AutoQNN can seek desirable quantizing schemes and mixed-precision policies for mainstream DNN models efficiently by involving three techniques: quantizing scheme search (QSS), quantizing precision learning (QPL), and quantized architecture generation (QAG). QSS introduces five quantizing schemes and defines three new schemes as a candidate set for scheme search, and then uses the differentiable neural architecture search (DNAS) algorithm to seek the layer- or model-desired scheme from the set. QPL is the first method to learn mixed-precision policies by reparameterizing the bitwidths of quantizing schemes, to the best of our knowledge. QPL optimizes both classification loss and precision loss of DNNs efficiently and obtains the relatively optimal mixed-precision model within limited model size and memory footprint. QAG is designed to convert arbitrary architectures into corresponding quantized ones without manual intervention, to facilitate end-to-end neural network quantization. We have implemented AutoQNN and integrated it into Keras. Extensive experiments demonstrate that AutoQNN can consistently outperform state-of-the-art quantization. △ Less

Submitted 7 April, 2023; originally announced April 2023.

Comments: 22 pages, 9 figures, 7 tables, Journal of Computer Science and Technology

arXiv:2303.15671 [pdf, other]

Colo-SCRL: Self-Supervised Contrastive Representation Learning for Colonoscopic Video Retrieval

Authors: Qingzhong Chen, Shilun Cai, Crystal Cai, Zefang Yu, Dahong Qian, Suncheng Xiang

Abstract: Colonoscopic video retrieval, which is a critical part of polyp treatment, has great clinical significance for the prevention and treatment of colorectal cancer. However, retrieval models trained on action recognition datasets usually produce unsatisfactory retrieval results on colonoscopic datasets due to the large domain gap between them. To seek a solution to this problem, we construct a large-… ▽ More Colonoscopic video retrieval, which is a critical part of polyp treatment, has great clinical significance for the prevention and treatment of colorectal cancer. However, retrieval models trained on action recognition datasets usually produce unsatisfactory retrieval results on colonoscopic datasets due to the large domain gap between them. To seek a solution to this problem, we construct a large-scale colonoscopic dataset named Colo-Pair for medical practice. Based on this dataset, a simple yet effective training method called Colo-SCRL is proposed for more robust representation learning. It aims to refine general knowledge from colonoscopies through masked autoencoder-based reconstruction and momentum contrast to improve retrieval performance. To the best of our knowledge, this is the first attempt to employ the contrastive learning paradigm for medical video retrieval. Empirical results show that our method significantly outperforms current state-of-the-art methods in the colonoscopic video retrieval task. △ Less

Submitted 27 March, 2023; originally announced March 2023.

Comments: Accepted by ICME 2023

arXiv:2301.04799 [pdf, ps, other]

Adaptive Context Selection for Polyp Segmentation

Authors: Ruifei Zhang, Guanbin Li, Zhen Li, Shuguang Cui, Dahong Qian, Yizhou Yu

Abstract: Accurate polyp segmentation is of great significance for the diagnosis and treatment of colorectal cancer. However, it has always been very challenging due to the diverse shape and size of polyp. In recent years, state-of-the-art methods have achieved significant breakthroughs in this task with the help of deep convolutional neural networks. However, few algorithms explicitly consider the impact o… ▽ More Accurate polyp segmentation is of great significance for the diagnosis and treatment of colorectal cancer. However, it has always been very challenging due to the diverse shape and size of polyp. In recent years, state-of-the-art methods have achieved significant breakthroughs in this task with the help of deep convolutional neural networks. However, few algorithms explicitly consider the impact of the size and shape of the polyp and the complex spatial context on the segmentation performance, which results in the algorithms still being powerless for complex samples. In fact, segmentation of polyps of different sizes relies on different local and global contextual information for regional contrast reasoning. To tackle these issues, we propose an adaptive context selection based encoder-decoder framework which is composed of Local Context Attention (LCA) module, Global Context Module (GCM) and Adaptive Selection Module (ASM). Specifically, LCA modules deliver local context features from encoder layers to decoder layers, enhancing the attention to the hard region which is determined by the prediction map of previous layer. GCM aims to further explore the global context features and send to the decoder layers. ASM is used for adaptive selection and aggregation of context features through channel-wise attention. Our proposed approach is evaluated on the EndoScene and Kvasir-SEG Datasets, and shows outstanding performance compared with other state-of-the-art methods. The code is available at https://github.com/ReaFly/ACSNet. △ Less

Submitted 11 January, 2023; originally announced January 2023.

Comments: Accepted by MICCAI2020

arXiv:2211.00933 [pdf, other]

Deep Multimodal Fusion for Generalizable Person Re-identification

Authors: Suncheng Xiang, Hao Chen, Wei Ran, Zefang Yu, Ting Liu, Dahong Qian, Yuzhuo Fu

Abstract: Person re-identification plays a significant role in realistic scenarios due to its various applications in public security and video surveillance. Recently, leveraging the supervised or semi-unsupervised learning paradigms, which benefits from the large-scale datasets and strong computing performance, has achieved a competitive performance on a specific target domain. However, when Re-ID models a… ▽ More Person re-identification plays a significant role in realistic scenarios due to its various applications in public security and video surveillance. Recently, leveraging the supervised or semi-unsupervised learning paradigms, which benefits from the large-scale datasets and strong computing performance, has achieved a competitive performance on a specific target domain. However, when Re-ID models are directly deployed in a new domain without target samples, they always suffer from considerable performance degradation and poor domain generalization. To address this challenge, we propose a Deep Multimodal Fusion network to elaborate rich semantic knowledge for assisting in representation learning during the pre-training. Importantly, a multimodal fusion strategy is introduced to translate the features of different modalities into the common space, which can significantly boost generalization capability of Re-ID model. As for the fine-tuning stage, a realistic dataset is adopted to fine-tune the pre-trained model for better distribution alignment with real-world data. Comprehensive experiments on benchmarks demonstrate that our method can significantly outperform previous domain generalization or meta-learning methods with a clear margin. Our source code will also be publicly available at https://github.com/JeremyXSC/DMF. △ Less

Submitted 29 December, 2022; v1 submitted 2 November, 2022; originally announced November 2022.

arXiv:2210.06685 [pdf]

doi 10.1038/s41467-023-43361-5

A robust and tunable Luttinger liquid in correlated edge of transition-metal second-order topological insulator Ta$_2$Pd$_3$Te$_5$

Authors: Anqi Wang, Yupeng Li, Guang Yang, Dayu Yan, Yuan Huang, Zhaopeng Guo, Jiacheng Gao, Jierui Huang, Qiaochu Zeng, Degui Qian, Hao Wang, Xingchen Guo, Fanqi Meng, Qinghua Zhang, Lin Gu, Xingjiang Zhou, Guangtong Liu, Fanming Qu, Tian Qian, Youguo Shi, Zhijun Wang, Li Lu, Jie Shen

Abstract: The interplay between topology and interaction always plays an important role in condensed matter physics and induces many exotic quantum phases, while rare transition metal layered material (TMLM) has been proved to possess both. Here we report a TMLM Ta$_2$Pd$_3$Te$_5$ has the two-dimensional second-order topology (also a quadrupole topological insulator) with correlated edge states - Luttinger… ▽ More The interplay between topology and interaction always plays an important role in condensed matter physics and induces many exotic quantum phases, while rare transition metal layered material (TMLM) has been proved to possess both. Here we report a TMLM Ta$_2$Pd$_3$Te$_5$ has the two-dimensional second-order topology (also a quadrupole topological insulator) with correlated edge states - Luttinger liquid. It is ascribed to the unconventional nature of the mismatch between charge- and atomic- centers induced by a remarkable double-band inversion. This one-dimensional protected edge state preserves the Luttinger liquid behavior with robustness and universality in scale from micro- to macro- size, leading to a significant anisotropic electrical transport through two-dimensional sides of bulk materials. Moreover, the bulk gap can be modulated by the thickness, resulting in an extensive-range phase diagram for Luttinger liquid. These provide an attractive model to study the interaction and quantum phases in correlated topological systems. △ Less

Submitted 1 May, 2024; v1 submitted 12 October, 2022; originally announced October 2022.

Comments: 41 pages, 6 Main Figures + 14 Supplementary Figure

arXiv:2209.02642 [pdf]

Multiscale reduced-order modeling of fused filament fabricated composites

Authors: Satyajit Mojumder, Anton van Beek, Zahabul Islam, Dong Qian, Wing Kam Liu

Abstract: Defects such as voids are observed at multiple length scales of an additively manufactured composite material. Modeling such defects and their multiscale interaction is crucial for the materials performance prediction. In this work, we study as-built defects in fused filament fabricated Polycarbonate/Short Carbon Fiber (PC/SCF) composite samples. The microscale and mesoscale voids along with the m… ▽ More Defects such as voids are observed at multiple length scales of an additively manufactured composite material. Modeling such defects and their multiscale interaction is crucial for the materials performance prediction. In this work, we study as-built defects in fused filament fabricated Polycarbonate/Short Carbon Fiber (PC/SCF) composite samples. The microscale and mesoscale voids along with the mesoscale layer orientations have been studied using a mechanistic reduced-order model. Our result indicates that the microscale intrabead voids interact with the mesoscale interbead voids and significantly degrade the mechanical response of the printed composites compared to the microscale microstructure without voids. The mesoscale layer orientations also influence the stress-strain response and show better performance when the load is applied to the bead direction. The efficient reduced-order modeling approach used in this work provides a way to evaluate multiscale design aspects of additively manufactured composite materials. △ Less

Submitted 6 September, 2022; originally announced September 2022.

Comments: 13 pages, 4 Figures

arXiv:2209.02478 [pdf, other]

Mimose: An Input-Aware Checkpointing Planner for Efficient Training on GPU

Authors: Jianjin Liao, Mingzhen Li, Qingxiao Sun, Jiwei Hao, Fengwei Yu, Shengdong Chen, Ye Tao, Zicheng Zhang, Hailong Yang, Zhongzhi Luan, Depei Qian

Abstract: Larger deep learning models usually lead to higher model quality with an ever-increasing GPU memory footprint. Although tensor checkpointing techniques have been proposed to enable training under a restricted GPU memory budget, the input tensor dynamics have been unexploited for optimizing performance while reducing GPU memory footprint. Specifically, due to the diverse datasets and subsequent dat… ▽ More Larger deep learning models usually lead to higher model quality with an ever-increasing GPU memory footprint. Although tensor checkpointing techniques have been proposed to enable training under a restricted GPU memory budget, the input tensor dynamics have been unexploited for optimizing performance while reducing GPU memory footprint. Specifically, due to the diverse datasets and subsequent data argumentation, the input tensor size per mini-batch is dynamic during the training process, leading to a changing GPU memory footprint. However, to leverage such input tensor dynamics in checkpointing, there are two challenges to be solved. First, the checkpointing plan needs to be determined during runtime due to the dynamics of input tensors. Second, the checkpointing plan needs to be applied on the fly without significantly deteriorating the performance. In this paper, we propose Mimose, an input-aware tensor checkpointing planner respecting the memory budget while enabling efficient model training on GPU. Mimose builds a lightweight but accurate prediction model of GPU memory usage online, without pre-analyzing the model. It generates a tensor checkpointing plan based on per-layer memory prediction and applies it to training progress on the fly. It also adopts a caching strategy to avoid having to regenerate the plan for repeated input size. Our experiments show that Mimose achieves superior training throughput compared to state-of-the-art memory planners under the same GPU memory budgets. △ Less

Submitted 6 September, 2022; originally announced September 2022.

arXiv:2208.14228 [pdf, other]

EasyScale: Accuracy-consistent Elastic Training for Deep Learning

Authors: Mingzhen Li, Wencong Xiao, Biao Sun, Hanyu Zhao, Hailong Yang, Shiru Ren, Zhongzhi Luan, Xianyan Jia, Yi Liu, Yong Li, Wei Lin, Depei Qian

Abstract: Distributed synchronized GPU training is commonly used for deep learning. The resource constraint of using a fixed number of GPUs makes large-scale training jobs suffer from long queuing time for resource allocation, and lowers the cluster utilization. Adapting to resource elasticity can alleviate this but often introduces inconsistent model accuracy, due to lacking of capability to decouple model… ▽ More Distributed synchronized GPU training is commonly used for deep learning. The resource constraint of using a fixed number of GPUs makes large-scale training jobs suffer from long queuing time for resource allocation, and lowers the cluster utilization. Adapting to resource elasticity can alleviate this but often introduces inconsistent model accuracy, due to lacking of capability to decouple model training procedure from resource allocation. We propose EasyScale, an elastic training system that achieves consistent model accuracy under resource elasticity for both homogeneous and heterogeneous GPUs. EasyScale preserves the data-parallel training behaviors strictly, traces the consistency-relevant factors carefully, utilizes the deep learning characteristics for EasyScaleThread abstraction and fast context-switching. To utilize heterogeneous cluster, EasyScale dynamically assigns workers based on the intra-/inter-job schedulers, minimizing load imbalance and maximizing aggregated job throughput. Deployed in an online serving cluster, EasyScale powers the training jobs to utilize idle GPUs opportunistically, improving overall cluster utilization by 62.1%. △ Less

Submitted 6 November, 2023; v1 submitted 30 August, 2022; originally announced August 2022.

Comments: To be appeared at SC'23. Link: https://sc23.supercomputing.org/presentation/?id=pap262&sess=sess168

arXiv:2208.11960 [pdf, other]

FusePose: IMU-Vision Sensor Fusion in Kinematic Space for Parametric Human Pose Estimation

Authors: Yiming Bao, Xu Zhao, Dahong Qian

Abstract: There exist challenging problems in 3D human pose estimation mission, such as poor performance caused by occlusion and self-occlusion. Recently, IMU-vision sensor fusion is regarded as valuable for solving these problems. However, previous researches on the fusion of IMU and vision data, which is heterogeneous, fail to adequately utilize either IMU raw data or reliable high-level vision features.… ▽ More There exist challenging problems in 3D human pose estimation mission, such as poor performance caused by occlusion and self-occlusion. Recently, IMU-vision sensor fusion is regarded as valuable for solving these problems. However, previous researches on the fusion of IMU and vision data, which is heterogeneous, fail to adequately utilize either IMU raw data or reliable high-level vision features. To facilitate a more efficient sensor fusion, in this work we propose a framework called \emph{FusePose} under a parametric human kinematic model. Specifically, we aggregate different information of IMU or vision data and introduce three distinctive sensor fusion approaches: NaiveFuse, KineFuse and AdaDeepFuse. NaiveFuse servers as a basic approach that only fuses simplified IMU data and estimated 3D pose in euclidean space. While in kinematic space, KineFuse is able to integrate the calibrated and aligned IMU raw data with converted 3D pose parameters. AdaDeepFuse further develops this kinematical fusion process to an adaptive and end-to-end trainable manner. Comprehensive experiments with ablation studies demonstrate the rationality and superiority of the proposed framework. The performance of 3D human pose estimation is improved compared to the baseline result. On Total Capture dataset, KineFuse surpasses previous state-of-the-art which uses IMU only for testing by 8.6\%. AdaDeepFuse surpasses state-of-the-art which uses IMU for both training and testing by 8.5\%. Moreover, we validate the generalization capability of our framework through experiments on Human3.6M dataset. △ Less

Submitted 25 August, 2022; originally announced August 2022.

Comments: 11 pages,8 figures

arXiv:2208.11483 [pdf, other]

SubFace: Learning with Softmax Approximation for Face Recognition

Authors: Hongwei Xu, Suncheng Xiang, Dahong Qian

Abstract: The softmax-based loss functions and its variants (e.g., cosface, sphereface, and arcface) significantly improve the face recognition performance in wild unconstrained scenes. A common practice of these algorithms is to perform optimizations on the multiplication between the embedding features and the linear transformation matrix. However in most cases, the dimension of embedding features is given… ▽ More The softmax-based loss functions and its variants (e.g., cosface, sphereface, and arcface) significantly improve the face recognition performance in wild unconstrained scenes. A common practice of these algorithms is to perform optimizations on the multiplication between the embedding features and the linear transformation matrix. However in most cases, the dimension of embedding features is given based on traditional design experience, and there is less-studied on improving performance using the feature itself when giving a fixed size. To address this challenge, this paper presents a softmax approximation method called SubFace, which employs the subspace feature to promote the performance of face recognition. Specifically, we dynamically select the non-overlapping subspace features in each batch during training, and then use the subspace features to approximate full-feature among softmax-based loss, so the discriminability of the deep model can be significantly enhanced for face recognition. Comprehensive experiments conducted on benchmark datasets demonstrate that our method can significantly improve the performance of vanilla CNN baseline, which strongly proves the effectiveness of subspace strategy with the margin-based loss. △ Less

Submitted 24 August, 2022; originally announced August 2022.

arXiv:2207.09166 [pdf, ps, other]

Regular subspaces of symmetric stable processes

Authors: Dongjian Qian, Jiangang Ying, Yushu Zheng

Abstract: Roughly speaking, regular subspaces are regular Dirichlet forms that inherit the original forms with smaller domains. In this paper, regular subspaces of 1-dim symmetric $αあるふぁ$-stable processes are considered. The main result is that it admits proper regular subspaces if and only if $αあるふぁ\in [1,2]$. Moreover, for $αあるふぁ\in(1,2)$, the characterization of the regular subspaces is given. General 1-dim symmetri… ▽ More Roughly speaking, regular subspaces are regular Dirichlet forms that inherit the original forms with smaller domains. In this paper, regular subspaces of 1-dim symmetric $αあるふぁ$-stable processes are considered. The main result is that it admits proper regular subspaces if and only if $αあるふぁ\in [1,2]$. Moreover, for $αあるふぁ\in(1,2)$, the characterization of the regular subspaces is given. General 1-dim symmetric Lévy processes will also be investigated. It will be shown that whether it has proper regular subspaces is closely related to whether its sample paths have finite variation. △ Less

Submitted 8 March, 2023; v1 submitted 19 July, 2022; originally announced July 2022.

Comments: 20 pages

arXiv:2207.04141 [pdf, ps, other]

doi 10.1103/PhysRevE.107.044605

Field driven cluster formation in two-dimensional colloidal binary mixtures

Authors: Dingwen Qian, Monica Olvera de la Cruz

Abstract: We study size- and charge-asymmetric oppositely charged colloids driven by an external electric field. The large particles are connected by harmonic springs, forming a hexagonal-lattice network while the small particles are free of bonds and exhibit fluid-like motion. We show that this model exhibits a cluster formation pattern when the external driving force exceeds a critical value. The clusteri… ▽ More We study size- and charge-asymmetric oppositely charged colloids driven by an external electric field. The large particles are connected by harmonic springs, forming a hexagonal-lattice network while the small particles are free of bonds and exhibit fluid-like motion. We show that this model exhibits a cluster formation pattern when the external driving force exceeds a critical value. The clustering is accompanied with stable wavepackets in vibrational motions of the large particles. △ Less

Submitted 6 April, 2023; v1 submitted 8 July, 2022; originally announced July 2022.

arXiv:2207.03366 [pdf, other]

A simple normalization technique using window statistics to improve the out-of-distribution generalization on medical images

Authors: Chengfeng Zhou, Songchang Chen, Chenming Xu, Jun Wang, Feng Liu, Chun Zhang, Juan Ye, Hefeng Huang, Dahong Qian

Abstract: Since data scarcity and data heterogeneity are prevailing for medical images, well-trained Convolutional Neural Networks (CNNs) using previous normalization methods may perform poorly when deployed to a new site. However, a reliable model for real-world clinical applications should be able to generalize well both on in-distribution (IND) and out-of-distribution (OOD) data (e.g., the new site data)… ▽ More Since data scarcity and data heterogeneity are prevailing for medical images, well-trained Convolutional Neural Networks (CNNs) using previous normalization methods may perform poorly when deployed to a new site. However, a reliable model for real-world clinical applications should be able to generalize well both on in-distribution (IND) and out-of-distribution (OOD) data (e.g., the new site data). In this study, we present a novel normalization technique called window normalization (WIN) to improve the model generalization on heterogeneous medical images, which is a simple yet effective alternative to existing normalization methods. Specifically, WIN perturbs the normalizing statistics with the local statistics computed on the window of features. This feature-level augmentation technique regularizes the models well and improves their OOD generalization significantly. Taking its advantage, we propose a novel self-distillation method called WIN-WIN for classification tasks. WIN-WIN is easily implemented with twice forward passes and a consistency constraint, which can be a simple extension for existing methods. Extensive experimental results on various tasks (6 tasks) and datasets (24 datasets) demonstrate the generality and effectiveness of our methods. △ Less

Submitted 13 July, 2022; v1 submitted 7 July, 2022; originally announced July 2022.

arXiv:2206.06781 [pdf, other]

doi 10.1103/PhysRevLett.128.246401

Anomalous contribution to the nematic electronic states from the structural transition in FeSe revealed by time- and angle-resolved photoemission spectroscopy

Authors: Yuanyuan Yang, Qisi Wang, Shaofeng Duan, Hongliang Wo, Chaozhi Huang, Shichong Wang, Lingxiao Gu, Dao Xiang, Dong Qian, Jun Zhao, Wentao Zhang

Abstract: High-resolution time- and angle-resolved photoemission measurements were made on FeSe superconductors. With ultrafast photoexcitation, two critical excitation fluences that correspond to two ultrafast electronic phase transitions were found only in the $d_{yz}$-orbit-derived band near the Brillouin-zone center within our time and energy resolution. Upon comparison to the detailed temperature depen… ▽ More High-resolution time- and angle-resolved photoemission measurements were made on FeSe superconductors. With ultrafast photoexcitation, two critical excitation fluences that correspond to two ultrafast electronic phase transitions were found only in the $d_{yz}$-orbit-derived band near the Brillouin-zone center within our time and energy resolution. Upon comparison to the detailed temperature dependent measurements, we conclude that there are two equilibrium electronic phase transitions (at approximately 90 and 120 K) above the superconducting transition temperature, and an anomalous contribution on the scale of 10 meV to the nematic states from the structural transition is experimentally determined. Our observations strongly suggest that the electronic phase transition at 120 K must be taken into account in the energy band development of FeSe, and, furthermore, the contribution of the structural transition plays an important role in the nematic phase of iron-based high-temperature superconductors. △ Less

Submitted 14 June, 2022; originally announced June 2022.

Comments: 9 pages, 5 figures

Journal ref: Phys. Rev.Lett. 128, 246401 (2022)

arXiv:2205.04307 [pdf, other]

de Haas-van Alphen effect and the first-principles study of the possible topological stannide Cu$_3$Sn

Authors: Chengxu Liu, Bin Li, Yongheng Ge, Wen-He Jiao, Chuanying Xi, Yi Liu, Chunqiang Xu, Qi Lu, Yunlong Li, Hang-Qiang Qiu, Qin-Qing Zhu, Zhi Ren, Ziming Zhu, Dong Qian, Xianglin Ke, Xiaofeng Xu

Abstract: The quest for quantum materials with diverse symmetry-protected topological states has been the focus of recent research interest, primarily due to their fascinating physical properties and the potential technological utility. In this work, we report on the magnetotransport, de Haas-van Alphen (dHvA) oscillations, and the first-principles calculations of the stannide Cu$_3$Sn that is isostructural… ▽ More The quest for quantum materials with diverse symmetry-protected topological states has been the focus of recent research interest, primarily due to their fascinating physical properties and the potential technological utility. In this work, we report on the magnetotransport, de Haas-van Alphen (dHvA) oscillations, and the first-principles calculations of the stannide Cu$_3$Sn that is isostructural with the recently reported topological semimetal Ag$_3$Sn. The magnetoresistance was found to vary quasi-linearly in field. Clear dHvA oscillations were observed under a field as low as 1 Tesla at 2 K, with three major oscillation frequencies $F_αあるふぁ$=8.74 T, $F_βべーた$=150.19 T and $F_γがんま$=229.66 T and extremely small effective masses. The analysis of dHvA quantum oscillations revealed a possible nonzero Berry phase, suggestive of the nontrivial band topology. The corroborating evidence for the nontrivial electronic topology also comes from the first-principles calculations which yield a nonzero $\mathbb{Z}_2$ topological index. These results collectively suggest that Cu$_3$Sn, in analogy to its homologue Ag$_3$Sn, may be another intermetallic stannide hosting topological Dirac fermions. △ Less

Submitted 9 May, 2022; originally announced May 2022.

Comments: 4 figures, 1 table

arXiv:2204.08258 [pdf, other]

doi 10.1088/0256-307X/39/5/057302

Unusual band splitting and superconducting gap evolution with sulfur substitution in FeSe

Authors: Yuanyuan Yang, Qisi Wang, Shaofeng Duan, Hongliang Wo, Chaozhi Huang, Shichong Wang, Lingxiao Gu, Dong Qian, Jun Zhao, Wentao Zhang

Abstract: High-resolution angle-resolved photoemission measurements were taken on FeSe$_{1-x}$S$_x$ (x=0, 0.04, and 0.08) superconductors. With an ultrahigh energy resolution of 0.4 meV, unusual two hole bands near the Brillouin-zone center, which was possibly a result of additional symmetry breaking, were identified in all the sulfur-substituted samples. In addition, in both of the hole bands highly anisot… ▽ More High-resolution angle-resolved photoemission measurements were taken on FeSe$_{1-x}$S$_x$ (x=0, 0.04, and 0.08) superconductors. With an ultrahigh energy resolution of 0.4 meV, unusual two hole bands near the Brillouin-zone center, which was possibly a result of additional symmetry breaking, were identified in all the sulfur-substituted samples. In addition, in both of the hole bands highly anisotropic superconducting gaps with resolution limited nodes were evidenced. We find that the larger superconducting gap on the outer hole band is reduced linearly to the nematic transition temperature while the gap on the inner hole is nearly S-substitution independent. Our observations strongly suggest that the superconducting gap increases with enhanced nematicity although the superconducting transition temperature is not only governed by the pairing strength, demonstrating strong constraints on theories in the FeSe family. △ Less

Submitted 21 April, 2022; v1 submitted 18 April, 2022; originally announced April 2022.

Comments: 6 pages, 4 figures

Journal ref: Chin. Phys. Lett. 39, 057302 (2022)

arXiv:2203.09776 [pdf]

doi 10.1038/s41467-023-37000-2

Transient dynamics of the phase transition in VO2 revealed by mega electron-volt ultrafast electron diffraction

Authors: Chenhang Xu, Cheng Jin, Zijing Chen, Qi Lu, Yun Cheng, Bo Zhang, Fengfeng Qi, Jiajun Chen, Xunqing Yin, Guohua Wang, Dao Xiang, Dong Qian

Abstract: Vanadium dioxide (VO2) exhibits an insulator-to-metal transition accompanied by a structural transition near room temperature. This transition can be triggered by an ultrafast laser pulse. Exotic transient states, such as a metallic state without structural transition, were also proposed. These unique characteristics let VO2 have great potential in thermal switchable devices and photonic applicati… ▽ More Vanadium dioxide (VO2) exhibits an insulator-to-metal transition accompanied by a structural transition near room temperature. This transition can be triggered by an ultrafast laser pulse. Exotic transient states, such as a metallic state without structural transition, were also proposed. These unique characteristics let VO2 have great potential in thermal switchable devices and photonic applications. Although great efforts have been made, the atomic pathway during the photoinduced phase transition is still not clear. Here, we synthesized freestanding quasi-single-crystal VO2 films and examined their photoinduced structural phase transition with mega-electron-volt ultrafast electron diffraction. Leveraging the high signal-to-noise ratio and high temporal resolution, we observe that the disappearance of vanadium dimers and zigzag chains does not coincide with the transformation of crystal symmetry. After photoexcitation, the initial structure is strongly modified within 200 femtoseconds, resulting in a transient monoclinic structure without vanadium dimers and zigzag chains. Then, it continues to evolve to the final tetragonal structure in approximately 5 picoseconds. In addition, only one laser fluence threshold instead of two thresholds suggested in polycrystalline samples was observed in our quasi-single-crystal samples. Our findings provide new essential information for a comprehensive understanding of the photoinduced ultrafast phase transition in VO2. △ Less

Submitted 23 November, 2023; v1 submitted 18 March, 2022; originally announced March 2022.

Journal ref: Nature Communications 14:1265 (2023)

arXiv:2202.09668 [pdf, other]

doi 10.1038/s41467-022-28309-5

Light-induced dimension crossover in 1T-TiSe$_2$ dictated by excitonic correlations

Authors: Yun Cheng, Alfred Zong, Jun Li, Wei Xia, Shaofeng Duan, Wenxuan Zhao, Yidian Li, Fengfeng Qi, Jun Wu, Lingrong Zhao, Pengfei Zhu, Xiao Zou, Tao Jiang, Yanfeng Guo, Lexian Yang, Dong Qian, Wentao Zhang, Anshul Kogar, Michael W. Zuerch, Dao Xiang, Jie Zhang

Abstract: In low-dimensional systems with strong electronic correlations, the application of an ultrashort laser pulse often yields novel phases that are otherwise inaccessible. The central challenge in understanding such phenomena is to determine how dimensionality and many-body correlations together govern the pathway of a non-adiabatic transition. To this end, we examine a layered compound, 1T-TiSe$_2$,… ▽ More In low-dimensional systems with strong electronic correlations, the application of an ultrashort laser pulse often yields novel phases that are otherwise inaccessible. The central challenge in understanding such phenomena is to determine how dimensionality and many-body correlations together govern the pathway of a non-adiabatic transition. To this end, we examine a layered compound, 1T-TiSe$_2$, whose three-dimensional charge-density-wave (3D CDW) state also features exciton condensation due to strong electron-hole interactions. We find that photoexcitation suppresses the equilibrium 3D CDW while creating a nonequilibrium 2D CDW. Remarkably, the dimension reduction does not occur unless bound electron-hole pairs are broken. This relation suggests that excitonic correlations maintain the out-of-plane CDW coherence, settling a long-standing debate over their role in the CDW transition. Our findings demonstrate how optical manipulation of electronic interaction enables one to control the dimensionality of a broken-symmetry order, paving the way for realizing other emergent states in strongly correlated systems. △ Less

Submitted 19 February, 2022; originally announced February 2022.

Journal ref: Nature Communications 13, 963 (2022)

arXiv:2202.08110

Disorder-induced linear magnetoresistance in Sr-doped Bi2Se3 thin films

Authors: Jiayuan Hu, Wenxiang Jiang, Guohua Wang, Yunlong Li, Jiangtao Wang, Jinlong Jiao, Qi Lu, Chenhang Xu, Wentao Zhang, Jie Ma, Dong Qian

Abstract: Sr-doped Bi2Se3 thin films was known as a potential candidate of topological superconductor. The magnetoresistance (MR) of SrxBi2Se3 films with various doping concentrations x were found to be dominated by weak antilocalization (WAL) at low magnetic fields, whereas the classical MR, which originally dominated the MR, was almost completely suppressed. In contrast, the MR of all samples has been obs… ▽ More Sr-doped Bi2Se3 thin films was known as a potential candidate of topological superconductor. The magnetoresistance (MR) of SrxBi2Se3 films with various doping concentrations x were found to be dominated by weak antilocalization (WAL) at low magnetic fields, whereas the classical MR, which originally dominated the MR, was almost completely suppressed. In contrast, the MR of all samples has been observed to be dominated by linear magnetoresistance (LMR) at high magnetic fields. The LMR, having the linear dependence on carrier mobility, can be successfully explained by the Parish-Littlewood model. This indicates that LMR originates from mobility fluctuation induced by Sr dopant atoms in doped Bi2Se3 films. △ Less

Submitted 18 February, 2022; v1 submitted 16 February, 2022; originally announced February 2022.

Comments: The article needs to be revised

Showing 1–50 of 161 results for author: Qian, D