(Translated by https://www.hiragana.jp/)
Search | arXiv e-print repository
Skip to main content

Showing 1–50 of 174 results for author: Qin, C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.05341  [pdf, other

    cs.CV cs.AI

    CAR: Contrast-Agnostic Deformable Medical Image Registration with Contrast-Invariant Latent Regularization

    Authors: Yinsong Wang, Siyi Du, Shaoming Zheng, Xinzhe Luo, Chen Qin

    Abstract: Multi-contrast image registration is a challenging task due to the complex intensity relationships between different imaging contrasts. Conventional image registration methods are typically based on iterative optimizations for each input image pair, which is time-consuming and sensitive to contrast variations. While learning-based approaches are much faster during the inference stage, due to gener… ▽ More

    Submitted 3 August, 2024; originally announced August 2024.

    Comments: 12 pages, 3 figures, 3 tables, accecpted by WBIR 2024

  2. arXiv:2408.03194  [pdf, other

    eess.IV cs.CV

    SGSR: Structure-Guided Multi-Contrast MRI Super-Resolution via Spatio-Frequency Co-Query Attention

    Authors: Shaoming Zheng, Yinsong Wang, Siyi Du, Chen Qin

    Abstract: Magnetic Resonance Imaging (MRI) is a leading diagnostic modality for a wide range of exams, where multiple contrast images are often acquired for characterizing different tissues. However, acquiring high-resolution MRI typically extends scan time, which can introduce motion artifacts. Super-resolution of MRI therefore emerges as a promising approach to mitigate these challenges. Earlier studies h… ▽ More

    Submitted 6 August, 2024; originally announced August 2024.

    Comments: The 15th International Workshop on Machine Learning in Medical Imaging (MLMI 2024)

  3. arXiv:2407.17944  [pdf, other

    cs.RO eess.SY

    Time-Optimal Planning for Long-Range Quadrotor Flights: An Automatic Optimal Synthesis Approach

    Authors: Chao Qin, Jingxiang Chen, Yifan Lin, Abhishek Goudar, Angela P. Schoellig, Hugh H. -T. Liu

    Abstract: Time-critical tasks such as drone racing typically cover large operation areas. However, it is difficult and computationally intensive for current time-optimal motion planners to accommodate long flight distances since a large yet unknown number of knot points is required to represent the trajectory. We present a polynomial-based automatic optimal synthesis (AOS) approach that can address this cha… ▽ More

    Submitted 25 July, 2024; originally announced July 2024.

    Comments: 19 pages, 19 figures

  4. arXiv:2407.07582  [pdf, other

    cs.CV

    TIP: Tabular-Image Pre-training for Multimodal Classification with Incomplete Data

    Authors: Siyi Du, Shaoming Zheng, Yinsong Wang, Wenjia Bai, Declan P. O'Regan, Chen Qin

    Abstract: Images and structured tables are essential parts of real-world databases. Though tabular-image representation learning is promising to create new insights, it remains a challenging task, as tabular data is typically heterogeneous and incomplete, presenting significant modality disparities with images. Earlier works have mainly focused on simple modality fusion strategies in complete data scenarios… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: 28 pages (including 9 pages of supplementary materials), accepted by ECCV 2024

  5. arXiv:2407.00082  [pdf, other

    cs.IR cs.AI cs.LG

    Adapting Job Recommendations to User Preference Drift with Behavioral-Semantic Fusion Learning

    Authors: Xiao Han, Chen Zhu, Xiao Hu, Chuan Qin, Xiangyu Zhao, Hengshu Zhu

    Abstract: Job recommender systems are crucial for aligning job opportunities with job-seekers in online job-seeking. However, users tend to adjust their job preferences to secure employment opportunities continually, which limits the performance of job recommendations. The inherent frequency of preference drift poses a challenge to promptly and precisely capture user preferences. To address this issue, we p… ▽ More

    Submitted 24 June, 2024; originally announced July 2024.

    Comments: Accepted by KDD 24 Research Track

  6. arXiv:2406.19973  [pdf, other

    cs.CV cs.LG

    STLLaVA-Med: Self-Training Large Language and Vision Assistant for Medical

    Authors: Guohao Sun, Can Qin, Huazhu Fu, Linwei Wang, Zhiqiang Tao

    Abstract: Large Vision-Language Models (LVLMs) have shown significant potential in assisting medical diagnosis by leveraging extensive biomedical datasets. However, the advancement of medical image understanding and reasoning critically depends on building high-quality visual instruction data, which is costly and labor-intensive to obtain, particularly in the medical domain. To mitigate this data-starving i… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

    Comments: 10 pages, 6 figures

  7. arXiv:2406.19043  [pdf

    eess.IV cs.AI cs.CV cs.DB

    CMRxRecon2024: A Multi-Modality, Multi-View K-Space Dataset Boosting Universal Machine Learning for Accelerated Cardiac MRI

    Authors: Zi Wang, Fanwen Wang, Chen Qin, Jun Lyu, Ouyang Cheng, Shuo Wang, Yan Li, Mengyao Yu, Haoyu Zhang, Kunyuan Guo, Zhang Shi, Qirong Li, Ziqiang Xu, Yajing Zhang, Hao Li, Sha Hua, Binghua Chen, Longyu Sun, Mengting Sun, Qin Li, Ying-Hua Chu, Wenjia Bai, Jing Qin, Xiahai Zhuang, Claudia Prieto , et al. (7 additional authors not shown)

    Abstract: Cardiac magnetic resonance imaging (MRI) has emerged as a clinically gold-standard technique for diagnosing cardiac diseases, thanks to its ability to provide diverse information with multiple modalities and anatomical views. Accelerated cardiac MRI is highly expected to achieve time-efficient and patient-friendly imaging, and then advanced image reconstruction approaches are required to recover h… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: 19 pages, 3 figures, 2 tables

  8. arXiv:2406.12465  [pdf, other

    cs.CY cs.AI cs.IR

    RIGL: A Unified Reciprocal Approach for Tracing the Independent and Group Learning Processes

    Authors: Xiaoshan Yu, Chuan Qin, Dazhong Shen, Shangshang Yang, Haiping Ma, Hengshu Zhu, Xingyi Zhang

    Abstract: In the realm of education, both independent learning and group learning are esteemed as the most classic paradigms. The former allows learners to self-direct their studies, while the latter is typically characterized by teacher-directed scenarios. Recent studies in the field of intelligent education have leveraged deep temporal models to trace the learning process, capturing the dynamics of studen… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: Accepted by KDD 2024. 12 pages

  9. arXiv:2406.11920  [pdf, other

    cs.LG cs.AI

    Job-SDF: A Multi-Granularity Dataset for Job Skill Demand Forecasting and Benchmarking

    Authors: Xi Chen, Chuan Qin, Chuyu Fang, Chao Wang, Chen Zhu, Fuzhen Zhuang, Hengshu Zhu, Hui Xiong

    Abstract: In a rapidly evolving job market, skill demand forecasting is crucial as it enables policymakers and businesses to anticipate and adapt to changes, ensuring that workforce skills align with market needs, thereby enhancing productivity and competitiveness. Additionally, by identifying emerging skill requirements, it directs individuals towards relevant training and education opportunities, promotin… ▽ More

    Submitted 19 June, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

  10. arXiv:2406.10916  [pdf, other

    cs.RO cs.DC

    M-SET: Multi-Drone Swarm Intelligence Experimentation with Collision Avoidance Realism

    Authors: Chuhao Qin, Alexander Robins, Callum Lillywhite-Roake, Adam Pearce, Hritik Mehta, Scott James, Tsz Ho Wong, Evangelos Pournaras

    Abstract: Distributed sensing by cooperative drone swarms is crucial for several Smart City applications, such as traffic monitoring and disaster response. Using an indoor lab with inexpensive drones, a testbed supports complex and ambitious studies on these systems while maintaining low cost, rigor, and external validity. This paper introduces the Multi-drone Sensing Experimentation Testbed (M-SET), a nove… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

    Comments: 7 pages, 7 figures. This work has been submitted to the IEEE conferenece

  11. arXiv:2406.00777  [pdf, other

    cs.CV cs.AI

    Diffusion Features to Bridge Domain Gap for Semantic Segmentation

    Authors: Yuxiang Ji, Boyong He, Chenyuan Qu, Zhuoyue Tan, Chuan Qin, Liaoni Wu

    Abstract: Pre-trained diffusion models have demonstrated remarkable proficiency in synthesizing images across a wide range of scenarios with customizable prompts, indicating their effective capacity to capture universal features. Motivated by this, our study delves into the utilization of the implicit knowledge embedded within diffusion models to address challenges in cross-domain semantic segmentation. Thi… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

  12. arXiv:2405.14161  [pdf, other

    cs.CL cs.AI cs.LG cs.SD eess.AS

    Self-Taught Recognizer: Toward Unsupervised Adaptation for Speech Foundation Models

    Authors: Yuchen Hu, Chen Chen, Chao-Han Huck Yang, Chengwei Qin, Pin-Yu Chen, Eng Siong Chng, Chao Zhang

    Abstract: We propose an unsupervised adaptation framework, Self-TAught Recognizer (STAR), which leverages unlabeled data to enhance the robustness of automatic speech recognition (ASR) systems in diverse target domains, such as noise and accents. STAR is developed for prevalent speech foundation models based on Transformer-related architecture with auto-regressive decoding (e.g., Whisper, Canary). Specifica… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: 23 pages, Preprint

  13. arXiv:2405.11862  [pdf, other

    cs.CV

    SEMv3: A Fast and Robust Approach to Table Separation Line Detection

    Authors: Chunxia Qin, Zhenrong Zhang, Pengfei Hu, Chenyu Liu, Jiefeng Ma, Jun Du

    Abstract: Table structure recognition (TSR) aims to parse the inherent structure of a table from its input image. The `"split-and-merge" paradigm is a pivotal approach to parse table structure, where the table separation line detection is crucial. However, challenges such as wireless and deformed tables make it demanding. In this paper, we adhere to the "split-and-merge" paradigm and propose SEMv3 (SEM: Spl… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

    Comments: 9 pages, 6 figures, 5 tables. Accepted by IJCAI2024 main track

  14. arXiv:2405.10992  [pdf, other

    cs.LG cs.AI

    Overcoming Catastrophic Forgetting by Exemplar Selection in Task-oriented Dialogue System

    Authors: Chen Chen, Ruizhe Li, Yuchen Hu, Yuanyuan Chen, Chengwei Qin, Qiang Zhang

    Abstract: Intelligent task-oriented dialogue systems (ToDs) are expected to continuously acquire new knowledge, also known as Continual Learning (CL), which is crucial to fit ever-changing user needs. However, catastrophic forgetting dramatically degrades the model performance in face of a long streamed curriculum. In this paper, we aim to overcome the forgetting problem in ToDs and propose a method (HESIT)… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

    Comments: ACL 2024

  15. arXiv:2405.10037  [pdf, other

    cs.CV

    Bilateral Event Mining and Complementary for Event Stream Super-Resolution

    Authors: Zhilin Huang, Quanmin Liang, Yijie Yu, Chujun Qin, Xiawu Zheng, Kai Huang, Zikun Zhou, Wenming Yang

    Abstract: Event Stream Super-Resolution (ESR) aims to address the challenge of insufficient spatial resolution in event streams, which holds great significance for the application of event cameras in complex scenarios. Previous works for ESR often process positive and negative events in a mixed paradigm. This paradigm limits their ability to effectively model the unique characteristics of each event and mut… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

    Comments: Accepted to CVPR2024

  16. arXiv:2405.10025  [pdf, other

    cs.CL cs.AI cs.LG cs.SD eess.AS

    Listen Again and Choose the Right Answer: A New Paradigm for Automatic Speech Recognition with Large Language Models

    Authors: Yuchen Hu, Chen Chen, Chengwei Qin, Qiushi Zhu, Eng Siong Chng, Ruizhe Li

    Abstract: Recent advances in large language models (LLMs) have promoted generative error correction (GER) for automatic speech recognition (ASR), which aims to predict the ground-truth transcription from the decoded N-best hypotheses. Thanks to the strong language generation ability of LLMs and rich information in the N-best list, GER shows great effectiveness in enhancing ASR results. However, it still suf… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

    Comments: 14 pages, Accepted by ACL 2024

  17. arXiv:2404.16612  [pdf, other

    cs.CV

    MuseumMaker: Continual Style Customization without Catastrophic Forgetting

    Authors: Chenxi Liu, Gan Sun, Wenqi Liang, Jiahua Dong, Can Qin, Yang Cong

    Abstract: Pre-trained large text-to-image (T2I) models with an appropriate text prompt has attracted growing interests in customized images generation field. However, catastrophic forgetting issue make it hard to continually synthesize new user-provided styles while retaining the satisfying results amongst learned styles. In this paper, we propose MuseumMaker, a method that enables the synthesis of images b… ▽ More

    Submitted 29 April, 2024; v1 submitted 25 April, 2024; originally announced April 2024.

  18. arXiv:2404.13534  [pdf, other

    cs.CV

    Motion-aware Latent Diffusion Models for Video Frame Interpolation

    Authors: Zhilin Huang, Yijie Yu, Ling Yang, Chujun Qin, Bing Zheng, Xiawu Zheng, Zikun Zhou, Yaowei Wang, Wenming Yang

    Abstract: With the advancement of AIGC, video frame interpolation (VFI) has become a crucial component in existing video generation frameworks, attracting widespread research interest. For the VFI task, the motion estimation between neighboring frames plays a crucial role in avoiding motion ambiguity. However, existing VFI methods always struggle to accurately predict the motion information between consecut… ▽ More

    Submitted 2 August, 2024; v1 submitted 21 April, 2024; originally announced April 2024.

    Comments: 17 pages, 4 figures

  19. arXiv:2404.13067  [pdf, other

    cs.CL cs.AI cs.LG

    Towards Efficient Resume Understanding: A Multi-Granularity Multi-Modal Pre-Training Approach

    Authors: Feihu Jiang, Chuan Qin, Jingshuai Zhang, Kaichun Yao, Xi Chen, Dazhong Shen, Chen Zhu, Hengshu Zhu, Hui Xiong

    Abstract: In the contemporary era of widespread online recruitment, resume understanding has been widely acknowledged as a fundamental and crucial task, which aims to extract structured information from resume documents automatically. Compared to the traditional rule-based approaches, the utilization of recently proposed pre-trained document understanding models can greatly enhance the effectiveness of resu… ▽ More

    Submitted 13 April, 2024; originally announced April 2024.

    Comments: ICME 2024 Accepted

  20. arXiv:2404.12728  [pdf, other

    cs.CL

    Relevant or Random: Can LLMs Truly Perform Analogical Reasoning?

    Authors: Chengwei Qin, Wenhan Xia, Tan Wang, Fangkai Jiao, Yuchen Hu, Bosheng Ding, Ruirui Chen, Shafiq Joty

    Abstract: Analogical reasoning is a unique ability of humans to address unfamiliar challenges by transferring strategies from relevant past experiences. One key finding in psychology is that compared with irrelevant past experiences, recalling relevant ones can help humans better handle new tasks. Coincidentally, the NLP community has also recently found that self-generating relevant examples in the context… ▽ More

    Submitted 23 June, 2024; v1 submitted 19 April, 2024; originally announced April 2024.

  21. arXiv:2404.10494  [pdf, other

    cs.HC cs.LG

    BDAN: Mitigating Temporal Difference Across Electrodes in Cross-Subject Motor Imagery Classification via Generative Bridging Domain

    Authors: Zhige Chen, Rui Yang, Mengjie Huang, Chengxuan Qin, Zidong Wang

    Abstract: Because of "the non-repeatability of the experiment settings and conditions" and "the variability of brain patterns among subjects", the data distributions across sessions and electrodes are different in cross-subject motor imagery (MI) studies, eventually reducing the performance of the classification model. Systematically summarised based on the existing studies, a novel temporal-electrode data… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

  22. arXiv:2404.08695  [pdf, other

    cs.CL cs.AI cs.IR

    Enhancing Question Answering for Enterprise Knowledge Bases using Large Language Models

    Authors: Feihu Jiang, Chuan Qin, Kaichun Yao, Chuyu Fang, Fuzhen Zhuang, Hengshu Zhu, Hui Xiong

    Abstract: Efficient knowledge management plays a pivotal role in augmenting both the operational efficiency and the innovative capacity of businesses and organizations. By indexing knowledge through vectorization, a variety of knowledge retrieval methods have emerged, significantly enhancing the efficacy of knowledge management systems. Recently, the rapid advancements in generative natural language process… ▽ More

    Submitted 20 April, 2024; v1 submitted 10 April, 2024; originally announced April 2024.

    Comments: DASFAA 2024 Accepted

  23. arXiv:2404.02507  [pdf, other

    cs.CL

    Lifelong Event Detection with Embedding Space Separation and Compaction

    Authors: Chengwei Qin, Ruirui Chen, Ruochen Zhao, Wenhan Xia, Shafiq Joty

    Abstract: To mitigate forgetting, existing lifelong event detection methods typically maintain a memory module and replay the stored memory data during the learning of a new task. However, the simple combination of memory data and new-task samples can still result in substantial forgetting of previously acquired knowledge, which may occur due to the potential overlap between the feature distribution of new… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

    Comments: NAACL 2024 main conference

  24. arXiv:2404.02106  [pdf, other

    cs.CV cs.CE

    Neural Ordinary Differential Equation based Sequential Image Registration for Dynamic Characterization

    Authors: Yifan Wu, Mengjin Dong, Rohit Jena, Chen Qin, James C. Gee

    Abstract: Deformable image registration (DIR) is crucial in medical image analysis, enabling the exploration of biological dynamics such as organ motions and longitudinal changes in imaging. Leveraging Neural Ordinary Differential Equations (ODE) for registration, this extension work discusses how this framework can aid in the characterization of sequential biological processes. Utilizing the Neural ODE's a… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: Journal extension of NODEO: A Neural Ordinary Differential Equation Based Optimization Framework for Deformable Image Registration, CVPR 2022

  25. arXiv:2404.00699  [pdf, other

    cs.CL

    How Much are LLMs Contaminated? A Comprehensive Survey and the LLMSanitize Library

    Authors: Mathieu Ravaut, Bosheng Ding, Fangkai Jiao, Hailin Chen, Xingxuan Li, Ruochen Zhao, Chengwei Qin, Caiming Xiong, Shafiq Joty

    Abstract: With the rise of Large Language Models (LLMs) in recent years, new opportunities are emerging, but also new challenges, and contamination is quickly becoming critical. Business applications and fundraising in AI have reached a scale at which a few percentage points gained on popular question-answering benchmarks could translate into dozens of millions of dollars, placing high pressure on model int… ▽ More

    Submitted 31 March, 2024; originally announced April 2024.

    Comments: 10 pages, 1 figure, 3 tables

  26. arXiv:2403.17416  [pdf, other

    cs.IR

    AFDGCF: Adaptive Feature De-correlation Graph Collaborative Filtering for Recommendations

    Authors: Wei Wu, Chao Wang, Dazhong Shen, Chuan Qin, Liyi Chen, Hui Xiong

    Abstract: Collaborative filtering methods based on graph neural networks (GNNs) have witnessed significant success in recommender systems (RS), capitalizing on their ability to capture collaborative signals within intricate user-item relationships via message-passing mechanisms. However, these GNN-based RS inadvertently introduce excess linear correlation between user and item embeddings, contradicting the… ▽ More

    Submitted 15 April, 2024; v1 submitted 26 March, 2024; originally announced March 2024.

    Comments: Accepted by SIGIR2024

  27. arXiv:2403.11299  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    SQ-LLaVA: Self-Questioning for Large Vision-Language Assistant

    Authors: Guohao Sun, Can Qin, Jiamian Wang, Zeyuan Chen, Ran Xu, Zhiqiang Tao

    Abstract: Recent advances in vision-language models have shown notable generalization in broad tasks through visual instruction tuning. However, bridging the gap between the pre-trained vision encoder and the large language models (LLMs) becomes the whole network's bottleneck. To improve cross-modality alignment, existing works usually consider more visual instruction data covering a broader range of vision… ▽ More

    Submitted 15 July, 2024; v1 submitted 17 March, 2024; originally announced March 2024.

    Comments: ECCV 2024

  28. arXiv:2403.02990  [pdf, other

    cs.CL cs.AI

    Data Augmentation using Large Language Models: Data Perspectives, Learning Paradigms and Challenges

    Authors: Bosheng Ding, Chengwei Qin, Ruochen Zhao, Tianze Luo, Xinze Li, Guizhen Chen, Wenhan Xia, Junjie Hu, Anh Tuan Luu, Shafiq Joty

    Abstract: In the rapidly evolving field of large language models (LLMs), data augmentation (DA) has emerged as a pivotal technique for enhancing model performance by diversifying training examples without the need for additional data collection. This survey explores the transformative impact of LLMs on DA, particularly addressing the unique challenges and opportunities they present in the context of natural… ▽ More

    Submitted 2 July, 2024; v1 submitted 5 March, 2024; originally announced March 2024.

  29. arXiv:2402.11461  [pdf, other

    cs.AI

    FGeo-HyperGNet: Geometric Problem Solving Integrating Formal Symbolic System and Hypergraph Neural Network

    Authors: Xiaokai Zhang, Na Zhu, Cheng Qin, Yang Li, Zhenbing Zeng, Tuo Leng

    Abstract: Geometric problem solving has always been a long-standing challenge in the fields of automated reasoning and artificial intelligence. We built a neural-symbolic system to automatically perform human-like geometric deductive reasoning. The symbolic part is a formal system built on FormalGeo, which can automatically perform geomertic relational reasoning and algebraic calculations and organize the s… ▽ More

    Submitted 22 April, 2024; v1 submitted 18 February, 2024; originally announced February 2024.

    Comments: 13 pages

  30. arXiv:2402.10592  [pdf, other

    cs.LG econ.EM stat.ML

    Optimizing Adaptive Experiments: A Unified Approach to Regret Minimization and Best-Arm Identification

    Authors: Chao Qin, Daniel Russo

    Abstract: Practitioners conducting adaptive experiments often encounter two competing priorities: maximizing total welfare (or `reward') through effective treatment assignment and swiftly concluding experiments to implement population-wide treatments. Current literature addresses these priorities separately, with regret minimization studies focusing on the former and best-arm identification research on the… ▽ More

    Submitted 30 July, 2024; v1 submitted 16 February, 2024; originally announced February 2024.

  31. arXiv:2402.08692  [pdf, other

    eess.IV cs.CV cs.LG

    Inference Stage Denoising for Undersampled MRI Reconstruction

    Authors: Yuyang Xue, Chen Qin, Sotirios A. Tsaftaris

    Abstract: Reconstruction of magnetic resonance imaging (MRI) data has been positively affected by deep learning. A key challenge remains: to improve generalisation to distribution shifts between the training and testing data. Most approaches aim to address this via inductive design or data augmentation. However, they can be affected by misleading data, e.g. random noise, and cases where the inference stage… ▽ More

    Submitted 12 February, 2024; originally announced February 2024.

    Comments: This paper is accepted by ISBI 2024

  32. arXiv:2402.00658  [pdf, other

    cs.AI cs.CL

    Learning Planning-based Reasoning by Trajectories Collection and Process Reward Synthesizing

    Authors: Fangkai Jiao, Chengwei Qin, Zhengyuan Liu, Nancy F. Chen, Shafiq Joty

    Abstract: Large Language Models (LLMs) have demonstrated significant potential in handling complex reasoning tasks through step-by-step rationale generation. However, recent studies have raised concerns regarding the hallucination and flaws in their reasoning process. Substantial efforts are being made to improve the reliability and faithfulness of the generated rationales. Some approaches model reasoning a… ▽ More

    Submitted 15 April, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

    Comments: 17 pages, 9 figures

  33. arXiv:2401.10749  [pdf, other

    cs.CY cs.LG

    ReliCD: A Reliable Cognitive Diagnosis Framework with Confidence Awareness

    Authors: Yunfei Zhang, Chuan Qin, Dazhong Shen, Haiping Ma, Le Zhang, Xingyi Zhang, Hengshu Zhu

    Abstract: During the past few decades, cognitive diagnostics modeling has attracted increasing attention in computational education communities, which is capable of quantifying the learning status and knowledge mastery levels of students. Indeed, the recent advances in neural networks have greatly enhanced the performance of traditional cognitive diagnosis models through learning the deep representations of… ▽ More

    Submitted 29 December, 2023; originally announced January 2024.

  34. arXiv:2401.04151  [pdf, other

    cs.LG cs.CL

    Chain of LoRA: Efficient Fine-tuning of Language Models via Residual Learning

    Authors: Wenhan Xia, Chengwei Qin, Elad Hazan

    Abstract: Fine-tuning is the primary methodology for tailoring pre-trained large language models to specific tasks. As the model's scale and the diversity of tasks expand, parameter-efficient fine-tuning methods are of paramount importance. One of the most widely used family of methods is low-rank adaptation (LoRA) and its variants. LoRA encodes weight update as the product of two low-rank matrices. Despite… ▽ More

    Submitted 8 January, 2024; originally announced January 2024.

    Comments: Work in progress

  35. arXiv:2312.17055  [pdf, other

    cs.CL

    Improving In-context Learning via Bidirectional Alignment

    Authors: Chengwei Qin, Wenhan Xia, Fangkai Jiao, Chen Chen, Yuchen Hu, Bosheng Ding, Shafiq Joty

    Abstract: Large language models (LLMs) have shown impressive few-shot generalization on many tasks via in-context learning (ICL). Despite their success in showing such emergent abilities, the scale and complexity of larger models also lead to unprecedentedly high computational demands and deployment challenges. In reaction, researchers explore transferring the powerful capabilities of larger models to more… ▽ More

    Submitted 24 June, 2024; v1 submitted 28 December, 2023; originally announced December 2023.

  36. arXiv:2312.10032  [pdf, other

    cs.CV

    Osprey: Pixel Understanding with Visual Instruction Tuning

    Authors: Yuqian Yuan, Wentong Li, Jian Liu, Dongqi Tang, Xinjie Luo, Chi Qin, Lei Zhang, Jianke Zhu

    Abstract: Multimodal large language models (MLLMs) have recently achieved impressive general-purpose vision-language capabilities through visual instruction tuning. However, current MLLMs primarily focus on image-level or box-level understanding, falling short in achieving fine-grained vision-language alignment at pixel level. Besides, the lack of mask-based instruction data limits their advancements. In th… ▽ More

    Submitted 14 March, 2024; v1 submitted 15 December, 2023; originally announced December 2023.

    Comments: CVPR2024, Code and Demo link:https://github.com/CircleRadon/Osprey

  37. arXiv:2312.06117  [pdf, other

    cs.CV

    M3SOT: Multi-frame, Multi-field, Multi-space 3D Single Object Tracking

    Authors: Jiaming Liu, Yue Wu, Maoguo Gong, Qiguang Miao, Wenping Ma, Can Qin

    Abstract: 3D Single Object Tracking (SOT) stands a forefront task of computer vision, proving essential for applications like autonomous driving. Sparse and occluded data in scene point clouds introduce variations in the appearance of tracked objects, adding complexity to the task. In this research, we unveil M3SOT, a novel 3D SOT framework, which synergizes multiple input frames (template sets), multiple r… ▽ More

    Submitted 10 December, 2023; originally announced December 2023.

    Comments: 12 pages, 10 figures, 10 tables, AAAI 2024

    Journal ref: AAAI 2024

  38. arXiv:2312.05959  [pdf, ps, other

    cs.LG eess.SP

    VAE-IF: Deep feature extraction with averaging for fully unsupervised artifact detection in routinely acquired ICU time-series

    Authors: Hollan Haule, Ian Piper, Patricia Jones, Chen Qin, Tsz-Yan Milly Lo, Javier Escudero

    Abstract: Artifacts are a common problem in physiological time series collected from intensive care units (ICU) and other settings. They affect the quality and reliability of clinical research and patient care. Manual annotation of artifacts is costly and time-consuming, rendering it impractical. Automated methods are desired. Here, we propose a novel fully unsupervised approach to detect artifacts in clini… ▽ More

    Submitted 2 August, 2024; v1 submitted 10 December, 2023; originally announced December 2023.

  39. arXiv:2311.16989  [pdf, other

    cs.CL

    ChatGPT's One-year Anniversary: Are Open-Source Large Language Models Catching up?

    Authors: Hailin Chen, Fangkai Jiao, Xingxuan Li, Chengwei Qin, Mathieu Ravaut, Ruochen Zhao, Caiming Xiong, Shafiq Joty

    Abstract: Upon its release in late 2022, ChatGPT has brought a seismic shift in the entire landscape of AI, both in research and commerce. Through instruction-tuning a large language model (LLM) with supervised fine-tuning and reinforcement learning from human feedback, it showed that a model could answer human questions and follow instructions on a broad panel of tasks. Following this success, interests in… ▽ More

    Submitted 15 January, 2024; v1 submitted 28 November, 2023; originally announced November 2023.

    Comments: version v4, included latest top-performing open-sourced LLMs

  40. arXiv:2311.09852  [pdf, other

    cs.RO cs.LG cs.MA

    Short vs. Long-term Coordination of Drones: When Distributed Optimization Meets Deep Reinforcement Learning

    Authors: Chuhao Qin, Evangelos Pournaras

    Abstract: Swarms of autonomous interactive drones, with the support of recharging technology, can provide compelling sensing capabilities in Smart Cities, such as traffic monitoring and disaster response. This paper aims to deliver a novel coordination solution for the cost-effective navigation, sensing, and recharging of drones. Existing approaches, such as deep reinforcement learning (DRL), offer long-ter… ▽ More

    Submitted 12 April, 2024; v1 submitted 16 November, 2023; originally announced November 2023.

    Comments: This work has been submitted to the IEEE Transactions on Systems, Man and Cybernetics: Systems for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  41. arXiv:2310.19319  [pdf, other

    stat.ML cs.LG

    Dual-Directed Algorithm Design for Efficient Pure Exploration

    Authors: Chao Qin, Wei You

    Abstract: We consider pure-exploration problems in the context of stochastic sequential adaptive experiments with a finite set of alternative options. The goal of the decision-maker is to accurately answer a query question regarding the alternatives with high confidence with minimal measurement efforts. A typical query question is to identify the alternative with the best performance, leading to ranking and… ▽ More

    Submitted 30 October, 2023; originally announced October 2023.

    Comments: An earlier version of this paper appeared as an extended abstract in the Proceedings of the 36th Annual Conference on Learning Theory, COLT'23, with the title "Information-Directed Selection for Top-Two Algorithms.''

  42. arXiv:2310.18021  [pdf, other

    cs.AI

    FormalGeo: An Extensible Formalized Framework for Olympiad Geometric Problem Solving

    Authors: Xiaokai Zhang, Na Zhu, Yiming He, Jia Zou, Qike Huang, Xiaoxiao Jin, Yanjun Guo, Chenyang Mao, Yang Li, Zhe Zhu, Dengfeng Yue, Fangzhen Zhu, Yifan Wang, Yiwen Huang, Runan Wang, Cheng Qin, Zhenbing Zeng, Shaorong Xie, Xiangfeng Luo, Tuo Leng

    Abstract: This is the first paper in a series of work we have accomplished over the past three years. In this paper, we have constructed a consistent formal plane geometry system. This will serve as a crucial bridge between IMO-level plane geometry challenges and readable AI automated reasoning. Within this formal framework, we have been able to seamlessly integrate modern AI models with our formal system.… ▽ More

    Submitted 14 February, 2024; v1 submitted 27 October, 2023; originally announced October 2023.

    Comments: 44 pages

  43. arXiv:2310.09886  [pdf, other

    cs.CL cs.AI

    Lifelong Sequence Generation with Dynamic Module Expansion and Adaptation

    Authors: Chengwei Qin, Chen Chen, Shafiq Joty

    Abstract: Lifelong sequence generation (LSG), a problem in continual learning, aims to continually train a model on a sequence of generation tasks to learn constantly emerging new generation patterns while avoiding the forgetting of previous knowledge. Existing LSG methods mainly focus on maintaining old knowledge while paying little attention to knowledge transfer across tasks. In contrast, humans can bett… ▽ More

    Submitted 22 November, 2023; v1 submitted 15 October, 2023; originally announced October 2023.

  44. arXiv:2310.09881  [pdf, other

    cs.CL cs.AI

    In-Context Learning with Iterative Demonstration Selection

    Authors: Chengwei Qin, Aston Zhang, Chen Chen, Anirudh Dagar, Wenming Ye

    Abstract: Spurred by advancements in scale, large language models (LLMs) have demonstrated strong few-shot learning ability via in-context learning (ICL). However, the performance of ICL has been shown to be highly sensitive to the selection of few-shot demonstrations. Selecting the most suitable examples as context remains an ongoing challenge and an open problem. Existing literature has highlighted the im… ▽ More

    Submitted 23 June, 2024; v1 submitted 15 October, 2023; originally announced October 2023.

  45. arXiv:2310.08275  [pdf, other

    cs.CR cs.SE

    Harnessing the Power of LLM to Support Binary Taint Analysis

    Authors: Puzhuo Liu, Chengnian Sun, Yaowen Zheng, Xuan Feng, Chuan Qin, Yuncheng Wang, Zhi Li, Limin Sun

    Abstract: This paper proposes LATTE, the first static binary taint analysis that is powered by a large language model (LLM). LATTE is superior to the state of the art (e.g., Emtaint, Arbiter, Karonte) in three aspects. First, LATTE is fully automated while prior static binary taint analyzers need rely on human expertise to manually customize taint propagation rules and vulnerability inspection rules. Second… ▽ More

    Submitted 12 October, 2023; originally announced October 2023.

    Comments: 12 pages,5 figures

  46. arXiv:2309.10836  [pdf, other

    cs.CV

    CMRxRecon: An open cardiac MRI dataset for the competition of accelerated image reconstruction

    Authors: Chengyan Wang, Jun Lyu, Shuo Wang, Chen Qin, Kunyuan Guo, Xinyu Zhang, Xiaotong Yu, Yan Li, Fanwen Wang, Jianhua Jin, Zhang Shi, Ziqiang Xu, Yapeng Tian, Sha Hua, Zhensen Chen, Meng Liu, Mengting Sun, Xutong Kuang, Kang Wang, Haoran Wang, Hao Li, Yinghua Chu, Guang Yang, Wenjia Bai, Xiahai Zhuang , et al. (3 additional authors not shown)

    Abstract: Cardiac magnetic resonance imaging (CMR) has emerged as a valuable diagnostic tool for cardiac diseases. However, a limitation of CMR is its slow imaging speed, which causes patient discomfort and introduces artifacts in the images. There has been growing interest in deep learning-based CMR imaging algorithms that can reconstruct high-quality images from highly under-sampled k-space data. However,… ▽ More

    Submitted 19 September, 2023; originally announced September 2023.

    Comments: 14 pages, 8 figures

  47. arXiv:2309.08915  [pdf, other

    cs.IT

    On non-expandable cross-bifix-free codes

    Authors: Chunyan Qin, Bocong Chen, Gaojun Luo

    Abstract: A cross-bifix-free code of length $n$ over $\mathbb{Z}_q$ is defined as a non-empty subset of $\mathbb{Z}_q^n$ satisfying that the prefix set of each codeword is disjoint from the suffix set of every codeword. Cross-bifix-free codes have found important applications in digital communication systems. One of the main research problems on cross-bifix-free codes is to construct cross-bifix-free codes… ▽ More

    Submitted 16 September, 2023; originally announced September 2023.

    Comments: This paper has been submitted to IEEE T-IT for possible publication

    MSC Class: 94B25

  48. arXiv:2309.06837  [pdf, other

    cs.RO

    Time-Optimal Gate-Traversing Planner for Autonomous Drone Racing

    Authors: Chao Qin, Maxime S. J. Michet, Jingxiang Chen, Hugh H. -T. Liu

    Abstract: In drone racing, the time-minimum trajectory is affected by the drone's capabilities, the layout of the race track, and the configurations of the gates (e.g., their shapes and sizes). However, previous studies neglect the configuration of the gates, simply rendering drone racing a waypoint-passing task. This formulation often leads to a conservative choice of paths through the gates, as the spatia… ▽ More

    Submitted 4 May, 2024; v1 submitted 13 September, 2023; originally announced September 2023.

  49. arXiv:2309.04702  [pdf, other

    cs.CV

    A Spatial-Temporal Deformable Attention based Framework for Breast Lesion Detection in Videos

    Authors: Chao Qin, Jiale Cao, Huazhu Fu, Rao Muhammad Anwer, Fahad Shahbaz Khan

    Abstract: Detecting breast lesion in videos is crucial for computer-aided diagnosis. Existing video-based breast lesion detection approaches typically perform temporal feature aggregation of deep backbone features based on the self-attention operation. We argue that such a strategy struggles to effectively perform deep feature aggregation and ignores the useful local information. To tackle these issues, we… ▽ More

    Submitted 9 September, 2023; originally announced September 2023.

    Comments: Accepted by MICCAI 2023

  50. arXiv:2308.06701  [pdf, other

    cs.CV cs.AI cs.LG

    Camouflaged Image Synthesis Is All You Need to Boost Camouflaged Detection

    Authors: Haichao Zhang, Can Qin, Yu Yin, Yun Fu

    Abstract: Camouflaged objects that blend into natural scenes pose significant challenges for deep-learning models to detect and synthesize. While camouflaged object detection is a crucial task in computer vision with diverse real-world applications, this research topic has been constrained by limited data availability. We propose a framework for synthesizing camouflage data to enhance the detection of camou… ▽ More

    Submitted 13 August, 2023; originally announced August 2023.