(Translated by https://www.hiragana.jp/)
Search | arXiv e-print repository
Skip to main content

Showing 1–50 of 911 results for author: Jin, X

.
  1. arXiv:2408.04275  [pdf, other

    cs.DC

    Addressing Model and Data Heterogeneity in Multimodal Large Language Model Training

    Authors: Zili Zhang, Yinmin Zhong, Ranchen Ming, Hanpeng Hu, Jianjian Sun, Zheng Ge, Yibo Zhu, Xin Jin

    Abstract: Multimodal large language models (LLMs) have demonstrated significant potential in a wide range of AI applications. Yet, training multimodal LLMs suffers from low efficiency and scalability, due to the inherent model heterogeneity and data heterogeneity across different modalities. We present MMScale, an efficient and adaptive framework to reform the training of multimodal large language models… ▽ More

    Submitted 8 August, 2024; originally announced August 2024.

  2. arXiv:2408.03032  [pdf, other

    math.NA

    Flexible Quaternion Generalized Minimal Residual Method for Ill-Posed Quaternion Inverse Problems

    Authors: Xuan Liu, Zhigang Jia, Xiaoqing Jin

    Abstract: The main goal of this paper is to propose a new quaternion total variation regularization model for solving linear ill-posed quaternion inverse problems, which arise from three-dimensional signal filtering or color image processing. The quaternion total variation term in the model is represented by collaborative total variation regularization and approximated by a quaternion iteratively reweighted… ▽ More

    Submitted 6 August, 2024; originally announced August 2024.

    Comments: 26 pages, 2 figures, 4 tables

    ACM Class: F.2.1; G.1.3; G.1.6; I.4.4

  3. arXiv:2408.01415  [pdf, other

    cs.AI cs.LG

    Conditional LoRA Parameter Generation

    Authors: Xiaolong Jin, Kai Wang, Dongwen Tang, Wangbo Zhao, Yukun Zhou, Junshu Tang, Yang You

    Abstract: Generative models have achieved remarkable success in image, video, and text domains. Inspired by this, researchers have explored utilizing generative models to generate neural network parameters. However, these efforts have been limited by the parameter size and the practicality of generating high-performance parameters. In this paper, we propose COND P-DIFF, a novel approach that demonstrates th… ▽ More

    Submitted 2 August, 2024; originally announced August 2024.

  4. arXiv:2407.21316  [pdf, other

    cs.CR cs.LG

    Diff-Cleanse: Identifying and Mitigating Backdoor Attacks in Diffusion Models

    Authors: Jiang Hao, Xiao Jin, Hu Xiaoguang, Chen Tianyou

    Abstract: Diffusion models (DM) represent one of the most advanced generative models today, yet recent studies suggest that DMs are vulnerable to backdoor attacks. Backdoor attacks establish hidden associations between particular input patterns and model behaviors, compromising model integrity by triggering undesirable actions with manipulated input data. This vulnerability poses substantial risks, includin… ▽ More

    Submitted 30 July, 2024; originally announced July 2024.

  5. arXiv:2407.20947  [pdf, other

    cs.NE

    An Asynchronous Multi-core Accelerator for SNN inference

    Authors: Zhuo Chen, De Ma, Xiaofei Jin, Qinghui Xing, Ouwen Jin, Xin Du, Shuibing He, Gang Pan

    Abstract: Spiking Neural Networks (SNNs) are extensively utilized in brain-inspired computing and neuroscience research. To enhance the speed and energy efficiency of SNNs, several many-core accelerators have been developed. However, maintaining the accuracy of SNNs often necessitates frequent explicit synchronization among all cores, which presents a challenge to overall efficiency. In this paper, we propo… ▽ More

    Submitted 30 July, 2024; originally announced July 2024.

  6. arXiv:2407.20761  [pdf, other

    cs.AI

    OmniBal: Towards Fast Instruct-tuning for Vision-Language Models via Omniverse Computation Balance

    Authors: Yongqiang Yao, Jingru Tan, Jiahao Hu, Feizhao Zhang, Xin Jin, Bo Li, Ruihao Gong, Pengfei Liu

    Abstract: Recently, vision-language instruct-tuning models have made significant progress due to their more comprehensive understanding of the world. In this work, we discovered that large-scale 3D parallel training on those models leads to an imbalanced computation load across different devices. The vision and language parts are inherently heterogeneous: their data distribution and model architecture diffe… ▽ More

    Submitted 30 July, 2024; originally announced July 2024.

  7. arXiv:2407.20018  [pdf, other

    cs.DC

    Efficient Training of Large Language Models on Distributed Infrastructures: A Survey

    Authors: Jiangfei Duan, Shuo Zhang, Zerui Wang, Lijuan Jiang, Wenwen Qu, Qinghao Hu, Guoteng Wang, Qizhen Weng, Hang Yan, Xingcheng Zhang, Xipeng Qiu, Dahua Lin, Yonggang Wen, Xin Jin, Tianwei Zhang, Peng Sun

    Abstract: Large Language Models (LLMs) like GPT and LLaMA are revolutionizing the AI industry with their sophisticated capabilities. Training these models requires vast GPU clusters and significant computing time, posing major challenges in terms of scalability, efficiency, and reliability. This survey explores recent advancements in training systems for LLMs, including innovations in training infrastructur… ▽ More

    Submitted 29 July, 2024; originally announced July 2024.

  8. arXiv:2407.18999  [pdf, other

    cs.CV cs.LG

    Graph-based Unsupervised Disentangled Representation Learning via Multimodal Large Language Models

    Authors: Baao Xie, Qiuyu Chen, Yunnan Wang, Zequn Zhang, Xin Jin, Wenjun Zeng

    Abstract: Disentangled representation learning (DRL) aims to identify and decompose underlying factors behind observations, thus facilitating data perception and generation. However, current DRL approaches often rely on the unrealistic assumption that semantic factors are statistically independent. In reality, these factors may exhibit correlations, which off-the-shelf solutions have yet to properly address… ▽ More

    Submitted 26 July, 2024; originally announced July 2024.

    Comments: 9 pages, 7 figures

  9. arXiv:2407.18957  [pdf, other

    q-fin.TR cs.AI cs.MA

    When AI Meets Finance (StockAgent): Large Language Model-based Stock Trading in Simulated Real-world Environments

    Authors: Chong Zhang, Xinyi Liu, Mingyu Jin, Zhongmou Zhang, Lingyao Li, Zhenting Wang, Wenyue Hua, Dong Shu, Suiyuan Zhu, Xiaobo Jin, Sujian Li, Mengnan Du, Yongfeng Zhang

    Abstract: Can AI Agents simulate real-world trading environments to investigate the impact of external factors on stock trading activities (e.g., macroeconomics, policy changes, company fundamentals, and global events)? These factors, which frequently influence trading behaviors, are critical elements in the quest for maximizing investors' profits. Our work attempts to solve this problem through large langu… ▽ More

    Submitted 1 August, 2024; v1 submitted 15 July, 2024; originally announced July 2024.

    Comments: 33 pages, 10 figures

  10. arXiv:2407.18556  [pdf, other

    cs.LG cs.AI

    Look Globally and Reason: Two-stage Path Reasoning over Sparse Knowledge Graphs

    Authors: Saiping Guan, Jiyao Wei, Xiaolong Jin, Jiafeng Guo, Xueqi Cheng

    Abstract: Sparse Knowledge Graphs (KGs), frequently encountered in real-world applications, contain fewer facts in the form of (head entity, relation, tail entity) compared to more populated KGs. The sparse KG completion task, which reasons answers for given queries in the form of (head entity, relation, ?) for sparse KGs, is particularly challenging due to the necessity of reasoning missing facts based on… ▽ More

    Submitted 26 July, 2024; originally announced July 2024.

    Comments: Accepted to CIKM 2024

  11. arXiv:2407.17570  [pdf, other

    astro-ph.GA astro-ph.CO

    A SPectroscopic survey of biased halos In the Reionization Era (ASPIRE): Broad-line AGN at $z=4-5$ revealed by JWST/NIRCam WFSS

    Authors: Xiaojing Lin, Feige Wang, Xiaohui Fan, Zheng Cai, Jaclyn B. Champagne, Fengwu Sun, Marta Volonteri, Jinyi Yang, Joseph F. Hennawi, Eduardo Bañados, Aaron Barth, Anna-Christina Eilers, Emanuele Paolo Farina, Weizhe Liu, Xiangyu Jin, Hyunsung D. Jun, Alessandro Lupi, Koki Kakiichi, Chiara Mazzucchelli, Masafusa Onoue, Zhiwei Pan, Elia Pizzati, Sofía Rojas-Ruiz, Jan-Torge Schindler, Benny Trakhtenbrot , et al. (11 additional authors not shown)

    Abstract: Low-luminosity AGNs with low-mass black holes (BHs) in the early universe are fundamental to understanding the BH growth and their co-evolution with the host galaxies. Utilizing JWST NIRCam Wide Field Slitless Spectroscopy (WFSS), we perform a systematic search for broad-line ${\rm Hαあるふぁ}$ emitters (BHAEs) at $z\approx 4-5$ in 25 fields of the ASPIRE (A SPectroscopic survey of biased halos In the Rei… ▽ More

    Submitted 24 July, 2024; originally announced July 2024.

    Comments: 19 pages, 13 figures, 4 tables. Accepted by the ApJ

  12. arXiv:2407.17130  [pdf, other

    math.NA

    Multiscale modeling for a class of high-contrast heterogeneous sign-changing problems

    Authors: Changqing Ye, Xingguang Jin, Patrick Ciarlet Jr., Eric T. Chung

    Abstract: The mathematical formulation of sign-changing problems involves a linear second-order partial differential equation in the divergence form, where the coefficient can assume positive and negative values in different subdomains. These problems find their physical background in negative-index metamaterials, either as inclusions embedded into common materials as the matrix or vice versa. In this paper… ▽ More

    Submitted 24 July, 2024; originally announced July 2024.

  13. arXiv:2407.15173  [pdf, other

    cs.CV

    Rethinking Domain Adaptation and Generalization in the Era of CLIP

    Authors: Ruoyu Feng, Tao Yu, Xin Jin, Xiaoyuan Yu, Lei Xiao, Zhibo Chen

    Abstract: In recent studies on domain adaptation, significant emphasis has been placed on the advancement of learning shared knowledge from a source domain to a target domain. Recently, the large vision-language pre-trained model, i.e., CLIP has shown strong ability on zero-shot recognition, and parameter efficient tuning can further improve its performance on specific tasks. This work demonstrates that a s… ▽ More

    Submitted 21 July, 2024; originally announced July 2024.

  14. arXiv:2407.15142  [pdf, other

    cond-mat.mtrl-sci cond-mat.str-el

    A Feasible Way to Find Above-Room-Temperature Ferromagnetic Spintronic Materials: from Flat Band Engineering

    Authors: Yuanji Xu, Xintao Jin, Jiacheng Xiang, Huiyuan Zhang, Fuyang Tian

    Abstract: Finding and designing ferromagnets that operate above room temperature is crucial in advancing high-performance spintronic devices. The pioneering van der Waals (vdW) ferromagnet Fe$_3$GaTe$_2$ has extended the way for spintronic applications by achieving a record-high Curie temperature among its analogues. However, the physical mechanism of increasing Cuire temperature still needs to be explored.… ▽ More

    Submitted 21 July, 2024; originally announced July 2024.

  15. arXiv:2407.14266  [pdf, other

    cs.IR cs.LG

    L^2CL: Embarrassingly Simple Layer-to-Layer Contrastive Learning for Graph Collaborative Filtering

    Authors: Xinzhou Jin, Jintang Li, Liang Chen, Chenyun Yu, Yuanzhen Xie, Tao Xie, Chengxiang Zhuo, Zang Li, Zibin Zheng

    Abstract: Graph neural networks (GNNs) have recently emerged as an effective approach to model neighborhood signals in collaborative filtering. Towards this research line, graph contrastive learning (GCL) demonstrates robust capabilities to address the supervision label shortage issue through generating massive self-supervised signals. Despite its effectiveness, GCL for recommendation suffers seriously from… ▽ More

    Submitted 19 July, 2024; originally announced July 2024.

  16. arXiv:2407.12371  [pdf, other

    cs.CV cs.AI

    HIMO: A New Benchmark for Full-Body Human Interacting with Multiple Objects

    Authors: Xintao Lv, Liang Xu, Yichao Yan, Xin Jin, Congsheng Xu, Shuwen Wu, Yifan Liu, Lincheng Li, Mengxiao Bi, Wenjun Zeng, Xiaokang Yang

    Abstract: Generating human-object interactions (HOIs) is critical with the tremendous advances of digital avatars. Existing datasets are typically limited to humans interacting with a single object while neglecting the ubiquitous manipulation of multiple objects. Thus, we propose HIMO, a large-scale MoCap dataset of full-body human interacting with multiple objects, containing 3.3K 4D HOI sequences and 4.08… ▽ More

    Submitted 17 July, 2024; originally announced July 2024.

    Comments: Project page: https://lvxintao.github.io/himo, accepted by ECCV 2024

  17. arXiv:2407.11700  [pdf, other

    cs.CV eess.IV

    Rate-Distortion-Cognition Controllable Versatile Neural Image Compression

    Authors: Jinming Liu, Ruoyu Feng, Yunpeng Qi, Qiuyu Chen, Zhibo Chen, Wenjun Zeng, Xin Jin

    Abstract: Recently, the field of Image Coding for Machines (ICM) has garnered heightened interest and significant advances thanks to the rapid progress of learning-based techniques for image compression and analysis. Previous studies often require training separate codecs to support various bitrate levels, machine tasks, and networks, thus lacking both flexibility and practicality. To address these challeng… ▽ More

    Submitted 17 July, 2024; v1 submitted 16 July, 2024; originally announced July 2024.

    Comments: ECCV2024

  18. arXiv:2407.08936  [pdf, ps, other

    cs.LO

    HHLPar: Automated Theorem Prover for Parallel Hybrid Communicating Sequential Processes

    Authors: Xiangyu Jin, Bohua Zhan, Shuling Wang, Naijun Zhan

    Abstract: We present a tool called HHLPar for verifying hybrid systems modelled in Hybrid Communicating Sequential Processes (HCSP). HHLPar is built upon a Hybrid Hoare Logic for HCSP, which is able to reason about continuous-time properties of differential equations, as well as communication and parallel composition of parallel HCSP processes with the help of parameterised trace assertions and their synchr… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  19. arXiv:2407.07805  [pdf, other

    cs.CV

    SUMix: Mixup with Semantic and Uncertain Information

    Authors: Huafeng Qin, Xin Jin, Hongyu Zhu, Hongchao Liao, Mounîm A. El-Yacoubi, Xinbo Gao

    Abstract: Mixup data augmentation approaches have been applied for various tasks of deep learning to improve the generalization ability of deep neural networks. Some existing approaches CutMix, SaliencyMix, etc. randomly replace a patch in one image with patches from another to generate the mixed image. Similarly, the corresponding labels are linearly combined by a fixed ratio $λらむだ$ by l. The objects in two i… ▽ More

    Submitted 17 July, 2024; v1 submitted 10 July, 2024; originally announced July 2024.

    Comments: Accepted by ECCV2024 [Camera Ready] (19 pages, 7 figures) with the source code at https://github.com/JinXins/SUMix

  20. arXiv:2407.07771  [pdf, other

    cs.CL cs.CV cs.MM

    Multi-task Prompt Words Learning for Social Media Content Generation

    Authors: Haochen Xue, Chong Zhang, Chengzhi Liu, Fangyu Wu, Xiaobo Jin

    Abstract: The rapid development of the Internet has profoundly changed human life. Humans are increasingly expressing themselves and interacting with others on social media platforms. However, although artificial intelligence technology has been widely used in many aspects of life, its application in social media content creation is still blank. To solve this problem, we propose a new prompt word generation… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: 8 pages, 5 figures

    Journal ref: International Joint Conference on Neural Networks 2024

  21. arXiv:2407.06127  [pdf, other

    cs.CV

    Better Sampling, towards Better End-to-end Small Object Detection

    Authors: Zile Huang, Chong Zhang, Mingyu Jin, Fangyu Wu, Chengzhi Liu, Xiaobo Jin

    Abstract: While deep learning-based general object detection has made significant strides in recent years, the effectiveness and efficiency of small object detection remain unsatisfactory. This is primarily attributed not only to the limited characteristics of such small targets but also to the high density and mutual overlap among these targets. The existing transformer-based small object detectors do not… ▽ More

    Submitted 17 May, 2024; originally announced July 2024.

    Comments: 14 pages, 5 figures

  22. arXiv:2407.04957  [pdf, other

    cond-mat.str-el

    Mechanism of magnetic phase transition in correlated magnetic metal: insight into itinerant ferromagnet Fe$_{3-δでるた}$GeTe$_2$

    Authors: Yuanji Xu, Yuechao Wang, Xintao Jin, Haifeng Liu, Yu Liu, Haifeng Song, Fuyang Tian

    Abstract: Developing a comprehensive magnetic theory of correlated itinerant magnets is a challenging task due to the difficulty in reconciling both local moments and itinerant electrons. In this work, we investigate the microscopic process of magnetic phase transition in ferromagnet metal Fe$_{3-δでるた}$GeTe$_2$. A new paradigm is proposed to describe the magnetic phase transition in correlated metallic ferroma… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

  23. arXiv:2407.04697  [pdf, other

    cs.CV cs.MM

    VCoME: Verbal Video Composition with Multimodal Editing Effects

    Authors: Weibo Gong, Xiaojie Jin, Xin Li, Dongliang He, Xinglong Wu

    Abstract: Verbal videos, featuring voice-overs or text overlays, provide valuable content but present significant challenges in composition, especially when incorporating editing effects to enhance clarity and visual appeal. In this paper, we introduce the novel task of verbal video composition with editing effects. This task aims to generate coherent and visually appealing verbal videos by integrating mult… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

  24. arXiv:2407.04364  [pdf, other

    math.NA

    Robust Multiscale Methods for Helmholtz equations in high contrast heterogeneous media

    Authors: Xingguang Jin, Changqing Ye, Eric T. Chung

    Abstract: In this paper, we provide the constraint energy minimization generalized multiscale finite element method (CEM-GMsFEM) to solve Helmholtz equations in heterogeneous medium. This novel multiscale method is specifically designed to overcome problems related to pollution effect, high-contrast coefficients, and the loss of hermiticity of operators. We establish the inf-sup stability and give an a prio… ▽ More

    Submitted 8 July, 2024; v1 submitted 5 July, 2024; originally announced July 2024.

  25. arXiv:2407.03757  [pdf, other

    cs.CV

    DiffRetouch: Using Diffusion to Retouch on the Shoulder of Experts

    Authors: Zheng-Peng Duan, Jiawei zhang, Zheng Lin, Xin Jin, Dongqing Zou, Chunle Guo, Chongyi Li

    Abstract: Image retouching aims to enhance the visual quality of photos. Considering the different aesthetic preferences of users, the target of retouching is subjective. However, current retouching methods mostly adopt deterministic models, which not only neglects the style diversity in the expert-retouched results and tends to learn an average style during training, but also lacks sample diversity during… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  26. arXiv:2407.02767  [pdf, other

    cond-mat.mtrl-sci cond-mat.mes-hall

    Comparison of Short-Range Order in GeSn Grown by Molecular Beam Epitaxy and Chemical Vapor Deposition

    Authors: Shang Liu, Yunfan Liang, Haochen Zhao, Nirosh M. Eldose, Jin-Hee Bae, Omar Concepcion, Xiaochen Jin, Shunda Chen, Ilias Bikmukhametov, Austin Akey, Cory T. Cline, Alejandra Cuervo Covian, Xiaoxin Wang, Tianshu Li, Yuping Zeng, Dan Buca, Shui-Qing Yu, Gregory J. Salamo, Shengbai Zhang, Jifeng Liu

    Abstract: Atomic short-range order (SRO) in direct-bandgap GeSn for infrared photonics has recently attracted attention due to its notable impact on band structures. However, the SRO in GeSn thin films grown by different methods have hardly been compared. This paper compares SRO in GeSn thin films of similar compositions grown by molecular beam epitaxy (MBE) and chemical vapor deposition (CVD) using atom pr… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

  27. arXiv:2407.02345  [pdf, other

    cs.CL

    MORPHEUS: Modeling Role from Personalized Dialogue History by Exploring and Utilizing Latent Space

    Authors: Yihong Tang, Bo Wang, Dongming Zhao, Xiaojia Jin, Jijun Zhang, Ruifang He, Yuexian Hou

    Abstract: Personalized Dialogue Generation (PDG) aims to create coherent responses according to roles or personas. Traditional PDG relies on external role data, which can be scarce and raise privacy concerns. Approaches address these issues by extracting role information from dialogue history, which often fail to generically model roles in continuous space. To overcome these limitations, we introduce a nove… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

  28. arXiv:2407.02077  [pdf, other

    cs.CV

    Hierarchical Temporal Context Learning for Camera-based Semantic Scene Completion

    Authors: Bohan Li, Jiajun Deng, Wenyao Zhang, Zhujin Liang, Dalong Du, Xin Jin, Wenjun Zeng

    Abstract: Camera-based 3D semantic scene completion (SSC) is pivotal for predicting complicated 3D layouts with limited 2D image observations. The existing mainstream solutions generally leverage temporal information by roughly stacking history frames to supplement the current frame, such straightforward temporal modeling inevitably diminishes valid clues and increases learning difficulty. To address this p… ▽ More

    Submitted 16 July, 2024; v1 submitted 2 July, 2024; originally announced July 2024.

    Comments: ECCV 2024

  29. arXiv:2407.01162  [pdf, ps, other

    physics.space-ph astro-ph.IM

    Actuation system of the inertial sensor for high-precision space missions using torsion pendulum

    Authors: Fangchao Yang, Yan Zhu, Xiaofei Jin, Yujie Zhao, Shixun Pei, Wei Hong

    Abstract: Precision space inertial sensors are imperative to Earth geodesy missions, gravitational wave observations and several fundamental physics experiments in space. In these missions, the residual acceleration noise of the test mass(TM) caused by the forces from inertial sensor components and environment is supposed to be kept below a certain level. As a number of forces contributing to residual accel… ▽ More

    Submitted 10 July, 2024; v1 submitted 1 July, 2024; originally announced July 2024.

    Comments: 9 pages, 14 figures

  30. arXiv:2407.00603  [pdf, other

    cs.CV

    Hierarchical Memory for Long Video QA

    Authors: Yiqin Wang, Haoji Zhang, Yansong Tang, Yong Liu, Jiashi Feng, Jifeng Dai, Xiaojie Jin

    Abstract: This paper describes our champion solution to the LOVEU Challenge @ CVPR'24, Track 1 (Long Video VQA). Processing long sequences of visual tokens is computationally expensive and memory-intensive, making long video question-answering a challenging task. The key is to compress visual tokens effectively, reducing memory footprint and decoding latency, while preserving the essential information for a… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

  31. arXiv:2406.18485  [pdf, other

    cs.DC

    LoongTrain: Efficient Training of Long-Sequence LLMs with Head-Context Parallelism

    Authors: Diandian Gu, Peng Sun, Qinghao Hu, Ting Huang, Xun Chen, Yingtong Xiong, Guoteng Wang, Qiaoling Chen, Shangchun Zhao, Jiarui Fang, Yonggang Wen, Tianwei Zhang, Xin Jin, Xuanzhe Liu

    Abstract: Efficiently training LLMs with long sequences is important yet challenged by the massive computation and memory requirements. Sequence parallelism has been proposed to tackle these problems, but existing methods suffer from scalability or efficiency issues. We propose LoongTrain, a novel system to efficiently train LLMs with long sequences at scale. The core of LoongTrain is the 2D-Attention mecha… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  32. arXiv:2406.17562  [pdf

    cond-mat.mtrl-sci physics.app-ph

    Low Excess Noise, High Quantum Efficiency Avalanche Photodiodes for Beyond 2 μみゅーm Wavelength Detection

    Authors: Hyemin Jung, Seunghyun Lee, Xiao Jin, Yifan Liu, Theodore J. Ronningen, Christoph H. Grein, John P. R. David, Sanjay Krishna

    Abstract: The increasing concentration of greenhouse gases, notably CH4 and CO2, has fueled global temperature increases, intensifying concerns regarding the prevailing climate crisis. Effectively monitoring these gases demands a detector spanning the extended short-wavelength infrared (~2.4 μみゅーm) range, covering wavelengths of CH4 (1.65 μみゅーm) and CO2 (2.05 μみゅーm). The state-of-the-art HgCdTe avalanche photodetect… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  33. arXiv:2406.14191  [pdf, other

    cs.CL cs.AI cs.LG

    Temporal Knowledge Graph Question Answering: A Survey

    Authors: Miao Su, Zixuan Li, Zhuo Chen, Long Bai, Xiaolong Jin, Jiafeng Guo

    Abstract: Knowledge Base Question Answering (KBQA) has been a long-standing field to answer questions based on knowledge bases. Recently, the evolving dynamics of knowledge have attracted a growing interest in Temporal Knowledge Graph Question Answering (TKGQA), an emerging task to answer temporal questions. However, this field grapples with ambiguities in defining temporal questions and lacks a systematic… ▽ More

    Submitted 5 July, 2024; v1 submitted 20 June, 2024; originally announced June 2024.

    Comments: 8 pages, 3 figures

  34. arXiv:2406.14026  [pdf, other

    cs.LG cs.CL stat.ML

    Demystifying Forgetting in Language Model Fine-Tuning with Statistical Analysis of Example Associations

    Authors: Xisen Jin, Xiang Ren

    Abstract: Language models (LMs) are known to suffer from forgetting of previously learned examples when fine-tuned, breaking stability of deployed LM systems. Despite efforts on mitigating forgetting, few have investigated whether, and how forgotten upstream examples are associated with newly learned tasks. Insights on such associations enable efficient and targeted mitigation of forgetting. In this paper,… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

    Comments: 5 pages

  35. arXiv:2406.11648  [pdf, other

    math.CO

    The number of quasi-trees of bouquets with exactly one non-orientable loop

    Authors: Qingying Deng, Xian'an Jin, Qi Yan

    Abstract: Recently, Merino extended the classical relation between the $2n$-th Fibonacci number and the number of spanning trees of the $n$-fan graph to ribbon graphs, and established a relation between the $n$-associated Mersenne number and the number of quasi-trees of the $n$-wheel ribbon graph. Moreover, Merino posed a problem of finding the Lucas numbers as the number of spanning quasi-trees of a family… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: 22 pages, 3 figures

  36. arXiv:2406.08855  [pdf, other

    cs.RO

    Trajectory Planning for Autonomous Driving in Unstructured Scenarios Based on Graph Neural Network and Numerical Optimization

    Authors: Sumin Zhang, Kuo Li, Rui He, Zhiwei Meng, Yupeng Chang, Xiaosong Jin, Ri Bai

    Abstract: In unstructured environments, obstacles are diverse and lack lane markings, making trajectory planning for intelligent vehicles a challenging task. Traditional trajectory planning methods typically involve multiple stages, including path planning, speed planning, and trajectory optimization. These methods require the manual design of numerous parameters for each stage, resulting in significant wor… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  37. arXiv:2406.08155  [pdf, other

    cs.LG cs.AI cs.CL

    Examining Post-Training Quantization for Mixture-of-Experts: A Benchmark

    Authors: Pingzhi Li, Xiaolong Jin, Yu Cheng, Tianlong Chen

    Abstract: Large Language Models~(LLMs) have become foundational in the realm of natural language processing, demonstrating performance improvements as model sizes increase. The Mixture-of-Experts~(MoE) approach offers a promising way to scale LLMs more efficiently by using fewer computational FLOPs through sparse activation. However, it suffers from significant memory overheads, necessitating model compress… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: Our code for reproducing all our experiments is provided at https://github.com/UNITES-Lab/moe-quantization

  38. arXiv:2406.08085  [pdf, other

    cs.CV

    Flash-VStream: Memory-Based Real-Time Understanding for Long Video Streams

    Authors: Haoji Zhang, Yiqin Wang, Yansong Tang, Yong Liu, Jiashi Feng, Jifeng Dai, Xiaojie Jin

    Abstract: Benefiting from the advancements in large language models and cross-modal alignment, existing multi-modal video understanding methods have achieved prominent performance in offline scenario. However, online video streams, as one of the most common media forms in the real world, have seldom received attention. Compared to offline videos, the 'dynamic' nature of online video streams poses challenges… ▽ More

    Submitted 30 June, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

  39. arXiv:2406.07006  [pdf, other

    cs.CV

    MIPI 2024 Challenge on Few-shot RAW Image Denoising: Methods and Results

    Authors: Xin Jin, Chunle Guo, Xiaoming Li, Zongsheng Yue, Chongyi Li, Shangchen Zhou, Ruicheng Feng, Yuekun Dai, Peiqing Yang, Chen Change Loy, Ruoqi Li, Chang Liu, Ziyi Wang, Yao Du, Jingjing Yang, Long Bao, Heng Sun, Xiangyu Kong, Xiaoxia Xing, Jinlong Wu, Yuanyang Xue, Hyunhee Park, Sejun Song, Changho Kim, Jingfan Tan , et al. (17 additional authors not shown)

    Abstract: The increasing demand for computational photography and imaging on mobile platforms has led to the widespread development and integration of advanced image sensors with novel algorithms in camera systems. However, the scarcity of high-quality data for research and the rare opportunity for in-depth exchange of views from industry and academia constrain the development of mobile intelligent photogra… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: CVPR 2024 Mobile Intelligent Photography and Imaging (MIPI) Workshop--Few-shot RAWImage Denoising Challenge Report. Website: https://mipi-challenge.org/MIPI2024/

  40. arXiv:2406.06858  [pdf, other

    cs.LG cs.DC

    FLUX: Fast Software-based Communication Overlap On GPUs Through Kernel Fusion

    Authors: Li-Wen Chang, Wenlei Bao, Qi Hou, Chengquan Jiang, Ningxin Zheng, Yinmin Zhong, Xuanrun Zhang, Zuquan Song, Ziheng Jiang, Haibin Lin, Xin Jin, Xin Liu

    Abstract: Large deep learning models have demonstrated strong ability to solve many tasks across a wide range of applications. Those large models typically require training and inference to be distributed. Tensor parallelism is a common technique partitioning computation of an operation or layer across devices to overcome the memory capacity limitation of a single processor, and/or to accelerate computation… ▽ More

    Submitted 18 June, 2024; v1 submitted 10 June, 2024; originally announced June 2024.

  41. arXiv:2406.06216  [pdf, other

    cs.CV

    Lighting Every Darkness with 3DGS: Fast Training and Real-Time Rendering for HDR View Synthesis

    Authors: Xin Jin, Pengyi Jiao, Zheng-Peng Duan, Xingchao Yang, Chun-Le Guo, Bo Ren, Chongyi Li

    Abstract: Volumetric rendering based methods, like NeRF, excel in HDR view synthesis from RAWimages, especially for nighttime scenes. While, they suffer from long training times and cannot perform real-time rendering due to dense sampling requirements. The advent of 3D Gaussian Splatting (3DGS) enables real-time rendering and faster training. However, implementing RAW image-based view synthesis directly usi… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

  42. arXiv:2406.05392  [pdf, other

    cs.CL cs.AI cs.CY cs.LG

    Deconstructing The Ethics of Large Language Models from Long-standing Issues to New-emerging Dilemmas

    Authors: Chengyuan Deng, Yiqun Duan, Xin Jin, Heng Chang, Yijun Tian, Han Liu, Henry Peng Zou, Yiqiao Jin, Yijia Xiao, Yichen Wang, Shenghao Wu, Zongxing Xie, Kuofeng Gao, Sihong He, Jun Zhuang, Lu Cheng, Haohan Wang

    Abstract: Large Language Models (LLMs) have achieved unparalleled success across diverse language modeling tasks in recent years. However, this progress has also intensified ethical concerns, impacting the deployment of LLMs in everyday contexts. This paper provides a comprehensive survey of ethical challenges associated with LLMs, from longstanding issues such as copyright infringement, systematic bias, an… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

  43. arXiv:2406.04591  [pdf, ps, other

    math.DG math.AP

    Stability of the generalized Lagrangian mean curvature flow in cotangent bundle

    Authors: Xishen Jin, Jiawei Liu

    Abstract: In this paper, we consider the stability of the generalized Lagrangian mean curvature flow of graph case in the cotangent bundle, which is first defined by Smoczyk-Tsui-Wang. By new estimates of derivatives along the flow, we weaken the initial condition and remove the positive curvature condition in Smoczyk-Tsui-Wang's work. More precisely, we prove that if the graph induced by a closed $1$-form… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: All comments are welcome! arXiv admin note: text overlap with arXiv:1604.02936 by other authors

  44. arXiv:2406.02929  [pdf, other

    cs.CV cs.LG

    Exploring Data Efficiency in Zero-Shot Learning with Diffusion Models

    Authors: Zihan Ye, Shreyank N. Gowda, Xiaobo Jin, Xiaowei Huang, Haotian Xu, Yaochu Jin, Kaizhu Huang

    Abstract: Zero-Shot Learning (ZSL) aims to enable classifiers to identify unseen classes by enhancing data efficiency at the class level. This is achieved by generating image features from pre-defined semantics of unseen classes. However, most current approaches heavily depend on the number of samples from seen classes, i.e. they do not consider instance-level effectiveness. In this paper, we demonstrate th… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

  45. arXiv:2406.01245  [pdf, other

    eess.IV

    Sparse Focus Network for Multi-Source Remote Sensing Data Classification

    Authors: Xuepeng Jin, Junyan Lin, Feng Gao, Lin Qi, Yang Zhou

    Abstract: Multi-source remote sensing data classification has emerged as a prominent research topic with the advancement of various sensors. Existing multi-source data classification methods are susceptible to irrelevant information interference during multi-source feature extraction and fusion. To solve this issue, we propose a sparse focus network for multi-source data classification. Sparse attention is… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: Accepted by IEEE IGARSS 2024

  46. arXiv:2406.01235  [pdf, other

    eess.IV

    Boosting Spatial-Spectral Masked Auto-Encoder Through Mining Redundant Spectra for HSI-SAR/LiDAR Classification

    Authors: Junyan Lin, Xuepeng Jin, Feng Gao, Junyu Dong, Hui Yu

    Abstract: Although recent masked image modeling (MIM)-based HSI-LiDAR/SAR classification methods have gradually recognized the importance of the spectral information, they have not adequately addressed the redundancy among different spectra, resulting in information leakage during the pretraining stage. This issue directly impairs the representation ability of the model. To tackle the problem, we propose a… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: Accepted by IGARSS 2024

  47. arXiv:2406.00943  [pdf, other

    cs.LG cs.AI

    State Space Models on Temporal Graphs: A First-Principles Study

    Authors: Jintang Li, Ruofan Wu, Xinzhou Jin, Boqun Ma, Liang Chen, Zibin Zheng

    Abstract: Over the past few years, research on deep graph learning has shifted from static graphs to temporal graphs in response to real-world complex systems that exhibit dynamic behaviors. In practice, temporal graphs are formalized as an ordered sequence of static graph snapshots observed at discrete time points. Sequence models such as RNNs or Transformers have long been the predominant backbone network… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

    Comments: Preprint; Code will be made available at https://github.com/EdisonLeeeee/GraphSSM

  48. arXiv:2406.00275  [pdf, other

    cs.CV cs.LG

    StyDeSty: Min-Max Stylization and Destylization for Single Domain Generalization

    Authors: Songhua Liu, Xin Jin, Xingyi Yang, Jingwen Ye, Xinchao Wang

    Abstract: Single domain generalization (single DG) aims at learning a robust model generalizable to unseen domains from only one training domain, making it a highly ambitious and challenging task. State-of-the-art approaches have mostly relied on data augmentations, such as adversarial perturbation and style enhancement, to synthesize new data and thus increase robustness. Nevertheless, they have largely ov… ▽ More

    Submitted 31 May, 2024; originally announced June 2024.

    Comments: Accepted at ICML 2024; Work in 2022 spring

  49. arXiv:2405.19946  [pdf, other

    cs.AI

    Learning to Discuss Strategically: A Case Study on One Night Ultimate Werewolf

    Authors: Xuanfa Jin, Ziyan Wang, Yali Du, Meng Fang, Haifeng Zhang, Jun Wang

    Abstract: Communication is a fundamental aspect of human society, facilitating the exchange of information and beliefs among people. Despite the advancements in large language models (LLMs), recent agents built with these often neglect the control over discussion tactics, which are essential in communication scenarios and games. As a variant of the famous communication game Werewolf, One Night Ultimate Were… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

    Comments: 27 pages, 5 figures

  50. arXiv:2405.19850  [pdf, other

    cs.AI

    Deciphering Human Mobility: Inferring Semantics of Trajectories with Large Language Models

    Authors: Yuxiao Luo, Zhongcai Cao, Xin Jin, Kang Liu, Ling Yin

    Abstract: Understanding human mobility patterns is essential for various applications, from urban planning to public safety. The individual trajectory such as mobile phone location data, while rich in spatio-temporal information, often lacks semantic detail, limiting its utility for in-depth mobility analysis. Existing methods can infer basic routine activity sequences from this data, lacking depth in under… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.