(Translated by https://www.hiragana.jp/)
Search | arXiv e-print repository
Skip to main content

Showing 1–50 of 1,946 results for author: Liu, L

Searching in archive cs. Search in all archives.
.
  1. arXiv:2409.08091  [pdf, other

    cs.CV

    EZIGen: Enhancing zero-shot subject-driven image generation with precise subject encoding and decoupled guidance

    Authors: Zicheng Duan, Yuxuan Ding, Chenhui Gou, Ziqin Zhou, Ethan Smith, Lingqiao Liu

    Abstract: Zero-shot subject-driven image generation aims to produce images that incorporate a subject from a given example image. The challenge lies in preserving the subject's identity while aligning with the text prompt, which often requires modifying certain aspects of the subject's appearance. Despite advancements in diffusion model based methods, existing approaches still struggle to balance identity p… ▽ More

    Submitted 12 September, 2024; originally announced September 2024.

  2. arXiv:2409.07907  [pdf, other

    cs.CV

    From COCO to COCO-FP: A Deep Dive into Background False Positives for COCO Detectors

    Authors: Longfei Liu, Wen Guo, Shihua Huang, Cheng Li, Xi Shen

    Abstract: Reducing false positives is essential for enhancing object detector performance, as reflected in the mean Average Precision (mAP) metric. Although object detectors have achieved notable improvements and high mAP scores on the COCO dataset, analysis reveals limited progress in addressing false positives caused by non-target visual clutter-background objects not included in the annotated categories.… ▽ More

    Submitted 12 September, 2024; originally announced September 2024.

  3. arXiv:2409.07092  [pdf, other

    eess.IV cs.AI cs.CV

    CWT-Net: Super-resolution of Histopathology Images Using a Cross-scale Wavelet-based Transformer

    Authors: Feiyang Jia, Zhineng Chen, Ziying Song, Lin Liu, Caiyan Jia

    Abstract: Super-resolution (SR) aims to enhance the quality of low-resolution images and has been widely applied in medical imaging. We found that the design principles of most existing methods are influenced by SR tasks based on real-world images and do not take into account the significance of the multi-level structure in pathological images, even if they can achieve respectable objective metric evaluatio… ▽ More

    Submitted 11 September, 2024; originally announced September 2024.

  4. arXiv:2409.06368  [pdf, other

    cs.GR

    Fiber-level Woven Fabric Capture from a Single Photo

    Authors: Zixuan Li, Pengfei Shen, Hanxiao Sun, Zibo Zhang, Yu Guo, Ligang Liu, Ling-Qi Yan, Steve Marschner, Milos Hasan, Beibei Wang

    Abstract: Accurately rendering the appearance of fabrics is challenging, due to their complex 3D microstructures and specialized optical properties. If we model the geometry and optics of fabrics down to the fiber level, we can achieve unprecedented rendering realism, but this raises the difficulty of authoring or capturing the fiber-level assets. Existing approaches can obtain fiber-level geometry with spe… ▽ More

    Submitted 10 September, 2024; originally announced September 2024.

    Comments: due to the limitation "The abstract field cannot be longer than 1,920 characters", the abstract appearing here is slightly shorter than that in the PDF file

  5. arXiv:2409.06285  [pdf, other

    cs.CV

    Context Enhancement with Reconstruction as Sequence for Unified Unsupervised Anomaly Detection

    Authors: Hui-Yue Yang, Hui Chen, Lihao Liu, Zijia Lin, Kai Chen, Liejun Wang, Jungong Han, Guiguang Ding

    Abstract: Unsupervised anomaly detection (AD) aims to train robust detection models using only normal samples, while can generalize well to unseen anomalies. Recent research focuses on a unified unsupervised AD setting in which only one model is trained for all classes, i.e., n-class-one-model paradigm. Feature-reconstruction-based methods achieve state-of-the-art performance in this scenario. However, exis… ▽ More

    Submitted 10 September, 2024; originally announced September 2024.

  6. arXiv:2409.05370  [pdf, other

    cs.CV cs.AI

    KARGEN: Knowledge-enhanced Automated Radiology Report Generation Using Large Language Models

    Authors: Yingshu Li, Zhanyu Wang, Yunyi Liu, Lei Wang, Lingqiao Liu, Luping Zhou

    Abstract: Harnessing the robust capabilities of Large Language Models (LLMs) for narrative generation, logical reasoning, and common-sense knowledge integration, this study delves into utilizing LLMs to enhance automated radiology report generation (R2Gen). Despite the wealth of knowledge within LLMs, efficiently triggering relevant knowledge within these large models for specific tasks like R2Gen poses a c… ▽ More

    Submitted 9 September, 2024; originally announced September 2024.

  7. arXiv:2409.04704  [pdf, other

    cs.LG cs.AI

    A Multi-scenario Attention-based Generative Model for Personalized Blood Pressure Time Series Forecasting

    Authors: Cheng Wan, Chenjie Xie, Longfei Liu, Dan Wu, Ye Li

    Abstract: Continuous blood pressure (BP) monitoring is essential for timely diagnosis and intervention in critical care settings. However, BP varies significantly across individuals, this inter-patient variability motivates the development of personalized models tailored to each patient's physiology. In this work, we propose a personalized BP forecasting model mainly using electrocardiogram (ECG) and photop… ▽ More

    Submitted 7 September, 2024; originally announced September 2024.

    Comments: 5 pages, 2 figures

  8. arXiv:2409.04421  [pdf, other

    cs.CL cs.AI cs.LG

    RLPF: Reinforcement Learning from Prediction Feedback for User Summarization with LLMs

    Authors: Jiaxing Wu, Lin Ning, Luyang Liu, Harrison Lee, Neo Wu, Chao Wang, Sushant Prakash, Shawn O'Banion, Bradley Green, Jun Xie

    Abstract: LLM-powered personalization agent systems employ Large Language Models (LLMs) to predict users' behavior from their past activities. However, their effectiveness often hinges on the ability to effectively leverage extensive, long user historical data due to its inherent noise and length of such data. Existing pretrained LLMs may generate summaries that are concise but lack the necessary context fo… ▽ More

    Submitted 6 September, 2024; originally announced September 2024.

  9. Achieving the Safety and Security of the End-to-End AV Pipeline

    Authors: Noah T. Curran, Minkyoung Cho, Ryan Feng, Liangkai Liu, Brian Jay Tang, Pedram MohajerAnsari, Alkim Domeke, Mert D. Pesé, Kang G. Shin

    Abstract: In the current landscape of autonomous vehicle (AV) safety and security research, there are multiple isolated problems being tackled by the community at large. Due to the lack of common evaluation criteria, several important research questions are at odds with one another. For instance, while much research has been conducted on physical attacks deceiving AV perception systems, there is often inade… ▽ More

    Submitted 5 September, 2024; originally announced September 2024.

    Comments: Accepted to 1st Cyber Security in Cars Workshop (CSCS) at CCS

  10. arXiv:2409.02708  [pdf, other

    cs.LG stat.ME

    Few-shot Multi-Task Learning of Linear Invariant Features with Meta Subspace Pursuit

    Authors: Chaozhi Zhang, Lin Liu, Xiaoqun Zhang

    Abstract: Data scarcity poses a serious threat to modern machine learning and artificial intelligence, as their practical success typically relies on the availability of big datasets. One effective strategy to mitigate the issue of insufficient data is to first harness information from other data sources possessing certain similarities in the study design stage, and then employ the multi-task or meta learni… ▽ More

    Submitted 4 September, 2024; originally announced September 2024.

  11. arXiv:2409.02494  [pdf, other

    cs.CV

    Plane2Depth: Hierarchical Adaptive Plane Guidance for Monocular Depth Estimation

    Authors: Li Liu, Ruijie Zhu, Jiacheng Deng, Ziyang Song, Wenfei Yang, Tianzhu Zhang

    Abstract: Monocular depth estimation aims to infer a dense depth map from a single image, which is a fundamental and prevalent task in computer vision. Many previous works have shown impressive depth estimation results through carefully designed network structures, but they usually ignore the planar information and therefore perform poorly in low-texture areas of indoor scenes. In this paper, we propose Pla… ▽ More

    Submitted 4 September, 2024; originally announced September 2024.

    Comments: 14 pages, 12 figures, 8 tables

  12. arXiv:2409.01695  [pdf, other

    cs.SD cs.AI eess.AS

    USTC-KXDIGIT System Description for ASVspoof5 Challenge

    Authors: Yihao Chen, Haochen Wu, Nan Jiang, Xiang Xia, Qing Gu, Yunqi Hao, Pengfei Cai, Yu Guan, Jialong Wang, Weilin Xie, Lei Fang, Sian Fang, Yan Song, Wu Guo, Lin Liu, Minqiang Xu

    Abstract: This paper describes the USTC-KXDIGIT system submitted to the ASVspoof5 Challenge for Track 1 (speech deepfake detection) and Track 2 (spoofing-robust automatic speaker verification, SASV). Track 1 showcases a diverse range of technical qualities from potential processing algorithms and includes both open and closed conditions. For these conditions, our system consists of a cascade of a frontend f… ▽ More

    Submitted 3 September, 2024; originally announced September 2024.

    Comments: ASVspoof5 workshop paper

  13. arXiv:2409.01672  [pdf, other

    cs.CV cs.AI cs.LG

    Enhancing Fine-Grained Visual Recognition in the Low-Data Regime Through Feature Magnitude Regularization

    Authors: Avraham Chapman, Haiming Xu, Lingqiao Liu

    Abstract: Training a fine-grained image recognition model with limited data presents a significant challenge, as the subtle differences between categories may not be easily discernible amidst distracting noise patterns. One commonly employed strategy is to leverage pretrained neural networks, which can generate effective feature representations for constructing an image classification model with a restricte… ▽ More

    Submitted 7 September, 2024; v1 submitted 3 September, 2024; originally announced September 2024.

  14. arXiv:2409.01586  [pdf, other

    cs.CL cs.AI

    Booster: Tackling Harmful Fine-tuning for Large Language Models via Attenuating Harmful Perturbation

    Authors: Tiansheng Huang, Sihao Hu, Fatih Ilhan, Selim Furkan Tekin, Ling Liu

    Abstract: Harmful fine-tuning issue \citep{qi2023fine} poses serious safety concerns for Large language models' fine-tuning-as-a-service. While existing defenses \citep{huang2024vaccine,rosati2024representation} have been proposed to mitigate the issue, their performances are still far away from satisfactory, and the root cause of the problem has not been fully recovered. For the first time in the literatur… ▽ More

    Submitted 4 September, 2024; v1 submitted 2 September, 2024; originally announced September 2024.

  15. arXiv:2409.01327  [pdf, other

    cs.CV

    SPDiffusion: Semantic Protection Diffusion for Multi-concept Text-to-image Generation

    Authors: Yang Zhang, Rui Zhang, Xuecheng Nie, Haochen Li, Jikun Chen, Yifan Hao, Xin Zhang, Luoqi Liu, Ling Li

    Abstract: Recent text-to-image models have achieved remarkable success in generating high-quality images. However, when tasked with multi-concept generation which creates images containing multiple characters or objects, existing methods often suffer from attribute confusion, resulting in severe text-image inconsistency. We found that attribute confusion occurs when a certain region of the latent features a… ▽ More

    Submitted 2 September, 2024; originally announced September 2024.

  16. arXiv:2409.01027  [pdf

    cs.HC

    Mindscape: Research of high-information density street environments based on electroencephalogram recording and virtual reality head-mounted simulation

    Authors: Yijiang Liu, Xiangyu Guan, Hui Wang, Lun Liu

    Abstract: This study aims to investigate, through neuroscientific methods, the effects of particular architectural elements on pedestrian spatial cognition and experience in the analysis and design of walking street spaces. More precisely, this paper will describe the impact of the density variation of storefront signs on the brainwaves of passersby in East Asian city walking streets, providing strategies a… ▽ More

    Submitted 2 September, 2024; originally announced September 2024.

    Comments: 10 pages, 10 figures, This paper has been accepted at the eCAADe 2024 Conference

    ACM Class: J.6

  17. arXiv:2409.00843  [pdf, other

    econ.GN cs.CE cs.CY q-fin.CP stat.ML

    Global Public Sentiment on Decentralized Finance: A Spatiotemporal Analysis of Geo-tagged Tweets from 150 Countries

    Authors: Yuqi Chen, Yifan Li, Kyrie Zhixuan Zhou, Xiaokang Fu, Lingbo Liu, Shuming Bao, Daniel Sui, Luyao Zhang

    Abstract: In the digital era, blockchain technology, cryptocurrencies, and non-fungible tokens (NFTs) have transformed financial and decentralized systems. However, existing research often neglects the spatiotemporal variations in public sentiment toward these technologies, limiting macro-level insights into their global impact. This study leverages Twitter data to explore public attention and sentiment acr… ▽ More

    Submitted 1 September, 2024; originally announced September 2024.

  18. arXiv:2409.00750  [pdf, other

    cs.SD cs.AI cs.LG eess.AS

    MaskGCT: Zero-Shot Text-to-Speech with Masked Generative Codec Transformer

    Authors: Yuancheng Wang, Haoyue Zhan, Liwei Liu, Ruihong Zeng, Haotian Guo, Jiachen Zheng, Qiang Zhang, Shunsi Zhang, Zhizheng Wu

    Abstract: Nowadays, large-scale text-to-speech (TTS) systems are primarily divided into two types: autoregressive and non-autoregressive. The autoregressive systems have certain deficiencies in robustness and cannot control speech duration. In contrast, non-autoregressive systems require explicit prediction of phone-level duration, which may compromise their naturalness. We introduce the Masked Generative C… ▽ More

    Submitted 1 September, 2024; originally announced September 2024.

  19. arXiv:2409.00700  [pdf, other

    cs.SD cs.AI cs.CV eess.AS

    Seeing Your Speech Style: A Novel Zero-Shot Identity-Disentanglement Face-based Voice Conversion

    Authors: Yan Rong, Li Liu

    Abstract: Face-based Voice Conversion (FVC) is a novel task that leverages facial images to generate the target speaker's voice style. Previous work has two shortcomings: (1) suffering from obtaining facial embeddings that are well-aligned with the speaker's voice identity information, and (2) inadequacy in decoupling content and speaker identity information from the audio input. To address these issues, we… ▽ More

    Submitted 1 September, 2024; originally announced September 2024.

  20. arXiv:2409.00369   

    cs.CL

    An Empirical Study on Information Extraction using Large Language Models

    Authors: Ridong Han, Chaohao Yang, Tao Peng, Prayag Tiwari, Xiang Wan, Lu Liu, Benyou Wang

    Abstract: Human-like large language models (LLMs), especially the most powerful and popular ones in OpenAI's GPT family, have proven to be very helpful for many natural language processing (NLP) related tasks. Therefore, various attempts have been made to apply LLMs to information extraction (IE), which is a fundamental NLP task that involves extracting information from unstructured plain text. To demonstra… ▽ More

    Submitted 9 September, 2024; v1 submitted 31 August, 2024; originally announced September 2024.

    Comments: This submission was intended instead as the replacement of arXiv:2305.14450 , where it now appears as arXiv:2305.14450v2

  21. arXiv:2409.00107  [pdf, other

    eess.SY cs.AI cs.LG econ.GN math.OC

    Evaluating the Impact of Multiple DER Aggregators on Wholesale Energy Markets: A Hybrid Mean Field Approach

    Authors: Jun He, Andrew L. Liu

    Abstract: The integration of distributed energy resources (DERs) into wholesale energy markets can greatly enhance grid flexibility, improve market efficiency, and contribute to a more sustainable energy future. As DERs -- such as solar PV panels and energy storage -- proliferate, effective mechanisms are needed to ensure that small prosumers can participate meaningfully in these markets. We study a wholesa… ▽ More

    Submitted 27 August, 2024; originally announced September 2024.

  22. arXiv:2408.16975  [pdf, other

    q-bio.BM cs.AI cs.LG

    Technical Report of HelixFold3 for Biomolecular Structure Prediction

    Authors: Lihang Liu, Shanzhuo Zhang, Yang Xue, Xianbin Ye, Kunrui Zhu, Yuxin Li, Yang Liu, Wenlai Zhao, Hongkun Yu, Zhihua Wu, Xiaonan Zhang, Xiaomin Fang

    Abstract: The AlphaFold series has transformed protein structure prediction with remarkable accuracy, often matching experimental methods. AlphaFold2, AlphaFold-Multimer, and the latest AlphaFold3 represent significant strides in predicting single protein chains, protein complexes, and biomolecular structures. While AlphaFold2 and AlphaFold-Multimer are open-sourced, facilitating rapid and reliable predicti… ▽ More

    Submitted 8 September, 2024; v1 submitted 29 August, 2024; originally announced August 2024.

  23. arXiv:2408.16966  [pdf, other

    cs.LG cs.AI cs.CL

    UserSumBench: A Benchmark Framework for Evaluating User Summarization Approaches

    Authors: Chao Wang, Neo Wu, Lin Ning, Jiaxing Wu, Luyang Liu, Jun Xie, Shawn O'Banion, Bradley Green

    Abstract: Large language models (LLMs) have shown remarkable capabilities in generating user summaries from a long list of raw user activity data. These summaries capture essential user information such as preferences and interests, and therefore are invaluable for LLM-based personalization applications, such as explainable recommender systems. However, the development of new summarization techniques is hin… ▽ More

    Submitted 5 September, 2024; v1 submitted 29 August, 2024; originally announced August 2024.

  24. arXiv:2408.15861  [pdf, other

    cs.CR cs.LG

    Fusing Pruned and Backdoored Models: Optimal Transport-based Data-free Backdoor Mitigation

    Authors: Weilin Lin, Li Liu, Jianze Li, Hui Xiong

    Abstract: Backdoor attacks present a serious security threat to deep neuron networks (DNNs). Although numerous effective defense techniques have been proposed in recent years, they inevitably rely on the availability of either clean or poisoned data. In contrast, data-free defense techniques have evolved slowly and still lag significantly in performance. To address this issue, different from the traditional… ▽ More

    Submitted 28 August, 2024; originally announced August 2024.

  25. arXiv:2408.15813  [pdf, other

    cs.CV

    DQFormer: Towards Unified LiDAR Panoptic Segmentation with Decoupled Queries

    Authors: Yu Yang, Jianbiao Mei, Liang Liu, Siliang Du, Yilin Xiao, Jongwon Ra, Yong Liu, Xiao Xu, Huifeng Wu

    Abstract: LiDAR panoptic segmentation, which jointly performs instance and semantic segmentation for things and stuff classes, plays a fundamental role in LiDAR perception tasks. While most existing methods explicitly separate these two segmentation tasks and utilize different branches (i.e., semantic and instance branches), some recent methods have embraced the query-based paradigm to unify LiDAR panoptic… ▽ More

    Submitted 28 August, 2024; originally announced August 2024.

    Comments: 13 pages, 10 figures

  26. arXiv:2408.15065  [pdf, other

    stat.ML cs.LG math.ST

    The Benefits of Balance: From Information Projections to Variance Reduction

    Authors: Lang Liu, Ronak Mehta, Soumik Pal, Zaid Harchaoui

    Abstract: Data balancing across multiple modalities/sources appears in various forms in several foundation models (e.g., CLIP and DINO) achieving universal representation learning. We show that this iterative algorithm, usually used to avoid representation collapse, enjoys an unsuspected benefit: reducing the variance of estimators that are functionals of the empirical distribution over these sources. We pr… ▽ More

    Submitted 27 August, 2024; originally announced August 2024.

  27. arXiv:2408.14976  [pdf, other

    cs.LG cs.AI cs.CV

    Prior-free Balanced Replay: Uncertainty-guided Reservoir Sampling for Long-Tailed Continual Learning

    Authors: Lei Liu, Li Liu, Yawen Cui

    Abstract: Even in the era of large models, one of the well-known issues in continual learning (CL) is catastrophic forgetting, which is significantly challenging when the continual data stream exhibits a long-tailed distribution, termed as Long-Tailed Continual Learning (LTCL). Existing LTCL solutions generally require the label distribution of the data stream to achieve re-balance training. However, obtain… ▽ More

    Submitted 27 August, 2024; originally announced August 2024.

  28. arXiv:2408.14513  [pdf, other

    cs.LG cs.AI

    Variational autoencoder-based neural network model compression

    Authors: Liang Cheng, Peiyuan Guan, Amir Taherkordi, Lei Liu, Dapeng Lan

    Abstract: Variational Autoencoders (VAEs), as a form of deep generative model, have been widely used in recent years, and shown great great peformance in a number of different domains, including image generation and anomaly detection, etc.. This paper aims to explore neural network model compression method based on VAE. The experiment uses different neural network models for MNIST recognition as compression… ▽ More

    Submitted 25 August, 2024; originally announced August 2024.

  29. arXiv:2408.13985  [pdf, other

    cs.CL

    TF-Attack: Transferable and Fast Adversarial Attacks on Large Language Models

    Authors: Zelin Li, Kehai Chen, Lemao Liu, Xuefeng Bai, Mingming Yang, Yang Xiang, Min Zhang

    Abstract: With the great advancements in large language models (LLMs), adversarial attacks against LLMs have recently attracted increasing attention. We found that pre-existing adversarial attack methodologies exhibit limited transferability and are notably inefficient, particularly when applied to LLMs. In this paper, we analyze the core mechanisms of previous predominant adversarial attack methods, reveal… ▽ More

    Submitted 8 September, 2024; v1 submitted 25 August, 2024; originally announced August 2024.

    Comments: 14 pages, 6 figures

  30. arXiv:2408.13858  [pdf, other

    cs.CV cs.LG

    Draw Like an Artist: Complex Scene Generation with Diffusion Model via Composition, Painting, and Retouching

    Authors: Minghao Liu, Le Zhang, Yingjie Tian, Xiaochao Qu, Luoqi Liu, Ting Liu

    Abstract: Recent advances in text-to-image diffusion models have demonstrated impressive capabilities in image quality. However, complex scene generation remains relatively unexplored, and even the definition of `complex scene' itself remains unclear. In this paper, we address this gap by providing a precise definition of complex scenes and introducing a set of Complex Decomposition Criteria (CDC) based on… ▽ More

    Submitted 25 August, 2024; originally announced August 2024.

  31. arXiv:2408.13781  [pdf, other

    cs.NI

    Demo: Generative Open xG Network Simulation with Multi-Agent LLM and ns-3 (GenOnet)

    Authors: Farhad Rezazadeh, Amir Ashtari Gargari, Sandra Lagén, Josep Mangues, Dusit Niyato, Lingjia Liu

    Abstract: The move toward Sixth-Generation (6G) networks relies on open interfaces and protocols for seamless interoperability across devices, vendors, and technologies. In this context, open 6G development involves multiple disciplines and requires advanced simulation approaches for testing. In this demo paper, we propose a generative simulation approach based on a multi-agent Large Language Model (LLM) an… ▽ More

    Submitted 25 August, 2024; originally announced August 2024.

    Comments: 3 pages, 4 figures

  32. arXiv:2408.12185  [pdf, other

    cs.LG cs.AI cs.IR

    Rank and Align: Towards Effective Source-free Graph Domain Adaptation

    Authors: Junyu Luo, Zhiping Xiao, Yifan Wang, Xiao Luo, Jingyang Yuan, Wei Ju, Langechuan Liu, Ming Zhang

    Abstract: Graph neural networks (GNNs) have achieved impressive performance in graph domain adaptation. However, extensive source graphs could be unavailable in real-world scenarios due to privacy and storage concerns. To this end, we investigate an underexplored yet practical problem of source-free graph domain adaptation, which transfers knowledge from source models instead of source graphs to a target do… ▽ More

    Submitted 22 August, 2024; originally announced August 2024.

    Comments: Published in IJCAI2024

  33. arXiv:2408.12063  [pdf, other

    stat.ML cs.AI cs.LG physics.ao-ph

    A Deconfounding Approach to Climate Model Bias Correction

    Authors: Wentao Gao, Jiuyong Li, Debo Cheng, Lin Liu, Jixue Liu, Thuc Duy Le, Xiaojing Du, Xiongren Chen, Yanchang Zhao, Yun Chen

    Abstract: Global Climate Models (GCMs) are crucial for predicting future climate changes by simulating the Earth systems. However, GCM outputs exhibit systematic biases due to model uncertainties, parameterization simplifications, and inadequate representation of complex climate phenomena. Traditional bias correction methods, which rely on historical observation data and statistical techniques, often neglec… ▽ More

    Submitted 21 August, 2024; originally announced August 2024.

  34. arXiv:2408.11587  [pdf, other

    cs.CL cs.CR

    Large Language Models are Good Attackers: Efficient and Stealthy Textual Backdoor Attacks

    Authors: Ziqiang Li, Yueqi Zeng, Pengfei Xia, Lei Liu, Zhangjie Fu, Bin Li

    Abstract: With the burgeoning advancements in the field of natural language processing (NLP), the demand for training data has increased significantly. To save costs, it has become common for users and businesses to outsource the labor-intensive task of data collection to third-party entities. Unfortunately, recent research has unveiled the inherent risk associated with this practice, particularly in exposi… ▽ More

    Submitted 21 August, 2024; originally announced August 2024.

    Comments: Under Review

  35. arXiv:2408.11535  [pdf, other

    cs.CV

    SAM-REF: Rethinking Image-Prompt Synergy for Refinement in Segment Anything

    Authors: Chongkai Yu, Anqi Li, Xiaochao Qu, Luoqi Liu, Ting Liu

    Abstract: The advent of the Segment Anything Model (SAM) marks a significant milestone for interactive segmentation using generalist models. As a late fusion model, SAM extracts image embeddings once and merges them with prompts in later interactions. This strategy limits the models ability to extract detailed information from the prompted target zone. Current specialist models utilize the early fusion stra… ▽ More

    Submitted 22 August, 2024; v1 submitted 21 August, 2024; originally announced August 2024.

  36. arXiv:2408.11492  [pdf, other

    cs.AI

    Estimating Peer Direct and Indirect Effects in Observational Network Data

    Authors: Xiaojing Du, Jiuyong Li, Debo Cheng, Lin Liu, Wentao Gao, Xiongren Chen

    Abstract: Estimating causal effects is crucial for decision-makers in many applications, but it is particularly challenging with observational network data due to peer interactions. Many algorithms have been proposed to estimate causal effects involving network data, particularly peer effects, but they often overlook the variety of peer effects. To address this issue, we propose a general setting which cons… ▽ More

    Submitted 21 August, 2024; originally announced August 2024.

    Comments: AAAI

  37. arXiv:2408.10627  [pdf, other

    cs.CV

    Rethinking Video Segmentation with Masked Video Consistency: Did the Model Learn as Intended?

    Authors: Chen Liang, Qiang Guo, Xiaochao Qu, Luoqi Liu, Ting Liu

    Abstract: Video segmentation aims at partitioning video sequences into meaningful segments based on objects or regions of interest within frames. Current video segmentation models are often derived from image segmentation techniques, which struggle to cope with small-scale or class-imbalanced video datasets. This leads to inconsistent segmentation results across frames. To address these issues, we propose a… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

  38. arXiv:2408.10556  [pdf, other

    cs.AI cs.LG

    Hokoff: Real Game Dataset from Honor of Kings and its Offline Reinforcement Learning Benchmarks

    Authors: Yun Qu, Boyuan Wang, Jianzhun Shao, Yuhang Jiang, Chen Chen, Zhenbin Ye, Lin Liu, Junfeng Yang, Lin Lai, Hongyang Qin, Minwen Deng, Juchao Zhuo, Deheng Ye, Qiang Fu, Wei Yang, Guang Yang, Lanxiao Huang, Xiangyang Ji

    Abstract: The advancement of Offline Reinforcement Learning (RL) and Offline Multi-Agent Reinforcement Learning (MARL) critically depends on the availability of high-quality, pre-collected offline datasets that represent real-world complexities and practical applications. However, existing datasets often fall short in their simplicity and lack of realism. To address this gap, we propose Hokoff, a comprehens… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

  39. arXiv:2408.10473  [pdf, other

    cs.CL cs.LG

    Enhancing One-shot Pruned Pre-trained Language Models through Sparse-Dense-Sparse Mechanism

    Authors: Guanchen Li, Xiandong Zhao, Lian Liu, Zeping Li, Dong Li, Lu Tian, Jie He, Ashish Sirasao, Emad Barsoum

    Abstract: Pre-trained language models (PLMs) are engineered to be robust in contextual understanding and exhibit outstanding performance in various natural language processing tasks. However, their considerable size incurs significant computational and storage costs. Modern pruning strategies employ one-shot techniques to compress PLMs without the need for retraining on task-specific or otherwise general da… ▽ More

    Submitted 19 August, 2024; originally announced August 2024.

  40. arXiv:2408.10135  [pdf, other

    cs.CV

    $R^2$-Mesh: Reinforcement Learning Powered Mesh Reconstruction via Geometry and Appearance Refinement

    Authors: Haoyang Wang, Liming Liu, Quanlu Jia, Jiangkai Wu, Haodan Zhang, Peiheng Wang, Xinggong Zhang

    Abstract: Mesh reconstruction based on Neural Radiance Fields (NeRF) is popular in a variety of applications such as computer graphics, virtual reality, and medical imaging due to its efficiency in handling complex geometric structures and facilitating real-time rendering. However, existing works often fail to capture fine geometric details accurately and struggle with optimizing rendering quality. To addre… ▽ More

    Submitted 19 August, 2024; originally announced August 2024.

  41. arXiv:2408.09937  [pdf, other

    quant-ph cs.LG

    The curse of random quantum data

    Authors: Kaining Zhang, Junyu Liu, Liu Liu, Liang Jiang, Min-Hsiu Hsieh, Dacheng Tao

    Abstract: Quantum machine learning, which involves running machine learning algorithms on quantum devices, may be one of the most significant flagship applications for these devices. Unlike its classical counterparts, the role of data in quantum machine learning has not been fully understood. In this work, we quantify the performances of quantum machine learning in the landscape of quantum data. Provided th… ▽ More

    Submitted 19 August, 2024; originally announced August 2024.

    Comments: 40 pages, 8 figures

  42. arXiv:2408.09651  [pdf, other

    cs.IR cs.AI

    Data-driven Conditional Instrumental Variables for Debiasing Recommender Systems

    Authors: Zhirong Huang, Shichao Zhang, Debo Cheng, Jiuyong Li, Lin Liu, Guangquan Lu

    Abstract: In recommender systems, latent variables can cause user-item interaction data to deviate from true user preferences. This biased data is then used to train recommendation models, further amplifying the bias and ultimately compromising both recommendation accuracy and user satisfaction. Instrumental Variable (IV) methods are effective tools for addressing the confounding bias introduced by latent v… ▽ More

    Submitted 18 August, 2024; originally announced August 2024.

  43. arXiv:2408.09646  [pdf, other

    cs.IR cs.AI

    Debiased Contrastive Representation Learning for Mitigating Dual Biases in Recommender Systems

    Authors: Zhirong Huang, Shichao Zhang, Debo Cheng, Jiuyong Li, Lin Liu, Guixian Zhang

    Abstract: In recommender systems, popularity and conformity biases undermine recommender effectiveness by disproportionately favouring popular items, leading to their over-representation in recommendation lists and causing an unbalanced distribution of user-item historical data. We construct a causal graph to address both biases and describe the abstract data generation mechanism. Then, we use it as a guide… ▽ More

    Submitted 18 August, 2024; originally announced August 2024.

  44. arXiv:2408.09600  [pdf, other

    cs.AI cs.CR

    Antidote: Post-fine-tuning Safety Alignment for Large Language Models against Harmful Fine-tuning

    Authors: Tiansheng Huang, Gautam Bhattacharya, Pratik Joshi, Josh Kimball, Ling Liu

    Abstract: Safety aligned Large Language Models (LLMs) are vulnerable to harmful fine-tuning attacks \cite{qi2023fine}-- a few harmful data mixed in the fine-tuning dataset can break the LLMs's safety alignment. Existing mitigation strategies include alignment stage solutions \cite{huang2024vaccine, rosati2024representation} and fine-tuning stage solutions \cite{huang2024lazy,mukhoti2023fine}. However, our e… ▽ More

    Submitted 2 September, 2024; v1 submitted 18 August, 2024; originally announced August 2024.

  45. arXiv:2408.08315  [pdf, other

    cs.CV cs.AI

    Segment Anything for Videos: A Systematic Survey

    Authors: Chunhui Zhang, Yawen Cui, Weilin Lin, Guanjie Huang, Yan Rong, Li Liu, Shiguang Shan

    Abstract: The recent wave of foundation models has witnessed tremendous success in computer vision (CV) and beyond, with the segment anything model (SAM) having sparked a passion for exploring task-agnostic visual foundation models. Empowered by its remarkable zero-shot generalization, SAM is currently challenging numerous traditional paradigms in CV, delivering extraordinary performance not only in various… ▽ More

    Submitted 30 July, 2024; originally announced August 2024.

    Comments: https://github.com/983632847/SAM-for-Videos

  46. arXiv:2408.07484  [pdf, other

    cs.CV eess.IV

    GRFormer: Grouped Residual Self-Attention for Lightweight Single Image Super-Resolution

    Authors: Yuzhen Li, Zehang Deng, Yuxin Cao, Lihua Liu

    Abstract: Previous works have shown that reducing parameter overhead and computations for transformer-based single image super-resolution (SISR) models (e.g., SwinIR) usually leads to a reduction of performance. In this paper, we present GRFormer, an efficient and lightweight method, which not only reduces the parameter overhead and computations, but also greatly improves performance. The core of GRFormer i… ▽ More

    Submitted 14 August, 2024; originally announced August 2024.

    Comments: Accepted for ACM MM 2024

  47. arXiv:2408.07219  [pdf, other

    cs.LG stat.ME

    Causal Effect Estimation using identifiable Variational AutoEncoder with Latent Confounders and Post-Treatment Variables

    Authors: Yang Xie, Ziqi Xu, Debo Cheng, Jiuyong Li, Lin Liu, Yinghao Zhang, Zaiwen Feng

    Abstract: Estimating causal effects from observational data is challenging, especially in the presence of latent confounders. Much work has been done on addressing this challenge, but most of the existing research ignores the bias introduced by the post-treatment variables. In this paper, we propose a novel method of joint Variational AutoEncoder (VAE) and identifiable Variational AutoEncoder (iVAE) for lea… ▽ More

    Submitted 13 August, 2024; originally announced August 2024.

  48. arXiv:2408.06831  [pdf, other

    cs.CG

    Polynomial 2D Green Coordinates for High-order Cages

    Authors: Shibo Liu, Ligang Liu, Xiao-Ming Fu

    Abstract: We propose conformal polynomial coordinates for 2D closed high-order cages, which consist of polynomial curves of any order. The coordinates enable the transformation of the input polynomial curves into polynomial curves of any order. We extend the classical 2D Green coordinates to define our coordinates, thereby leading to cage-aware conformal harmonic deformations. We extensively test our method… ▽ More

    Submitted 13 August, 2024; originally announced August 2024.

  49. arXiv:2408.06556  [pdf

    cs.CE

    An improved point-to-surface contact algorithm with penalty method for peridynamics

    Authors: Haoran Zhang, Lisheng Liu, Xin Lai, Jun Li

    Abstract: It is significantly challenging to obtain accurate contact forces in peridynamics (PD) simulations due to the difficulty of surface particles identification, particularly for complex geometries. Here, an improved point-to-surface contact model is proposed for PD with high accuracy. First, the outer surface is identified using the eigenvalue method and then we construct a Verlet list to identify po… ▽ More

    Submitted 12 August, 2024; originally announced August 2024.

    Comments: 27 pages, 27 figures, 1 table

  50. arXiv:2408.05981  [pdf, other

    cs.RO

    CADきゃど-Mesher: A Convenient, Accurate, Dense Mesh-based Mapping Module in SLAM for Dynamic Environments

    Authors: Yanpeng Jia, Fengkui Cao, Ting Wang, Yandong Tang, Shiliang Shao, Lianqing Liu

    Abstract: Most LiDAR odometry and SLAM systems construct maps in point clouds, which are discrete and sparse when zoomed in, making them not directly suitable for navigation. Mesh maps represent a dense and continuous map format with low memory consumption, which can approximate complex structures with simple elements, attracting significant attention of researchers in recent years. However, most implementa… ▽ More

    Submitted 12 August, 2024; originally announced August 2024.

    Comments: 9 pages, 7 figures