(Translated by https://www.hiragana.jp/)
Search | arXiv e-print repository
Skip to main content

Showing 1–50 of 136 results for author: Ge, C

.
  1. arXiv:2409.07097  [pdf, ps, other

    math.CO math.SP

    Spectral bounds of multi-way Cheeger constants via cyclomatic number

    Authors: Chuanyuan Ge

    Abstract: As a non-trivial extension of the celebrated Cheeger inequality, the higher-order Cheeger inequalities for graphs due to Lee, Oveis Gharan and Trevisan provide for each $k$ an upper bound for the $k$-way Cheeger constant in forms of $C(k)\sqrt{λらむだ_k(G)}$, where $λらむだ_k(G)$ is the $k$-th eigenvalue of the graph Laplacian and $C(k)$ is a constant depending only on $k$. In this article, we prove some new… ▽ More

    Submitted 11 September, 2024; originally announced September 2024.

    Comments: 13 pages, 1 figure

  2. arXiv:2409.05229  [pdf, other

    astro-ph.GA

    Two Channels of Metal-Rich Compact Stellar System Formation: Starbursts Under High Ram Pressure vs. Tidal Stripping

    Authors: Yuan Bian, Min Du, Victor P. Debattista, Dylan Nelson, Mark A. Norris, Luis C. Ho, Shuai Lu, Renyue Cen, Shuo Ma, Chong Ge, Taotao Fang, Hui Li

    Abstract: Most galaxies follow well-defined scaling relations of metallicity and stellar mass; however, some outliers at the low mass end of the observed galaxy population exhibit unusually high metallicity for their mass. Understanding how these objects get to be so metal-rich is vital for understanding the role of feedback in galaxy formation. Using the TNG50 simulation, we explore the origins of this phe… ▽ More

    Submitted 8 September, 2024; originally announced September 2024.

    Comments: 28 pages, 13 figures. Submitted

  3. arXiv:2408.12534  [pdf, other

    eess.IV cs.AI cs.CV

    Automatic Organ and Pan-cancer Segmentation in Abdomen CT: the FLARE 2023 Challenge

    Authors: Jun Ma, Yao Zhang, Song Gu, Cheng Ge, Ershuai Wang, Qin Zhou, Ziyan Huang, Pengju Lyu, Jian He, Bo Wang

    Abstract: Organ and cancer segmentation in abdomen Computed Tomography (CT) scans is the prerequisite for precise cancer diagnosis and treatment. Most existing benchmarks and algorithms are tailored to specific cancer types, limiting their ability to provide comprehensive cancer analysis. This work presents the first international competition on abdominal organ and pan-cancer segmentation by providing a lar… ▽ More

    Submitted 22 August, 2024; originally announced August 2024.

    Comments: MICCAI 2024 FLARE Challenge Summary

  4. arXiv:2408.06885  [pdf, other

    cs.CR

    Voltran: Unlocking Trust and Confidentiality in Decentralized Federated Learning Aggregation

    Authors: Hao Wang, Yichen Cai, Jun Wang, Chuan Ma, Chunpeng Ge, Xiangmou Qu, Lu Zhou

    Abstract: The decentralized Federated Learning (FL) paradigm built upon blockchain architectures leverages distributed node clusters to replace the single server for executing FL model aggregation. This paradigm tackles the vulnerability of the centralized malicious server in vanilla FL and inherits the trustfulness and robustness offered by blockchain. However, existing blockchain-enabled schemes face chal… ▽ More

    Submitted 13 August, 2024; originally announced August 2024.

  5. arXiv:2407.14771  [pdf, other

    quant-ph physics.optics

    Post-Measurement Pairing Quantum Key Distribution with Local Optical Frequency Standard

    Authors: Chengfang Ge, Lai Zhou, Jinping Lin, Hua-Lei Yin, Qiang Zeng, Zhiliang Yuan

    Abstract: The idea of post-measurement coincidence pairing simplifies substantially long-distance, repeater-like quantum key distribution (QKD) by eliminating the need for tracking the differential phase of the users' lasers. However, optical frequency tracking remains necessary and can become a severe burden in future deployment of multi-node quantum networks. Here, we resolve this problem by referencing e… ▽ More

    Submitted 20 July, 2024; originally announced July 2024.

  6. arXiv:2407.11784  [pdf, other

    cs.AI cs.CV cs.LG

    Data-Juicer Sandbox: A Comprehensive Suite for Multimodal Data-Model Co-development

    Authors: Daoyuan Chen, Haibin Wang, Yilun Huang, Ce Ge, Yaliang Li, Bolin Ding, Jingren Zhou

    Abstract: The emergence of large-scale multi-modal generative models has drastically advanced artificial intelligence, introducing unprecedented levels of performance and functionality. However, optimizing these models remains challenging due to historically isolated paths of model-centric and data-centric developments, leading to suboptimal outcomes and inefficient resource utilization. In response, we pre… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Comments: 26 pages, 9 figures, 5 tables

  7. arXiv:2407.04281  [pdf, other

    cs.RO

    WOMD-Reasoning: A Large-Scale Language Dataset for Interaction and Driving Intentions Reasoning

    Authors: Yiheng Li, Chongjian Ge, Chenran Li, Chenfeng Xu, Masayoshi Tomizuka, Chen Tang, Mingyu Ding, Wei Zhan

    Abstract: We propose Waymo Open Motion Dataset-Reasoning (WOMD-Reasoning), a language annotation dataset built on WOMD, with a focus on describing and reasoning interactions and intentions in driving scenarios. Previous language datasets primarily captured interactions caused by close distances. However, interactions induced by traffic rules and human intentions, which can occur over long distances, are yet… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

  8. arXiv:2406.19964  [pdf, other

    cs.CR

    Secure Outsourced Decryption for FHE-based Privacy-preserving Cloud Computing

    Authors: Xirong Ma, Chuan Li, Yuchang Hu, Yunting Tao, Yali Jiang, Yanbin Li, Fanyu Kong, Chunpeng Ge

    Abstract: The demand for processing vast volumes of data has surged dramatically due to the advancement of machine learning technology. Large-scale data processing necessitates substantial computational resources, prompting individuals and enterprises to turn to cloud services. Accompanying this trend is a growing concern regarding data leakage and misuse. Homomorphic encryption (HE) is one solution for saf… ▽ More

    Submitted 9 July, 2024; v1 submitted 28 June, 2024; originally announced June 2024.

    Comments: content and title updated

  9. arXiv:2406.10308  [pdf, other

    stat.ME stat.AP

    Quick and Simple Kernel Differential Equation Regression Estimators for Data with Sparse Design

    Authors: Chunlei Ge, W. John Braun

    Abstract: Local polynomial regression of order at least one often performs poorly in regions of sparse data. Local constant regression is exceptional in this regard, though it is the least accurate method in general, especially at the boundaries of the data. Incorporating information from differential equations which may approximately or exactly hold is one way of extending the sparse design capacity of loc… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: 18 pages, 3 figures

    MSC Class: 62G08 (Primary); 62-07 (Secondary)

  10. arXiv:2406.04295  [pdf, other

    cs.CV

    Everything to the Synthetic: Diffusion-driven Test-time Adaptation via Synthetic-Domain Alignment

    Authors: Jiayi Guo, Junhao Zhao, Chunjiang Ge, Chaoqun Du, Zanlin Ni, Shiji Song, Humphrey Shi, Gao Huang

    Abstract: Test-time adaptation (TTA) aims to enhance the performance of source-domain pretrained models when tested on unknown shifted target domains. Traditional TTA methods primarily adapt model weights based on target data streams, making model performance sensitive to the amount and order of target data. Recently, diffusion-driven TTA methods have demonstrated strong performance by using an unconditiona… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: GitHub: https://github.com/SHI-Labs/Diffusion-Driven-Test-Time-Adaptation-via-Synthetic-Domain-Alignment

  11. arXiv:2405.16605  [pdf, other

    cs.CV

    Demystify Mamba in Vision: A Linear Attention Perspective

    Authors: Dongchen Han, Ziyi Wang, Zhuofan Xia, Yizeng Han, Yifan Pu, Chunjiang Ge, Jun Song, Shiji Song, Bo Zheng, Gao Huang

    Abstract: Mamba is an effective state space model with linear computation complexity. It has recently shown impressive efficiency in dealing with high-resolution inputs across various vision tasks. In this paper, we reveal that the powerful Mamba model shares surprising similarities with linear attention Transformer, which typically underperform conventional Transformer in practice. By exploring the similar… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

  12. arXiv:2405.15738  [pdf, other

    cs.CV

    ConvLLaVA: Hierarchical Backbones as Visual Encoder for Large Multimodal Models

    Authors: Chunjiang Ge, Sijie Cheng, Ziming Wang, Jiale Yuan, Yuan Gao, Jun Song, Shiji Song, Gao Huang, Bo Zheng

    Abstract: High-resolution Large Multimodal Models (LMMs) encounter the challenges of excessive visual tokens and quadratic visual complexity. Current high-resolution LMMs address the quadratic complexity while still generating excessive visual tokens. However, the redundancy in visual tokens is the key problem as it leads to more substantial compute. To mitigate this issue, we propose ConvLLaVA, which emplo… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

    Comments: 17 pages

  13. arXiv:2405.14908  [pdf, other

    cs.LG cs.AI cs.CL

    Data Mixing Made Efficient: A Bivariate Scaling Law for Language Model Pretraining

    Authors: Ce Ge, Zhijian Ma, Daoyuan Chen, Yaliang Li, Bolin Ding

    Abstract: Large language models exhibit exceptional generalization capabilities, primarily attributed to the utilization of diversely sourced data. However, conventional practices in integrating this diverse data heavily rely on heuristic schemes, lacking theoretical guidance. This research tackles these limitations by investigating strategies based on low-cost proxies for data mixtures, with the aim of str… ▽ More

    Submitted 11 July, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

    Comments: typos corrected

  14. arXiv:2405.10357  [pdf, other

    cs.CV

    RGB Guided ToF Imaging System: A Survey of Deep Learning-based Methods

    Authors: Xin Qiao, Matteo Poggi, Pengchao Deng, Hao Wei, Chenyang Ge, Stefano Mattoccia

    Abstract: Integrating an RGB camera into a ToF imaging system has become a significant technique for perceiving the real world. The RGB guided ToF imaging system is crucial to several applications, including face anti-spoofing, saliency detection, and trajectory prediction. Depending on the distance of the working range, the implementation schemes of the RGB guided ToF imaging systems are different. Specifi… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

    Comments: To appear on International Journal of Computer Vision (IJCV)

  15. arXiv:2404.18820  [pdf, other

    eess.IV cs.CV

    Towards Extreme Image Compression with Latent Feature Guidance and Diffusion Prior

    Authors: Zhiyuan Li, Yanhui Zhou, Hao Wei, Chenyang Ge, Jingwen Jiang

    Abstract: Image compression at extremely low bitrates (below 0.1 bits per pixel (bpp)) is a significant challenge due to substantial information loss. In this work, we propose a novel two-stage extreme image compression framework that exploits the powerful generative capability of pre-trained diffusion models to achieve realistic image reconstruction at extremely low bitrates. In the first stage, we treat t… ▽ More

    Submitted 3 September, 2024; v1 submitted 29 April, 2024; originally announced April 2024.

    Comments: Accepted by IEEE TCSVT

  16. arXiv:2404.16484  [pdf, other

    cs.CV eess.IV

    Real-Time 4K Super-Resolution of Compressed AVIF Images. AIS 2024 Challenge Survey

    Authors: Marcos V. Conde, Zhijun Lei, Wen Li, Cosmin Stejerean, Ioannis Katsavounidis, Radu Timofte, Kihwan Yoon, Ganzorig Gankhuyag, Jiangtao Lv, Long Sun, Jinshan Pan, Jiangxin Dong, Jinhui Tang, Zhiyuan Li, Hao Wei, Chenyang Ge, Dongyang Zhang, Tianle Liu, Huaian Chen, Yi Jin, Menghan Zhou, Yiqiang Yan, Si Gao, Biao Wu, Shaoli Liu , et al. (50 additional authors not shown)

    Abstract: This paper introduces a novel benchmark as part of the AIS 2024 Real-Time Image Super-Resolution (RTSR) Challenge, which aims to upscale compressed images from 540p to 4K resolution (4x factor) in real-time on commercial GPUs. For this, we use a diverse test set containing a variety of 4K images ranging from digital art to gaming and photography. The images are compressed using the modern AVIF cod… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

    Comments: CVPR 2024, AI for Streaming (AIS) Workshop

  17. arXiv:2403.11703  [pdf, other

    cs.CV cs.AI

    LLaVA-UHD: an LMM Perceiving Any Aspect Ratio and High-Resolution Images

    Authors: Ruyi Xu, Yuan Yao, Zonghao Guo, Junbo Cui, Zanlin Ni, Chunjiang Ge, Tat-Seng Chua, Zhiyuan Liu, Maosong Sun, Gao Huang

    Abstract: Visual encoding constitutes the basis of large multimodal models (LMMs) in understanding the visual world. Conventional LMMs process images in fixed sizes and limited resolutions, while recent explorations in this direction are limited in adaptivity, efficiency, and even correctness. In this work, we first take GPT-4V and LLaVA-1.5 as representative examples and expose systematic flaws rooted in t… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

    Comments: Preprint

  18. arXiv:2403.04692  [pdf, other

    cs.CV

    PixArt-Σしぐま: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation

    Authors: Junsong Chen, Chongjian Ge, Enze Xie, Yue Wu, Lewei Yao, Xiaozhe Ren, Zhongdao Wang, Ping Luo, Huchuan Lu, Zhenguo Li

    Abstract: In this paper, we introduce PixArt-Σしぐま, a Diffusion Transformer model~(DiT) capable of directly generating images at 4K resolution. PixArt-Σしぐまrepresents a significant advancement over its predecessor, PixArt-αあるふぁ, offering images of markedly higher fidelity and improved alignment with text prompts. A key feature of PixArt-Σしぐまis its training efficiency. Leveraging the foundational pre-training of PixArt-αあるふぁ,… ▽ More

    Submitted 17 March, 2024; v1 submitted 7 March, 2024; originally announced March 2024.

    Comments: Project Page: https://pixart-alpha.github.io/PixArt-sigma-project/

  19. Arrow Matrix Decomposition: A Novel Approach for Communication-Efficient Sparse Matrix Multiplication

    Authors: Lukas Gianinazzi, Alexandros Nikolaos Ziogas, Langwen Huang, Piotr Luczynski, Saleh Ashkboos, Florian Scheidl, Armon Carigiet, Chio Ge, Nabil Abubaker, Maciej Besta, Tal Ben-Nun, Torsten Hoefler

    Abstract: We propose a novel approach to iterated sparse matrix dense matrix multiplication, a fundamental computational kernel in scientific computing and graph neural network training. In cases where matrix sizes exceed the memory of a single compute node, data transfer becomes a bottleneck. An approach based on dense matrix multiplication algorithms leads to suboptimal scalability and fails to exploit th… ▽ More

    Submitted 20 March, 2024; v1 submitted 29 February, 2024; originally announced February 2024.

    ACM Class: F.2.1

    Journal ref: PPoPP'24: Proceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming (2024) 404-416

  20. arXiv:2402.16117  [pdf, other

    cs.RO cs.AI cs.CV

    RoboCodeX: Multimodal Code Generation for Robotic Behavior Synthesis

    Authors: Yao Mu, Junting Chen, Qinglong Zhang, Shoufa Chen, Qiaojun Yu, Chongjian Ge, Runjian Chen, Zhixuan Liang, Mengkang Hu, Chaofan Tao, Peize Sun, Haibao Yu, Chao Yang, Wenqi Shao, Wenhai Wang, Jifeng Dai, Yu Qiao, Mingyu Ding, Ping Luo

    Abstract: Robotic behavior synthesis, the problem of understanding multimodal inputs and generating precise physical control for robots, is an important part of Embodied AI. Despite successes in applying multimodal large language models for high-level understanding, it remains challenging to translate these conceptual understandings into detailed robotic actions while achieving generalization across various… ▽ More

    Submitted 25 February, 2024; originally announced February 2024.

  21. arXiv:2402.15744  [pdf, other

    eess.IV cs.CV

    Traditional Transformation Theory Guided Model for Learned Image Compression

    Authors: Zhiyuan Li, Chenyang Ge, Shun Li

    Abstract: Recently, many deep image compression methods have been proposed and achieved remarkable performance. However, these methods are dedicated to optimizing the compression performance and speed at medium and high bitrates, while research on ultra low bitrates is limited. In this work, we propose a ultra low bitrates enhanced invertible encoding network guided by traditional transformation theory, exp… ▽ More

    Submitted 24 February, 2024; originally announced February 2024.

    Comments: 6 pages, 8 figures, accepted by ICCE 2024

  22. arXiv:2401.17982  [pdf, other

    astro-ph.HE

    Diagnosing the particle transport mechanism in the pulsar halo via X-ray observations

    Authors: Qi-Zuo Wu, Chao-Ming Li, Xuan-Han Liang, Chong Ge, Ruo-Yu Liu

    Abstract: Pulsar halos (also termed 'TeV halo') are a new class of $γがんま$-ray sources in Galaxy, which manifest as extended $γがんま$-ray emission around middle-age pulsars, as discovered around the Geminga pulsar, the Monogem pulsar and PSR~J0622+3749 by HAWC and LHAASO. A consensus has been reached that the TeV emission comes from the inverse Compton scattering of escaping electrons/positrons from the PWN off soft… ▽ More

    Submitted 31 January, 2024; originally announced January 2024.

    Comments: 7 figures

  23. PeVatron Candidate SNR G106.3+2.7 in a Low-density Cavity: a Multiwavelength Test

    Authors: Yiwei Bao, Ruo-Yu Liu, Chong Ge, Yang Chen

    Abstract: In this paper, we constrain the density of the interstellar medium (ISM) around the hadronic PeVatron candidate, supernova remnant (SNR) G106.3+2.7, based on X-ray and $γがんま$-ray observations. The purpose of this investigation is to understand the influence of the gaseous environment on this SNR as a proton PeVatron candidate. By modelling the self-regulated propagation of the CRs injected from the S… ▽ More

    Submitted 5 January, 2024; originally announced January 2024.

    Comments: submitted to MNRAS

    Journal ref: 2024MNRAS.528.5487B

  24. arXiv:2312.08776  [pdf, other

    cs.DS cs.AI

    Approximate Integer Solution Counts over Linear Arithmetic Constraints

    Authors: Cunjing Ge

    Abstract: Counting integer solutions of linear constraints has found interesting applications in various fields. It is equivalent to the problem of counting lattice points inside a polytope. However, state-of-the-art algorithms for this problem become too slow for even a modest number of variables. In this paper, we propose a new framework to approximate the lattice counts inside a polytope with a new rando… ▽ More

    Submitted 14 December, 2023; originally announced December 2023.

  25. arXiv:2311.17400  [pdf, other

    cs.CL cs.CR cs.LG

    Improving the Robustness of Transformer-based Large Language Models with Dynamic Attention

    Authors: Lujia Shen, Yuwen Pu, Shouling Ji, Changjiang Li, Xuhong Zhang, Chunpeng Ge, Ting Wang

    Abstract: Transformer-based models, such as BERT and GPT, have been widely adopted in natural language processing (NLP) due to their exceptional performance. However, recent studies show their vulnerability to textual adversarial attacks where the model's output can be misled by intentionally manipulating the text inputs. Despite various methods that have been proposed to enhance the model's robustness and… ▽ More

    Submitted 29 November, 2023; v1 submitted 29 November, 2023; originally announced November 2023.

  26. arXiv:2311.16405  [pdf

    cond-mat.mtrl-sci

    Metal-to-insulator transition in oxide semimetals by anion doping

    Authors: Haitao Hong, Huimin Zhang, Shan Lin, Jeffrey A. Dhas, Binod Paudel, Shuai Xu, Shengru Chen, Ting Cui, Yiyan Fan, Dongke Rong, Qiao Jin, Zihua Zhu, Yingge Du, Scott A. Chambers, Chen Ge, Can Wang, Qinghua Zhang, Le Wang, Kui-juan Jin, Shuai Dong, Er-Jia Guo

    Abstract: Oxide semimetals exhibiting both nontrivial topological characteristics stand as exemplary parent compounds and multiple degrees of freedom, offering great promise for the realization of novel electronic states. In this study, we present compelling evidence of profound structural and transport phase shifts in a recently uncovered oxide semimetal, SrNbO3, achieved through effective in-situ anion do… ▽ More

    Submitted 27 November, 2023; originally announced November 2023.

    Comments: 18 pages, 4 figures

  27. arXiv:2311.15157  [pdf, other

    cs.CV

    Advancing Vision Transformers with Group-Mix Attention

    Authors: Chongjian Ge, Xiaohan Ding, Zhan Tong, Li Yuan, Jiangliu Wang, Yibing Song, Ping Luo

    Abstract: Vision Transformers (ViTs) have been shown to enhance visual recognition through modeling long-range dependencies with multi-head self-attention (MHSA), which is typically formulated as Query-Key-Value computation. However, the attention map generated from the Query and Key captures only token-to-token correlations at one single granularity. In this paper, we argue that self-attention should have… ▽ More

    Submitted 25 November, 2023; originally announced November 2023.

  28. arXiv:2311.14580  [pdf, other

    cs.CV

    Large Language Models as Automated Aligners for benchmarking Vision-Language Models

    Authors: Yuanfeng Ji, Chongjian Ge, Weikai Kong, Enze Xie, Zhengying Liu, Zhengguo Li, Ping Luo

    Abstract: With the advancements in Large Language Models (LLMs), Vision-Language Models (VLMs) have reached a new level of sophistication, showing notable competence in executing intricate cognition and reasoning tasks. However, existing evaluation benchmarks, primarily relying on rigid, hand-crafted datasets to measure task-specific performance, face significant limitations in assessing the alignment of th… ▽ More

    Submitted 24 November, 2023; originally announced November 2023.

  29. arXiv:2311.13231  [pdf, other

    cs.LG cs.AI cs.CV

    Using Human Feedback to Fine-tune Diffusion Models without Any Reward Model

    Authors: Kai Yang, Jian Tao, Jiafei Lyu, Chunjiang Ge, Jiaxin Chen, Qimai Li, Weihan Shen, Xiaolong Zhu, Xiu Li

    Abstract: Using reinforcement learning with human feedback (RLHF) has shown significant promise in fine-tuning diffusion models. Previous methods start by training a reward model that aligns with human preferences, then leverage RL techniques to fine-tune the underlying models. However, crafting an efficient reward model demands extensive datasets, optimal architecture, and manual hyperparameter tuning, mak… ▽ More

    Submitted 23 March, 2024; v1 submitted 22 November, 2023; originally announced November 2023.

    Comments: CVPR 2024 accepted; huggingface daily paper

  30. arXiv:2311.13228  [pdf

    cond-mat.supr-con cond-mat.mes-hall cond-mat.mtrl-sci cond-mat.str-el

    Strain mediated phase crossover in Ruddlesden Popper nickelates

    Authors: Ting Cui, Songhee Choi, Ting Lin, Chen Liu, Gang Wang, Ningning Wang, Shengru Chen, Haitao Hong, Dongke Rong, Qianying Wang, Qiao Jin, Jia-Ou Wang, Lin Gu, Chen Ge, Can Wang, Jin Guang Cheng, Qinghua Zhang, Liang Si, Kui-juan Jin, Er-Jia Guo

    Abstract: Recent progress on the signatures of pressure-induced high temperature superconductivity in Ruddlesden Popper (RP) nickelates (Lan+1NinO3n+1) has attracted growing interest in both theoretical calculations and experimental efforts. The fabrication of high-quality single crystalline RP nickelate thin films is critical for possible reducing the superconducting transition pressure and advancing appli… ▽ More

    Submitted 22 November, 2023; originally announced November 2023.

    Comments: 29 pages, 5 figures, one supplementary materials

  31. arXiv:2310.08189  [pdf, ps, other

    math.SP math.CO

    New graph invariants based on $p$-Laplacian eigenvalues

    Authors: Chuanyuan Ge, Shiping Liu, Dong Zhang

    Abstract: We present monotonicity inequalities for certain functions involving eigenvalues of $p$-Laplacians on signed graphs with respect to $p$. Inspired by such monotonicity, we propose new spectrum-based graph invariants, called (variational) cut-off adjacency eigenvalues, that are relevant to certain eigenvector-dependent nonlinear eigenvalue problem. Using these invariants, we obtain new lower bounds… ▽ More

    Submitted 31 October, 2023; v1 submitted 12 October, 2023; originally announced October 2023.

    Comments: 30 pages

  32. arXiv:2310.05136  [pdf, other

    cs.AI cs.CV

    InstructDET: Diversifying Referring Object Detection with Generalized Instructions

    Authors: Ronghao Dang, Jiangyan Feng, Haodong Zhang, Chongjian Ge, Lin Song, Lijun Gong, Chengju Liu, Qijun Chen, Feng Zhu, Rui Zhao, Yibing Song

    Abstract: We propose InstructDET, a data-centric method for referring object detection (ROD) that localizes target objects based on user instructions. While deriving from referring expressions (REC), the instructions we leverage are greatly diversified to encompass common user intentions related to object detection. For one image, we produce tremendous instructions that refer to every single object and diff… ▽ More

    Submitted 11 March, 2024; v1 submitted 8 October, 2023; originally announced October 2023.

    Comments: 29 pages (include Appendix) Published in ICLR

  33. arXiv:2310.00426  [pdf, other

    cs.CV

    PixArt-$αあるふぁ$: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis

    Authors: Junsong Chen, Jincheng Yu, Chongjian Ge, Lewei Yao, Enze Xie, Yue Wu, Zhongdao Wang, James Kwok, Ping Luo, Huchuan Lu, Zhenguo Li

    Abstract: The most advanced text-to-image (T2I) models require significant training costs (e.g., millions of GPU hours), seriously hindering the fundamental innovation for the AIGC community while increasing CO2 emissions. This paper introduces PIXART-$αあるふぁ$, a Transformer-based T2I diffusion model whose image generation quality is competitive with state-of-the-art image generators (e.g., Imagen, SDXL, and eve… ▽ More

    Submitted 29 December, 2023; v1 submitted 30 September, 2023; originally announced October 2023.

    Comments: Project Page: https://pixart-alpha.github.io

  34. arXiv:2309.16855  [pdf, other

    stat.ME math.ST

    A Variational Spike-and-Slab Approach for Group Variable Selection

    Authors: Buyu Lin, Changhao Ge, Jun S. Liu

    Abstract: We introduce a class of generic spike-and-slab priors for high-dimensional linear regression with grouped variables and present a Coordinate-ascent Variational Inference (CAVI) algorithm for obtaining an optimal variational Bayes approximation. Using parameter expansion for a specific, yet comprehensive, family of slab distributions, we obtain a further gain in computational efficiency. The method… ▽ More

    Submitted 28 September, 2023; originally announced September 2023.

    Comments: 64 pages, 6 figures

  35. arXiv:2309.13942  [pdf, other

    cs.CV cs.MM cs.SD eess.AS

    Speed Co-Augmentation for Unsupervised Audio-Visual Pre-training

    Authors: Jiangliu Wang, Jianbo Jiao, Yibing Song, Stephen James, Zhan Tong, Chongjian Ge, Pieter Abbeel, Yun-hui Liu

    Abstract: This work aims to improve unsupervised audio-visual pre-training. Inspired by the efficacy of data augmentation in visual contrastive learning, we propose a novel speed co-augmentation method that randomly changes the playback speeds of both audio and video data. Despite its simplicity, the speed co-augmentation method possesses two compelling attributes: (1) it increases the diversity of audio-vi… ▽ More

    Submitted 25 September, 2023; originally announced September 2023.

    Comments: Published at the CVPR 2023 Sight and Sound workshop

  36. arXiv:2309.09244  [pdf, other

    astro-ph.HE

    Constraining baryon loading efficiency of AGNs with diffuse neutrino flux from galaxy clusters

    Authors: Xin-Yue Shi, Ruo-Yu Liu, Chong Ge, Xiang-Yu Wang

    Abstract: The active galactic nuclei (AGNs) are widely believed to be one of the promising acceleration sites of ultrahigh-energy cosmic rays (CRs). Essentially, AGNs are powered by the gravitational energy of matter falling to supermassive black holes. However, the conversion efficiency of gravitational to kinetic energy of CRs in AGNs, which is defined as baryon loading factor $ηいーた_p$, is not well known yet… ▽ More

    Submitted 17 September, 2023; originally announced September 2023.

    Comments: 13 pages, 4 figures, accepted for publication in ApJ

  37. arXiv:2309.02033  [pdf, other

    cs.LG cs.DB cs.DC

    Data-Juicer: A One-Stop Data Processing System for Large Language Models

    Authors: Daoyuan Chen, Yilun Huang, Zhijian Ma, Hesen Chen, Xuchen Pan, Ce Ge, Dawei Gao, Yuexiang Xie, Zhaoyang Liu, Jinyang Gao, Yaliang Li, Bolin Ding, Jingren Zhou

    Abstract: The immense evolution in Large Language Models (LLMs) has underscored the importance of massive, heterogeneous, and high-quality data. A data recipe is a mixture of data from different sources for training LLMs, which plays a vital role in LLMs' performance. Existing open-source tools for LLM data processing are mostly tailored for specific data recipes. To continuously uncover the potential of LL… ▽ More

    Submitted 20 December, 2023; v1 submitted 5 September, 2023; originally announced September 2023.

    Comments: 20 Pages, 10 figures, 9 tables. The system, data recipes, and demos are continuously maintained at https://github.com/alibaba/data-juicer

  38. arXiv:2308.05864  [pdf, other

    eess.IV cs.CV cs.LG q-bio.QM

    The Multi-modality Cell Segmentation Challenge: Towards Universal Solutions

    Authors: Jun Ma, Ronald Xie, Shamini Ayyadhury, Cheng Ge, Anubha Gupta, Ritu Gupta, Song Gu, Yao Zhang, Gihun Lee, Joonkee Kim, Wei Lou, Haofeng Li, Eric Upschulte, Timo Dickscheid, José Guilherme de Almeida, Yixin Wang, Lin Han, Xin Yang, Marco Labagnara, Vojislav Gligorovski, Maxime Scheder, Sahand Jamal Rahi, Carly Kempster, Alice Pollitt, Leon Espinosa , et al. (15 additional authors not shown)

    Abstract: Cell segmentation is a critical step for quantitative single-cell analysis in microscopy images. Existing cell segmentation methods are often tailored to specific modalities or require manual interventions to specify hyper-parameters in different experimental settings. Here, we present a multi-modality cell segmentation benchmark, comprising over 1500 labeled images derived from more than 50 diver… ▽ More

    Submitted 1 April, 2024; v1 submitted 10 August, 2023; originally announced August 2023.

    Comments: NeurIPS22 Cell Segmentation Challenge: https://neurips22-cellseg.grand-challenge.org/ . Nature Methods (2024)

  39. arXiv:2308.05862  [pdf, other

    eess.IV cs.AI cs.CV

    Unleashing the Strengths of Unlabeled Data in Pan-cancer Abdominal Organ Quantification: the FLARE22 Challenge

    Authors: Jun Ma, Yao Zhang, Song Gu, Cheng Ge, Shihao Ma, Adamo Young, Cheng Zhu, Kangkang Meng, Xin Yang, Ziyan Huang, Fan Zhang, Wentao Liu, YuanKe Pan, Shoujin Huang, Jiacheng Wang, Mingze Sun, Weixin Xu, Dengqiang Jia, Jae Won Choi, Natália Alves, Bram de Wilde, Gregor Koehler, Yajun Wu, Manuel Wiesenfarth, Qiongjie Zhu , et al. (4 additional authors not shown)

    Abstract: Quantitative organ assessment is an essential step in automated abdominal disease diagnosis and treatment planning. Artificial intelligence (AI) has shown great potential to automatize this process. However, most existing AI algorithms rely on many expert annotations and lack a comprehensive evaluation of accuracy and efficiency in real-world multinational settings. To overcome these limitations,… ▽ More

    Submitted 10 August, 2023; originally announced August 2023.

    Comments: MICCAI FLARE22: https://flare22.grand-challenge.org/

  40. arXiv:2308.00328  [pdf, other

    astro-ph.GA astro-ph.HE

    A detached double X-ray tail in the merging galaxy cluster Z8338 with a large double tail

    Authors: Chong Ge, Ming Sun, Paul E. J. Nulsen, Craig Sarazin, Maxim Markevitch, Gerrit Schellenberger

    Abstract: When subhalos infall into galaxy clusters, their gas content is ram pressure stripped by the intracluster medium (ICM) and may turn into cometary tails. We report the discovery of two spectacular X-ray double tails in a single galaxy cluster, Z8338, revealed by 70 ks Chandra observations. The brighter one, with an X-ray bolometric luminosity of $3.9 \times 10^{42}{\rm\ erg\ s}^{-1}$, is a detached… ▽ More

    Submitted 1 August, 2023; originally announced August 2023.

    Comments: 11 pages, 8 figures, MNRAS accepted

  41. arXiv:2307.02106  [pdf, other

    cs.CR cs.DB cs.LG

    SoK: Privacy-Preserving Data Synthesis

    Authors: Yuzheng Hu, Fan Wu, Qinbin Li, Yunhui Long, Gonzalo Munilla Garrido, Chang Ge, Bolin Ding, David Forsyth, Bo Li, Dawn Song

    Abstract: As the prevalence of data analysis grows, safeguarding data privacy has become a paramount concern. Consequently, there has been an upsurge in the development of mechanisms aimed at privacy-preserving data analyses. However, these approaches are task-specific; designing algorithms for new tasks is a cumbersome process. As an alternative, one can create synthetic data that is (ideally) devoid of pr… ▽ More

    Submitted 5 August, 2023; v1 submitted 5 July, 2023; originally announced July 2023.

    Comments: Accepted at IEEE S&P (Oakland) 2024

  42. arXiv:2305.02155   

    cs.NI

    Optimized Live 4K Video Multicast

    Authors: Zhaoyuan He, Changhan Ge, Wangyang Li, Lili Qiu, Peijie Li, Ghufran Baig

    Abstract: 4K videos are becoming increasingly popular. However, despite advances in wireless technology, streaming 4K videos over mmWave to multiple users is facing significant challenges arising from directional communication, unpredictable channel fluctuation and high bandwidth requirements. This paper develops a novel 4K layered video multicast system. We (i) develop a video quality model for layered vid… ▽ More

    Submitted 22 July, 2023; v1 submitted 3 May, 2023; originally announced May 2023.

    Comments: arXiv admin comment: This version has been removed by arXiv administrators as the submitter did not have the rights to agree to the license at the time of submission

  43. arXiv:2304.14636  [pdf, other

    cs.NI

    PreNAS: Preferred One-Shot Learning Towards Efficient Neural Architecture Search

    Authors: Haibin Wang, Ce Ge, Hesen Chen, Xiuyu Sun

    Abstract: The wide application of pre-trained models is driving the trend of once-for-all training in one-shot neural architecture search (NAS). However, training within a huge sample space damages the performance of individual subnets and requires much computation to search for an optimal model. In this paper, we present PreNAS, a search-free NAS approach that accentuates target models in one-shot training… ▽ More

    Submitted 16 June, 2023; v1 submitted 28 April, 2023; originally announced April 2023.

    Comments: Accepted by ICML 2023

  44. arXiv:2304.09801  [pdf, other

    cs.CV

    MetaBEV: Solving Sensor Failures for BEV Detection and Map Segmentation

    Authors: Chongjian Ge, Junsong Chen, Enze Xie, Zhongdao Wang, Lanqing Hong, Huchuan Lu, Zhenguo Li, Ping Luo

    Abstract: Perception systems in modern autonomous driving vehicles typically take inputs from complementary multi-modal sensors, e.g., LiDAR and cameras. However, in real-world applications, sensor corruptions and failures lead to inferior performances, thus compromising autonomous safety. In this paper, we propose a robust framework, called MetaBEV, to address extreme real-world environments involving over… ▽ More

    Submitted 19 April, 2023; originally announced April 2023.

    Comments: Project page: https://chongjiange.github.io/metabev.html

  45. arXiv:2304.01168  [pdf, other

    cs.CV cs.LG cs.RO

    DeepAccident: A Motion and Accident Prediction Benchmark for V2X Autonomous Driving

    Authors: Tianqi Wang, Sukmin Kim, Wenxuan Ji, Enze Xie, Chongjian Ge, Junsong Chen, Zhenguo Li, Ping Luo

    Abstract: Safety is the primary priority of autonomous driving. Nevertheless, no published dataset currently supports the direct and explainable safety evaluation for autonomous driving. In this work, we propose DeepAccident, a large-scale dataset generated via a realistic simulator containing diverse accident scenarios that frequently occur in real-world driving. The proposed DeepAccident dataset includes… ▽ More

    Submitted 17 December, 2023; v1 submitted 3 April, 2023; originally announced April 2023.

  46. arXiv:2303.17142  [pdf, other

    cs.CV

    Soft Neighbors are Positive Supporters in Contrastive Visual Representation Learning

    Authors: Chongjian Ge, Jiangliu Wang, Zhan Tong, Shoufa Chen, Yibing Song, Ping Luo

    Abstract: Contrastive learning methods train visual encoders by comparing views from one instance to others. Typically, the views created from one instance are set as positive, while views from other instances are negative. This binary instance discrimination is studied extensively to improve feature representations in self-supervised learning. In this paper, we rethink the instance discrimination framework… ▽ More

    Submitted 30 March, 2023; originally announced March 2023.

    Comments: Accepted by ICLR23

  47. Revisiting the Chandra Observation on the Region of PSR J1809-1917: Indication of an X-ray Halo and Implication for the Origin of HESS J1809-193

    Authors: Chao-Ming Li, Chong Ge, Ruo-Yu Liu

    Abstract: HESS J1809-193 is an extended TeV $γがんま$-ray source and the origin of its $γがんま$-ray emission remains ambiguous. Pulsar wind nebula (PWN) of PSR J1809-1917 laying inside the extended $γがんま$-ray emission is a possible candidate. Powered by the central pulsar, ultrarelativistic electrons in PWN can produce radio to X-ray emission through synchrotron and $γがんま$-ray emission by inverse Compton (IC) scattering. To… ▽ More

    Submitted 29 March, 2023; v1 submitted 27 March, 2023; originally announced March 2023.

    Comments: 20 pages, 8 figures, accepted by APJ. Comments are welcome

  48. arXiv:2303.09307  [pdf, other

    cs.CV

    Depth Super-Resolution from Explicit and Implicit High-Frequency Features

    Authors: Xin Qiao, Chenyang Ge, Youmin Zhang, Yanhui Zhou, Fabio Tosi, Matteo Poggi, Stefano Mattoccia

    Abstract: We propose a novel multi-stage depth super-resolution network, which progressively reconstructs high-resolution depth maps from explicit and implicit high-frequency features. The former are extracted by an efficient transformer processing both local and global contexts, while the latter are obtained by projecting color images into the frequency domain. Both are combined together with depth feature… ▽ More

    Submitted 30 May, 2023; v1 submitted 16 March, 2023; originally announced March 2023.

  49. arXiv:2302.13763  [pdf, other

    cs.CR cs.LG

    Efficient and Low Overhead Website Fingerprinting Attacks and Defenses based on TCP/IP Traffic

    Authors: Guodong Huang, Chuan Ma, Ming Ding, Yuwen Qian, Chunpeng Ge, Liming Fang, Zhe Liu

    Abstract: Website fingerprinting attack is an extensively studied technique used in a web browser to analyze traffic patterns and thus infer confidential information about users. Several website fingerprinting attacks based on machine learning and deep learning tend to use the most typical features to achieve a satisfactory performance of attacking rate. However, these attacks suffer from several practical… ▽ More

    Submitted 27 February, 2023; originally announced February 2023.

  50. arXiv:2302.05892  [pdf, other

    cs.CL cs.AI cs.CR

    TextDefense: Adversarial Text Detection based on Word Importance Entropy

    Authors: Lujia Shen, Xuhong Zhang, Shouling Ji, Yuwen Pu, Chunpeng Ge, Xing Yang, Yanghe Feng

    Abstract: Currently, natural language processing (NLP) models are wildly used in various scenarios. However, NLP models, like all deep models, are vulnerable to adversarially generated text. Numerous works have been working on mitigating the vulnerability from adversarial attacks. Nevertheless, there is no comprehensive defense in existing works where each work targets a specific attack category or suffers… ▽ More

    Submitted 12 February, 2023; originally announced February 2023.