(Translated by https://www.hiragana.jp/)
Search | arXiv e-print repository
Skip to main content

Showing 1–50 of 196 results for author: Zheng, G

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.08785  [pdf, other

    cs.CV

    BEVSpread: Spread Voxel Pooling for Bird's-Eye-View Representation in Vision-based Roadside 3D Object Detection

    Authors: Wenjie Wang, Yehao Lu, Guangcong Zheng, Shuigen Zhan, Xiaoqing Ye, Zichang Tan, Jingdong Wang, Gaoang Wang, Xi Li

    Abstract: Vision-based roadside 3D object detection has attracted rising attention in autonomous driving domain, since it encompasses inherent advantages in reducing blind spots and expanding perception range. While previous work mainly focuses on accurately estimating depth or height for 2D-to-3D mapping, ignoring the position approximation error in the voxel pooling process. Inspired by this insight, we p… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  2. arXiv:2406.05316  [pdf, other

    cs.LG

    C-Mamba: Channel Correlation Enhanced State Space Models for Multivariate Time Series Forecasting

    Authors: Chaolv Zeng, Zhanyu Liu, Guanjie Zheng, Linghe Kong

    Abstract: In recent years, significant progress has been made in multivariate time series forecasting using Linear-based, Transformer-based, and Convolution-based models. However, these approaches face notable limitations: linear forecasters struggle with representation capacities, attention mechanisms suffer from quadratic complexity, and convolutional models have a restricted receptive field. These constr… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  3. arXiv:2406.03511  [pdf, other

    cs.LG cs.AI

    MagiNet: Mask-Aware Graph Imputation Network for Incomplete Traffic Data

    Authors: Jianping Zhou, Bin Lu, Zhanyu Liu, Siyu Pan, Xuejun Feng, Hua Wei, Guanjie Zheng, Xinbing Wang, Chenghu Zhou

    Abstract: Due to detector malfunctions and communication failures, missing data is ubiquitous during the collection of traffic data. Therefore, it is of vital importance to impute the missing values to facilitate data analysis and decision-making for Intelligent Transportation System (ITS). However, existing imputation methods generally perform zero pre-filling techniques to initialize missing values, intro… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: 19 pages, 7 figures

  4. arXiv:2406.03098  [pdf, ps, other

    cs.IT eess.SP

    A Data and Model-Driven Deep Learning Approach to Robust Downlink Beamforming Optimization

    Authors: Kai Liang, Gan Zheng, Zan Li, Kai-Kit Wong, Chan-Byoung Chae

    Abstract: This paper investigates the optimization of the long-standing probabilistically robust transmit beamforming problem with channel uncertainties in the multiuser multiple-input single-output (MISO) downlink transmission. This problem poses significant analytical and computational challenges. Currently, the state-of-the-art optimization method relies on convex restrictions as tractable approximations… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: This paper has been accepted for publication in the IEEE Journal on Selected Areas in Communications, Special Issue on Advanced Optimization Theory and Algorithms for Next Generation Wireless Communication Networks

  5. arXiv:2406.02614  [pdf, other

    cs.LG cs.AI

    Frequency Enhanced Pre-training for Cross-city Few-shot Traffic Forecasting

    Authors: Zhanyu Liu, Jianrong Ding, Guanjie Zheng

    Abstract: The field of Intelligent Transportation Systems (ITS) relies on accurate traffic forecasting to enable various downstream applications. However, developing cities often face challenges in collecting sufficient training traffic data due to limited resources and outdated infrastructure. Recognizing this obstacle, the concept of cross-city few-shot forecasting has emerged as a viable approach. While… ▽ More

    Submitted 5 June, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

    Comments: Accepted by ECMLPKDD 2024 (Research Track)

  6. arXiv:2406.02131  [pdf, other

    cs.LG cs.AI

    CondTSF: One-line Plugin of Dataset Condensation for Time Series Forecasting

    Authors: Jianrong Ding, Zhanyu Liu, Guanjie Zheng, Haiming Jin, Linghe Kong

    Abstract: Dataset condensation is a newborn technique that generates a small dataset that can be used in training deep neural networks to lower training costs. The objective of dataset condensation is to ensure that the model trained with the synthetic dataset can perform comparably to the model trained with full datasets. However, existing methods predominantly concentrate on classification tasks, posing c… ▽ More

    Submitted 11 June, 2024; v1 submitted 4 June, 2024; originally announced June 2024.

    Comments: 23 pages, 13 figures

  7. arXiv:2405.18035  [pdf, other

    cs.CL

    Instruction Tuning with Retrieval-based Examples Ranking for Aspect-based Sentiment Analysis

    Authors: Guangmin Zheng, Jin Wang, Liang-Chih Yu, Xuejie Zhang

    Abstract: Aspect-based sentiment analysis (ABSA) identifies sentiment information related to specific aspects and provides deeper market insights to businesses and organizations. With the emergence of large language models (LMs), recent studies have proposed using fixed examples for instruction tuning to reformulate ABSA as a generation task. However, the performance is sensitive to the selection of in-cont… ▽ More

    Submitted 29 May, 2024; v1 submitted 28 May, 2024; originally announced May 2024.

    Comments: ACL Findings 2024

  8. arXiv:2405.15274  [pdf, other

    cs.CV cs.HC

    Talk to Parallel LiDARs: A Human-LiDAR Interaction Method Based on 3D Visual Grounding

    Authors: Yuhang Liu, Boyi Sun, Guixu Zheng, Yishuo Wang, Jing Wang, Fei-Yue Wang

    Abstract: LiDAR sensors play a crucial role in various applications, especially in autonomous driving. Current research primarily focuses on optimizing perceptual models with point cloud data as input, while the exploration of deeper cognitive intelligence remains relatively limited. To address this challenge, parallel LiDARs have emerged as a novel theoretical framework for the next-generation intelligent… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  9. arXiv:2405.09848  [pdf, other

    cs.CL cs.AI

    Enhancing Semantics in Multimodal Chain of Thought via Soft Negative Sampling

    Authors: Guangmin Zheng, Jin Wang, Xiaobing Zhou, Xuejie Zhang

    Abstract: Chain of thought (CoT) has proven useful for problems requiring complex reasoning. Many of these problems are both textual and multimodal. Given the inputs in different modalities, a model generates a rationale and then uses it to answer a question. Because of the hallucination issue, the generated soft negative rationales with high textual quality but illogical semantics do not always help improv… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

    Comments: Accepted by LREC-COLING 2024

  10. arXiv:2405.03697  [pdf, other

    cs.HC

    GeoViz: A Multi-View Visualization Platform for Spatio-temporal Knowledge Graph

    Authors: Jianping Zhou, Junhao Li, Guanjie Zheng, Yunqiang Zhu, Xinbing Wang, Chenghu Zhou

    Abstract: In this paper, we propose a multi-view visualization technology for spatio-temporal knowledge graph(STKG), which utilizes three distinct perspectives: knowledge tree, knowledge net, and knowledge map, to facilitate a comprehensive analysis of the STKG. The knowledge tree enables the visualization of hierarchical interrelation within the STKG, while the knowledge net elucidates semantic relationshi… ▽ More

    Submitted 29 April, 2024; originally announced May 2024.

    Comments: 4 pages, 2 figures

  11. arXiv:2405.03649  [pdf, other

    cs.LG cs.CV

    Learning Robust Classifiers with Self-Guided Spurious Correlation Mitigation

    Authors: Guangtao Zheng, Wenqian Ye, Aidong Zhang

    Abstract: Deep neural classifiers tend to rely on spurious correlations between spurious attributes of inputs and targets to make predictions, which could jeopardize their generalization capability. Training classifiers robust to spurious correlations typically relies on annotations of spurious correlations in data, which are often expensive to get. In this paper, we tackle an annotation-free setting and pr… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

    Comments: Accepted to IJCAI 2024

  12. arXiv:2403.18306  [pdf, other

    cs.DB

    Sm-Nd Isotope Data Compilation from Geoscientific Literature Using an Automated Tabular Extraction Method

    Authors: Zhixin Guo, Tao Wang, Chaoyang Wang, Jianping Zhou, Guanjie Zheng, Xinbing Wang, Chenghu Zhou

    Abstract: The rare earth elements Sm and Nd significantly address fundamental questions about crustal growth, such as its spatiotemporal evolution and the interplay between orogenesis and crustal accretion. Their relative immobility during high-grade metamorphism makes the Sm-Nd isotopic system crucial for inferring crustal formation times. Historically, data have been disseminated sporadically in the scien… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

  13. arXiv:2403.11681  [pdf, other

    cs.RO cs.CV

    MASSTAR: A Multi-Modal and Large-Scale Scene Dataset with a Versatile Toolchain for Surface Prediction and Completion

    Authors: Guiyong Zheng, Jinqi Jiang, Chen Feng, Shaojie Shen, Boyu Zhou

    Abstract: Surface prediction and completion have been widely studied in various applications. Recently, research in surface completion has evolved from small objects to complex large-scale scenes. As a result, researchers have begun increasing the volume of data and leveraging a greater variety of data modalities including rendered RGB images, descriptive texts, depth images, etc, to enhance algorithm perfo… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

    Comments: Submitted to IROS2024. Code: https://github.com/SYSU-STAR/MASSTAR. Project Page: https://github.com/SYSU-STAR/MASSTAR

  14. arXiv:2403.11624  [pdf, other

    cs.IR cs.LG

    Dual-Channel Multiplex Graph Neural Networks for Recommendation

    Authors: Xiang Li, Chaofan Fu, Zhongying Zhao, Guanjie Zheng, Chao Huang, Junyu Dong, Yanwei Yu

    Abstract: Efficient recommender systems play a crucial role in accurately capturing user and item attributes that mirror individual preferences. Some existing recommendation techniques have started to shift their focus towards modeling various types of interaction relations between users and items in real-world recommendation scenarios, such as clicks, marking favorites, and purchases on online shopping pla… ▽ More

    Submitted 29 March, 2024; v1 submitted 18 March, 2024; originally announced March 2024.

  15. arXiv:2403.11186  [pdf, other

    cs.CV

    NetTrack: Tracking Highly Dynamic Objects with a Net

    Authors: Guangze Zheng, Shijie Lin, Haobo Zuo, Changhong Fu, Jia Pan

    Abstract: The complex dynamicity of open-world objects presents non-negligible challenges for multi-object tracking (MOT), often manifested as severe deformations, fast motion, and occlusions. Most methods that solely depend on coarse-grained object cues, such as boxes and the overall appearance of the object, are susceptible to degradation due to distorted internal relationships of dynamic objects. To addr… ▽ More

    Submitted 17 March, 2024; originally announced March 2024.

    Comments: Accepted by CVPR 2024

  16. arXiv:2403.07294  [pdf, other

    cs.LG cs.AI cs.SI

    Graph Data Condensation via Self-expressive Graph Structure Reconstruction

    Authors: Zhanyu Liu, Chaolv Zeng, Guanjie Zheng

    Abstract: With the increasing demands of training graph neural networks (GNNs) on large-scale graphs, graph data condensation has emerged as a critical technique to relieve the storage and time costs during the training phase. It aims to condense the original large-scale graph to a much smaller synthetic graph while preserving the essential information necessary for efficiently training a downstream GNN. Ho… ▽ More

    Submitted 7 June, 2024; v1 submitted 11 March, 2024; originally announced March 2024.

  17. arXiv:2403.07245  [pdf, other

    cs.LG

    Dataset Condensation for Time Series Classification via Dual Domain Matching

    Authors: Zhanyu Liu, Ke Hao, Guanjie Zheng, Yanwei Yu

    Abstract: Time series data has been demonstrated to be crucial in various research fields. The management of large quantities of time series data presents challenges in terms of deep learning tasks, particularly for training a deep neural network. Recently, a technique named \textit{Dataset Condensation} has emerged as a solution to this problem. This technique generates a smaller synthetic dataset that has… ▽ More

    Submitted 10 June, 2024; v1 submitted 11 March, 2024; originally announced March 2024.

    Comments: Accepted by KDD 2024 research track

  18. arXiv:2403.02576  [pdf, other

    cs.DL cs.LG cs.SI

    AceMap: Knowledge Discovery through Academic Graph

    Authors: Xinbing Wang, Luoyi Fu, Xiaoying Gan, Ying Wen, Guanjie Zheng, Jiaxin Ding, Liyao Xiang, Nanyang Ye, Meng Jin, Shiyu Liang, Bin Lu, Haiwen Wang, Yi Xu, Cheng Deng, Shao Zhang, Huquan Kang, Xingli Wang, Qi Li, Zhixin Guo, Jiexing Qi, Pan Liu, Yuyang Ren, Lyuwen Wu, Jungang Yang, Jianping Zhou , et al. (1 additional authors not shown)

    Abstract: The exponential growth of scientific literature requires effective management and extraction of valuable insights. While existing scientific search engines excel at delivering search results based on relational databases, they often neglect the analysis of collaborations between scientific entities and the evolution of ideas, as well as the in-depth analysis of content within scientific publicatio… ▽ More

    Submitted 14 April, 2024; v1 submitted 4 March, 2024; originally announced March 2024.

    Comments: Technical Report for AceMap (https://www.acemap.info)

  19. arXiv:2402.12715  [pdf, other

    cs.LG

    Spurious Correlations in Machine Learning: A Survey

    Authors: Wenqian Ye, Guangtao Zheng, Xu Cao, Yunsheng Ma, Aidong Zhang

    Abstract: Machine learning systems are known to be sensitive to spurious correlations between non-essential features of the inputs (e.g., background, texture, and secondary objects) and the corresponding labels. These features and their correlations with the labels are known as "spurious" because they tend to change with shifts in real-world data distributions, which can negatively impact the model's genera… ▽ More

    Submitted 16 May, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

    Comments: Version 2; Github Link: https://github.com/wenqian-ye/Awesome-Spurious-Correlations

  20. A Lattice-Reduction Aided Vector Perturbation Precoder Relying on Quantum Annealing

    Authors: Samuel Winter, Yangyishi Zhang, Gan Zheng, Lajos Hanzo

    Abstract: Quantum annealing (QA) is proposed for vector perturbation precoding (VPP) in multiple input multiple output (MIMO) communications systems. The mathematical framework of VPP is presented, outlining the problem formulation and the benefits of lattice reduction algorithms. Lattice reduction aided quantum vector perturbation (LRAQVP) is designed by harnessing physical quantum hardware, and the optimi… ▽ More

    Submitted 12 February, 2024; originally announced February 2024.

    Comments: accepted by IEEE Wireless Communications Letters

  21. arXiv:2402.03049  [pdf, other

    cs.CL cs.AI cs.HC cs.IR cs.LG

    EasyInstruct: An Easy-to-use Instruction Processing Framework for Large Language Models

    Authors: Yixin Ou, Ningyu Zhang, Honghao Gui, Ziwen Xu, Shuofei Qiao, Yida Xue, Runnan Fang, Kangwei Liu, Lei Li, Zhen Bi, Guozhou Zheng, Huajun Chen

    Abstract: In recent years, instruction tuning has gained increasing attention and emerged as a crucial technique to enhance the capabilities of Large Language Models (LLMs). To construct high-quality instruction datasets, many instruction processing approaches have been proposed, aiming to achieve a delicate balance between data quantity and data quality. Nevertheless, due to inconsistencies that persist am… ▽ More

    Submitted 21 March, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

    Comments: Project website: https://zjunlp.github.io/project/EasyInstruct Code: https://github.com/zjunlp/EasyInstruct Video: https://youtu.be/rfQOWYfziFo Demo: https://huggingface.co/spaces/zjunlp/EasyInstruct

  22. arXiv:2402.00397  [pdf, other

    cs.LG cs.AI

    Multi-scale Traffic Pattern Bank for Cross-city Few-shot Traffic Forecasting

    Authors: Zhanyu Liu, Guanjie Zheng, Yanwei Yu

    Abstract: Traffic forecasting is crucial for intelligent transportation systems (ITS), aiding in efficient resource allocation and effective traffic control. However, its effectiveness often relies heavily on abundant traffic data, while many cities lack sufficient data due to limited device support, posing a significant challenge for traffic forecasting. Recognizing this challenge, we have made a noteworth… ▽ More

    Submitted 26 February, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

    Comments: Under review. Text overlap with arXiv:2308.09727

  23. arXiv:2401.17221  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    MouSi: Poly-Visual-Expert Vision-Language Models

    Authors: Xiaoran Fan, Tao Ji, Changhao Jiang, Shuo Li, Senjie Jin, Sirui Song, Junke Wang, Boyang Hong, Lu Chen, Guodong Zheng, Ming Zhang, Caishuang Huang, Rui Zheng, Zhiheng Xi, Yuhao Zhou, Shihan Dou, Junjie Ye, Hang Yan, Tao Gui, Qi Zhang, Xipeng Qiu, Xuanjing Huang, Zuxuan Wu, Yu-Gang Jiang

    Abstract: Current large vision-language models (VLMs) often encounter challenges such as insufficient capabilities of a single visual component and excessively long visual tokens. These issues can limit the model's effectiveness in accurately interpreting complex visual information and over-lengthy contextual information. Addressing these challenges is crucial for enhancing the performance and applicability… ▽ More

    Submitted 30 January, 2024; originally announced January 2024.

  24. arXiv:2401.15071  [pdf, other

    cs.CV

    From GPT-4 to Gemini and Beyond: Assessing the Landscape of MLLMs on Generalizability, Trustworthiness and Causality through Four Modalities

    Authors: Chaochao Lu, Chen Qian, Guodong Zheng, Hongxing Fan, Hongzhi Gao, Jie Zhang, Jing Shao, Jingyi Deng, Jinlan Fu, Kexin Huang, Kunchang Li, Lijun Li, Limin Wang, Lu Sheng, Meiqi Chen, Ming Zhang, Qibing Ren, Sirui Chen, Tao Gui, Wanli Ouyang, Yali Wang, Yan Teng, Yaru Wang, Yi Wang, Yinan He , et al. (11 additional authors not shown)

    Abstract: Multi-modal Large Language Models (MLLMs) have shown impressive abilities in generating reasonable responses with respect to multi-modal contents. However, there is still a wide gap between the performance of recent MLLM-based applications and the expectation of the broad public, even though the most powerful OpenAI's GPT-4 and Google's Gemini have been deployed. This paper strives to enhance unde… ▽ More

    Submitted 29 January, 2024; v1 submitted 26 January, 2024; originally announced January 2024.

  25. arXiv:2401.05850  [pdf, other

    cs.SD eess.AS

    Contrastive Loss Based Frame-wise Feature disentanglement for Polyphonic Sound Event Detection

    Authors: Yadong Guan, Jiqing Han, Hongwei Song, Wenjie Song, Guibin Zheng, Tieran Zheng, Yongjun He

    Abstract: Overlapping sound events are ubiquitous in real-world environments, but existing end-to-end sound event detection (SED) methods still struggle to detect them effectively. A critical reason is that these methods represent overlapping events using shared and entangled frame-wise features, which degrades the feature discrimination. To solve the problem, we propose a disentangled feature learning fram… ▽ More

    Submitted 11 January, 2024; originally announced January 2024.

    Comments: accepted by icassp2024

  26. arXiv:2312.13472  [pdf, other

    math.OC cs.RO

    Task Planning for Multiple Item Insertion using ADMM

    Authors: Gavin Zheng

    Abstract: Mixed-integer nonlinear programmings (MINLPs) are powerful formulation tools for task planning. However, it suffers from long solving time especially for large scale problems. In this work, we first formulate the task planning problem for item stowing into a mixed-integer nonlinear programming problem, then solve it using Alternative Direction Method of Multipliers (ADMM). ADMM separates the compl… ▽ More

    Submitted 20 December, 2023; originally announced December 2023.

    Comments: arXiv admin note: text overlap with arXiv:2208.13158 by other authors

  27. arXiv:2312.12720  [pdf, other

    cs.CV

    AdvST: Revisiting Data Augmentations for Single Domain Generalization

    Authors: Guangtao Zheng, Mengdi Huai, Aidong Zhang

    Abstract: Single domain generalization (SDG) aims to train a robust model against unknown target domain shifts using data from a single source domain. Data augmentation has been proven an effective approach to SDG. However, the utility of standard augmentations, such as translate, or invert, has not been fully exploited in SDG; practically, these augmentations are used as a part of a data preprocessing proc… ▽ More

    Submitted 14 February, 2024; v1 submitted 19 December, 2023; originally announced December 2023.

    Comments: Accepted to AAAI 2024

  28. arXiv:2312.05402  [pdf, other

    cs.CL

    Towards Controlled Table-to-Text Generation with Scientific Reasoning

    Authors: Zhixin Guo, Jianping Zhou, Jiexing Qi, Mingxuan Yan, Ziwei He, Guanjie Zheng, Zhouhan Lin, Xinbing Wang, Chenghu Zhou

    Abstract: The sheer volume of scientific experimental results and complex technical statements, often presented in tabular formats, presents a formidable barrier to individuals acquiring preferred information. The realms of scientific reasoning and content generation that adhere to user preferences encounter distinct challenges. In this work, we present a new task for generating fluent and logical descripti… ▽ More

    Submitted 8 December, 2023; originally announced December 2023.

  29. arXiv:2312.02206  [pdf, other

    cs.AI cs.CL

    Axiomatic Preference Modeling for Longform Question Answering

    Authors: Corby Rosset, Guoqing Zheng, Victor Dibia, Ahmed Awadallah, Paul Bennett

    Abstract: The remarkable abilities of large language models (LLMs) like GPT-4 partially stem from post-training processes like Reinforcement Learning from Human Feedback (RLHF) involving human preferences encoded in a reward model. However, these reward models (RMs) often lack direct knowledge of why, or under what principles, the preferences annotations were made. In this study, we identify principles that… ▽ More

    Submitted 2 December, 2023; originally announced December 2023.

    Comments: Accepted to EMNLP 2023

  30. arXiv:2311.11045  [pdf, other

    cs.AI

    Orca 2: Teaching Small Language Models How to Reason

    Authors: Arindam Mitra, Luciano Del Corro, Shweti Mahajan, Andres Codas, Clarisse Simoes, Sahaj Agarwal, Xuxi Chen, Anastasia Razdaibiedina, Erik Jones, Kriti Aggarwal, Hamid Palangi, Guoqing Zheng, Corby Rosset, Hamed Khanpour, Ahmed Awadallah

    Abstract: Orca 1 learns from rich signals, such as explanation traces, allowing it to outperform conventional instruction-tuned models on benchmarks like BigBench Hard and AGIEval. In Orca 2, we continue exploring how improved training signals can enhance smaller LMs' reasoning abilities. Research on training small LMs has often relied on imitation learning to replicate the output of more capable models. We… ▽ More

    Submitted 21 November, 2023; v1 submitted 18 November, 2023; originally announced November 2023.

    Comments: Added url to model weights fixed typo in Author name

  31. arXiv:2311.08770  [pdf

    cs.DL cs.DB

    A proof-of-concept online metadata catalogue service of Earth observation datasets for human health research in exposomics

    Authors: Keumseok Koh, Maged N. Kamel Boulos, Gang Zheng, Hongsheng Zhang, Muralikrishna V. Iyyanki, Bosco Bwambale, Ashraf Dewan

    Abstract: This article describes research carried out during 2023 under an International Society for Photogrammetry and Remote Sensing (ISPRS)-funded project to develop and disseminate a metadata catalogue of Earth observation data sources/products and types that are relevant to human health research in exposomics, as a free service to interested researchers worldwide. The proof-of-concept catalogue was inf… ▽ More

    Submitted 1 April, 2024; v1 submitted 15 November, 2023; originally announced November 2023.

    Comments: 6 figures

    ACM Class: J.3

  32. arXiv:2310.16436  [pdf, other

    cs.CV cs.CL

    DDCoT: Duty-Distinct Chain-of-Thought Prompting for Multimodal Reasoning in Language Models

    Authors: Ge Zheng, Bin Yang, Jiajin Tang, Hong-Yu Zhou, Sibei Yang

    Abstract: A long-standing goal of AI systems is to perform complex multimodal reasoning like humans. Recently, large language models (LLMs) have made remarkable strides in such multi-step reasoning on the language modality solely by leveraging the chain of thought (CoT) to mimic human thinking. However, the transfer of these advancements to multimodal contexts introduces heightened challenges, including but… ▽ More

    Submitted 26 October, 2023; v1 submitted 25 October, 2023; originally announced October 2023.

    Comments: 24 pages, 13 figures, to be published in NeurIPS 2023

  33. arXiv:2310.14355  [pdf

    cs.LG eess.IV

    A global product of fine-scale urban building height based on spaceborne lidar

    Authors: Xiao Ma, Guang Zheng, Chi Xu, L. Monika Moskal, Peng Gong, Qinghua Guo, Huabing Huang, Xuecao Li, Yong Pang, Cheng Wang, Huan Xie, Bailang Yu, Bo Zhao, Yuyu Zhou

    Abstract: Characterizing urban environments with broad coverages and high precision is more important than ever for achieving the UN's Sustainable Development Goals (SDGs) as half of the world's populations are living in cities. Urban building height as a fundamental 3D urban structural feature has far-reaching applications. However, so far, producing readily available datasets of recent urban building heig… ▽ More

    Submitted 22 October, 2023; originally announced October 2023.

  34. arXiv:2310.04185  [pdf, other

    cs.NI

    Cross-Edge Orchestration of Serverless Functions with Probabilistic Caching

    Authors: Chen Chen, Manuel Herrera, Ge Zheng, Liqiao Xia, Zhengyang Ling, Jiangtao Wang

    Abstract: Serverless edge computing adopts an event-based paradigm that provides back-end services on an as-used basis, resulting in efficient resource utilization. To improve the end-to-end latency and revenue, service providers need to optimize the number and placement of serverless containers while considering the system cost incurred by the provisioning. The particular reason for this circumstance is th… ▽ More

    Submitted 6 October, 2023; originally announced October 2023.

  35. arXiv:2310.03750  [pdf

    eess.SP cond-mat.mtrl-sci cs.LG physics.app-ph

    Health diagnosis and recuperation of aged Li-ion batteries with data analytics and equivalent circuit modeling

    Authors: Riko I Made, Jing Lin, Jintao Zhang, Yu Zhang, Lionel C. H. Moh, Zhaolin Liu, Ning Ding, Sing Yang Chiam, Edwin Khoo, Xuesong Yin, Guangyuan Wesley Zheng

    Abstract: Battery health assessment and recuperation play a crucial role in the utilization of second-life Li-ion batteries. However, due to ambiguous aging mechanisms and lack of correlations between the recovery effects and operational states, it is challenging to accurately estimate battery health and devise a clear strategy for cell rejuvenation. This paper presents aging and reconditioning experiments… ▽ More

    Submitted 21 September, 2023; originally announced October 2023.

    Comments: 20 pages, 5 figures, 1 table

    Journal ref: iScience (2024)

  36. arXiv:2310.02842  [pdf, other

    cs.CL cs.AI

    Sweeping Heterogeneity with Smart MoPs: Mixture of Prompts for LLM Task Adaptation

    Authors: Chen Dun, Mirian Hipolito Garcia, Guoqing Zheng, Ahmed Hassan Awadallah, Anastasios Kyrillidis, Robert Sim

    Abstract: Large Language Models (LLMs) have the ability to solve a variety of tasks, such as text summarization and mathematical questions, just out of the box, but they are often trained with a single task in mind. Due to high computational costs, the current trend is to use prompt instruction tuning to better adjust monolithic, pretrained LLMs for new -- but often individual -- downstream tasks. Thus, how… ▽ More

    Submitted 5 October, 2023; v1 submitted 4 October, 2023; originally announced October 2023.

  37. arXiv:2310.02031  [pdf, other

    cs.CL cs.AI cs.CE cs.LG cs.RO

    OceanGPT: A Large Language Model for Ocean Science Tasks

    Authors: Zhen Bi, Ningyu Zhang, Yida Xue, Yixin Ou, Daxiong Ji, Guozhou Zheng, Huajun Chen

    Abstract: Ocean science, which delves into the oceans that are reservoirs of life and biodiversity, is of great significance given that oceans cover over 70% of our planet's surface. Recently, advances in Large Language Models (LLMs) have transformed the paradigm in science. Despite the success in other domains, current LLMs often fall short in catering to the needs of domain experts like oceanographers, an… ▽ More

    Submitted 23 May, 2024; v1 submitted 3 October, 2023; originally announced October 2023.

    Comments: ACL2024. Project Website: https://oceangpt.zjukg.cn/

  38. arXiv:2309.16990  [pdf, other

    cs.RO

    Simultaneous Synchronization and Calibration for Wide-baseline Stereo Event Cameras

    Authors: Wanli Xing, Shijie Lin, Guangze Zheng, Yanjun Du, Jia Pan

    Abstract: Event-based cameras are increasingly utilized in various applications, owing to their high temporal resolution and low power consumption. However, a fundamental challenge arises when deploying multiple such cameras: they operate on independent time systems, leading to temporal misalignment. This misalignment can significantly degrade performance in downstream applications. Traditional solutions, w… ▽ More

    Submitted 29 September, 2023; originally announced September 2023.

  39. arXiv:2309.13611  [pdf

    eess.IV cs.IR physics.optics

    Sparsity-regularized coded ptychography for robust and efficient lensless microscopy on a chip

    Authors: Ninghe Liu, Qianhao Zhao, Guoan Zheng

    Abstract: In ptychographic imaging, the trade-off between the number of acquisitions and the resultant imaging quality presents a complex optimization problem. Increasing the number of acquisitions typically yields reconstructions with higher spatial resolution and finer details. Conversely, a reduction in measurement frequency often compromises the quality of the reconstructed images, manifesting as increa… ▽ More

    Submitted 24 September, 2023; originally announced September 2023.

    Comments: 15 pages, 7 figures

  40. arXiv:2309.12783  [pdf, ps, other

    cs.NI eess.SP

    Multi-objective Optimization of Space-Air-Ground Integrated Network Slicing Relying on a Pair of Central and Distributed Learning Algorithms

    Authors: Guorong Zhou, Liqiang Zhao, Gan Zheng, Shenghui Song, Jiankang Zhang, Lajos Hanzo

    Abstract: As an attractive enabling technology for next-generation wireless communications, network slicing supports diverse customized services in the global space-air-ground integrated network (SAGIN) with diverse resource constraints. In this paper, we dynamically consider three typical classes of radio access network (RAN) slices, namely high-throughput slices, low-delay slices and wide-coverage slices,… ▽ More

    Submitted 22 September, 2023; originally announced September 2023.

    Comments: 19 pages, 14 figures, journal

  41. Uncertainty-aware Traffic Prediction under Missing Data

    Authors: Hao Mei, Junxian Li, Zhiming Liang, Guanjie Zheng, Bin Shi, Hua Wei

    Abstract: Traffic prediction is a crucial topic because of its broad scope of applications in the transportation domain. Recently, various studies have achieved promising results. However, most studies assume the prediction locations have complete or at least partial historical records and cannot be extended to non-historical recorded locations. In real-life scenarios, the deployment of sensors could be lim… ▽ More

    Submitted 29 November, 2023; v1 submitted 13 September, 2023; originally announced September 2023.

    Comments: 11 pages, 3 figures, a short version of this paper is accepted by ICDM 2023

  42. arXiv:2309.03473  [pdf, other

    cs.CV

    Temporal Collection and Distribution for Referring Video Object Segmentation

    Authors: Jiajin Tang, Ge Zheng, Sibei Yang

    Abstract: Referring video object segmentation aims to segment a referent throughout a video sequence according to a natural language expression. It requires aligning the natural language expression with the objects' motions and their dynamic associations at the global video level but segmenting objects at the frame level. To achieve this goal, we propose to simultaneously maintain a global referent token an… ▽ More

    Submitted 7 September, 2023; originally announced September 2023.

    Comments: Accepted by ICCV 2023; Project page: https://toneyaya.github.io/tempcd/

  43. arXiv:2309.01093  [pdf, other

    cs.CV

    CoTDet: Affordance Knowledge Prompting for Task Driven Object Detection

    Authors: Jiajin Tang, Ge Zheng, Jingyi Yu, Sibei Yang

    Abstract: Task driven object detection aims to detect object instances suitable for affording a task in an image. Its challenge lies in object categories available for the task being too diverse to be limited to a closed set of object vocabulary for traditional object detection. Simply mapping categories and visual features of common objects to the task cannot address the challenge. In this paper, we propos… ▽ More

    Submitted 3 September, 2023; originally announced September 2023.

    Comments: Accepted by ICCV 2023

  44. arXiv:2309.01017  [pdf, other

    cs.CV

    Contrastive Grouping with Transformer for Referring Image Segmentation

    Authors: Jiajin Tang, Ge Zheng, Cheng Shi, Sibei Yang

    Abstract: Referring image segmentation aims to segment the target referent in an image conditioning on a natural language expression. Existing one-stage methods employ per-pixel classification frameworks, which attempt straightforwardly to align vision and language at the pixel level, thus failing to capture critical object-level information. In this paper, we propose a mask classification framework, Contra… ▽ More

    Submitted 2 September, 2023; originally announced September 2023.

    Comments: Accepted by CVPR 2023

  45. arXiv:2308.15452  [pdf, other

    cs.CL cs.AI cs.LG cs.SE

    When Do Program-of-Thoughts Work for Reasoning?

    Authors: Zhen Bi, Ningyu Zhang, Yinuo Jiang, Shumin Deng, Guozhou Zheng, Huajun Chen

    Abstract: In the realm of embodied artificial intelligence, the reasoning capabilities of Large Language Models (LLMs) play a pivotal role. Although there are effective methods like program-of-thought prompting for LLMs which uses programming language to tackle complex reasoning tasks, the specific impact of code data on the improvement of reasoning capabilities remains under-explored. To address this gap,… ▽ More

    Submitted 18 December, 2023; v1 submitted 29 August, 2023; originally announced August 2023.

    Comments: AAAI 2024

  46. arXiv:2308.09727  [pdf, other

    cs.LG

    Cross-city Few-Shot Traffic Forecasting via Traffic Pattern Bank

    Authors: Zhanyu Liu, Guanjie Zheng, Yanwei Yu

    Abstract: Traffic forecasting is a critical service in Intelligent Transportation Systems (ITS). Utilizing deep models to tackle this task relies heavily on data from traffic sensors or vehicle devices, while some cities might lack device support and thus have few available data. So, it is necessary to learn from data-rich cities and transfer the knowledge to data-scarce cities in order to improve the perfo… ▽ More

    Submitted 17 August, 2023; originally announced August 2023.

    Comments: Accepted by CIKM2023 (Long Paper)

  47. arXiv:2308.07269  [pdf, other

    cs.CL cs.AI cs.CV cs.IR cs.LG

    EasyEdit: An Easy-to-use Knowledge Editing Framework for Large Language Models

    Authors: Peng Wang, Ningyu Zhang, Bozhong Tian, Zekun Xi, Yunzhi Yao, Ziwen Xu, Mengru Wang, Shengyu Mao, Xiaohan Wang, Siyuan Cheng, Kangwei Liu, Yuansheng Ni, Guozhou Zheng, Huajun Chen

    Abstract: Large Language Models (LLMs) usually suffer from knowledge cutoff or fallacy issues, which means they are unaware of unseen events or generate text with incorrect facts owing to outdated/noisy data. To this end, many knowledge editing approaches for LLMs have emerged -- aiming to subtly inject/edit updated knowledge or adjust undesired behavior while minimizing the impact on unrelated inputs. Neve… ▽ More

    Submitted 19 March, 2024; v1 submitted 14 August, 2023; originally announced August 2023.

    Comments: Code: https://github.com/zjunlp/EasyEdit HF Demo: https://huggingface.co/spaces/zjunlp/EasyEdit Video: https://youtu.be/Gm6T0QaaskU Docs: https://zjunlp.gitbook.io/easyedit

  48. arXiv:2308.04215  [pdf, other

    cs.CL cs.AI cs.DC

    Hybrid Retrieval-Augmented Generation for Real-time Composition Assistance

    Authors: Menglin Xia, Xuchao Zhang, Camille Couturier, Guoqing Zheng, Saravan Rajmohan, Victor Ruhle

    Abstract: Retrieval augmentation enhances performance of traditional language models by incorporating additional context. However, the computational demands for retrieval augmented large language models (LLMs) pose a challenge when applying them to real-time tasks, such as composition assistance. To address this limitation, we propose the Hybrid Retrieval-Augmented Generation (HybridRAG) framework, a novel… ▽ More

    Submitted 5 February, 2024; v1 submitted 8 August, 2023; originally announced August 2023.

  49. arXiv:2308.03658  [pdf, other

    eess.SP cs.IT eess.SY

    Control-Oriented Deep Space Communications For Unmanned Space Exploration

    Authors: Xinran Fang, Wei Feng, Yunfei Chen, Ning Ge, Gan Zheng

    Abstract: In unmanned space exploration, the cooperation among space robots requires advanced communication techniques. In this paper, we propose a communication optimization scheme for a specific cooperation system named the "mother-daughter system". In this setup, the mother spacecraft orbits the planet, while daughter probes are distributed across the planetary surface. During each control cycle, the mot… ▽ More

    Submitted 11 December, 2023; v1 submitted 7 August, 2023; originally announced August 2023.

  50. arXiv:2307.14064  [pdf, other

    cs.IT

    Relay-Enabled Backscatter Communications: Linear Mapping and Resource Allocation

    Authors: Rui Xu, Liqin Shi, Yinghui Ye, Haijian Sun, Gan Zheng

    Abstract: Relay-enabled backscatter communication (BC) is an intriguing paradigm to alleviate energy shortage and improve throughput of Internet-of-Things (IoT) devices. Most of the existing works focus on the resource allocation that considered the unequal and continuous time allocation for both source-relay and relay-destination links. However, the continuous time allocation may be infeasible since in pra… ▽ More

    Submitted 26 July, 2023; originally announced July 2023.