(Translated by https://www.hiragana.jp/)
Search | arXiv e-print repository
Skip to main content

Showing 1–50 of 195 results for author: Huang, X

Searching in archive eess. Search in all archives.
.
  1. arXiv:2405.14210  [pdf, other

    cs.CV eess.IV

    Eidos: Efficient, Imperceptible Adversarial 3D Point Clouds

    Authors: Hanwei Zhang, Luo Cheng, Qisong He, Wei Huang, Renjue Li, Ronan Sicre, Xiaowei Huang, Holger Hermanns, Lijun Zhang

    Abstract: Classification of 3D point clouds is a challenging machine learning (ML) task with important real-world applications in a spectrum from autonomous driving and robot-assisted surgery to earth observation from low orbit. As with other ML tasks, classification models are notoriously brittle in the presence of adversarial attacks. These are rooted in imperceptible changes to inputs with the effect tha… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: Preprint

  2. arXiv:2405.06230  [pdf

    eess.IV

    Fire in SRRN: Next-Gen 3D Temperature Field Reconstruction Technology

    Authors: Shenxiang Feng, Xiaojian Hao, Xiaodong Huang, Pan Pei, Tong Wei, Chenyang Xu

    Abstract: In aerospace and energy engineering, accurate 3D combustion field temperature measurement is critical. The resolution of traditional methods based on algebraic iteration is limited by the initial voxel division. This study introduces a novel method for reconstructing three-dimensional temperature fields using the Spatial Radiation Representation Network (SRRN). This method utilizes the flame therm… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  3. arXiv:2404.16305  [pdf, other

    cs.MM cs.SD eess.AS

    Semantically consistent Video-to-Audio Generation using Multimodal Language Large Model

    Authors: Gehui Chen, Guan'an Wang, Xiaowen Huang, Jitao Sang

    Abstract: Existing works have made strides in video generation, but the lack of sound effects (SFX) and background music (BGM) hinders a complete and immersive viewer experience. We introduce a novel semantically consistent v ideo-to-audio generation framework, namely SVA, which automatically generates audio semantically consistent with the given video content. The framework harnesses the power of multimoda… ▽ More

    Submitted 25 April, 2024; v1 submitted 24 April, 2024; originally announced April 2024.

  4. arXiv:2404.15620  [pdf, other

    eess.IV

    A Dynamic Kernel Prior Model for Unsupervised Blind Image Super-Resolution

    Authors: Zhixiong Yang, Jingyuan Xia, Shengxi Li, Xinghua Huang, Shuanghui Zhang, Zhen Liu, Yaowen Fu, Yongxiang Liu

    Abstract: Deep learning-based methods have achieved significant successes on solving the blind super-resolution (BSR) problem. However, most of them request supervised pre-training on labelled datasets. This paper proposes an unsupervised kernel estimation model, named dynamic kernel prior (DKP), to realize an unsupervised and pre-training-free learning-based algorithm for solving the BSR problem. DKP can a… ▽ More

    Submitted 25 April, 2024; v1 submitted 23 April, 2024; originally announced April 2024.

    Comments: Accepted for publication in CVPR 2024

  5. arXiv:2404.13786  [pdf, other

    eess.SY cs.AI cs.DC cs.LG

    Soar: Design and Deployment of A Smart Roadside Infrastructure System for Autonomous Driving

    Authors: Shuyao Shi, Neiwen Ling, Zhehao Jiang, Xuan Huang, Yuze He, Xiaoguang Zhao, Bufang Yang, Chen Bian, Jingfei Xia, Zhenyu Yan, Raymond Yeung, Guoliang Xing

    Abstract: Recently,smart roadside infrastructure (SRI) has demonstrated the potential of achieving fully autonomous driving systems. To explore the potential of infrastructure-assisted autonomous driving, this paper presents the design and deployment of Soar, the first end-to-end SRI system specifically designed to support autonomous driving systems. Soar consists of both software and hardware components ca… ▽ More

    Submitted 21 April, 2024; originally announced April 2024.

  6. arXiv:2404.12973  [pdf, other

    eess.IV cs.CV cs.LG q-bio.QM

    Cross-modal Diffusion Modelling for Super-resolved Spatial Transcriptomics

    Authors: Xiaofei Wang, Xingxu Huang, Stephen J. Price, Chao Li

    Abstract: The recent advancement of spatial transcriptomics (ST) allows to characterize spatial gene expression within tissue for discovery research. However, current ST platforms suffer from low resolution, hindering in-depth understanding of spatial gene expression. Super-resolution approaches promise to enhance ST maps by integrating histology images with gene expressions of profiled tissue spots. Howeve… ▽ More

    Submitted 27 May, 2024; v1 submitted 19 April, 2024; originally announced April 2024.

  7. arXiv:2404.12595  [pdf, other

    eess.SP

    Deep Reinforcement Learning-aided Transmission Design for Energy-efficient Link Optimization in Vehicular Communications

    Authors: Zhengpeng Wang, Yanqun Tang, Yingzhe Mao, Tao Wang, Xiunan Huang

    Abstract: This letter presents a deep reinforcement learning (DRL) approach for transmission design to optimize the energy efficiency in vehicle-to-vehicle (V2V) communication links. Considering the dynamic environment of vehicular communications, the optimization problem is non-convex and mathematically difficult to solve. Hence, we propose scenario identification-based double and Dueling deep Q-Network (S… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

    Comments: 5 pages, 3 figures

  8. arXiv:2403.02566  [pdf, other

    eess.IV cs.CV

    Enhancing Weakly Supervised 3D Medical Image Segmentation through Probabilistic-aware Learning

    Authors: Zhaoxin Fan, Runmin Jiang, Junhao Wu, Xin Huang, Tianyang Wang, Heng Huang, Min Xu

    Abstract: 3D medical image segmentation is a challenging task with crucial implications for disease diagnosis and treatment planning. Recent advances in deep learning have significantly enhanced fully supervised medical image segmentation. However, this approach heavily relies on labor-intensive and time-consuming fully annotated ground-truth labels, particularly for 3D volumes. To overcome this limitation,… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

  9. arXiv:2402.09372  [pdf, other

    eess.IV cs.AI cs.CV

    Deep Rib Fracture Instance Segmentation and Classification from CT on the RibFrac Challenge

    Authors: Jiancheng Yang, Rui Shi, Liang Jin, Xiaoyang Huang, Kaiming Kuang, Donglai Wei, Shixuan Gu, Jianying Liu, Pengfei Liu, Zhizhong Chai, Yongjie Xiao, Hao Chen, Liming Xu, Bang Du, Xiangyi Yan, Hao Tang, Adam Alessio, Gregory Holste, Jiapeng Zhang, Xiaoming Wang, Jianye He, Lixuan Che, Hanspeter Pfister, Ming Li, Bingbing Ni

    Abstract: Rib fractures are a common and potentially severe injury that can be challenging and labor-intensive to detect in CT scans. While there have been efforts to address this field, the lack of large-scale annotated datasets and evaluation benchmarks has hindered the development and validation of deep learning algorithms. To address this issue, the RibFrac Challenge was introduced, providing a benchmar… ▽ More

    Submitted 14 February, 2024; originally announced February 2024.

    Comments: Challenge paper for MICCAI RibFrac Challenge (https://ribfrac.grand-challenge.org/)

  10. arXiv:2402.09048  [pdf, other

    eess.SP

    Sensing in Bi-Static ISAC Systems with Clock Asynchronism: A Signal Processing Perspective

    Authors: Kai Wu, Jacopo Pegoraro, Francesca Meneghello, J. Andrew Zhang, Jesus O. Lacruz, Joerg Widmer, Francesco Restuccia, Michele Rossi, Xiaojing Huang, Daqing Zhang, Giuseppe Caire, Y. Jay Guo

    Abstract: Integrated Sensing and Communication (ISAC) has been identified as a pillar usage scenario for the impending 6G era. Bi-static sensing, a major type of sensing in ISAC, is promising to expedite ISAC in the near future, as it requires minimal changes to the existing network infrastructure. However, a critical challenge for bi-static sensing is clock asynchronism due to the use of different clocks a… ▽ More

    Submitted 24 June, 2024; v1 submitted 14 February, 2024; originally announced February 2024.

    Comments: 20 pages, 6 figures, 1 table

  11. arXiv:2402.06841  [pdf

    eess.IV cs.CV

    Point cloud-based registration and image fusion between cardiac SPECT MPI and CTA

    Authors: Shaojie Tang, Penpen Miao, Xingyu Gao, Yu Zhong, Dantong Zhu, Haixing Wen, Zhihui Xu, Qiuyue Wei, Hongping Yao, Xin Huang, Rui Gao, Chen Zhao, Weihua Zhou

    Abstract: A method was proposed for the point cloud-based registration and image fusion between cardiac single photon emission computed tomography (SPECT) myocardial perfusion images (MPI) and cardiac computed tomography angiograms (CTA). Firstly, the left ventricle (LV) epicardial regions (LVERs) in SPECT and CTA images were segmented by using different U-Net neural networks trained to generate the point c… ▽ More

    Submitted 9 February, 2024; originally announced February 2024.

  12. arXiv:2402.01546  [pdf, other

    cs.LG cs.AI cs.CR cs.DC cs.MA eess.SY

    Privacy-Preserving Distributed Learning for Residential Short-Term Load Forecasting

    Authors: Yi Dong, Yingjie Wang, Mariana Gama, Mustafa A. Mustafa, Geert Deconinck, Xiaowei Huang

    Abstract: In the realm of power systems, the increasing involvement of residential users in load forecasting applications has heightened concerns about data privacy. Specifically, the load data can inadvertently reveal the daily routines of residential users, thereby posing a risk to their property security. While federated learning (FL) has been employed to safeguard user privacy by enabling model training… ▽ More

    Submitted 2 February, 2024; originally announced February 2024.

  13. arXiv:2401.16592  [pdf

    physics.med-ph eess.IV

    A compact and cost-effective laser-powered speckle visibility spectroscopy (SVS) device for measuring cerebral blood flow

    Authors: Yu Xi Huang, Simon Mahler, Maya Dickson, Aidin Abedi, Julian M. Tyszka, Jack Lo Yu Tung, Jonathan Russin, Charles Liu, Changhuei Yang

    Abstract: In the realm of cerebrovascular monitoring, primary metrics typically include blood pressure, which influences cerebral blood flow (CBF) and is contingent upon vessel radius. Measuring CBF non-invasively poses a persistent challenge, primarily attributed to the difficulty of accessing and obtaining signal from the brain. This study aims to introduce a compact speckle visibility spectroscopy (SVS)… ▽ More

    Submitted 8 February, 2024; v1 submitted 29 January, 2024; originally announced January 2024.

  14. arXiv:2401.16520  [pdf, other

    cs.LG cs.CV eess.SP

    MT-HCCAR: Multi-Task Deep Learning with Hierarchical Classification and Attention-based Regression for Cloud Property Retrieval

    Authors: Xingyan Li, Andrew M. Sayer, Ian T. Carroll, Xin Huang, Jianwu Wang

    Abstract: In the realm of Earth science, effective cloud property retrieval, encompassing cloud masking, cloud phase classification, and cloud optical thickness (COT) prediction, remains pivotal. Traditional methodologies necessitate distinct models for each sensor instrument due to their unique spectral characteristics. Recent strides in Earth Science research have embraced machine learning and deep learni… ▽ More

    Submitted 5 July, 2024; v1 submitted 29 January, 2024; originally announced January 2024.

    Comments: 14 pages, 3 figures, accepted by ECML PKDD 2024

    MSC Class: 68T07 ACM Class: I.2.6

  15. arXiv:2401.12826  [pdf, other

    cs.NI eess.IV

    Digital Twin-Based Network Management for Better QoE in Multicast Short Video Streaming

    Authors: Xinyu Huang, Shisheng Hu, Haojun Yang, Xinghan Wang, Yingying Pei, Xuemin Shen

    Abstract: Multicast short video streaming can enhance bandwidth utilization by enabling simultaneous video transmission to multiple users over shared wireless channels. The existing network management schemes mainly rely on the sequential buffering principle and general quality of experience (QoE) model, which may deteriorate QoE when users' swipe behaviors exhibit distinct spatiotemporal variation. In this… ▽ More

    Submitted 23 January, 2024; originally announced January 2024.

    Comments: 13 pages, 12 figures

  16. arXiv:2401.10070  [pdf, other

    cs.CL cs.SD eess.AS

    Communication-Efficient Personalized Federated Learning for Speech-to-Text Tasks

    Authors: Yichao Du, Zhirui Zhang, Linan Yue, Xu Huang, Yuqing Zhang, Tong Xu, Linli Xu, Enhong Chen

    Abstract: To protect privacy and meet legal regulations, federated learning (FL) has gained significant attention for training speech-to-text (S2T) systems, including automatic speech recognition (ASR) and speech translation (ST). However, the commonly used FL approach (i.e., \textsc{FedAvg}) in S2T tasks typically suffers from extensive communication overhead due to multi-round interactions based on the wh… ▽ More

    Submitted 18 January, 2024; originally announced January 2024.

    Comments: ICASSP 2024

  17. arXiv:2311.14925  [pdf, other

    cs.CV eess.IV

    Coordinate-based Neural Network for Fourier Phase Retrieval

    Authors: Tingyou Li, Zixin Xu, Yong S. Chu, Xiaojing Huang, Jizhou Li

    Abstract: Fourier phase retrieval is essential for high-definition imaging of nanoscale structures across diverse fields, notably coherent diffraction imaging. This study presents the Single impliCit neurAl Network (SCAN), a tool built upon coordinate neural networks meticulously designed for enhanced phase retrieval performance. Remedying the drawbacks of conventional iterative methods which are easiliy tr… ▽ More

    Submitted 8 January, 2024; v1 submitted 24 November, 2023; originally announced November 2023.

  18. arXiv:2311.12223  [pdf, other

    cs.NI cs.AI eess.SP

    Digital Twin-Based User-Centric Edge Continual Learning in Integrated Sensing and Communication

    Authors: Shisheng Hu, Jie Gao, Xinyu Huang, Mushu Li, Kaige Qu, Conghao Zhou, Xuemin, Shen

    Abstract: In this paper, we propose a digital twin (DT)-based user-centric approach for processing sensing data in an integrated sensing and communication (ISAC) system with high accuracy and efficient resource utilization. The considered scenario involves an ISAC device with a lightweight deep neural network (DNN) and a mobile edge computing (MEC) server with a large DNN. After collecting sensing data, the… ▽ More

    Submitted 20 November, 2023; originally announced November 2023.

    Comments: submitted to IEEE ICC 2024

  19. arXiv:2311.00483  [pdf, other

    eess.IV cs.CV

    DEFN: Dual-Encoder Fourier Group Harmonics Network for Three-Dimensional Indistinct-Boundary Object Segmentation

    Authors: Xiaohua Jiang, Yihao Guo, Jian Huang, Yuting Wu, Meiyi Luo, Zhaoyang Xu, Qianni Zhang, Xingru Huang, Hong He, Shaowei Jiang, Jing Ye, Mang Xiao

    Abstract: The precise spatial and quantitative delineation of indistinct-boundary medical objects is paramount for the accuracy of diagnostic protocols, efficacy of surgical interventions, and reliability of postoperative assessments. Despite their significance, the effective segmentation and instantaneous three-dimensional reconstruction are significantly impeded by the paucity of representative samples in… ▽ More

    Submitted 19 June, 2024; v1 submitted 1 November, 2023; originally announced November 2023.

    Comments: 36pages,16figures,7tables

    MSC Class: 68; 92 ACM Class: I.4; J.3

  20. arXiv:2310.12039  [pdf, other

    cs.IT eess.SP

    Ordered Reliability Direct Error Pattern Testing Decoding Algorithm

    Authors: Reza Hadavian, Xiaoting Huang, Dmitri Truhachev, Kamal El-Sankary, Hamid Ebrahimzad, Hossein Najafi

    Abstract: We introduce a novel universal soft-decision decoding algorithm for binary block codes called ordered reliability direct error pattern testing (ORDEPT). Our results, obtained for a variety of popular short high-rate codes, demonstrate that ORDEPT outperforms state-of-the-art decoding algorithms of comparable complexity such as ordered reliability bits guessing random additive noise decoding (ORBGR… ▽ More

    Submitted 18 October, 2023; originally announced October 2023.

  21. arXiv:2310.04456  [pdf, other

    cs.CL cs.SD eess.AS

    Multimodal Prompt Transformer with Hybrid Contrastive Learning for Emotion Recognition in Conversation

    Authors: Shihao Zou, Xianying Huang, Xudong Shen

    Abstract: Emotion Recognition in Conversation (ERC) plays an important role in driving the development of human-machine interaction. Emotions can exist in multiple modalities, and multimodal ERC mainly faces two problems: (1) the noise problem in the cross-modal information fusion process, and (2) the prediction problem of less sample emotion labels that are semantically similar but different categories. To… ▽ More

    Submitted 4 October, 2023; originally announced October 2023.

    Comments: Accepted to ACM MM 2023

  22. arXiv:2310.01861  [pdf, other

    eess.IV cs.CV cs.GR

    Shifting More Attention to Breast Lesion Segmentation in Ultrasound Videos

    Authors: Junhao Lin, Qian Dai, Lei Zhu, Huazhu Fu, Qiong Wang, Weibin Li, Wenhao Rao, Xiaoyang Huang, Liansheng Wang

    Abstract: Breast lesion segmentation in ultrasound (US) videos is essential for diagnosing and treating axillary lymph node metastasis. However, the lack of a well-established and large-scale ultrasound video dataset with high-quality annotations has posed a persistent challenge for the research community. To overcome this issue, we meticulously curated a US video breast lesion segmentation dataset comprisi… ▽ More

    Submitted 3 October, 2023; originally announced October 2023.

    Comments: 10 pages

  23. arXiv:2309.15867  [pdf

    cs.LG eess.IV q-bio.QM

    Identifying factors associated with fast visual field progression in patients with ocular hypertension based on unsupervised machine learning

    Authors: Xiaoqin Huang, Asma Poursoroush, Jian Sun, Michael V. Boland, Chris Johnson, Siamak Yousefi

    Abstract: Purpose: To identify ocular hypertension (OHT) subtypes with different trends of visual field (VF) progression based on unsupervised machine learning and to discover factors associated with fast VF progression. Participants: A total of 3133 eyes of 1568 ocular hypertension treatment study (OHTS) participants with at least five follow-up VF tests were included in the study. Methods: We used a laten… ▽ More

    Submitted 26 September, 2023; originally announced September 2023.

  24. arXiv:2309.15697  [pdf, other

    cs.CV eess.IV

    Physics Inspired Hybrid Attention for SAR Target Recognition

    Authors: Zhongling Huang, Chong Wu, Xiwen Yao, Zhicheng Zhao, Xiankai Huang, Junwei Han

    Abstract: There has been a recent emphasis on integrating physical models and deep neural networks (DNNs) for SAR target recognition, to improve performance and achieve a higher level of physical interpretability. The attributed scattering center (ASC) parameters garnered the most interest, being considered as additional input data or features for fusion in most methods. However, the performance greatly dep… ▽ More

    Submitted 27 September, 2023; originally announced September 2023.

  25. arXiv:2309.02124  [pdf, other

    cs.LG eess.SP

    Exploiting Spatial-temporal Data for Sleep Stage Classification via Hypergraph Learning

    Authors: Yuze Liu, Ziming Zhao, Tiehua Zhang, Kang Wang, Xin Chen, Xiaowei Huang, Jun Yin, Zhishu Shen

    Abstract: Sleep stage classification is crucial for detecting patients' health conditions. Existing models, which mainly use Convolutional Neural Networks (CNN) for modelling Euclidean data and Graph Convolution Networks (GNN) for modelling non-Euclidean data, are unable to consider the heterogeneity and interactivity of multimodal data as well as the spatial-temporal correlation simultaneously, which hinde… ▽ More

    Submitted 5 September, 2023; originally announced September 2023.

  26. arXiv:2307.11784  [pdf, other

    cs.LG cs.AI cs.SE eess.SY

    What, Indeed, is an Achievable Provable Guarantee for Learning-Enabled Safety Critical Systems

    Authors: Saddek Bensalem, Chih-Hong Cheng, Wei Huang, Xiaowei Huang, Changshun Wu, Xingyu Zhao

    Abstract: Machine learning has made remarkable advancements, but confidently utilising learning-enabled components in safety-critical domains still poses challenges. Among the challenges, it is known that a rigorous, yet practical, way of achieving safety guarantees is one of the most prominent. In this paper, we first discuss the engineering and research challenges associated with the design and verificati… ▽ More

    Submitted 20 July, 2023; originally announced July 2023.

  27. arXiv:2307.02514  [pdf, other

    eess.AS cs.AI cs.SD

    Exploring Multimodal Approaches for Alzheimer's Disease Detection Using Patient Speech Transcript and Audio Data

    Authors: Hongmin Cai, Xiaoke Huang, Zhengliang Liu, Wenxiong Liao, Haixing Dai, Zihao Wu, Dajiang Zhu, Hui Ren, Quanzheng Li, Tianming Liu, Xiang Li

    Abstract: Alzheimer's disease (AD) is a common form of dementia that severely impacts patient health. As AD impairs the patient's language understanding and expression ability, the speech of AD patients can serve as an indicator of this disease. This study investigates various methods for detecting AD using patients' speech and transcripts data from the DementiaBank Pitt database. The proposed approach invo… ▽ More

    Submitted 5 July, 2023; originally announced July 2023.

  28. arXiv:2306.17419  [pdf

    eess.IV eess.SP physics.optics

    Interferometric speckle visibility spectroscopy (iSVS) for measuring decorrelation time and dynamics of moving samples with enhanced signal-to-noise ratio and relaxed reference requirements

    Authors: Yu Xi Huang, Simon Mahler, Jerome Mertz, Changhuei Yang

    Abstract: Diffusing wave spectroscopy (DWS) is a group of techniques used to measure the dynamics of a scattering medium in a non-invasive manner. DWS methods rely on detecting the speckle light field from the moving scattering media and measuring the speckle decorrelation time to quantify the scattering mediums dynamics. For DWS, the signal-to-noise (SNR) is determined by the ratio between measured decorre… ▽ More

    Submitted 30 June, 2023; originally announced June 2023.

    Comments: 14 pages, 5 figures

    MSC Class: 92C55

  29. arXiv:2306.05946  [pdf, other

    eess.IV cs.NI

    Digital Twin-Assisted Resource Demand Prediction for Multicast Short Video Streaming

    Authors: Xinyu Huang, Wen Wu, Xuemin Sherman Shen

    Abstract: In this paper, we propose a digital twin (DT)-assisted resource demand prediction scheme to enhance prediction accuracy for multicast short video streaming. Particularly, we construct user DTs (UDTs) for collecting real-time user status, including channel condition, location, watching duration, and preference. A reinforcement learning-empowered K-means++ algorithm is developed to cluster users bas… ▽ More

    Submitted 9 June, 2023; originally announced June 2023.

    Comments: 2 pages, 3 figures

  30. arXiv:2305.14838  [pdf, other

    cs.CL cs.SD eess.AS

    ComSL: A Composite Speech-Language Model for End-to-End Speech-to-Text Translation

    Authors: Chenyang Le, Yao Qian, Long Zhou, Shujie Liu, Yanmin Qian, Michael Zeng, Xuedong Huang

    Abstract: Joint speech-language training is challenging due to the large demand for training data and GPU consumption, as well as the modality gap between speech and language. We present ComSL, a speech-language model built atop a composite architecture of public pretrained speech-only and language-only models and optimized data-efficiently for spoken language tasks. Particularly, we propose to incorporate… ▽ More

    Submitted 14 October, 2023; v1 submitted 24 May, 2023; originally announced May 2023.

    Comments: NeurIPS 2023, Poster

  31. arXiv:2305.12311  [pdf, other

    cs.CL cs.AI cs.CV cs.LG eess.AS

    i-Code V2: An Autoregressive Generation Framework over Vision, Language, and Speech Data

    Authors: Ziyi Yang, Mahmoud Khademi, Yichong Xu, Reid Pryzant, Yuwei Fang, Chenguang Zhu, Dongdong Chen, Yao Qian, Mei Gao, Yi-Ling Chen, Robert Gmyr, Naoyuki Kanda, Noel Codella, Bin Xiao, Yu Shi, Lu Yuan, Takuya Yoshioka, Michael Zeng, Xuedong Huang

    Abstract: The convergence of text, visual, and audio data is a key step towards human-like artificial intelligence, however the current Vision-Language-Speech landscape is dominated by encoder-only models which lack generative abilities. We propose closing this gap with i-Code V2, the first model capable of generating natural language from any combination of Vision, Language, and Speech data. i-Code V2 is a… ▽ More

    Submitted 20 May, 2023; originally announced May 2023.

  32. arXiv:2305.11548  [pdf, ps, other

    eess.SP

    Sensing Aided Uplink Transmission in OTFS ISAC with Joint Parameter Association, Channel Estimation and Signal Detection

    Authors: Xi Yang, Hang Li, Qinghua Guo, J. Andrew Zhang, Xiaojing Huang, Zhiqun Cheng

    Abstract: In this work, we study sensing-aided uplink transmission in an integrated sensing and communication (ISAC) vehicular network with the use of orthogonal time frequency space (OTFS) modulation. To exploit sensing parameters for improving uplink communications, the parameters must be first associated with the transmitters, which is a challenging task. We propose a scheme that jointly conducts paramet… ▽ More

    Submitted 19 May, 2023; originally announced May 2023.

  33. arXiv:2304.00729  [pdf, other

    eess.SY

    Data-Driven Safe Controller Synthesis for Deterministic Systems: A Posteriori Method With Validation Tests

    Authors: Yu Chen, Chao Shang, Xiaolin Huang, Xiang Yin

    Abstract: In this work, we investigate the data-driven safe control synthesis problem for unknown dynamic systems. We first formulate the safety synthesis problem as a robust convex program (RCP) based on notion of control barrier function. To resolve the issue of unknown system dynamic, we follow the existing approach by converting the RCP to a scenario convex program (SCP) by randomly collecting finite sa… ▽ More

    Submitted 3 April, 2023; originally announced April 2023.

  34. arXiv:2303.13859  [pdf, other

    cs.MM eess.IV

    XGC-VQA: A unified video quality assessment model for User, Professionally, and Occupationally-Generated Content

    Authors: Xinhui Huang, Chunyi Li, Abdelhak Bentaleb, Roger Zimmermann, Guangtao Zhai

    Abstract: With the rapid growth of Internet video data amounts and types, a unified Video Quality Assessment (VQA) is needed to inspire video communication with perceptual quality. To meet the real-time and universal requirements in providing such inspiration, this study proposes a VQA model from a classification of User Generated Content (UGC), Professionally Generated Content (PGC), and Occupationally Gen… ▽ More

    Submitted 24 March, 2023; originally announced March 2023.

    Comments: 6 pages, 4 figures

  35. arXiv:2303.08015  [pdf, ps, other

    q-bio.QM cs.IT eess.SP

    Molecular Communication for Quorum Sensing Inspired Cooperative Drug Delivery

    Authors: Yuting Fang, Stuart T. Johnston, Matt Faria, Xinyu Huang, Andrew W. Eckford, Jamie Evans

    Abstract: A cooperative drug delivery system is proposed, where quorum sensing (QS), a density-dependent bacterial behavior coordination mechanism, is employed by synthetic bacterium-based nanomachines (B-NMs) for controllable drug delivery. In our proposed system, drug delivery is only triggered when there are enough QS molecules, which in turn only happens when there are enough B-NMs. This makes the propo… ▽ More

    Submitted 14 February, 2023; originally announced March 2023.

    Comments: 9 pages; 9 figures

  36. arXiv:2302.13755  [pdf, ps, other

    eess.SY cs.MA

    Neuroadaptive Distributed Event-triggered Control of Networked Uncertain Pure-feedback Systems with Polluted Feedback

    Authors: Libei Sun, Zhirong Zhang, Xinjian Huang, Xiucai Huang

    Abstract: This paper investigates the distributed event-triggered control problem for a class of uncertain pure-feedback nonlinear multi-agent systems (MASs) with polluted feedback. Under the setting of event-triggered control, substantial challenges exist in both control design and stability analysis for systems in more general non-affine pure-feedback forms wherein all state variables are not directly and… ▽ More

    Submitted 27 February, 2023; originally announced February 2023.

  37. arXiv:2302.01728  [pdf, other

    eess.SY cs.DC

    Decentralised and Cooperative Control of Multi-Robot Systems through Distributed Optimisation

    Authors: Yi Dong, Zhongguo Li, Xingyu Zhao, Zhengtao Ding, Xiaowei Huang

    Abstract: Multi-robot cooperative control has gained extensive research interest due to its wide applications in civil, security, and military domains. This paper proposes a cooperative control algorithm for multi-robot systems with general linear dynamics. The algorithm is based on distributed cooperative optimisation and output regulation, and it achieves global optimum by utilising only information share… ▽ More

    Submitted 3 February, 2023; originally announced February 2023.

    Comments: Accepted by AAMAS'23

  38. AnycostFL: Efficient On-Demand Federated Learning over Heterogeneous Edge Devices

    Authors: Peichun Li, Guoliang Cheng, Xumin Huang, Jiawen Kang, Rong Yu, Yuan Wu, Miao Pan

    Abstract: In this work, we investigate the challenging problem of on-demand federated learning (FL) over heterogeneous edge devices with diverse resource constraints. We propose a cost-adjustable FL framework, named AnycostFL, that enables diverse edge devices to efficiently perform local updates under a wide range of efficiency constraints. To this end, we design the model shrinking to support local model… ▽ More

    Submitted 8 January, 2023; originally announced January 2023.

    Comments: Accepted to IEEE INFOCOM 2023

    Journal ref: IEEE INFOCOM 2023 - IEEE Conference on Computer Communications, New York City, NY, USA, 2023, pp. 1-10

  39. arXiv:2212.03329  [pdf, other

    cs.LG eess.SP q-bio.NC

    Enhancing Low-Density EEG-Based Brain-Computer Interfaces with Similarity-Keeping Knowledge Distillation

    Authors: Xin-Yao Huang, Sung-Yu Chen, Chun-Shu Wei

    Abstract: Electroencephalogram (EEG) has been one of the common neuromonitoring modalities for real-world brain-computer interfaces (BCIs) because of its non-invasiveness, low cost, and high temporal resolution. Recently, light-weight and portable EEG wearable devices based on low-density montages have increased the convenience and usability of BCI applications. However, loss of EEG decoding performance is… ▽ More

    Submitted 6 December, 2022; originally announced December 2022.

  40. arXiv:2211.15036  [pdf, ps, other

    eess.SY

    Quantized control of non-Lipschitz nonlinear systems: a novel control framework with prescribed transient performance and lower design complexity

    Authors: Zongcheng Liu, Jiangshuai Huang, Changyun Wen, Jing Zhou, Xiucai Huang

    Abstract: A novel control design framework is proposed for a class of non-Lipschitz nonlinear systems with quantized states, meanwhile prescribed transient performance and lower control design complexity could be guaranteed. Firstly, different from all existing control methods for systems with state quantization, global stability of strict-feedback nonlinear systems is achieved without requiring the conditi… ▽ More

    Submitted 27 November, 2022; originally announced November 2022.

  41. arXiv:2211.06906  [pdf, other

    eess.IV eess.SP

    Digital Twin-Assisted Collaborative Transcoding for Better User Satisfaction in Live Streaming

    Authors: Xinyu Huang, Mushu Li, Wen Wu, Conghao Zhou, Xuemin Sherman Shen

    Abstract: In this paper, we propose a digital twin (DT)-assisted cloud-edge collaborative transcoding scheme to enhance user satisfaction in live streaming. We first present a DT-assisted transcoding workload estimation (TWE) model for the cloud-edge collaborative transcoding. Particularly, two DTs are constructed for emulating the cloud-edge collaborative transcoding process by analyzing spatial-temporal i… ▽ More

    Submitted 13 November, 2022; originally announced November 2022.

    Comments: Submitted to ICC 2023

  42. arXiv:2210.15022  [pdf, other

    eess.IV cs.CV

    Automatic Assessment of Infant Face and Upper-Body Symmetry as Early Signs of Torticollis

    Authors: Michael Wan, Xiaofei Huang, Bethany Tunik, Sarah Ostadabbas

    Abstract: We apply computer vision pose estimation techniques developed expressly for the data-scarce infant domain to the study of torticollis, a common condition in infants for which early identification and treatment is critical. Specifically, we use a combination of facial landmark and body joint estimation techniques designed for infants to estimate a range of geometric measures pertaining to face and… ▽ More

    Submitted 7 November, 2022; v1 submitted 26 October, 2022; originally announced October 2022.

  43. arXiv:2210.08369  [pdf

    physics.optics eess.IV

    Metasurface Smart Glass for Object Recognition

    Authors: Cheng-Chia Tsai, Xiaoyan Huang, Zhicheng Wu, Zongfu Yu, Nanfang Yu

    Abstract: Recent years have seen a considerable surge of research on developing heuristic approaches to realize analog computing using physical waves. Among these, neuromorphic computing using light waves is envisioned to feature performance metrics such as computational speed and energy efficiency exceeding those of conventional digital techniques by many orders of magnitude. Yet, neuromorphic computing ba… ▽ More

    Submitted 15 October, 2022; originally announced October 2022.

    Comments: 30 pages, 6 figures

  44. arXiv:2210.07098  [pdf

    cs.LG eess.SY

    Meta-learning Based Short-Term Passenger Flow Prediction for Newly-Operated Urban Rail Transit Stations

    Authors: Kuo Han, Jinlei Zhang, Chunqi Zhu, Lixing Yang, Xiaoyu Huang, Songsong Li

    Abstract: Accurate short-term passenger flow prediction in urban rail transit stations has great benefits for reasonably allocating resources, easing congestion, and reducing operational risks. However, compared with data-rich stations, the passenger flow prediction in newly-operated stations is limited by passenger flow data volume, which would reduce the prediction accuracy and increase the difficulty for… ▽ More

    Submitted 13 October, 2022; originally announced October 2022.

    Comments: 37 pages, 13 figures, 3 tables

  45. arXiv:2210.04435  [pdf, other

    cs.RO cs.AI eess.SY

    Creating a Dynamic Quadrupedal Robotic Goalkeeper with Reinforcement Learning

    Authors: Xiaoyu Huang, Zhongyu Li, Yanzhen Xiang, Yiming Ni, Yufeng Chi, Yunhao Li, Lizhi Yang, Xue Bin Peng, Koushil Sreenath

    Abstract: We present a reinforcement learning (RL) framework that enables quadrupedal robots to perform soccer goalkeeping tasks in the real world. Soccer goalkeeping using quadrupeds is a challenging problem, that combines highly dynamic locomotion with precise and fast non-prehensile object (ball) manipulation. The robot needs to react to and intercept a potentially flying ball using dynamic locomotion ma… ▽ More

    Submitted 10 October, 2022; originally announced October 2022.

    Comments: First two authors contributed equally. Accompanying video is at https://youtu.be/iX6OgG67-ZQ

  46. arXiv:2209.06261  [pdf, other

    cs.RO cs.AI cs.GR cs.LG eess.SY

    Real2Sim2Real Transfer for Control of Cable-driven Robots via a Differentiable Physics Engine

    Authors: Kun Wang, William R. Johnson III, Shiyang Lu, Xiaonan Huang, Joran Booth, Rebecca Kramer-Bottiglio, Mridul Aanjaneya, Kostas Bekris

    Abstract: Tensegrity robots, composed of rigid rods and flexible cables, exhibit high strength-to-weight ratios and significant deformations, which enable them to navigate unstructured terrains and survive harsh impacts. They are hard to control, however, due to high dimensionality, complex dynamics, and a coupled architecture. Physics-based simulation is a promising avenue for developing locomotion policie… ▽ More

    Submitted 17 September, 2023; v1 submitted 13 September, 2022; originally announced September 2022.

    Comments: Accepted to IROS2023; https://sites.google.com/view/sim2real

  47. arXiv:2208.09792  [pdf, other

    eess.SP cs.IT eess.SY

    Simultaneous Beam and User Selection for the Beamspace mmWave/THz Massive MIMO Downlink

    Authors: Kai Wu, J. Andrew Zhang, Xiaojing Huang, Y. Jay Guo, Lajos Hanzo

    Abstract: Beamspace millimeter-wave (mmWave) and terahertz (THz) massive MIMO constitute attractive schemes for next-generation communications, given their abundant bandwidth and high throughput. However, their user and beam selection problem has to be efficiently addressed. Inspired by this challenge, we develop low-complexity solutions explicitly. We introduce the dirty paper coding (DPC) into the joint u… ▽ More

    Submitted 15 January, 2023; v1 submitted 20 August, 2022; originally announced August 2022.

    Comments: 12 pages, 8 figures; to appear in IEEE Transactions on Communications

  48. arXiv:2208.09791  [pdf, other

    eess.SP cs.IT eess.SY

    Joint Communications and Sensing Employing Optimized MIMO-OFDM Signals

    Authors: Kai Wu, J. Andrew Zhang, Zhitong Ni, Xiaojing Huang, Y. Jay Guo, Shanzhi Chen

    Abstract: Joint communication and sensing (JCAS) has the potential to improve the overall energy, cost and frequency efficiency of IoT systems. As a first effort, we propose to optimize the MIMO-OFDM data symbols carried by sub-carriers for better time- and spatial-domain signal orthogonality. This not only boosts the availability of usable signals for JCAS, but also significantly facilitates Internet-of-Th… ▽ More

    Submitted 20 August, 2022; originally announced August 2022.

    Comments: 15 pages, 7 figures; submitted to an IEEE journal

  49. arXiv:2208.09782  [pdf, other

    eess.SP

    Green Joint Communications and Sensing Employing Analog Multi-Beam Antenna Arrays

    Authors: Kai Wu, J. Andrew Zhang, Xiaojing Huang, Robert W. Heath Jr., Y. Jay Guo

    Abstract: Joint communications and sensing (JCAS) is potentially a hallmark technology for the sixth generation mobile network (6G). Most existing JCAS designs are based on digital arrays, analog arrays with tunable phase shifters, or hybrid arrays, which are effective but are generally complicated to design and power inefficient. This article introduces the energy-efficient and easy-to-design multi-beam an… ▽ More

    Submitted 1 January, 2023; v1 submitted 20 August, 2022; originally announced August 2022.

    Comments: to appear in IEEE Communications Magazine; 7 pages, 5 figures, 1 table

  50. arXiv:2208.05616  [pdf, other

    eess.IV cs.CV cs.LG

    OpenMedIA: Open-Source Medical Image Analysis Toolbox and Benchmark under Heterogeneous AI Computing Platforms

    Authors: Jia-Xin Zhuang, Xiansong Huang, Yang Yang, Jiancong Chen, Yue Yu, Wei Gao, Ge Li, Jie Chen, Tong Zhang

    Abstract: In this paper, we present OpenMedIA, an open-source toolbox library containing a rich set of deep learning methods for medical image analysis under heterogeneous Artificial Intelligence (AI) computing platforms. Various medical image analysis methods, including 2D/3D medical image classification, segmentation, localisation, and detection, have been included in the toolbox with PyTorch and/or MindS… ▽ More

    Submitted 7 September, 2022; v1 submitted 10 August, 2022; originally announced August 2022.

    Comments: 12 pages, 1 figure