(Translated by https://www.hiragana.jp/)
Search | arXiv e-print repository
Skip to main content

Showing 1–50 of 194 results for author: Zhang, F

Searching in archive eess. Search in all archives.
.
  1. arXiv:2407.07318  [pdf

    physics.optics eess.IV

    Serial coherent diffraction imaging of dynamic samples based on inter-frame constraint

    Authors: Pengju Sheng, Fucai Zhang

    Abstract: We proposed a novel approach to coherent imaging of dynamic samples. The inter-frame similarity of the sample's local structures is found to be a powerful constraint in phasing a sequence of diffraction patterns. We devised a new image reconstruction algorithm that exploits this inter-frame constraint enabled by an adaptive similar region determination approach. We demonstrated the feasibility of… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

  2. arXiv:2407.05749  [pdf, other

    eess.SP cs.HC cs.LG

    LDGCN: An Edge-End Lightweight Dual GCN Based on Single-Channel EEG for Driver Drowsiness Monitoring

    Authors: Jingwei Huang, Chuansheng Wang, Jiayan Huang, Haoyi Fan, Antoni Grau, Fuquan Zhang

    Abstract: Driver drowsiness electroencephalography (EEG) signal monitoring can timely alert drivers of their drowsiness status, thereby reducing the probability of traffic accidents. Graph convolutional networks (GCNs) have shown significant advancements in processing the non-stationary, time-varying, and non-Euclidean nature of EEG signals. However, the existing single-channel EEG adjacency graph construct… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  3. arXiv:2407.02675  [pdf, other

    eess.IV cs.CV

    Depth-Aware Endoscopic Video Inpainting

    Authors: Francis Xiatian Zhang, Shuang Chen, Xianghua Xie, Hubert P. H. Shum

    Abstract: Video inpainting fills in corrupted video content with plausible replacements. While recent advances in endoscopic video inpainting have shown potential for enhancing the quality of endoscopic videos, they mainly repair 2D visual information without effectively preserving crucial 3D spatial details for clinical reference. Depth-aware inpainting methods attempt to preserve these details by incorpor… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: Accepted by MICCAI 2024

  4. arXiv:2406.13268  [pdf, other

    eess.AS cs.SD

    CEC: A Noisy Label Detection Method for Speaker Recognition

    Authors: Yao Shen, Yingying Gao, Yaqian Hao, Chenguang Hu, Fulin Zhang, Junlan Feng, Shilei Zhang

    Abstract: Noisy labels are inevitable, even in well-annotated datasets. The detection of noisy labels is of significant importance to enhance the robustness of speaker recognition models. In this paper, we propose a novel noisy label detection approach based on two new statistical metrics: Continuous Inconsistent Counting (CIC) and Total Inconsistent Counting (TIC). These metrics are calculated through Cros… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: interspeech 2024

  5. arXiv:2406.11653  [pdf, other

    eess.SY

    Communication-Efficient MARL for Platoon Stability and Energy-efficiency Co-optimization in Cooperative Adaptive Cruise Control of CAVs

    Authors: Min Hua, Dong Chen, Kun Jiang, Fanggang Zhang, Jinhai Wang, Bo Wang, Quan Zhou, Hongming Xu

    Abstract: Cooperative adaptive cruise control (CACC) has been recognized as a fundamental function of autonomous driving, in which platoon stability and energy efficiency are outstanding challenges that are difficult to accommodate in real-world operations. This paper studied the CACC of connected and autonomous vehicles (CAVs) based on the multi-agent reinforcement learning algorithm (MARL) to optimize pla… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  6. arXiv:2406.06640  [pdf

    physics.comp-ph eess.IV physics.optics

    A high-performance reconstruction method for partially coherent ptychography

    Authors: Wenhui Xu, Shoucong Ning, Pengju Sheng, Huixiang Lin, Angus I Kirkland, Yong Peng, Fucai Zhang

    Abstract: Ptychography is now integrated as a tool in mainstream microscopy allowing quantitative and high-resolution imaging capabilities over a wide field of view. However, its ultimate performance is inevitably limited by the available coherent flux when implemented using electrons or laboratory X-ray sources. We present a universal reconstruction algorithm with high tolerance to low coherence for both f… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

  7. arXiv:2406.00212  [pdf, other

    eess.IV cs.CV

    MVAD: A Multiple Visual Artifact Detector for Video Streaming

    Authors: Chen Feng, Duolikun Danier, Fan Zhang, David Bull

    Abstract: Visual artifacts are often introduced into streamed video content, due to prevailing conditions during content production and/or delivery. Since these can degrade the quality of the user's experience, it is important to automatically and accurately detect them in order to enable effective quality measurement and enhancement. Existing detection methods often focus on a single type of artifact and/o… ▽ More

    Submitted 31 May, 2024; originally announced June 2024.

    Comments: 9 pages

  8. arXiv:2405.19336  [pdf

    eess.SP

    Image-based retrieval of all-day cloud physical parameters for FY4A/AGRI and its application over the Tibetan Plateau

    Authors: Zhijun Zhao, Feng Zhang, Wenwen Li, Jingwei Li

    Abstract: Satellite remote sensing serves as a crucial means to acquire cloud physical parameters. However, existing official cloud products derived from the advanced geostationary radiation imager (AGRI) onboard the Fengyun-4A geostationary satellite suffer from limitations in computational precision and efficiency. In this study, an image-based transfer learning model (ITLM) was developed to realize all-d… ▽ More

    Submitted 28 March, 2024; originally announced May 2024.

  9. Survey on Visual Signal Coding and Processing with Generative Models: Technologies, Standards and Optimization

    Authors: Zhibo Chen, Heming Sun, Li Zhang, Fan Zhang

    Abstract: This paper provides a survey of the latest developments in visual signal coding and processing with generative models. Specifically, our focus is on presenting the advancement of generative models and their influence on research in the domain of visual signal coding and processing. This survey study begins with a brief introduction of well-established generative models, including the Variational A… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  10. arXiv:2405.08621  [pdf, other

    eess.IV cs.CV

    RMT-BVQA: Recurrent Memory Transformer-based Blind Video Quality Assessment for Enhanced Video Content

    Authors: Tianhao Peng, Chen Feng, Duolikun Danier, Fan Zhang, David Bull

    Abstract: With recent advances in deep learning, numerous algorithms have been developed to enhance video quality, reduce visual artefacts and improve perceptual quality. However, little research has been reported on the quality assessment of enhanced content - the evaluation of enhancement methods is often based on quality metrics that were designed for compression applications. In this paper, we propose a… ▽ More

    Submitted 15 May, 2024; v1 submitted 14 May, 2024; originally announced May 2024.

    Comments: 8pages, 2figures

  11. arXiv:2405.06159  [pdf, other

    eess.SP

    Near-Field Channel Characterization for Mid-band ELAA Systems: Sounding, Parameter Estimation, and Modeling

    Authors: Wei Fan, Zhiqiang Yuan, Yejian Lyu, Jianhua Zhang, Gert Pedersen, Jonathan Borrill, Fengchun Zhang

    Abstract: 6G communication will greatly benefit from using extremely large-scale antenna arrays (ELAAs) and new mid-band spectrums (7-24 GHz). These techniques require a thorough exploration of the challenges and potentials of the associated near-field (NF) phenomena. It is crucial to develop accurate NF channel models that include spherical wave propagation and spatial non-stationarity (SnS). However, chan… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

    Comments: Submitted to IEEE Communication Magazine

  12. arXiv:2404.18580  [pdf, other

    cs.RO eess.SY

    Data-Driven Dynamics Modeling of Miniature Robotic Blimps Using Neural ODEs With Parameter Auto-Tuning

    Authors: Yongjian Zhu, Hao Cheng, Feitian Zhang

    Abstract: Miniature robotic blimps, as one type of lighter-than-air aerial vehicles, have attracted increasing attention in the science and engineering community for their enhanced safety, extended endurance, and quieter operation compared to quadrotors. Accurately modeling the dynamics of these robotic blimps poses a significant challenge due to the complex aerodynamics stemming from their large lifting bo… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

    Comments: 8 pages, 8 figures

  13. arXiv:2404.10240  [pdf, other

    eess.SY

    Disturbance Rejection-Guarded Learning for Vibration Suppression of Two-Inertia Systems

    Authors: Fan Zhang, Jinfeng Chen, Yu Hu, Zhiqiang Gao, Ge Lv, Qin Lin

    Abstract: Model uncertainty presents significant challenges in vibration suppression of multi-inertia systems, as these systems often rely on inaccurate nominal mathematical models due to system identification errors or unmodeled dynamics. An observer, such as an extended state observer (ESO), can estimate the discrepancy between the inaccurate nominal model and the true model, thus improving control perfor… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

  14. arXiv:2404.09571  [pdf, other

    eess.IV cs.CV

    MTKD: Multi-Teacher Knowledge Distillation for Image Super-Resolution

    Authors: Yuxuan Jiang, Chen Feng, Fan Zhang, David Bull

    Abstract: Knowledge distillation (KD) has emerged as a promising technique in deep learning, typically employed to enhance a compact student network through learning from their high-performance but more complex teacher variant. When applied in the context of image super-resolution, most KD approaches are modified versions of methods developed for other computer vision tasks, which are based on training stra… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

  15. arXiv:2403.19001  [pdf, other

    cs.CV cs.AI eess.IV q-bio.NC

    Cross-domain Fiber Cluster Shape Analysis for Language Performance Cognitive Score Prediction

    Authors: Yui Lo, Yuqian Chen, Dongnan Liu, Wan Liu, Leo Zekelman, Fan Zhang, Yogesh Rathi, Nikos Makris, Alexandra J. Golby, Weidong Cai, Lauren J. O'Donnell

    Abstract: Shape plays an important role in computer graphics, offering informative features to convey an object's morphology and functionality. Shape analysis in brain imaging can help interpret structural and functionality correlations of the human brain. In this work, we investigate the shape of the brain's 3D white matter connections and its potential predictive relationship to human cognitive function.… ▽ More

    Submitted 29 March, 2024; v1 submitted 27 March, 2024; originally announced March 2024.

    Comments: 2 figures, 11 pages

  16. arXiv:2403.11074  [pdf, other

    cs.CV cs.AI cs.MM cs.SD eess.AS

    Audio-Visual Segmentation via Unlabeled Frame Exploitation

    Authors: Jinxiang Liu, Yikun Liu, Fei Zhang, Chen Ju, Ya Zhang, Yanfeng Wang

    Abstract: Audio-visual segmentation (AVS) aims to segment the sounding objects in video frames. Although great progress has been witnessed, we experimentally reveal that current methods reach marginal performance gain within the use of the unlabeled frames, leading to the underutilization issue. To fully explore the potential of the unlabeled frames for AVS, we explicitly divide them into two categories bas… ▽ More

    Submitted 16 March, 2024; originally announced March 2024.

    Comments: Accepted by CVPR 2024

  17. arXiv:2403.10805  [pdf, other

    cs.SD cs.AI cs.CV cs.GR cs.HC eess.AS

    Speech-driven Personalized Gesture Synthetics: Harnessing Automatic Fuzzy Feature Inference

    Authors: Fan Zhang, Zhaohan Wang, Xin Lyu, Siyuan Zhao, Mengjian Li, Weidong Geng, Naye Ji, Hui Du, Fuxing Gao, Hao Wu, Shunman Li

    Abstract: Speech-driven gesture generation is an emerging field within virtual human creation. However, a significant challenge lies in accurately determining and processing the multitude of input features (such as acoustic, semantic, emotional, personality, and even subtle unknown features). Traditional approaches, reliant on various explicit feature inputs and complex multimodal processing, constrain the… ▽ More

    Submitted 16 March, 2024; originally announced March 2024.

    Comments: 12 pages,

  18. arXiv:2402.11356  [pdf, other

    eess.SP

    Experimental Study of Spatial Statistics for Ultra-Reliable Communications

    Authors: Tobias Kallehauge, Anders E. Kalør, Fengchun Zhang, Petar Popovski

    Abstract: This paper presents an experimental validation for prediction of rare fading events using channel distribution information (CDI) maps that predict channel statistics from measurements acquired at surrounding locations using spatial interpolation. Using experimental channel measurements from 127 locations, we demonstrate the use case of providing statistical guarantees for rate selection in ultra-r… ▽ More

    Submitted 17 February, 2024; originally announced February 2024.

    Comments: Accepted for IEEE International Conference on Communications (ICC) in june 2024

  19. arXiv:2402.01596  [pdf, other

    eess.IV cs.CV

    Immersive Video Compression using Implicit Neural Representations

    Authors: Ho Man Kwan, Fan Zhang, Andrew Gower, David Bull

    Abstract: Recent work on implicit neural representations (INRs) has evidenced their potential for efficiently representing and encoding conventional video content. In this paper we, for the first time, extend their application to immersive (multi-view) videos, by proposing MV-HiNeRV, a new INR-based immersive video codec. MV-HiNeRV is an enhanced version of a state-of-the-art INR-based video codec, HiNeRV,… ▽ More

    Submitted 23 February, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

  20. arXiv:2401.05915  [pdf, other

    eess.IV

    Neural Implicit Surface Reconstruction of Freehand 3D Ultrasound Volume with Geometric Constraints

    Authors: Hongbo Chen, Logiraj Kumaralingam, Shuhang Zhang, Sheng Song, Fayi Zhang, Haibin Zhang, Thanh-Tu Pham, Edmond H. M. Lou, Kumaradevan Punithakumar, Yuyao Zhang, Lawrence H. Le, Rui Zheng

    Abstract: Three-dimensional (3D) freehand ultrasound (US) is a widely used imaging modality that allows non-invasive imaging of medical anatomy without radiation exposure. Surface reconstruction of US volume is vital to acquire the accurate anatomical structures needed for modeling, registration, and visualization. However, traditional methods cannot produce a high-quality surface due to image noise. Despit… ▽ More

    Submitted 11 July, 2024; v1 submitted 11 January, 2024; originally announced January 2024.

    Comments: Preprint

  21. arXiv:2401.04579  [pdf

    q-bio.QM cs.AI eess.IV

    A Deep Network for Explainable Prediction of Non-Imaging Phenotypes using Anatomical Multi-View Data

    Authors: Yuxiang Wei, Yuqian Chen, Tengfei Xue, Leo Zekelman, Nikos Makris, Yogesh Rathi, Weidong Cai, Fan Zhang, Lauren J. O' Donnell

    Abstract: Large datasets often contain multiple distinct feature sets, or views, that offer complementary information that can be exploited by multi-view learning methods to improve results. We investigate anatomical multi-view data, where each brain anatomical structure is described with multiple feature sets. In particular, we focus on sets of white matter microstructure and connectivity features from dif… ▽ More

    Submitted 13 January, 2024; v1 submitted 9 January, 2024; originally announced January 2024.

    Comments: 2023 The Medical Image Computing and Computer Assisted Intervention Society workshop

  22. arXiv:2401.03396  [pdf

    eess.SP

    A Closed-loop Brain-Machine Interface SoC Featuring a 0.2$μみゅー$J/class Multiplexer Based Neural Network

    Authors: Chao Zhang, Yongxiang Guo, Dawid Sheng, Zhixiong Ma, Chao Sun, Yuwei Zhang, Wenxin Zhao, Fenyan Zhang, Tongfei Wang, Xing Sheng, Milin Zhang

    Abstract: This work presents the first fabricated electrophysiology-optogenetic closed-loop bidirectional brain-machine interface (CL-BBMI) system-on-chip (SoC) with electrical neural signal recording, on-chip sleep staging and optogenetic stimulation. The first multiplexer with static assignment based table lookup solution (MUXnet) for multiplier-free NN processor was proposed. A state-of-the-art average a… ▽ More

    Submitted 7 January, 2024; originally announced January 2024.

    Comments: 2 pages, 6 figures. Accepted by IEEE Custom Integrated Circuits Conference (CICC) 2024. The codes for the MUXnet (constructing neural networks using multiplexers instead of multipliers) will be open-sourced after the Journal version of this work is accepted

  23. arXiv:2401.00523  [pdf, other

    eess.IV cs.CV

    Compressing Deep Image Super-resolution Models

    Authors: Yuxuan Jiang, Jakub Nawala, Fan Zhang, David Bull

    Abstract: Deep learning techniques have been applied in the context of image super-resolution (SR), achieving remarkable advances in terms of reconstruction performance. Existing techniques typically employ highly complex model structures which result in large model sizes and slow inference speeds. This often leads to high energy consumption and restricts their adoption for practical applications. To addres… ▽ More

    Submitted 21 February, 2024; v1 submitted 31 December, 2023; originally announced January 2024.

  24. arXiv:2312.14364  [pdf, other

    eess.SY

    GreenScan: Towards large-scale terrestrial monitoring the health of urban trees using mobile sensing

    Authors: Akshit Gupta, Simone Mora, Fan Zhang, Martine Rutten, R. Venkatesha Prasad, Carlo Ratti

    Abstract: Healthy urban greenery is a fundamental asset to mitigate climate change phenomena such as extreme heat and air pollution. However, urban trees are often affected by abiotic and biotic stressors that hamper their functionality, and whenever not timely managed, even their survival. While the current greenery inspection techniques can help in taking effective measures, they often require a high amou… ▽ More

    Submitted 6 April, 2024; v1 submitted 21 December, 2023; originally announced December 2023.

    Comments: 13 pages, submitted to IEEE Sensors

  25. Full-reference Video Quality Assessment for User Generated Content Transcoding

    Authors: Zihao Qi, Chen Feng, Duolikun Danier, Fan Zhang, Xiaozhong Xu, Shan Liu, David Bull

    Abstract: Unlike video coding for professional content, the delivery pipeline of User Generated Content (UGC) involves transcoding where unpristine reference content needs to be compressed repeatedly. In this work, we observe that existing full-/no-reference quality metrics fail to accurately predict the perceptual quality difference between transcoded UGC content and the corresponding unpristine references… ▽ More

    Submitted 19 December, 2023; originally announced December 2023.

    Comments: 5 pages, 4 figures

  26. arXiv:2312.08864  [pdf, other

    eess.IV cs.CV

    RankDVQA-mini: Knowledge Distillation-Driven Deep Video Quality Assessment

    Authors: Chen Feng, Duolikun Danier, Haoran Wang, Fan Zhang, Benoit Vallade, Alex Mackin, David Bull

    Abstract: Deep learning-based video quality assessment (deep VQA) has demonstrated significant potential in surpassing conventional metrics, with promising improvements in terms of correlation with human perception. However, the practical deployment of such deep VQA models is often limited due to their high computational complexity and large memory requirements. To address this issue, we aim to significantl… ▽ More

    Submitted 7 March, 2024; v1 submitted 14 December, 2023; originally announced December 2023.

    Comments: The paper has been accepted by Picture Coding Symposium (PCS) 2024

  27. Accelerating Learnt Video Codecs with Gradient Decay and Layer-wise Distillation

    Authors: Tianhao Peng, Ge Gao, Heming Sun, Fan Zhang, David Bull

    Abstract: In recent years, end-to-end learnt video codecs have demonstrated their potential to compete with conventional coding algorithms in term of compression efficiency. However, most learning-based video compression models are associated with high computational complexity and latency, in particular at the decoder side, which limits their deployment in practical applications. In this paper, we present a… ▽ More

    Submitted 5 December, 2023; originally announced December 2023.

    Report number: 2312.02605

  28. arXiv:2311.14935  [pdf

    eess.IV

    A Novel Deep Clustering Framework for Fine-Scale Parcellation of Amygdala Using dMRI Tractography

    Authors: Haolin He, Ce Zhu, Le Zhang, Yipeng Liu, Xiao Xu, Yuqian Chen, Leo Zekelman, Jarrett Rushmore, Yogesh Rathi, Nikos Makris, Lauren J. O'Donnell, Fan Zhang

    Abstract: The amygdala plays a vital role in emotional processing and exhibits structural diversity that necessitates fine-scale parcellation for a comprehensive understanding of its anatomico-functional correlations. Diffusion MRI tractography is an advanced imaging technique that can estimate the brain's white matter structural connectivity to potentially reveal the topography of the amygdala for studying… ▽ More

    Submitted 25 November, 2023; originally announced November 2023.

  29. arXiv:2311.08415  [pdf

    eess.IV physics.optics

    Scanning phase imaging without accurate positioning system

    Authors: Tao Liu, Bingyang Wang, JiangTao Zhao, Fu rong Chen, Fucai Zhang

    Abstract: Ptychography, a high-resolution phase imaging technique using precise in-plane translation information, has been widely applied in modern synchrotron radiation sources across the globe. A key requirement for successful ptychographic reconstruction is the precise knowledge of the scanning positions, which are typically obtained by a physical interferometric positioning system. Whereas high-throughp… ▽ More

    Submitted 31 October, 2023; originally announced November 2023.

    Comments: 9 pages,4 figures

  30. arXiv:2311.06276  [pdf, other

    eess.IV cs.CV

    Enhancing the machine vision performance with multi-spectral light sources

    Authors: Feng Zhang, Rui Bao, Congqi Dai, Wanlu Zhang, Shu Liu, Ruiqian Guo

    Abstract: This study mainly focuses on the performance of different multi-spectral light sources on different object colors in machine vision and tries to enhance machine vision with multi-spectral light sources. Using different color pencils as samples, by recognizing the collected images with two classical neural networks, AlexNet and VGG19, the performance was investigated under 35 different multi-spectr… ▽ More

    Submitted 20 October, 2023; originally announced November 2023.

    Comments: 12 pages, 7 figures

  31. arXiv:2311.04483  [pdf, other

    eess.SP

    Cross-Domain Dual-Functional OFDM Waveform Design for Accurate Sensing/Positioning

    Authors: Fan Zhang, Tianqi Mao, Ruiqi Liu, Zhu Han, Sheng Chen, Zhaocheng Wang

    Abstract: Orthogonal frequency division multiplexing (OFDM) has been widely recognized as the representative waveform for 5G wireless networks, which can directly support sensing/positioning with existing infrastructure. To guarantee superior sensing/positioning accuracy while supporting high-speed communications simultaneously, the dual functions tend to be assigned with different resource elements (REs) d… ▽ More

    Submitted 19 March, 2024; v1 submitted 8 November, 2023; originally announced November 2023.

  32. arXiv:2311.02894  [pdf

    eess.SY

    Design and Performance Analysis of a Class of Generalized Predictive Controllers

    Authors: Feilong Zhang

    Abstract: The design and structure of generalized predictive control (GPC) are not simple and intuitive. The performance analysis does not deeply analyze how the controller parameters affect the system characteristics and the relationship between the tracking error caused by the noise and the selected controller parameters. This paper proposes a generalized predictive control, and its design is simple and i… ▽ More

    Submitted 6 November, 2023; originally announced November 2023.

  33. arXiv:2311.02139  [pdf

    physics.bio-ph eess.IV physics.optics

    Broadband ptychographic imaging of biological samples using a deconvolution algorithm

    Authors: Huixiang Lin, Fucai Zhang

    Abstract: Ptychography is an attractive advance of coherent diffraction imaging (CDI), which can provide high lateral resolution and wide field of view. The theoretical resolution of ptychography is dose-limited, therefore making ptychography workable with a broadband source will be highly beneficial. However, broad spectra of light source conflict with the high coherence assumption in CDI that the current… ▽ More

    Submitted 3 November, 2023; originally announced November 2023.

    Comments: 6 pages, 2 figures

  34. arXiv:2310.17190  [pdf, other

    cs.CV eess.IV

    Lookup Table meets Local Laplacian Filter: Pyramid Reconstruction Network for Tone Mapping

    Authors: Feng Zhang, Ming Tian, Zhiqiang Li, Bin Xu, Qingbo Lu, Changxin Gao, Nong Sang

    Abstract: Tone mapping aims to convert high dynamic range (HDR) images to low dynamic range (LDR) representations, a critical task in the camera imaging pipeline. In recent years, 3-Dimensional LookUp Table (3D LUT) based methods have gained attention due to their ability to strike a favorable balance between enhancement performance and computational efficiency. However, these methods often fail to deliver… ▽ More

    Submitted 3 January, 2024; v1 submitted 26 October, 2023; originally announced October 2023.

    Comments: 12 pages, 6 figures, accepted by NeurlPS 2023

  35. arXiv:2310.14515  [pdf

    physics.optics eess.IV

    First realization of macroscopic Fourier ptychography for hundred-meter distance sub-diffraction imaging

    Authors: Qi Zhang, Yuran Lu, Yinghui Guo, Yingjie Shang, Mingbo Pu, Yulong Fan, Rui Zhou, Xiaoyin Li, Fei Zhang, Mingfeng Xu, Xiangang Luo

    Abstract: Fourier ptychography (FP) imaging, drawing on the idea of synthetic aperture, has been demonstrated as a potential approach for remote sub-diffraction-limited imaging. Nevertheless, the farthest imaging distance is still limited around 10 m even though there has been a significant improvement in macroscopic FP. The most severely issue in increasing the imaging distance is FoV limitation caused by… ▽ More

    Submitted 22 October, 2023; originally announced October 2023.

  36. arXiv:2310.08089  [pdf, other

    cs.GT eess.SY stat.ML

    Learning Regularized Monotone Graphon Mean-Field Games

    Authors: Fengzhuo Zhang, Vincent Y. F. Tan, Zhaoran Wang, Zhuoran Yang

    Abstract: This paper studies two fundamental problems in regularized Graphon Mean-Field Games (GMFGs). First, we establish the existence of a Nash Equilibrium (NE) of any $λらむだ$-regularized GMFG (for $λらむだ\geq 0$). This result relies on weaker conditions than those in previous works for analyzing both unregularized GMFGs ($λらむだ=0$) and $λらむだ$-regularized MFGs, which are special cases of GMFGs. Second, we propose provab… ▽ More

    Submitted 12 October, 2023; originally announced October 2023.

  37. arXiv:2309.11715   

    cs.CV eess.IV

    Deshadow-Anything: When Segment Anything Model Meets Zero-shot shadow removal

    Authors: Xiao Feng Zhang, Tian Yi Song, Jia Wei Yao

    Abstract: Segment Anything (SAM), an advanced universal image segmentation model trained on an expansive visual dataset, has set a new benchmark in image segmentation and computer vision. However, it faced challenges when it came to distinguishing between shadows and their backgrounds. To address this, we developed Deshadow-Anything, considering the generalization of large-scale datasets, and we performed F… ▽ More

    Submitted 2 January, 2024; v1 submitted 20 September, 2023; originally announced September 2023.

    Comments: it needs revised

  38. arXiv:2309.11109  [pdf, other

    cs.CV eess.IV

    Self-supervised Domain-agnostic Domain Adaptation for Satellite Images

    Authors: Fahong Zhang, Yilei Shi, Xiao Xiang Zhu

    Abstract: Domain shift caused by, e.g., different geographical regions or acquisition conditions is a common issue in machine learning for global scale satellite image processing. A promising method to address this problem is domain adaptation, where the training and the testing datasets are split into two or multiple domains according to their distributions, and an adaptation method is applied to improve t… ▽ More

    Submitted 25 September, 2023; v1 submitted 20 September, 2023; originally announced September 2023.

  39. Snapp: An Agile Robotic Fish with 3-D Maneuverability for Open Water Swim

    Authors: Timothy J. K. Ng, Nan Chen, Fu Zhang

    Abstract: Fish exhibit impressive locomotive performance and agility in complex underwater environments, using their undulating tails and pectoral fins for propulsion and maneuverability. Replicating these abilities in robotic fish is challenging; existing designs focus on either fast swimming or directional control at limited speeds, mainly within a confined environment. To address these limitations, we de… ▽ More

    Submitted 24 August, 2023; originally announced August 2023.

    Comments: 8 pages, 17 figures, to be publish in IEEE Robotics and Automation Letters The accompanying video can be found at this link: https://youtu.be/1bGmlN0Jriw

  40. arXiv:2308.05995   

    cs.SD cs.AI cs.GR cs.MM eess.AS

    Audio is all in one: speech-driven gesture synthetics using WavLM pre-trained model

    Authors: Fan Zhang, Naye Ji, Fuxing Gao, Siyuan Zhao, Zhaohan Wang, Shunman Li

    Abstract: The generation of co-speech gestures for digital humans is an emerging area in the field of virtual human creation. Prior research has made progress by using acoustic and semantic information as input and adopting classify method to identify the person's ID and emotion for driving co-speech gesture generation. However, this endeavour still faces significant challenges. These challenges go beyond t… ▽ More

    Submitted 13 April, 2024; v1 submitted 11 August, 2023; originally announced August 2023.

    Comments: This article needs major revision

  41. arXiv:2308.05862  [pdf, other

    eess.IV cs.AI cs.CV

    Unleashing the Strengths of Unlabeled Data in Pan-cancer Abdominal Organ Quantification: the FLARE22 Challenge

    Authors: Jun Ma, Yao Zhang, Song Gu, Cheng Ge, Shihao Ma, Adamo Young, Cheng Zhu, Kangkang Meng, Xin Yang, Ziyan Huang, Fan Zhang, Wentao Liu, YuanKe Pan, Shoujin Huang, Jiacheng Wang, Mingze Sun, Weixin Xu, Dengqiang Jia, Jae Won Choi, Natália Alves, Bram de Wilde, Gregor Koehler, Yajun Wu, Manuel Wiesenfarth, Qiongjie Zhu , et al. (4 additional authors not shown)

    Abstract: Quantitative organ assessment is an essential step in automated abdominal disease diagnosis and treatment planning. Artificial intelligence (AI) has shown great potential to automatize this process. However, most existing AI algorithms rely on many expert annotations and lack a comprehensive evaluation of accuracy and efficiency in real-world multinational settings. To overcome these limitations,… ▽ More

    Submitted 10 August, 2023; originally announced August 2023.

    Comments: MICCAI FLARE22: https://flare22.grand-challenge.org/

  42. arXiv:2307.16508  [pdf, other

    cs.CV cs.MM eess.IV

    Towards General Low-Light Raw Noise Synthesis and Modeling

    Authors: Feng Zhang, Bin Xu, Zhiqiang Li, Xinran Liu, Qingbo Lu, Changxin Gao, Nong Sang

    Abstract: Modeling and synthesizing low-light raw noise is a fundamental problem for computational photography and image processing applications. Although most recent works have adopted physics-based models to synthesize noise, the signal-independent noise in low-light conditions is far more complicated and varies dramatically across camera sensors, which is beyond the description of these models. To addres… ▽ More

    Submitted 17 August, 2023; v1 submitted 31 July, 2023; originally announced July 2023.

    Comments: 11 pages, 7 figures. Accepted by ICCV 2023

    Journal ref: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023, pp. 10820-10830

  43. arXiv:2307.07713  [pdf, other

    eess.SY cs.RO

    Data-Driven Optimal Control of Tethered Space Robot Deployment with Learning Based Koopman Operator

    Authors: Ao Jin, Fan Zhang, Panfeng Huang

    Abstract: To avoid complex constraints of the traditional nonlinear method for tethered space robot (TSR) deployment, this paper proposes a data-driven optimal control framework with an improved deep learning based Koopman operator that could be applied to complex environments. In consideration of TSR's nonlinearity, its finite dimensional lifted representation is derived with the state-dependent only embed… ▽ More

    Submitted 15 July, 2023; originally announced July 2023.

    Comments: 10pages, 10figures

  44. arXiv:2307.07366  [pdf, other

    eess.IV

    Reconstructing Three-decade Global Fine-Grained Nighttime Light Observations by a New Super-Resolution Framework

    Authors: Jinyu Guo, Feng Zhang, Hang Zhao, Baoxiang Pan, Linlu Mei

    Abstract: Satellite-collected nighttime light provides a unique perspective on human activities, including urbanization, population growth, and epidemics. Yet, long-term and fine-grained nighttime light observations are lacking, leaving the analysis and applications of decades of light changes in urban facilities undeveloped. To fill this gap, we developed an innovative framework and used it to design a new… ▽ More

    Submitted 14 July, 2023; originally announced July 2023.

  45. arXiv:2307.00861  [pdf, other

    cs.RO eess.SY

    Perch a quadrotor on planes by the ceiling effect

    Authors: Yuying Zou, Haotian Li, Yunfan Ren, Wei Xu, Yihang Li, Yixi Cai, Shenji Zhou, Fu Zhang

    Abstract: Perching is a promising solution for a small unmanned aerial vehicle (UAV) to save energy and extend operation time. This paper proposes a quadrotor that can perch on planar structures using the ceiling effect. Compared with the existing work, this perching method does not require any claws, hooks, or adhesive pads, leading to a simpler system design. This method does not limit the perching by sur… ▽ More

    Submitted 3 July, 2023; originally announced July 2023.

  46. HiNeRV: Video Compression with Hierarchical Encoding-based Neural Representation

    Authors: Ho Man Kwan, Ge Gao, Fan Zhang, Andrew Gower, David Bull

    Abstract: Learning-based video compression is currently a popular research topic, offering the potential to compete with conventional standard video codecs. In this context, Implicit Neural Representations (INRs) have previously been used to represent and compress image and video content, demonstrating relatively high decoding speed compared to other methods. However, existing INR-based methods have failed… ▽ More

    Submitted 26 January, 2024; v1 submitted 16 June, 2023; originally announced June 2023.

  47. arXiv:2306.09361  [pdf, other

    eess.AS cs.CL cs.SD

    MFSN: Multi-perspective Fusion Search Network For Pre-training Knowledge in Speech Emotion Recognition

    Authors: Haiyang Sun, Fulin Zhang, Yingying Gao, Zheng Lian, Shilei Zhang, Junlan Feng

    Abstract: Speech Emotion Recognition (SER) is an important research topic in human-computer interaction. Many recent works focus on directly extracting emotional cues through pre-trained knowledge, frequently overlooking considerations of appropriateness and comprehensiveness. Therefore, we propose a novel framework for pre-training knowledge in SER, called Multi-perspective Fusion Search Network (MFSN). Co… ▽ More

    Submitted 26 June, 2024; v1 submitted 12 June, 2023; originally announced June 2023.

  48. RGBlimp: Robotic Gliding Blimp -- Design, Modeling, Development, and Aerodynamics Analysis

    Authors: Hao Cheng, Zeyu Sha, Yongjian Zhu, Feitian Zhang

    Abstract: A miniature robotic blimp, as one type of lighter-than-air aerial vehicle, has attracted increasing attention in the science and engineering field for its long flight duration and safe aerial locomotion. While a variety of miniature robotic blimps have been developed over the past decade, most of them utilize the buoyant lift and neglect the aerodynamic lift in their design, thus leading to a medi… ▽ More

    Submitted 20 October, 2023; v1 submitted 6 June, 2023; originally announced June 2023.

    Journal ref: IEEE Robotics and Automation Letters, vol. 8, no. 11, pp. 7273-7280, Nov. 2023

  49. arXiv:2305.07234  [pdf, other

    eess.SP

    Doppler-Resilient Design of CAZAC Sequences for mmWave/THz Sensing Applications

    Authors: Fan Zhang, Tianqi Mao, Zhaocheng Wang

    Abstract: Ultra-high-resolution target sensing has emerged as a key enabler for various cutting-edge applications, which can be realized by utilizing the millimeter wave/terahertz frequencies. However, the extremely high operating frequency inevitably leads to significant Doppler shift effects, especially for high-mobility applications, causing the degradation of sensing performance with high false alarm ra… ▽ More

    Submitted 12 May, 2023; originally announced May 2023.

  50. arXiv:2304.13096  [pdf, other

    cs.RO eess.SY

    Real-time Autonomous Glider Navigation Software

    Authors: Ruochu Yang, Mengxue Hou, Chad Lembke, Catherine Edwards, Fumin Zhang

    Abstract: Underwater gliders are widely utilized for ocean sampling, surveillance, and other various oceanic applications. In the context of complex ocean environments, gliders may yield poor navigation performance due to strong ocean currents, thus requiring substantial human effort during the manual piloting process. To enhance navigation accuracy, we developed a real-time autonomous glider navigation sof… ▽ More

    Submitted 20 December, 2023; v1 submitted 25 April, 2023; originally announced April 2023.

    Comments: OCEANS 2023 Limerick