-
Synchronous Multi-modal Semantic Communication System with Packet-level Coding
Authors:
Yun Tian,
Jingkai Ying,
Zhijin Qin,
Ye Jin,
Xiaoming Tao
Abstract:
Although the semantic communication with joint semantic-channel coding design has shown promising performance in transmitting data of different modalities over physical layer channels, the synchronization and packet-level forward error correction of multimodal semantics have not been well studied. Due to the independent design of semantic encoders, synchronizing multimodal features in both the sem…
▽ More
Although the semantic communication with joint semantic-channel coding design has shown promising performance in transmitting data of different modalities over physical layer channels, the synchronization and packet-level forward error correction of multimodal semantics have not been well studied. Due to the independent design of semantic encoders, synchronizing multimodal features in both the semantic and time domains is a challenging problem. In this paper, we take the facial video and speech transmission as an example and propose a Synchronous Multimodal Semantic Communication System (SyncSC) with Packet-Level Coding. To achieve semantic and time synchronization, 3D Morphable Mode (3DMM) coefficients and text are transmitted as semantics, and we propose a semantic codec that achieves similar quality of reconstruction and synchronization with lower bandwidth, compared to traditional methods. To protect semantic packets under the erasure channel, we propose a packet-Level Forward Error Correction (FEC) method, called PacSC, that maintains a certain visual quality performance even at high packet loss rates. Particularly, for text packets, a text packet loss concealment module, called TextPC, based on Bidirectional Encoder Representations from Transformers (BERT) is proposed, which significantly improves the performance of traditional FEC methods. The simulation results show that our proposed SyncSC reduce transmission overhead and achieve high-quality synchronous transmission of video and speech over the packet loss network.
△ Less
Submitted 10 August, 2024; v1 submitted 8 August, 2024;
originally announced August 2024.
-
A Secure and Efficient Distributed Semantic Communication System for Heterogeneous Internet of Things Devices
Authors:
Weihao Zeng,
Xinyu Xu,
Qianyun Zhang,
Jiting Shi,
Zhijin Qin,
Zhenyu Guan
Abstract:
Semantic communications have emerged as a promising solution to address the challenge of efficient communication in rapidly evolving and increasingly complex Internet of Things (IoT) networks. However, protecting the security of semantic communication systems within the distributed and heterogeneous IoT networks is critical issues that need to be addressed. We develop a secure and efficient distri…
▽ More
Semantic communications have emerged as a promising solution to address the challenge of efficient communication in rapidly evolving and increasingly complex Internet of Things (IoT) networks. However, protecting the security of semantic communication systems within the distributed and heterogeneous IoT networks is critical issues that need to be addressed. We develop a secure and efficient distributed semantic communication system in IoT scenarios, focusing on three aspects: secure system maintenance, efficient system update, and privacy-preserving system usage. Firstly, we propose a blockchain-based interaction framework that ensures the integrity, authentication, and availability of interactions among IoT devices to securely maintain system. This framework includes a novel digital signature verification mechanism designed for semantic communications, enabling secure and efficient interactions with semantic communications. Secondly, to improve the efficiency of interactions, we develop a flexible semantic communication scheme that leverages compressed semantic knowledge bases. This scheme reduces the data exchange required for system update and is adapt to dynamic task requirements and the diversity of device capabilities. Thirdly, we exploit the integration of differential privacy into semantic communications. We analyze the implementation of differential privacy taking into account the lossy nature of semantic communications and wireless channel distortions. An joint model-channel noise mechanism is introduced to achieve differential privacy preservation in semantic communications without compromising the system's functionality. Experiments show that the system is able to achieve integrity, availability, efficiency and the preservation of privacy.
△ Less
Submitted 19 July, 2024;
originally announced July 2024.
-
DreamVoice: Text-Guided Voice Conversion
Authors:
Jiarui Hai,
Karan Thakkar,
Helin Wang,
Zengyi Qin,
Mounya Elhilali
Abstract:
Generative voice technologies are rapidly evolving, offering opportunities for more personalized and inclusive experiences. Traditional one-shot voice conversion (VC) requires a target recording during inference, limiting ease of usage in generating desired voice timbres. Text-guided generation offers an intuitive solution to convert voices to desired "DreamVoices" according to the users' needs. O…
▽ More
Generative voice technologies are rapidly evolving, offering opportunities for more personalized and inclusive experiences. Traditional one-shot voice conversion (VC) requires a target recording during inference, limiting ease of usage in generating desired voice timbres. Text-guided generation offers an intuitive solution to convert voices to desired "DreamVoices" according to the users' needs. Our paper presents two major contributions to VC technology: (1) DreamVoiceDB, a robust dataset of voice timbre annotations for 900 speakers from VCTK and LibriTTS. (2) Two text-guided VC methods: DreamVC, an end-to-end diffusion-based text-guided VC model; and DreamVG, a versatile text-to-voice generation plugin that can be combined with any one-shot VC models. The experimental results demonstrate that our proposed methods trained on the DreamVoiceDB dataset generate voice timbres accurately aligned with the text prompt and achieve high-quality VC.
△ Less
Submitted 24 June, 2024;
originally announced June 2024.
-
Computational and Statistical Guarantees for Tensor-on-Tensor Regression with Tensor Train Decomposition
Authors:
Zhen Qin,
Zhihui Zhu
Abstract:
Recently, a tensor-on-tensor (ToT) regression model has been proposed to generalize tensor recovery, encompassing scenarios like scalar-on-tensor regression and tensor-on-vector regression. However, the exponential growth in tensor complexity poses challenges for storage and computation in ToT regression. To overcome this hurdle, tensor decompositions have been introduced, with the tensor train (T…
▽ More
Recently, a tensor-on-tensor (ToT) regression model has been proposed to generalize tensor recovery, encompassing scenarios like scalar-on-tensor regression and tensor-on-vector regression. However, the exponential growth in tensor complexity poses challenges for storage and computation in ToT regression. To overcome this hurdle, tensor decompositions have been introduced, with the tensor train (TT)-based ToT model proving efficient in practice due to reduced memory requirements, enhanced computational efficiency, and decreased sampling complexity. Despite these practical benefits, a disparity exists between theoretical analysis and real-world performance. In this paper, we delve into the theoretical and algorithmic aspects of the TT-based ToT regression model. Assuming the regression operator satisfies the restricted isometry property (RIP), we conduct an error analysis for the solution to a constrained least-squares optimization problem. This analysis includes upper error bound and minimax lower bound, revealing that such error bounds polynomially depend on the order $N+M$. To efficiently find solutions meeting such error bounds, we propose two optimization algorithms: the iterative hard thresholding (IHT) algorithm (employing gradient descent with TT-singular value decomposition (TT-SVD)) and the factorization approach using the Riemannian gradient descent (RGD) algorithm. When RIP is satisfied, spectral initialization facilitates proper initialization, and we establish the linear convergence rate of both IHT and RGD.
△ Less
Submitted 9 June, 2024;
originally announced June 2024.
-
Hybrid Digital-Analog Semantic Communications
Authors:
Huiqiang Xie,
Zhijin Qin,
Zhu Han,
Khaled B. Letaief
Abstract:
Digital and analog semantic communications (SemCom) face inherent limitations such as data security concerns in analog SemCom, as well as leveling-off and cliff-edge effects in digital SemCom. In order to overcome these challenges, we propose a novel SemCom framework and a corresponding system called HDA-DeepSC, which leverages a hybrid digital-analog approach for multimedia transmission. This is…
▽ More
Digital and analog semantic communications (SemCom) face inherent limitations such as data security concerns in analog SemCom, as well as leveling-off and cliff-edge effects in digital SemCom. In order to overcome these challenges, we propose a novel SemCom framework and a corresponding system called HDA-DeepSC, which leverages a hybrid digital-analog approach for multimedia transmission. This is achieved through the introduction of digital-analog allocation and fusion modules. To strike a balance between data rate and distortion, we design new loss functions that take into account long-distance dependencies in the semantic distortion constraint, essential information recovery in the channel distortion constraint, and optimal bit stream generation in the rate constraint. Additionally, we propose denoising diffusion-based signal detection techniques, which involve carefully designed variance schedules and sampling algorithms to refine transmitted signals. Through extensive numerical experiments, we will demonstrate that HDA-DeepSC exhibits robustness to channel variations and is capable of supporting various communication scenarios. Our proposed framework outperforms existing benchmarks in terms of peak signal-to-noise ratio and multi-scale structural similarity, showcasing its superiority in semantic communication quality.
△ Less
Submitted 27 May, 2024; v1 submitted 21 May, 2024;
originally announced May 2024.
-
Semantic MIMO Systems for Speech-to-Text Transmission
Authors:
Zhenzi Weng,
Zhijin Qin,
Huiqiang Xie,
Xiaoming Tao,
Khaled B. Letaief
Abstract:
Semantic communications have been utilized to execute numerous intelligent tasks by transmitting task-related semantic information instead of bits. In this article, we propose a semantic-aware speech-to-text transmission system for the single-user multiple-input multiple-output (MIMO) and multi-user MIMO communication scenarios, named SAC-ST. Particularly, a semantic communication system to serve…
▽ More
Semantic communications have been utilized to execute numerous intelligent tasks by transmitting task-related semantic information instead of bits. In this article, we propose a semantic-aware speech-to-text transmission system for the single-user multiple-input multiple-output (MIMO) and multi-user MIMO communication scenarios, named SAC-ST. Particularly, a semantic communication system to serve the speech-to-text task at the receiver is first designed, which compresses the semantic information and generates the low-dimensional semantic features by leveraging the transformer module. In addition, a novel semantic-aware network is proposed to facilitate the transmission with high semantic fidelity to identify the critical semantic information and guarantee it is recovered accurately. Furthermore, we extend the SAC-ST with a neural network-enabled channel estimation network to mitigate the dependence on accurate channel state information and validate the feasibility of SAC-ST in practical communication environments. Simulation results will show that the proposed SAC-ST outperforms the communication framework without the semantic-aware network for speech-to-text transmission over the MIMO channels in terms of the speech-to-text metrics, especially in the low signal-to-noise regime. Moreover, the SAC-ST with the developed channel estimation network is comparable to the SAC-ST with perfect channel state information.
△ Less
Submitted 13 May, 2024;
originally announced May 2024.
-
Hybrid Bit and Semantic Communications
Authors:
Kaiwen Yu,
Renhe Fan,
Gang Wu,
Zhijin Qin
Abstract:
Semantic communication technology is regarded as a method surpassing the Shannon limit of bit transmission, capable of effectively enhancing transmission efficiency. However, current approaches that directly map content to transmission symbols are challenging to deploy in practice, imposing significant limitations on the development of semantic communication. To address this challenge, we propose…
▽ More
Semantic communication technology is regarded as a method surpassing the Shannon limit of bit transmission, capable of effectively enhancing transmission efficiency. However, current approaches that directly map content to transmission symbols are challenging to deploy in practice, imposing significant limitations on the development of semantic communication. To address this challenge, we propose a hybrid bit and semantic communication system, named HybridBSC, in which encoded semantic information is inserted into bit information for transmission via conventional digital communication systems utilizing same spectrum resources. The system can be easily deployed using existing communication architecture to achieve bit and semantic information transmission. Particularly, we design a semantic insertion and extraction scheme to implement this strategy. Furthermore, we conduct experimental validation based on the pluto-based software defined radio (SDR) platform in a real wireless channel, demonstrating that the proposed strategy can simultaneously transmit semantic and bit information.
△ Less
Submitted 30 April, 2024;
originally announced April 2024.
-
Are Watermarks Bugs for Deepfake Detectors? Rethinking Proactive Forensics
Authors:
Xiaoshuai Wu,
Xin Liao,
Bo Ou,
Yuling Liu,
Zheng Qin
Abstract:
AI-generated content has accelerated the topic of media synthesis, particularly Deepfake, which can manipulate our portraits for positive or malicious purposes. Before releasing these threatening face images, one promising forensics solution is the injection of robust watermarks to track their own provenance. However, we argue that current watermarking models, originally devised for genuine images…
▽ More
AI-generated content has accelerated the topic of media synthesis, particularly Deepfake, which can manipulate our portraits for positive or malicious purposes. Before releasing these threatening face images, one promising forensics solution is the injection of robust watermarks to track their own provenance. However, we argue that current watermarking models, originally devised for genuine images, may harm the deployed Deepfake detectors when directly applied to forged images, since the watermarks are prone to overlap with the forgery signals used for detection. To bridge this gap, we thus propose AdvMark, on behalf of proactive forensics, to exploit the adversarial vulnerability of passive detectors for good. Specifically, AdvMark serves as a plug-and-play procedure for fine-tuning any robust watermarking into adversarial watermarking, to enhance the forensic detectability of watermarked images; meanwhile, the watermarks can still be extracted for provenance tracking. Extensive experiments demonstrate the effectiveness of the proposed AdvMark, leveraging robust watermarking to fool Deepfake detectors, which can help improve the accuracy of downstream Deepfake detection without tuning the in-the-wild detectors. We believe this work will shed some light on the harmless proactive forensics against Deepfake.
△ Less
Submitted 27 April, 2024;
originally announced April 2024.
-
A Robust Semantic Communication System for Image
Authors:
Xiang Peng,
Zhijin Qin,
Xiaoming Tao,
Jianhua Lu,
Khaled B. Letaief
Abstract:
Semantic communications have gained significant attention as a promising approach to address the transmission bottleneck, especially with the continuous development of 6G techniques. Distinct from the well investigated physical channel impairments, this paper focuses on semantic impairments in image, particularly those arising from adversarial perturbations. Specifically, we propose a novel metric…
▽ More
Semantic communications have gained significant attention as a promising approach to address the transmission bottleneck, especially with the continuous development of 6G techniques. Distinct from the well investigated physical channel impairments, this paper focuses on semantic impairments in image, particularly those arising from adversarial perturbations. Specifically, we propose a novel metric for quantifying the intensity of semantic impairment and develop a semantic impairment dataset. Furthermore, we introduce a deep learning enabled semantic communication system, termed as DeepSC-RI, to enhance the robustness of image transmission, which incorporates a multi-scale semantic extractor with a dual-branch architecture for extracting semantics with varying granularity, thereby improving the robustness of the system. The fine-grained branch incorporates a semantic importance evaluation module to identify and prioritize crucial semantics, while the coarse-grained branch adopts a hierarchical approach for capturing the robust semantics. These two streams of semantics are seamlessly integrated via an advanced cross-attention-based semantic fusion module. Experimental results demonstrate the superior performance of DeepSC-RI under various levels of semantic impairment intensity.
△ Less
Submitted 14 March, 2024;
originally announced March 2024.
-
Robust Semantic Communications for Speech Transmission
Authors:
Zhenzi Weng,
Zhijin Qin
Abstract:
In this paper, we propose a robust semantic communication system for speech transmission, named Ross-S2T, by delivering the essential semantic information. Particularly, we consider the speech-to-text translation (S2TT) as the transmission goal. First, a deep semantic encoder is developed to directly convert speech in the source language to textual features associated with the target language, fac…
▽ More
In this paper, we propose a robust semantic communication system for speech transmission, named Ross-S2T, by delivering the essential semantic information. Particularly, we consider the speech-to-text translation (S2TT) as the transmission goal. First, a deep semantic encoder is developed to directly convert speech in the source language to textual features associated with the target language, facilitating the end-to-end (E2E) semantic exchange to perform the S2TT task and reducing the transmission data without performance degradation. To mitigate semantic impairments inherent in the corrupted speech, a novel generative adversarial network (GAN)-enabled deep semantic compensator is established to estimate the lost semantic information within the speech and extract deep semantic features simultaneously, which enables robust semantic transmission for corrupted speech. Furthermore, a semantic probe-aided compensator is devised to enhance the semantic fidelity of recovered semantic features and improve the understandability of the target text. According to simulation results, the proposed Ross-S2T exhibits superior S2TT performance compared to conventional approaches and high robustness against semantic impairments.
△ Less
Submitted 25 April, 2024; v1 submitted 8 March, 2024;
originally announced March 2024.
-
Computational Offloading in Semantic-Aware Cloud-Edge-End Collaborative Networks
Authors:
Zelin Ji,
Zhijin Qin
Abstract:
The trend of massive connectivity pushes forward the explosive growth of end devices. The emergence of various applications has prompted a demand for pervasive connectivity and more efficient computing paradigms. On the other hand, the lack of computational capacity of the end devices restricts the implementation of the intelligent applications, and becomes a bottleneck of the multiple access for…
▽ More
The trend of massive connectivity pushes forward the explosive growth of end devices. The emergence of various applications has prompted a demand for pervasive connectivity and more efficient computing paradigms. On the other hand, the lack of computational capacity of the end devices restricts the implementation of the intelligent applications, and becomes a bottleneck of the multiple access for supporting massive connectivity. Mobile cloud computing (MCC) and mobile edge computing (MEC) techniques enable end devices to offload local computation-intensive tasks to servers by networks. In this paper, we consider the cloud-edge-end collaborative networks to utilize distributed computing resources. Furthermore, we apply task-oriented semantic communications to tackle the fast-varying channel between the end devices and MEC servers and reduce the communication cost. To minimize long-term energy consumption on constraints queue stability and computational delay, a Lyapunov-guided deep reinforcement learning hybrid (DRLH) framework is proposed to solve the mixed integer non-linear programming (MINLP) problem. The long-term energy consumption minimization problem is transformed into the deterministic problem in each time frame. The DRLH framework integrates a model-free deep reinforcement learning algorithm with a model-based mathematical optimization algorithm to mitigate computational complexity and leverage the scenario information, so that improving the convergence performance. Numerical results demonstrate that the proposed DRLH framework achieves near-optimal performance on energy consumption while stabilizing all queues.
△ Less
Submitted 19 May, 2024; v1 submitted 28 February, 2024;
originally announced February 2024.
-
Towards Intelligent Communications: Large Model Empowered Semantic Communications
Authors:
Huiqiang Xie,
Zhijin Qin,
Xiaoming Tao,
Zhu Han
Abstract:
Deep learning enabled semantic communications have shown great potential to significantly improve transmission efficiency and alleviate spectrum scarcity, by effectively exchanging the semantics behind the data. Recently, the emergence of large models, boasting billions of parameters, has unveiled remarkable human-like intelligence, offering a promising avenue for advancing semantic communication…
▽ More
Deep learning enabled semantic communications have shown great potential to significantly improve transmission efficiency and alleviate spectrum scarcity, by effectively exchanging the semantics behind the data. Recently, the emergence of large models, boasting billions of parameters, has unveiled remarkable human-like intelligence, offering a promising avenue for advancing semantic communication by enhancing semantic understanding and contextual understanding. This article systematically investigates the large model-empowered semantic communication systems from potential applications to system design. First, we propose a new semantic communication architecture that seamlessly integrates large models into semantic communication through the introduction of a memory module. Then, the typical applications are illustrated to show the benefits of the new architecture. Besides, we discuss the key designs in implementing the new semantic communication systems from module design to system training. Finally, the potential research directions are identified to boost the large model-empowered semantic communications.
△ Less
Submitted 19 March, 2024; v1 submitted 20 February, 2024;
originally announced February 2024.
-
Guaranteed Nonconvex Factorization Approach for Tensor Train Recovery
Authors:
Zhen Qin,
Michael B. Wakin,
Zhihui Zhu
Abstract:
In this paper, we provide the first convergence guarantee for the factorization approach. Specifically, to avoid the scaling ambiguity and to facilitate theoretical analysis, we optimize over the so-called left-orthogonal TT format which enforces orthonormality among most of the factors. To ensure the orthonormal structure, we utilize the Riemannian gradient descent (RGD) for optimizing those fact…
▽ More
In this paper, we provide the first convergence guarantee for the factorization approach. Specifically, to avoid the scaling ambiguity and to facilitate theoretical analysis, we optimize over the so-called left-orthogonal TT format which enforces orthonormality among most of the factors. To ensure the orthonormal structure, we utilize the Riemannian gradient descent (RGD) for optimizing those factors over the Stiefel manifold. We first delve into the TT factorization problem and establish the local linear convergence of RGD. Notably, the rate of convergence only experiences a linear decline as the tensor order increases. We then study the sensing problem that aims to recover a TT format tensor from linear measurements. Assuming the sensing operator satisfies the restricted isometry property (RIP), we show that with a proper initialization, which could be obtained through spectral initialization, RGD also converges to the ground-truth tensor at a linear rate. Furthermore, we expand our analysis to encompass scenarios involving Gaussian noise in the measurements. We prove that RGD can reliably recover the ground truth at a linear rate, with the recovery error exhibiting only polynomial growth in relation to the tensor order. We conduct various experiments to validate our theoretical findings.
△ Less
Submitted 4 January, 2024;
originally announced January 2024.
-
Federated Multi-View Synthesizing for Metaverse
Authors:
Yiyu Guo,
Zhijin Qin,
Xiaoming Tao,
Geoffrey Ye Li
Abstract:
The metaverse is expected to provide immersive entertainment, education, and business applications. However, virtual reality (VR) transmission over wireless networks is data- and computation-intensive, making it critical to introduce novel solutions that meet stringent quality-of-service requirements. With recent advances in edge intelligence and deep learning, we have developed a novel multi-view…
▽ More
The metaverse is expected to provide immersive entertainment, education, and business applications. However, virtual reality (VR) transmission over wireless networks is data- and computation-intensive, making it critical to introduce novel solutions that meet stringent quality-of-service requirements. With recent advances in edge intelligence and deep learning, we have developed a novel multi-view synthesizing framework that can efficiently provide computation, storage, and communication resources for wireless content delivery in the metaverse. We propose a three-dimensional (3D)-aware generative model that uses collections of single-view images. These single-view images are transmitted to a group of users with overlapping fields of view, which avoids massive content transmission compared to transmitting tiles or whole 3D models. We then present a federated learning approach to guarantee an efficient learning process. The training performance can be improved by characterizing the vertical and horizontal data samples with a large latent feature space, while low-latency communication can be achieved with a reduced number of transmitted parameters during federated learning. We also propose a federated transfer learning framework to enable fast domain adaptation to different target domains. Simulation results have demonstrated the effectiveness of our proposed federated multi-view synthesizing framework for VR content delivery.
△ Less
Submitted 18 December, 2023;
originally announced January 2024.
-
OpenVoice: Versatile Instant Voice Cloning
Authors:
Zengyi Qin,
Wenliang Zhao,
Xumin Yu,
Xin Sun
Abstract:
We introduce OpenVoice, a versatile voice cloning approach that requires only a short audio clip from the reference speaker to replicate their voice and generate speech in multiple languages. OpenVoice represents a significant advancement in addressing the following open challenges in the field: 1) Flexible Voice Style Control. OpenVoice enables granular control over voice styles, including emotio…
▽ More
We introduce OpenVoice, a versatile voice cloning approach that requires only a short audio clip from the reference speaker to replicate their voice and generate speech in multiple languages. OpenVoice represents a significant advancement in addressing the following open challenges in the field: 1) Flexible Voice Style Control. OpenVoice enables granular control over voice styles, including emotion, accent, rhythm, pauses, and intonation, in addition to replicating the tone color of the reference speaker. The voice styles are not directly copied from and constrained by the style of the reference speaker. Previous approaches lacked the ability to flexibly manipulate voice styles after cloning. 2) Zero-Shot Cross-Lingual Voice Cloning. OpenVoice achieves zero-shot cross-lingual voice cloning for languages not included in the massive-speaker training set. Unlike previous approaches, which typically require extensive massive-speaker multi-lingual (MSML) dataset for all languages, OpenVoice can clone voices into a new language without any massive-speaker training data for that language. OpenVoice is also computationally efficient, costing tens of times less than commercially available APIs that offer even inferior performance. To foster further research in the field, we have made the source code and trained model publicly accessible. We also provide qualitative results in our demo website. Prior to its public release, our internal version of OpenVoice was used tens of millions of times by users worldwide between May and October 2023, serving as the backend of MyShell.
△ Less
Submitted 2 January, 2024; v1 submitted 3 December, 2023;
originally announced December 2023.
-
An End-Cloud Computing Enabled Surveillance Video Transmission System
Authors:
Dingxi Yang,
Zhijin Qin,
Liting Wang,
Xiaoming Tao,
Fang Cui,
Hengjiang Wang
Abstract:
The enormous data volume of video poses a significant burden on the network. Particularly, transferring high-definition surveillance videos to the cloud consumes a significant amount of spectrum resources. To address these issues, we propose a surveillance video transmission system enabled by end-cloud computing. Specifically, the cameras actively down-sample the original video and then a redundan…
▽ More
The enormous data volume of video poses a significant burden on the network. Particularly, transferring high-definition surveillance videos to the cloud consumes a significant amount of spectrum resources. To address these issues, we propose a surveillance video transmission system enabled by end-cloud computing. Specifically, the cameras actively down-sample the original video and then a redundant frame elimination module is employed to further reduce the data volume of surveillance videos. Then we develop a key-frame assisted video super-resolution model to reconstruct the high-quality video at the cloud side. Moreover, we propose a strategy of extracting key frames from source videos for better reconstruction performance by utilizing the peak signal-to-noise ratio (PSNR) of adjacent frames to measure the propagation distance of key frame information. Simulation results show that the developed system can effectively reduce the data volume by the end-cloud collaboration and outperforms existing video super-resolution models significantly in terms of PSNR and structural similarity index (SSIM).
△ Less
Submitted 8 November, 2023;
originally announced November 2023.
-
Compression Ratio Learning and Semantic Communications for Video Imaging
Authors:
Bowen Zhang,
Zhijin Qin,
Geoffrey Ye Li
Abstract:
Camera sensors have been widely used in intelligent robotic systems. Developing camera sensors with high sensing efficiency has always been important to reduce the power, memory, and other related resources. Inspired by recent success on programmable sensors and deep optic methods, we design a novel video compressed sensing system with spatially-variant compression ratios, which achieves higher im…
▽ More
Camera sensors have been widely used in intelligent robotic systems. Developing camera sensors with high sensing efficiency has always been important to reduce the power, memory, and other related resources. Inspired by recent success on programmable sensors and deep optic methods, we design a novel video compressed sensing system with spatially-variant compression ratios, which achieves higher imaging quality than the existing snapshot compressed imaging methods with the same sensing costs. In this article, we also investigate the data transmission methods for programmable sensors, where the performance of communication systems is evaluated by the reconstructed images or videos rather than the transmission of sensor data itself. Usually, different reconstruction algorithms are designed for applications in high dynamic range imaging, video compressive sensing, or motion debluring. This task-aware property inspires a semantic communication framework for programmable sensors. In this work, a policy-gradient based reinforcement learning method is introduced to achieve the explicit trade-off between the compression (or transmission) rate and the image distortion. Numerical results show the superiority of the proposed methods over existing baselines.
△ Less
Submitted 9 October, 2023;
originally announced October 2023.
-
Low-complexity eigenvector prediction-based precoding matrix prediction in massive MIMO with mobility
Authors:
Ziao Qin,
Haifan Yin,
Weidong Li
Abstract:
In practical massive multiple-input multiple-output (MIMO) systems, the precoding matrix is often obtained from the eigenvectors of channel matrices and is challenging to update in time due to finite computation resources at the base station, especially in mobile scenarios. In order to reduce the precoding complexity while enhancing the spectral efficiency (SE), a novel precoding matrix prediction…
▽ More
In practical massive multiple-input multiple-output (MIMO) systems, the precoding matrix is often obtained from the eigenvectors of channel matrices and is challenging to update in time due to finite computation resources at the base station, especially in mobile scenarios. In order to reduce the precoding complexity while enhancing the spectral efficiency (SE), a novel precoding matrix prediction method based on the eigenvector prediction (EGVP) is proposed. The basic idea is to decompose the periodic uplink channel eigenvector samples into a linear combination of the channel state information (CSI) and channel weights. We further prove that the channel weights can be interpolated by an exponential model corresponding to the Doppler characteristics of the CSI. A fast matrix pencil prediction (FMPP) method is also devised to predict the CSI. We also prove that our scheme achieves asymptotically error-free precoder prediction with a distinct complexity advantage. Simulation results show that under the perfect non-delayed CSI, the proposed EGVP method reduces floating point operations by 80\% without losing SE performance compared to the traditional full-time precoding scheme. In more realistic cases with CSI delays, the proposed EGVP-FMPP scheme has clear SE performance gains compared to the precoding scheme widely used in current communication systems.
△ Less
Submitted 30 June, 2024; v1 submitted 24 August, 2023;
originally announced August 2023.
-
Semantic-Aware Image Compressed Sensing
Authors:
Bowen Zhang,
Zhijin Qin,
Geoffrey Ye Li
Abstract:
Deep learning based image compressed sensing (CS) has achieved great success. However, existing CS systems mainly adopt a fixed measurement matrix to images, ignoring the fact the optimal measurement numbers and bases are different for different images. To further improve the sensing efficiency, we propose a novel semantic-aware image CS system. In our system, the encoder first uses a fixed number…
▽ More
Deep learning based image compressed sensing (CS) has achieved great success. However, existing CS systems mainly adopt a fixed measurement matrix to images, ignoring the fact the optimal measurement numbers and bases are different for different images. To further improve the sensing efficiency, we propose a novel semantic-aware image CS system. In our system, the encoder first uses a fixed number of base CS measurements to sense different images. According to the base CS results, the encoder then employs a policy network to analyze the semantic information in images and determines the measurement matrix for different image areas. At the decoder side, a semantic-aware initial reconstruction network is developed to deal with the changes of measurement matrices used at the encoder. A rate-distortion training loss is further introduced to dynamically adjust the average compression ratio for the semantic-aware CS system and the policy network is trained jointly with the encoder and the decoder in an en-to-end manner by using some proxy functions. Numerical results show that the proposed semantic-aware image CS system is superior to the traditional ones with fixed measurement matrices.
△ Less
Submitted 10 July, 2023; v1 submitted 6 July, 2023;
originally announced July 2023.
-
Meta Federated Reinforcement Learning for Distributed Resource Allocation
Authors:
Zelin Ji,
Zhijin Qin,
Xiaoming Tao
Abstract:
In cellular networks, resource allocation is usually performed in a centralized way, which brings huge computation complexity to the base station (BS) and high transmission overhead. This paper explores a distributed resource allocation method that aims to maximize energy efficiency (EE) while ensuring the quality of service (QoS) for users. Specifically, in order to address wireless channel condi…
▽ More
In cellular networks, resource allocation is usually performed in a centralized way, which brings huge computation complexity to the base station (BS) and high transmission overhead. This paper explores a distributed resource allocation method that aims to maximize energy efficiency (EE) while ensuring the quality of service (QoS) for users. Specifically, in order to address wireless channel conditions, we propose a robust meta federated reinforcement learning (\textit{MFRL}) framework that allows local users to optimize transmit power and assign channels using locally trained neural network models, so as to offload computational burden from the cloud server to the local users, reducing transmission overhead associated with local channel state information. The BS performs the meta learning procedure to initialize a general global model, enabling rapid adaptation to different environments with improved EE performance. The federated learning technique, based on decentralized reinforcement learning, promotes collaboration and mutual benefits among users. Analysis and numerical results demonstrate that the proposed \textit{MFRL} framework accelerates the reinforcement learning process, decreases transmission overhead, and offloads computation, while outperforming the conventional decentralized reinforcement learning algorithm in terms of convergence speed and EE performance across various scenarios.
△ Less
Submitted 9 July, 2023; v1 submitted 6 July, 2023;
originally announced July 2023.
-
Quantum State Tomography for Matrix Product Density Operators
Authors:
Zhen Qin,
Casey Jameson,
Zhexuan Gong,
Michael B. Wakin,
Zhihui Zhu
Abstract:
The reconstruction of quantum states from experimental measurements, often achieved using quantum state tomography (QST), is crucial for the verification and benchmarking of quantum devices. However, performing QST for a generic unstructured quantum state requires an enormous number of state copies that grows \emph{exponentially} with the number of individual quanta in the system, even for the mos…
▽ More
The reconstruction of quantum states from experimental measurements, often achieved using quantum state tomography (QST), is crucial for the verification and benchmarking of quantum devices. However, performing QST for a generic unstructured quantum state requires an enormous number of state copies that grows \emph{exponentially} with the number of individual quanta in the system, even for the most optimal measurement settings. Fortunately, many physical quantum states, such as states generated by noisy, intermediate-scale quantum computers, are usually structured. In one dimension, such states are expected to be well approximated by matrix product operators (MPOs) with a finite matrix/bond dimension independent of the number of qubits, therefore enabling efficient state representation. Nevertheless, it is still unclear whether efficient QST can be performed for these states in general.
In this paper, we attempt to bridge this gap and establish theoretical guarantees for the stable recovery of MPOs using tools from compressive sensing and the theory of empirical processes. We begin by studying two types of random measurement settings: Gaussian measurements and Haar random rank-one Positive Operator Valued Measures (POVMs). We show that the information contained in an MPO with a finite bond dimension can be preserved using a number of random measurements that depends only \emph{linearly} on the number of qubits, assuming no statistical error of the measurements. We then study MPO-based QST with physical quantum measurements through Haar random rank-one POVMs that can be implemented on quantum computers. We prove that only a \emph{polynomial} number of state copies in the number of qubits is required to guarantee bounded recovery error of an MPO state.
△ Less
Submitted 18 February, 2024; v1 submitted 15 June, 2023;
originally announced June 2023.
-
QoE-based Semantic-Aware Resource Allocation for Multi-Task Networks
Authors:
Lei Yan,
Zhijin Qin,
Chunfeng Li,
Rui Zhang,
Yongzhao Li,
Xiaoming Tao
Abstract:
By transmitting task-related information only, semantic communications yield significant performance gains over conventional communications. However, the lack of mature semantic theory about semantic information quantification and performance evaluation makes it challenging to perform resource allocation for semantic communications, especially when multiple tasks coexist in the network. To cope wi…
▽ More
By transmitting task-related information only, semantic communications yield significant performance gains over conventional communications. However, the lack of mature semantic theory about semantic information quantification and performance evaluation makes it challenging to perform resource allocation for semantic communications, especially when multiple tasks coexist in the network. To cope with this challenge, we propose a quality-of-experience (QoE) based semantic-aware resource allocation method for multi-task networks in this paper. First, semantic entropy is defined to quantify the semantic information for different tasks, and the relationship between semantic entropy and Shannon entropy is analyzed. Then, we develop a novel QoE model to formulate the semantic-aware resource allocation in terms of semantic compression, channel assignment, and transmit power. The compatibility of the formulated problem with conventional communications is further demonstrated. To solve this problem, we decouple it into two subproblems and solved them by a developed deep Q-network (DQN) based method and a proposed low-complexity matching algorithm, respectively. Finally, simulation results validate the effectiveness and superiority of the proposed method, as well as its compatibility with conventional communications.
△ Less
Submitted 8 April, 2024; v1 submitted 10 May, 2023;
originally announced May 2023.
-
A manifold learning-based CSI feedback framework for FDD massive MIMO
Authors:
Yandi Cao,
Haifan Yin,
Ziao Qin,
Weidong Li,
Weimin Wu,
Merouane Debbah
Abstract:
Massive multi-input multi-output (MIMO) in Frequency Division Duplex (FDD) mode suffers from heavy feedback overhead for Channel State Information (CSI). In this paper, a novel manifold learning-based CSI feedback framework (MLCF) is proposed to reduce the feedback and improve the spectral efficiency of FDD massive MIMO. Manifold learning (ML) is an effective method for dimensionality reduction. H…
▽ More
Massive multi-input multi-output (MIMO) in Frequency Division Duplex (FDD) mode suffers from heavy feedback overhead for Channel State Information (CSI). In this paper, a novel manifold learning-based CSI feedback framework (MLCF) is proposed to reduce the feedback and improve the spectral efficiency of FDD massive MIMO. Manifold learning (ML) is an effective method for dimensionality reduction. However, most ML algorithms focus only on data compression, and lack the corresponding recovery methods. Moreover, the computational complexity is high when dealing with incremental data. To solve these problems, we propose a landmark selection algorithm to characterize the topological skeleton of the manifold where the CSI sample resides. Based on the learned skeleton, the local patch of the incremental CSI on the manifold can be easily determined by its nearest landmarks. This motivates us to propose a low-complexity compression and reconstruction scheme by keeping the local geometric relationships with landmarks unchanged. We theoretically prove the convergence of the proposed algorithm. Meanwhile, the upper bound on the error of approximating the CSI samples using landmarks is derived. Simulation results under an industrial channel model of 3GPP demonstrate that the proposed MLCF method outperforms existing algorithms based on compressed sensing and deep learning.
△ Less
Submitted 27 April, 2023;
originally announced April 2023.
-
Dynamic Compressive Sensing based on RLS for Underwater Acoustic Communications
Authors:
Zhen Qin
Abstract:
Sparse structures are widely recognized and utilized in channel estimation. Two typical mechanisms, namely proportionate updating (PU) and zero-attracting (ZA) techniques, achieve better performance, but their computational complexity are higher than non-sparse counterparts. In this paper, we propose a DCS technique based on the recursive least squares (RLS) algorithm which can simultaneously achi…
▽ More
Sparse structures are widely recognized and utilized in channel estimation. Two typical mechanisms, namely proportionate updating (PU) and zero-attracting (ZA) techniques, achieve better performance, but their computational complexity are higher than non-sparse counterparts. In this paper, we propose a DCS technique based on the recursive least squares (RLS) algorithm which can simultaneously achieve improved performance and reduced computational complexity. Specifically, we develop the sparse adaptive subspace pursuit-improved RLS (SpAdSP-IRLS) algorithm by updating only the sparse structure in the IRLS to track significant coefficients. The complexity of the SpAdSP-IRLS algorithm is successfully reduced to $\mathcal{O}(L^2+2L(s+1)+10s)$, compared with the order of $\mathcal{O}(3L^2+4L)$ for the standard RLS. Here, $L$ represents the length of the channel, and $s$ represents the size of the support set. Our experiments on both synthetic and real data show the superiority of the proposed SpAdSP-IRLS, even though only $s$ elements are updated in the channel estimation.
△ Less
Submitted 4 May, 2023; v1 submitted 24 April, 2023;
originally announced April 2023.
-
Patching Neural Barrier Functions Using Hamilton-Jacobi Reachability
Authors:
Sander Tonkens,
Alex Toofanian,
Zhizhen Qin,
Sicun Gao,
Sylvia Herbert
Abstract:
Learning-based control algorithms have led to major advances in robotics at the cost of decreased safety guarantees. Recently, neural networks have also been used to characterize safety through the use of barrier functions for complex nonlinear systems. Learned barrier functions approximately encode and enforce a desired safety constraint through a value function, but do not provide any formal gua…
▽ More
Learning-based control algorithms have led to major advances in robotics at the cost of decreased safety guarantees. Recently, neural networks have also been used to characterize safety through the use of barrier functions for complex nonlinear systems. Learned barrier functions approximately encode and enforce a desired safety constraint through a value function, but do not provide any formal guarantees. In this paper, we propose a local dynamic programming (DP) based approach to "patch" an almost-safe learned barrier at potentially unsafe points in the state space. This algorithm, HJ-Patch, obtains a novel barrier that provides formal safety guarantees, yet retains the global structure of the learned barrier. Our local DP based reachability algorithm, HJ-Patch, updates the barrier function "minimally" at points that both (a) neighbor the barrier safety boundary and (b) do not satisfy the safety condition. We view this as a key step to bridging the gap between learning-based barrier functions and Hamilton-Jacobi reachability analysis, providing a framework for further integration of these approaches. We demonstrate that for well-trained barriers we reduce the computational load by 2 orders of magnitude with respect to standard DP-based reachability, and demonstrate scalability to a 6-dimensional system, which is at the limit of standard DP-based reachability.
△ Less
Submitted 19 April, 2023;
originally announced April 2023.
-
Semantic Communication with Memory
Authors:
Huiqiang Xie,
Zhijin Qin,
Geoffrey Ye Li
Abstract:
While semantic communication succeeds in efficiently transmitting due to the strong capability to extract the essential semantic information, it is still far from the intelligent or human-like communications. In this paper, we introduce an essential component, memory, into semantic communications to mimic human communications. Particularly, we investigate a deep learning (DL) based semantic commun…
▽ More
While semantic communication succeeds in efficiently transmitting due to the strong capability to extract the essential semantic information, it is still far from the intelligent or human-like communications. In this paper, we introduce an essential component, memory, into semantic communications to mimic human communications. Particularly, we investigate a deep learning (DL) based semantic communication system with memory, named Mem-DeepSC, by considering the scenario question answer task. We exploit the universal Transformer based transceiver to extract the semantic information and introduce the memory module to process the context information. Moreover, we derive the relationship between the length of semantic signal and the channel noise to validate the possibility of dynamic transmission. Specially, we propose two dynamic transmission methods to enhance the transmission reliability as well as to reduce the communication overhead by masking some unessential elements, which are recognized through training the model with mutual information. Numerical results show that the proposed Mem-DeepSC is superior to benchmarks in terms of answer accuracy and transmission efficiency, i.e., number of transmitted symbols.
△ Less
Submitted 22 March, 2023;
originally announced March 2023.
-
A review of codebooks for CSI feedback in 5G new radio and beyond
Authors:
Ziao Qin,
Haifan Yin
Abstract:
Codebooks have been indispensable for wireless communication standard since the first release of the Long-Term Evolution in 2009. They offer an efficient way to acquire the channel state information (CSI) for multiple antenna systems. Nowadays, a codebook is not limited to a set of pre-defined precoders, it refers to a CSI feedback framework, which is more and more sophisticated. In this paper, we…
▽ More
Codebooks have been indispensable for wireless communication standard since the first release of the Long-Term Evolution in 2009. They offer an efficient way to acquire the channel state information (CSI) for multiple antenna systems. Nowadays, a codebook is not limited to a set of pre-defined precoders, it refers to a CSI feedback framework, which is more and more sophisticated. In this paper, we review the codebooks in 5G New Radio (NR) standards. The codebook timeline and the evolution trend are shown. Each codebook is elaborated with its motivation, the corresponding feedback mechanism, and the format of the precoding matrix indicator. Some insights are given to help grasp the underlying reasons and intuitions of these codebooks. Finally, we point out some unresolved challenges of the codebooks for future evolution of the standards. In general, this paper provides a comprehensive review of the codebooks in 5G NR and aims to help researchers understand the CSI feedback schemes from a standard and industrial perspective.
△ Less
Submitted 13 June, 2023; v1 submitted 17 February, 2023;
originally announced February 2023.
-
Semantic Communications with Variable-Length Coding for Extended Reality
Authors:
Bowen Zhang,
Zhijin Qin,
Geoffrey Ye Li
Abstract:
Wireless extended reality (XR) has attracted wide attentions as a promising technology to improve users' mobility and quality of experience. However, the ultra-high data rate requirement of wireless XR has hindered its development for many years. To overcome this challenge, we develop a semantic communication framework, where semantically-unimportant information is highly-compressed or discarded i…
▽ More
Wireless extended reality (XR) has attracted wide attentions as a promising technology to improve users' mobility and quality of experience. However, the ultra-high data rate requirement of wireless XR has hindered its development for many years. To overcome this challenge, we develop a semantic communication framework, where semantically-unimportant information is highly-compressed or discarded in semantic coders, significantly improving the transmission efficiency. Besides, considering the fact that some source content may have less amount of semantic information or have higher tolerance to channel noise, we propose a universal variable-length semantic-channel coding method. In particular, we first use a rate allocation network to estimate the best code length for semantic information and then adjust the coding process accordingly. By adopting some proxy functions, the whole framework is trained in an end-to-end manner. Numerical results show that our semantic system significantly outperforms traditional transmission methods and the proposed variable-length coding scheme is superior to the fixed-length coding methods.
△ Less
Submitted 11 March, 2023; v1 submitted 16 February, 2023;
originally announced February 2023.
-
Resource Optimization for Semantic-Aware Networks with Task Offloading
Authors:
Zelin Ji,
Zhijin Qin,
Xiaoming Tao,
Han Zhu
Abstract:
The limited capabilities of user equipment restrict the local implementation of computation-intensive applications. Edge computing, especially the edge intelligence system, enables local users to offload the computation tasks to the edge servers to reduce the computational energy consumption of user equipment and accelerate fast task execution. However, the limited bandwidth of upstream channels m…
▽ More
The limited capabilities of user equipment restrict the local implementation of computation-intensive applications. Edge computing, especially the edge intelligence system, enables local users to offload the computation tasks to the edge servers to reduce the computational energy consumption of user equipment and accelerate fast task execution. However, the limited bandwidth of upstream channels may increase the task transmission latency and affect the computation offloading performance. To overcome the challenge arising from scarce wireless communication resources, we propose a semantic-aware multi-modal task offloading system that facilitates the extraction and offloading of semantic task information to edge servers. To cope with the different tasks with multi-modal data, a unified quality of experience (QoE) criterion is designed. Furthermore, a proximal policy optimization-based multi-agent reinforcement learning algorithm (MAPPO) is proposed to coordinate the resource management for wireless communications and computation in a distributed and low computational complexity manner. Simulation results verify that the proposed MAPPO algorithm outperforms other reinforcement learning algorithms and fixed schemes in terms of task execution speed and the overall system QoE.
△ Less
Submitted 29 January, 2024; v1 submitted 19 January, 2023;
originally announced January 2023.
-
A Generalized Semantic Communication System: from Sources to Channels
Authors:
Zhijin Qin,
Feifei Gao,
Bo Lin,
Xiaoming Tao,
Guangyi Liu,
Chengkang Pan
Abstract:
Semantic communication is regarded as the breakthrough beyond the Shannon paradigm, which transmits only semantic information to significantly improve communication efficiency. This article introduces a framework for generalized semantic communication system, which exploits the semantic information in both the multimodal source and the wireless channel environment. Subsequently, the developed deep…
▽ More
Semantic communication is regarded as the breakthrough beyond the Shannon paradigm, which transmits only semantic information to significantly improve communication efficiency. This article introduces a framework for generalized semantic communication system, which exploits the semantic information in both the multimodal source and the wireless channel environment. Subsequently, the developed deep learning enabled end-to-end semantic communication and environment semantics aided wireless communication techniques are demonstrated through two examples. The article concludes with several research challenges to boost the development of such a generalized semantic communication system.
△ Less
Submitted 14 March, 2023; v1 submitted 11 January, 2023;
originally announced January 2023.
-
Semantic Sensing and Communications for Ultimate Extended Reality
Authors:
Bowen Zhang,
Zhijin Qin,
Yiyu Guo,
Geoffrey Ye Li
Abstract:
As a key technology in metaversa, wireless ultimate extended reality (XR) has attracted extensive attentions from both industry and academia. However, the stringent latency and ultra-high data rates requirements have hindered the development of wireless ultimate XR. Instead of transmitting the original source data bit-by-bit, semantic communications focus on the successful delivery of semantic inf…
▽ More
As a key technology in metaversa, wireless ultimate extended reality (XR) has attracted extensive attentions from both industry and academia. However, the stringent latency and ultra-high data rates requirements have hindered the development of wireless ultimate XR. Instead of transmitting the original source data bit-by-bit, semantic communications focus on the successful delivery of semantic information contained in the source, which have shown great potentials in reducing the data traffic of wireless systems. Inspired by semantic communications, this article develops a joint semantic sensing, rendering, and communication framework for wireless ultimate XR. In particular, semantic sensing is used to improve the sensing efficiency by exploring the spatial-temporal distributions of semantic information. Semantic rendering is designed to reduce the costs on semantically-redundant pixels. Next, semantic communications are adopted for high data transmission efficiency in wireless ultimate XR. Then, two case studies are provided to demonstrate the effectiveness of the proposed framework. Finally, potential research directions are identified to boost the development of semantic-aware wireless ultimate XR.
△ Less
Submitted 16 December, 2022;
originally announced December 2022.
-
Semantic Communication for Internet of Vehicles: A Multi-User Cooperative Approach
Authors:
Wenjun Xu,
Yimeng Zhang,
Fengyu Wang,
Zhijin Qin,
Chenyao Liu,
Ping Zhang
Abstract:
Internet of Vehicles (IoV) is expected to become the central infrastructure to provide advanced services to connected vehicles and users for higher transportation efficiency and security. A variety of emerging applications/services bring explosively growing demands for mobile data traffic between connected vehicles and roadside units (RSU), imposing the significant challenge of spectrum scarcity t…
▽ More
Internet of Vehicles (IoV) is expected to become the central infrastructure to provide advanced services to connected vehicles and users for higher transportation efficiency and security. A variety of emerging applications/services bring explosively growing demands for mobile data traffic between connected vehicles and roadside units (RSU), imposing the significant challenge of spectrum scarcity to IoV. In this paper, we propose a cooperative semantic-aware architecture to convey essential semantics from collaborated users to servers for lowering the data traffic. In contrast to current solutions that are mainly based on piling up highly complex signal processing techniques and multiple access capabilities in terms of syntactic communications, this paper puts forth the idea of semantic-aware content delivery in IoV. Specifically, the successful transmission of essential semantics of the source data is pursued, rather than the accurate reception of symbols regardless of its meaning as in conventional syntactic communications. To assess the benefits of the proposed architecture, we provide a case study of the image retrieval task for vehicles in intelligent transportation systems. Simulation results demonstrate that the proposed architecture outperforms the existing solutions with fewer radio resources, especially in a low signal-to-noise-ratio (SNR) regime, which can shed light on the potential of the proposed architecture in extending the applications in extreme environments.
△ Less
Submitted 6 December, 2022;
originally announced December 2022.
-
Spatio-temporal Incentives Optimization for Ride-hailing Services with Offline Deep Reinforcement Learning
Authors:
Yanqiu Wu,
Qingyang Li,
Zhiwei Qin
Abstract:
A fundamental question in any peer-to-peer ride-sharing system is how to, both effectively and efficiently, meet the request of passengers to balance the supply and demand in real time. On the passenger side, traditional approaches focus on pricing strategies by increasing the probability of users' call to adjust the distribution of demand. However, previous methods do not take into account the im…
▽ More
A fundamental question in any peer-to-peer ride-sharing system is how to, both effectively and efficiently, meet the request of passengers to balance the supply and demand in real time. On the passenger side, traditional approaches focus on pricing strategies by increasing the probability of users' call to adjust the distribution of demand. However, previous methods do not take into account the impact of changes in strategy on future supply and demand changes, which means drivers are repositioned to different destinations due to passengers' calls, which will affect the driver's income for a period of time in the future. Motivated by this observation, we make an attempt to optimize the distribution of demand to handle this problem by learning the long-term spatio-temporal values as a guideline for pricing strategy. In this study, we propose an offline deep reinforcement learning based method focusing on the demand side to improve the utilization of transportation resources and customer satisfaction. We adopt a spatio-temporal learning method to learn the value of different time and location, then incentivize the ride requests of passengers to adjust the distribution of demand to balance the supply and demand in the system. In particular, we model the problem as a Markov Decision Process (MDP).
△ Less
Submitted 6 November, 2022;
originally announced November 2022.
-
Proportionate Recursive Maximum Correntropy Criterion Adaptive Filtering Algorithms and their Performance Analysis
Authors:
Zhen Qin,
Jun Tao,
Le Yang,
Ming Jiang
Abstract:
The maximum correntropy criterion (MCC) has been employed to design outlier-robust adaptive filtering algorithms, among which the recursive MCC (RMCC) algorithm is a typical one. Motivated by the success of our recently proposed proportionate recursive least squares (PRLS) algorithm for sparse system identification, we propose to introduce the proportionate updating (PU) mechanism into the RMCC, l…
▽ More
The maximum correntropy criterion (MCC) has been employed to design outlier-robust adaptive filtering algorithms, among which the recursive MCC (RMCC) algorithm is a typical one. Motivated by the success of our recently proposed proportionate recursive least squares (PRLS) algorithm for sparse system identification, we propose to introduce the proportionate updating (PU) mechanism into the RMCC, leading to two sparsity-aware RMCC algorithms: the proportionate recursive MCC (PRMCC) algorithm and the combinational PRMCC (CPRMCC) algorithm. The CPRMCC is implemented as an adaptive convex combination of two PRMCC filters. For PRMCC, its stability condition and mean-square performance were analyzed. Based on the analysis, optimal parameter selection in nonstationary environments was obtained. Performance study of CPRMCC was also provided and showed that the CPRMCC performs at least as well as the better component PRMCC filter in steady state. Numerical simulations of sparse system identification corroborate the advantage of proposed algorithms as well as the validity of theoretical analysis.
△ Less
Submitted 7 October, 2023; v1 submitted 21 October, 2022;
originally announced October 2022.
-
Vector Quantized Semantic Communication System
Authors:
Qifan Fu,
Huiqiang Xie,
Zhijin Qin,
Gregory Slabaugh,
Xiaoming Tao
Abstract:
Although analog semantic communication systems have received considerable attention in the literature, there is less work on digital semantic communication systems. In this paper, we develop a deep learning (DL)-enabled vector quantized (VQ) semantic communication system for image transmission, named VQ-DeepSC. Specifically, we propose a convolutional neural network (CNN)-based transceiver to extr…
▽ More
Although analog semantic communication systems have received considerable attention in the literature, there is less work on digital semantic communication systems. In this paper, we develop a deep learning (DL)-enabled vector quantized (VQ) semantic communication system for image transmission, named VQ-DeepSC. Specifically, we propose a convolutional neural network (CNN)-based transceiver to extract multi-scale semantic features of images and introduce multi-scale semantic embedding spaces to perform semantic feature quantization, rendering the data compatible with digital communication systems. Furthermore, we employ adversarial training to improve the quality of received images by introducing a PatchGAN discriminator. Experimental results demonstrate that the proposed VQ-DeepSC is more robustness than BPG in digital communication systems and has comparable MS-SSIM performance to the DeepJSCC method.
△ Less
Submitted 12 April, 2023; v1 submitted 23 September, 2022;
originally announced September 2022.
-
A Validation Approach to Over-parameterized Matrix and Image Recovery
Authors:
Lijun Ding,
Zhen Qin,
Liwei Jiang,
Jinxin Zhou,
Zhihui Zhu
Abstract:
In this paper, we study the problem of recovering a low-rank matrix from a number of noisy random linear measurements. We consider the setting where the rank of the ground-truth matrix is unknown a prior and use an overspecified factored representation of the matrix variable, where the global optimal solutions overfit and do not correspond to the underlying ground-truth. We then solve the associat…
▽ More
In this paper, we study the problem of recovering a low-rank matrix from a number of noisy random linear measurements. We consider the setting where the rank of the ground-truth matrix is unknown a prior and use an overspecified factored representation of the matrix variable, where the global optimal solutions overfit and do not correspond to the underlying ground-truth. We then solve the associated nonconvex problem using gradient descent with small random initialization. We show that as long as the measurement operators satisfy the restricted isometry property (RIP) with its rank parameter scaling with the rank of ground-truth matrix rather than scaling with the overspecified matrix variable, gradient descent iterations are on a particular trajectory towards the ground-truth matrix and achieve nearly information-theoretically optimal recovery when stop appropriately. We then propose an efficient early stopping strategy based on the common hold-out method and show that it detects nearly optimal estimator provably. Moreover, experiments show that the proposed validation approach can also be efficiently used for image restoration with deep image prior which over-parameterizes an image with a deep network.
△ Less
Submitted 21 September, 2022;
originally announced September 2022.
-
A Unified Multi-Task Semantic Communication System for Multimodal Data
Authors:
Guangyi Zhang,
Qiyu Hu,
Zhijin Qin,
Yunlong Cai,
Guanding Yu,
Xiaoming Tao
Abstract:
Task-oriented semantic communications have achieved significant performance gains. However, the employed deep neural networks in semantic communications have to be updated when the task is changed or multiple models need to be stored for performing different tasks. To address this issue, we develop a unified deep learning-enabled semantic communication system (U-DeepSC), where a unified end-to-end…
▽ More
Task-oriented semantic communications have achieved significant performance gains. However, the employed deep neural networks in semantic communications have to be updated when the task is changed or multiple models need to be stored for performing different tasks. To address this issue, we develop a unified deep learning-enabled semantic communication system (U-DeepSC), where a unified end-to-end framework can serve many different tasks with multiple modalities of data. As the number of required features varies from task to task, we propose a vector-wise dynamic scheme that can adjust the number of transmitted symbols for different tasks. Moreover, our dynamic scheme can also adaptively adjust the number of transmitted features under different channel conditions to optimize the transmission efficiency. Particularly, we devise a lightweight feature selection module (FSM) to evaluate the importance of feature vectors, which can hierarchically drop redundant feature vectors and significantly accelerate the inference. To reduce the transmission overhead, we then design a unified codebook for feature representation to serve multiple tasks, where only the indices of these task-specific features in the codebook are transmitted. According to the simulation results, the proposed U-DeepSC achieves comparable performance to the task-oriented semantic communication system designed for a specific task but with significant reduction in both transmission overhead and model size.
△ Less
Submitted 8 June, 2024; v1 submitted 15 September, 2022;
originally announced September 2022.
-
A Multi-Dimensional Matrix Pencil-Based Channel Prediction Method for Massive MIMO with Mobility
Authors:
Weidong Li,
Haifan Yin,
Ziao Qin,
Yandi Cao,
Merouane Debbah
Abstract:
This paper addresses the mobility problem in massive multiple-input multiple-output systems, which leads to significant performance losses in the practical deployment of the fifth generation mobile communication networks. We propose a novel channel prediction method based on multi-dimensional matrix pencil (MDMP), which estimates the path parameters by exploiting the angular-frequency-domain and a…
▽ More
This paper addresses the mobility problem in massive multiple-input multiple-output systems, which leads to significant performance losses in the practical deployment of the fifth generation mobile communication networks. We propose a novel channel prediction method based on multi-dimensional matrix pencil (MDMP), which estimates the path parameters by exploiting the angular-frequency-domain and angular-time-domain structures of the wideband channel. The MDMP method also entails a novel path pairing scheme to pair the delay and Doppler, based on the super-resolution property of the angle estimation. Our method is able to deal with the realistic constraint of time-varying path delays introduced by user movements, which has not been considered so far in the literature. We prove theoretically that in the scenario with time-varying path delays, the prediction error converges to zero with the increasing number of the base station (BS) antennas, providing that only two arbitrary channel samples are known. We also derive a lower-bound of the number of the BS antennas to achieve a satisfactory performance. Simulation results under the industrial channel model of 3GPP demonstrate that our proposed MDMP method approaches the performance of the stationary scenario even when the users' velocity reaches 120 km/h and the latency of the channel state information is as large as 16 ms.
△ Less
Submitted 3 August, 2022;
originally announced August 2022.
-
Symbol Rate and Carries Estimation in OFDM Framework: A high Accuracy Technique under Low SNR
Authors:
Zetian Qin,
Yubai Li,
Benye Niu,
Qingyao Li,
Renhao Xue
Abstract:
Under a low Signal-to-Noise Ratio (SNR), the Orthogonal Frequency-Division Multiplexing (OFDM) signal symbol rate is limited. Existing carrier number estimation algorithms lack adequate methods to deal with low SNR. This paper proposes an algorithm with a low error rate under low SNR by correlating the signal and applying a Fast Fourier Transform (FFT) operation. By improving existing algorithms,…
▽ More
Under a low Signal-to-Noise Ratio (SNR), the Orthogonal Frequency-Division Multiplexing (OFDM) signal symbol rate is limited. Existing carrier number estimation algorithms lack adequate methods to deal with low SNR. This paper proposes an algorithm with a low error rate under low SNR by correlating the signal and applying a Fast Fourier Transform (FFT) operation. By improving existing algorithms, we improve the performance of the OFDM carrier count algorithm. The performance of the OFDM's useful symbol time estimation algorithm is improved by estimating the number of carriers and symbol rate.
△ Less
Submitted 27 July, 2022;
originally announced July 2022.
-
Fast optical refocusing through multimode fiber bend using Cake-Cutting Hadamard encoding algorithm to improve robustness
Authors:
Chuncheng Zhang,
Zheyi Yao,
Zhengyue Qin,
Guohua Gu,
Qian Chen,
Zhihua Xie,
Guodong Liu,
Xiubao Sui
Abstract:
Multimode fibres offer the advantages of high resolution and miniaturization over single mode fibers in the field of optical imaging. However, multimode fibre's imaging is susceptible to perturbations of MMF that can lead to secondary spatial distortions in the transmitted image. Perturbations include random disturbances in the fiber as well as environmental noise. Here, we exploit the fast focusi…
▽ More
Multimode fibres offer the advantages of high resolution and miniaturization over single mode fibers in the field of optical imaging. However, multimode fibre's imaging is susceptible to perturbations of MMF that can lead to secondary spatial distortions in the transmitted image. Perturbations include random disturbances in the fiber as well as environmental noise. Here, we exploit the fast focusing capability of the Cake-Cutting Hadamard coding algorithm to counteract the effects of perturbations and improve the system's robustness. Simulation shows that it can approach the theoretical enhancement at 2000 measurements. Experimental results show that the algorithm can help the system to refocus in a short time when MMFs are perturbed. This research will further contribute to using multimode fibres in medicine, communication, and detection.
△ Less
Submitted 27 July, 2022;
originally announced July 2022.
-
Robust Semantic Communications with Masked VQ-VAE Enabled Codebook
Authors:
Qiyu Hu,
Guangyi Zhang,
Zhijin Qin,
Yunlong Cai,
Guanding Yu,
Geoffrey Ye Li
Abstract:
Although semantic communications have exhibited satisfactory performance for a large number of tasks, the impact of semantic noise and the robustness of the systems have not been well investigated. Semantic noise refers to the misleading between the intended semantic symbols and received ones, thus cause the failure of tasks. In this paper, we first propose a framework for the robust end-to-end se…
▽ More
Although semantic communications have exhibited satisfactory performance for a large number of tasks, the impact of semantic noise and the robustness of the systems have not been well investigated. Semantic noise refers to the misleading between the intended semantic symbols and received ones, thus cause the failure of tasks. In this paper, we first propose a framework for the robust end-to-end semantic communication systems to combat the semantic noise. In particular, we analyze sample-dependent and sample-independent semantic noise. To combat the semantic noise, the adversarial training with weight perturbation is developed to incorporate the samples with semantic noise in the training dataset. Then, we propose to mask a portion of the input, where the semantic noise appears frequently, and design the masked vector quantized-variational autoencoder (VQ-VAE) with the noise-related masking strategy. We use a discrete codebook shared by the transmitter and the receiver for encoded feature representation. To further improve the system robustness, we develop a feature importance module (FIM) to suppress the noise-related and task-unrelated features. Thus, the transmitter simply needs to transmit the indices of these important task-related features in the codebook. Simulation results show that the proposed method can be applied in many downstream tasks and significantly improve the robustness against semantic noise with remarkable reduction on the transmission overhead.
△ Less
Submitted 18 April, 2023; v1 submitted 8 June, 2022;
originally announced June 2022.
-
A Robust Deep Learning Enabled Semantic Communication System for Text
Authors:
Xiang Peng,
Zhijin Qin,
Danlan Huang,
Xiaoming Tao,
Jianhua Lu,
Guangyi Liu,
Chengkang Pan
Abstract:
With the advent of the 6G era, the concept of semantic communication has attracted increasing attention. Compared with conventional communication systems, semantic communication systems are not only affected by physical noise existing in the wireless communication environment, e.g., additional white Gaussian noise, but also by semantic noise due to the source and the nature of deep learning-based…
▽ More
With the advent of the 6G era, the concept of semantic communication has attracted increasing attention. Compared with conventional communication systems, semantic communication systems are not only affected by physical noise existing in the wireless communication environment, e.g., additional white Gaussian noise, but also by semantic noise due to the source and the nature of deep learning-based systems. In this paper, we elaborate on the mechanism of semantic noise. In particular, we categorize semantic noise into two categories: literal semantic noise and adversarial semantic noise. The former is caused by written errors or expression ambiguity, while the latter is caused by perturbations or attacks added to the embedding layer via the semantic channel. To prevent semantic noise from influencing semantic communication systems, we present a robust deep learning enabled semantic communication system (R-DeepSC) that leverages a calibrated self-attention mechanism and adversarial training to tackle semantic noise. Compared with baseline models that only consider physical noise for text transmission, the proposed R-DeepSC achieves remarkable performance in dealing with semantic noise under different signal-to-noise ratios.
△ Less
Submitted 6 June, 2022;
originally announced June 2022.
-
A Unified Multi-Task Semantic Communication System with Domain Adaptation
Authors:
Guangyi Zhang,
Qiyu Hu,
Zhijin Qin,
Yunlong Cai,
Guanding Yu
Abstract:
The task-oriented semantic communication systems have achieved significant performance gain, however, the paradigm that employs a model for a specific task might be limited, since the system has to be updated once the task is changed or multiple models are stored for serving various tasks. To address this issue, we firstly propose a unified deep learning enabled semantic communication system (U-De…
▽ More
The task-oriented semantic communication systems have achieved significant performance gain, however, the paradigm that employs a model for a specific task might be limited, since the system has to be updated once the task is changed or multiple models are stored for serving various tasks. To address this issue, we firstly propose a unified deep learning enabled semantic communication system (U-DeepSC), where a unified model is developed to serve various transmission tasks. To jointly serve these tasks in one model with fixed parameters, we employ domain adaptation in the training procedure to specify the task-specific features for each task. Thus, the system only needs to transmit the task-specific features, rather than all the features, to reduce the transmission overhead. Moreover, since each task is of different difficulty and requires different number of layers to achieve satisfactory performance, we develop the multi-exit architecture to provide early-exit results for relatively simple tasks. In the experiments, we employ a proposed U-DeepSC to serve five tasks with multi-modalities. Simulation results demonstrate that our proposed U-DeepSC achieves comparable performance to the task-oriented semantic communication system designed for a specific task with significant transmission overhead reduction and much less number of model parameters.
△ Less
Submitted 1 June, 2022;
originally announced June 2022.
-
QoE-Aware Resource Allocation for Semantic Communication Networks
Authors:
Lei Yan,
Zhijin Qin,
Rui Zhang,
Yongzhao Li,
Geoffrey Ye Li
Abstract:
With the aim of accomplishing intelligence tasks, semantic communications transmit task-related information only, yielding significant performance gains over conventional communications. To guarantee user requirements for different types of tasks, we perform the semantic-aware resource allocation in a multi-cell multi-task network in this paper. Specifically, an approximate measure of semantic ent…
▽ More
With the aim of accomplishing intelligence tasks, semantic communications transmit task-related information only, yielding significant performance gains over conventional communications. To guarantee user requirements for different types of tasks, we perform the semantic-aware resource allocation in a multi-cell multi-task network in this paper. Specifically, an approximate measure of semantic entropy is first developed to quantify the semantic information for different tasks, based on which a novel quality-of-experience (QoE) model is proposed. We formulate the QoE-aware semantic resource allocation in terms of the number of transmitted semantic symbols, channel assignment, and power allocation. To solve this problem, we first decouple it into two independent subproblems. The first one is to optimize the number of transmitted semantic symbols with given channel assignment and power allocation, which is solved by the exhaustive searching method. The second one is the channel assignment and power allocation subproblem, which is modeled as a many-to-one matching game and solved by the proposed low-complexity matching algorithm. Simulation results demonstrate the effectiveness and superiority of the proposed method on the overall QoE.
△ Less
Submitted 28 May, 2022;
originally announced May 2022.
-
Deep Learning Enabled Semantic Communications with Speech Recognition and Synthesis
Authors:
Zhenzi Weng,
Zhijin Qin,
Xiaoming Tao,
Chengkang Pan,
Guangyi Liu,
Geoffrey Ye Li
Abstract:
In this paper, we develop a deep learning based semantic communication system for speech transmission, named DeepSC-ST. We take the speech recognition and speech synthesis as the transmission tasks of the communication system, respectively. First, the speech recognition-related semantic features are extracted for transmission by a joint semantic-channel encoder and the text is recovered at the rec…
▽ More
In this paper, we develop a deep learning based semantic communication system for speech transmission, named DeepSC-ST. We take the speech recognition and speech synthesis as the transmission tasks of the communication system, respectively. First, the speech recognition-related semantic features are extracted for transmission by a joint semantic-channel encoder and the text is recovered at the receiver based on the received semantic features, which significantly reduces the required amount of data transmission without performance degradation. Then, we perform speech synthesis at the receiver, which dedicates to re-generate the speech signals by feeding the recognized text and the speaker information into a neural network module. To enable the DeepSC-ST adaptive to dynamic channel environments, we identify a robust model to cope with different channel conditions. According to the simulation results, the proposed DeepSC-ST significantly outperforms conventional communication systems and existing DL-enabled communication systems, especially in the low signal-to-noise ratio (SNR) regime. A software demonstration is further developed as a proof-of-concept of the DeepSC-ST.
△ Less
Submitted 31 March, 2023; v1 submitted 9 May, 2022;
originally announced May 2022.
-
Unsupervised Voice-Face Representation Learning by Cross-Modal Prototype Contrast
Authors:
Boqing Zhu,
Kele Xu,
Changjian Wang,
Zheng Qin,
Tao Sun,
Huaimin Wang,
Yuxing Peng
Abstract:
We present an approach to learn voice-face representations from the talking face videos, without any identity labels. Previous works employ cross-modal instance discrimination tasks to establish the correlation of voice and face. These methods neglect the semantic content of different videos, introducing false-negative pairs as training noise. Furthermore, the positive pairs are constructed based…
▽ More
We present an approach to learn voice-face representations from the talking face videos, without any identity labels. Previous works employ cross-modal instance discrimination tasks to establish the correlation of voice and face. These methods neglect the semantic content of different videos, introducing false-negative pairs as training noise. Furthermore, the positive pairs are constructed based on the natural correlation between audio clips and visual frames. However, this correlation might be weak or inaccurate in a large amount of real-world data, which leads to deviating positives into the contrastive paradigm. To address these issues, we propose the cross-modal prototype contrastive learning (CMPC), which takes advantage of contrastive methods and resists adverse effects of false negatives and deviate positives. On one hand, CMPC could learn the intra-class invariance by constructing semantic-wise positives via unsupervised clustering in different modalities. On the other hand, by comparing the similarities of cross-modal instances from that of cross-modal prototypes, we dynamically recalibrate the unlearnable instances' contribution to overall loss. Experiments show that the proposed approach outperforms state-of-the-art unsupervised methods on various voice-face association evaluation protocols. Additionally, in the low-shot supervision setting, our method also has a significant improvement compared to previous instance-wise contrastive learning.
△ Less
Submitted 26 May, 2022; v1 submitted 28 April, 2022;
originally announced April 2022.
-
Federated Learning for Distributed Energy-Efficient Resource Allocation
Authors:
Zelin Ji,
Zhijin Qin
Abstract:
In cellular networks, resource allocation is performed in a centralized way, which brings huge computation complexity to the base station (BS) and high transmission overhead. This paper investigates the distributed resource allocation scheme for cellular networks to maximize the energy efficiency of the system in the uplink transmission, while guaranteeing the quality of service (QoS) for cellular…
▽ More
In cellular networks, resource allocation is performed in a centralized way, which brings huge computation complexity to the base station (BS) and high transmission overhead. This paper investigates the distributed resource allocation scheme for cellular networks to maximize the energy efficiency of the system in the uplink transmission, while guaranteeing the quality of service (QoS) for cellular users. Particularly, to cope the fast varying channels in wireless communication environment, we propose a robust federated reinforcement learning (FRL_suc) framework to enable local users to perform distributed resource allocation in items of transmit power and channel assignment by the guidance of the local neural network trained at each user. Analysis and numerical results show that the proposed FRL_suc framework can lower the transmission overhead and offload the computation from the central server to the local users, while outperforming the conventional multi-agent reinforcement learning algorithm in terms of EE, and is more robust to channel variations.
△ Less
Submitted 17 August, 2022; v1 submitted 20 April, 2022;
originally announced April 2022.
-
A Partial Reciprocity-based Channel Prediction Framework for FDD Massive MIMO with High Mobility
Authors:
Ziao Qin,
Haifan Yin,
Yandi Cao,
Weidong Li,
David Gesbert
Abstract:
Massive multiple-input multiple-output (MIMO) is believed to deliver unrepresented spectral efficiency gains for 5G and beyond. However, a practical challenge arises during its commercial deployment, which is known as the ``curse of mobility''. The performance of massive MIMO drops alarmingly when the velocity level of user increases. In this paper, we tackle the problem in frequency division dupl…
▽ More
Massive multiple-input multiple-output (MIMO) is believed to deliver unrepresented spectral efficiency gains for 5G and beyond. However, a practical challenge arises during its commercial deployment, which is known as the ``curse of mobility''. The performance of massive MIMO drops alarmingly when the velocity level of user increases. In this paper, we tackle the problem in frequency division duplex (FDD) massive MIMO with a novel Channel State Information (CSI) acquisition framework. A joint angle-delay-Doppler (JADD) wideband precoder is proposed for channel training. Our idea consists in the exploitation of the partial channel reciprocity of FDD and the angle-delay-Doppler channel structure. More precisely, the base station (BS) estimates the angle-delay-Doppler information of the UL channel based on UL pilots using Matrix Pencil (MP) method. It then computes the wideband JADD precoders according to the extracted parameters. Afterwards, the user estimates and feeds back some scalar coefficients for the BS to reconstruct the predicted DL channel. Asymptotic analysis shows that the CSI prediction error converges to zero when the number of BS antennas and the bandwidth increases. Numerical results with industrial channel model demonstrate that our framework can well adapt to high speed (350 km/h), large CSI delay (10 ms) and channel sample noise.
△ Less
Submitted 26 May, 2022; v1 submitted 11 February, 2022;
originally announced February 2022.
-
Robust Semantic Communications Against Semantic Noise
Authors:
Qiyu Hu,
Guangyi Zhang,
Zhijin Qin,
Yunlong Cai,
Guanding Yu,
Geoffrey Ye Li
Abstract:
Although the semantic communications have exhibited satisfactory performance in a large number of tasks, the impact of semantic noise and the robustness of the systems have not been well investigated. Semantic noise is a particular kind of noise in semantic communication systems, which refers to the misleading between the intended semantic symbols and received ones. In this paper, we first propose…
▽ More
Although the semantic communications have exhibited satisfactory performance in a large number of tasks, the impact of semantic noise and the robustness of the systems have not been well investigated. Semantic noise is a particular kind of noise in semantic communication systems, which refers to the misleading between the intended semantic symbols and received ones. In this paper, we first propose a framework for the robust end-to-end semantic communication systems to combat the semantic noise. Particularly, we analyze the causes of semantic noise and propose a practical method to generate it. To remove the effect of semantic noise, adversarial training is proposed to incorporate the samples with semantic noise in the training dataset. Then, the masked autoencoder (MAE) is designed as the architecture of a robust semantic communication system, where a portion of the input is masked. To further improve the robustness of semantic communication systems, we firstly employ the vector quantization-variational autoencoder (VQ-VAE) to design a discrete codebook shared by the transmitter and the receiver for encoded feature representation. Thus, the transmitter simply needs to transmit the indices of these features in the codebook. Simulation results show that our proposed method significantly improves the robustness of semantic communication systems against semantic noise with significant reduction on the transmission overhead.
△ Less
Submitted 22 May, 2022; v1 submitted 7 February, 2022;
originally announced February 2022.
-
Resource allocation for text semantic communications
Authors:
Lei Yan,
Zhijin Qin,
Rui Zhang,
Yongzhao Li,
Geoffrey Ye Li
Abstract:
Semantic communications have shown its great potential to improve the transmission reliability, especially in the low signal-to-noise regime. However, resource allocation for semantic communications still remains unexplored, which is a critical issue in guaranteeing the semantic transmission reliability and the communication efficiency. To fill this gap, we investigate the spectral efficiency in t…
▽ More
Semantic communications have shown its great potential to improve the transmission reliability, especially in the low signal-to-noise regime. However, resource allocation for semantic communications still remains unexplored, which is a critical issue in guaranteeing the semantic transmission reliability and the communication efficiency. To fill this gap, we investigate the spectral efficiency in the semantic domain and rethink the semantic-aware resource allocation issue. Specifically, taking text semantic communication as an example, the semantic spectral efficiency (S-SE) is defined for the first time, and is used to optimize resource allocation in terms of channel assignment and the number of transmitted semantic symbols. Additionally, for fair comparison of semantic and conventional communication systems, a transform method is developed to convert the conventional bit-based spectral efficiency to the S-SE. Simulation results demonstrate the validity and feasibility of the proposed resource allocation method, as well as the superiority of semantic communications in terms of the S-SE.
△ Less
Submitted 25 April, 2022; v1 submitted 16 January, 2022;
originally announced January 2022.