-
MTDSense: AI-Based Fingerprinting of Moving Target Defense Techniques in Software-Defined Networking
Authors:
Tina Moghaddam,
Guowei Yang,
Chandra Thapa,
Seyit Camtepe,
Dan Dongseong Kim
Abstract:
Moving target defenses (MTD) are proactive security techniques that enhance network security by confusing the attacker and limiting their attack window. MTDs have been shown to have significant benefits when evaluated against traditional network attacks, most of which are automated and untargeted. However, little has been done to address an attacker who is aware the network uses an MTD. In this wo…
▽ More
Moving target defenses (MTD) are proactive security techniques that enhance network security by confusing the attacker and limiting their attack window. MTDs have been shown to have significant benefits when evaluated against traditional network attacks, most of which are automated and untargeted. However, little has been done to address an attacker who is aware the network uses an MTD. In this work, we propose a novel approach named MTDSense, which can determine when the MTD has been triggered using the footprints the MTD operation leaves in the network traffic. MTDSense uses unsupervised clustering to identify traffic following an MTD trigger and extract the MTD interval. An attacker can use this information to maximize their attack window and tailor their attacks, which has been shown to significantly reduce the effectiveness of MTD. Through analyzing the attacker's approach, we propose and evaluate two new MTD update algorithms that aim to reduce the information leaked into the network by the MTD. We present an extensive experimental evaluation by creating, to our knowledge, the first dataset of the operation of an IP-shuffling MTD in a software-defined network. Our work reveals that despite previous results showing the effectiveness of MTD as a defense, traditional implementations of MTD are highly susceptible to a targeted attacker.
△ Less
Submitted 7 August, 2024;
originally announced August 2024.
-
Elevating Software Trust: Unveiling and Quantifying the Risk Landscape
Authors:
Sarah Ali Siddiqui,
Chandra Thapa,
Rayne Holland,
Wei Shao,
Seyit Camtepe
Abstract:
Considering the ever-evolving threat landscape and rapid changes in software development, we propose a risk assessment framework SRiQT (Software Risk Quantification through Trust). This framework is based on the necessity of a dynamic, data-driven, and adaptable process to quantify risk in the software supply chain. Usually, when formulating such frameworks, static pre-defined weights are assigned…
▽ More
Considering the ever-evolving threat landscape and rapid changes in software development, we propose a risk assessment framework SRiQT (Software Risk Quantification through Trust). This framework is based on the necessity of a dynamic, data-driven, and adaptable process to quantify risk in the software supply chain. Usually, when formulating such frameworks, static pre-defined weights are assigned to reflect the impact of each contributing parameter while aggregating these individual parameters to compute resulting risk scores. This leads to inflexibility, a lack of adaptability, and reduced accuracy, making them unsuitable for the changing nature of the digital world. We adopt a novel perspective by examining risk through the lens of trust and incorporating the human aspect. Moreover, we quantify risk associated with individual software by assessing and formulating risk elements quantitatively and exploring dynamic data-driven weight assignment. This enhances the sensitivity of the framework to cater to the evolving risk factors associated with software development and the different actors involved in the entire process. The devised framework is tested through a dataset containing 9000 samples, comprehensive scenarios, assessments, and expert opinions. Furthermore, a comparison between scores computed by the OpenSSF scorecard, OWASP risk calculator, and the proposed SRiQT framework has also been presented. The results suggest that SRiQT mitigates subjectivity and yields dynamic data-driven weights as well as risk scores.
△ Less
Submitted 5 August, 2024;
originally announced August 2024.
-
One-Shot Collaborative Data Distillation
Authors:
William Holland,
Chandra Thapa,
Sarah Ali Siddiqui,
Wei Shao,
Seyit Camtepe
Abstract:
Large machine-learning training datasets can be distilled into small collections of informative synthetic data samples. These synthetic sets support efficient model learning and reduce the communication cost of data sharing. Thus, high-fidelity distilled data can support the efficient deployment of machine learning applications in distributed network environments. A naive way to construct a synthe…
▽ More
Large machine-learning training datasets can be distilled into small collections of informative synthetic data samples. These synthetic sets support efficient model learning and reduce the communication cost of data sharing. Thus, high-fidelity distilled data can support the efficient deployment of machine learning applications in distributed network environments. A naive way to construct a synthetic set in a distributed environment is to allow each client to perform local data distillation and to merge local distillations at a central server. However, the quality of the resulting set is impaired by heterogeneity in the distributions of the local data held by clients. To overcome this challenge, we introduce the first collaborative data distillation technique, called CollabDM, which captures the global distribution of the data and requires only a single round of communication between client and server. Our method outperforms the state-of-the-art one-shot learning method on skewed data in distributed learning environments. We also show the promising practical benefits of our method when applied to attack detection in 5G networks.
△ Less
Submitted 12 August, 2024; v1 submitted 5 August, 2024;
originally announced August 2024.
-
ST-DPGAN: A Privacy-preserving Framework for Spatiotemporal Data Generation
Authors:
Wei Shao,
Rongyi Zhu,
Cai Yang,
Chandra Thapa,
Muhammad Ejaz Ahmed,
Seyit Camtepe,
Rui Zhang,
DuYong Kim,
Hamid Menouar,
Flora D. Salim
Abstract:
Spatiotemporal data is prevalent in a wide range of edge devices, such as those used in personal communication and financial transactions. Recent advancements have sparked a growing interest in integrating spatiotemporal analysis with large-scale language models. However, spatiotemporal data often contains sensitive information, making it unsuitable for open third-party access. To address this cha…
▽ More
Spatiotemporal data is prevalent in a wide range of edge devices, such as those used in personal communication and financial transactions. Recent advancements have sparked a growing interest in integrating spatiotemporal analysis with large-scale language models. However, spatiotemporal data often contains sensitive information, making it unsuitable for open third-party access. To address this challenge, we propose a Graph-GAN-based model for generating privacy-protected spatiotemporal data. Our approach incorporates spatial and temporal attention blocks in the discriminator and a spatiotemporal deconvolution structure in the generator. These enhancements enable efficient training under Gaussian noise to achieve differential privacy. Extensive experiments conducted on three real-world spatiotemporal datasets validate the efficacy of our model. Our method provides a privacy guarantee while maintaining the data utility. The prediction model trained on our generated data maintains a competitive performance compared to the model trained on the original data.
△ Less
Submitted 4 June, 2024;
originally announced June 2024.
-
Mitigation of Channel Tampering Attacks in Continuous-Variable Quantum Key Distribution
Authors:
Sebastian P. Kish,
Chandra Thapa,
Mikhael Sayat,
Hajime Suzuki,
Josef Pieprzyk,
Seyit Camtepe
Abstract:
Despite significant advancements in continuous-variable quantum key distribution (CV-QKD), practical CV-QKD systems can be compromised by various attacks. Consequently, identifying new attack vectors and countermeasures for CV-QKD implementations is important for the continued robustness of CV-QKD. In particular, as CV-QKD relies on a public quantum channel, vulnerability to communication disrupti…
▽ More
Despite significant advancements in continuous-variable quantum key distribution (CV-QKD), practical CV-QKD systems can be compromised by various attacks. Consequently, identifying new attack vectors and countermeasures for CV-QKD implementations is important for the continued robustness of CV-QKD. In particular, as CV-QKD relies on a public quantum channel, vulnerability to communication disruption persists from potential adversaries employing Denial-of-Service (DoS) attacks. Inspired by DoS attacks, this paper introduces a novel threat in CV-QKD called the Channel Amplification (CA) attack, wherein Eve manipulates the communication channel through amplification. We specifically model this attack in a CV-QKD optical fiber setup. To counter this threat, we propose a detection and mitigation strategy. Detection involves a machine learning (ML) model based on a decision tree classifier, classifying various channel tampering attacks, including CA and DoS attacks. For mitigation, Bob, post-selects quadrature data by classifying the attack type and frequency. Our ML model exhibits high accuracy in distinguishing and categorizing these attacks. The CA attack's impact on the secret key rate (SKR) is explored concerning Eve's location and the relative intensity noise of the local oscillator (LO). The proposed mitigation strategy improves the attacked SKR for CA attacks and, in some cases, for hybrid CA-DoS attacks. Our study marks a novel application of both ML classification and post-selection in this context. These findings are important for enhancing the robustness of CV-QKD systems against emerging threats on the channel.
△ Less
Submitted 12 June, 2024; v1 submitted 29 January, 2024;
originally announced January 2024.
-
Entropy Causal Graphs for Multivariate Time Series Anomaly Detection
Authors:
Falih Gozi Febrinanto,
Kristen Moore,
Chandra Thapa,
Mujie Liu,
Vidya Saikrishna,
Jiangang Ma,
Feng Xia
Abstract:
Many multivariate time series anomaly detection frameworks have been proposed and widely applied. However, most of these frameworks do not consider intrinsic relationships between variables in multivariate time series data, thus ignoring the causal relationship among variables and degrading anomaly detection performance. This work proposes a novel framework called CGAD, an entropy Causal Graph for…
▽ More
Many multivariate time series anomaly detection frameworks have been proposed and widely applied. However, most of these frameworks do not consider intrinsic relationships between variables in multivariate time series data, thus ignoring the causal relationship among variables and degrading anomaly detection performance. This work proposes a novel framework called CGAD, an entropy Causal Graph for multivariate time series Anomaly Detection. CGAD utilizes transfer entropy to construct graph structures that unveil the underlying causal relationships among time series data. Weighted graph convolutional networks combined with causal convolutions are employed to model both the causal graph structures and the temporal patterns within multivariate time series data. Furthermore, CGAD applies anomaly scoring, leveraging median absolute deviation-based normalization to improve the robustness of the anomaly identification process. Extensive experiments demonstrate that CGAD outperforms state-of-the-art methods on real-world datasets with a 15% average improvement based on three different multivariate time series anomaly detection metrics.
△ Less
Submitted 14 December, 2023;
originally announced December 2023.
-
Radio Signal Classification by Adversarially Robust Quantum Machine Learning
Authors:
Yanqiu Wu,
Eromanga Adermann,
Chandra Thapa,
Seyit Camtepe,
Hajime Suzuki,
Muhammad Usman
Abstract:
Radio signal classification plays a pivotal role in identifying the modulation scheme used in received radio signals, which is essential for demodulation and proper interpretation of the transmitted information. Researchers have underscored the high susceptibility of ML algorithms for radio signal classification to adversarial attacks. Such vulnerability could result in severe consequences, includ…
▽ More
Radio signal classification plays a pivotal role in identifying the modulation scheme used in received radio signals, which is essential for demodulation and proper interpretation of the transmitted information. Researchers have underscored the high susceptibility of ML algorithms for radio signal classification to adversarial attacks. Such vulnerability could result in severe consequences, including misinterpretation of critical messages, interception of classified information, or disruption of communication channels. Recent advancements in quantum computing have revolutionized theories and implementations of computation, bringing the unprecedented development of Quantum Machine Learning (QML). It is shown that quantum variational classifiers (QVCs) provide notably enhanced robustness against classical adversarial attacks in image classification. However, no research has yet explored whether QML can similarly mitigate adversarial threats in the context of radio signal classification. This work applies QVCs to radio signal classification and studies their robustness to various adversarial attacks. We also propose the novel application of the approximate amplitude encoding (AAE) technique to encode radio signal data efficiently. Our extensive simulation results present that attacks generated on QVCs transfer well to CNN models, indicating that these adversarial examples can fool neural networks that they are not explicitly designed to attack. However, the converse is not true. QVCs primarily resist the attacks generated on CNNs. Overall, with comprehensive simulations, our results shed new light on the growing field of QML by bridging knowledge gaps in QAML in radio signal classification and uncovering the advantages of applying QML methods in practical applications.
△ Less
Submitted 12 December, 2023;
originally announced December 2023.
-
Federated Split Learning with Only Positive Labels for resource-constrained IoT environment
Authors:
Praveen Joshi,
Chandra Thapa,
Mohammed Hasanuzzaman,
Ted Scully,
Haithem Afli
Abstract:
Distributed collaborative machine learning (DCML) is a promising method in the Internet of Things (IoT) domain for training deep learning models, as data is distributed across multiple devices. A key advantage of this approach is that it improves data privacy by removing the necessity for the centralized aggregation of raw data but also empowers IoT devices with low computational power. Among vari…
▽ More
Distributed collaborative machine learning (DCML) is a promising method in the Internet of Things (IoT) domain for training deep learning models, as data is distributed across multiple devices. A key advantage of this approach is that it improves data privacy by removing the necessity for the centralized aggregation of raw data but also empowers IoT devices with low computational power. Among various techniques in a DCML framework, federated split learning, known as splitfed learning (SFL), is the most suitable for efficient training and testing when devices have limited computational capabilities. Nevertheless, when resource-constrained IoT devices have only positive labeled data, multiclass classification deep learning models in SFL fail to converge or provide suboptimal results. To overcome these challenges, we propose splitfed learning with positive labels (SFPL). SFPL applies a random shuffling function to the smashed data received from clients before supplying it to the server for model training. Additionally, SFPL incorporates the local batch normalization for the client-side model portion during the inference phase. Our results demonstrate that SFPL outperforms SFL: (i) by factors of 51.54 and 32.57 for ResNet-56 and ResNet-32, respectively, with the CIFAR-100 dataset, and (ii) by factors of 9.23 and 8.52 for ResNet-32 and ResNet-8, respectively, with CIFAR-10 dataset. Overall, this investigation underscores the efficacy of the proposed SFPL framework in DCML.
△ Less
Submitted 25 July, 2023;
originally announced July 2023.
-
ACE: A Consent-Embedded privacy-preserving search on genomic database
Authors:
Sara Jafarbeiki,
Amin Sakzad,
Ron Steinfeld,
Shabnam Kasra Kermanshahi,
Chandra Thapa,
Yuki Kume
Abstract:
In this paper, we introduce ACE, a consent-embedded searchable encryption scheme. ACE enables dynamic consent management by supporting the physical deletion of associated data at the time of consent revocation. This ensures instant real deletion of data, aligning with privacy regulations and preserving individuals' rights. We evaluate ACE in the context of genomic databases, demonstrating its abil…
▽ More
In this paper, we introduce ACE, a consent-embedded searchable encryption scheme. ACE enables dynamic consent management by supporting the physical deletion of associated data at the time of consent revocation. This ensures instant real deletion of data, aligning with privacy regulations and preserving individuals' rights. We evaluate ACE in the context of genomic databases, demonstrating its ability to perform the addition and deletion of genomic records and related information based on ID, which especially complies with the requirements of deleting information of a particular data owner. To formally prove that ACE is secure under non-adaptive attacks, we present two new definitions of forward and backward privacy. We also define a new hard problem, which we call D-ACE, that facilitates the proof of our theorem (we formally prove its hardness by a security reduction from DDH to D-ACE). We finally present implementation results to evaluate the performance of ACE.
△ Less
Submitted 23 July, 2023;
originally announced July 2023.
-
Discretization-based ensemble model for robust learning in IoT
Authors:
Anahita Namvar,
Chandra Thapa,
Salil S. Kanhere
Abstract:
IoT device identification is the process of recognizing and verifying connected IoT devices to the network. This is an essential process for ensuring that only authorized devices can access the network, and it is necessary for network management and maintenance. In recent years, machine learning models have been used widely for automating the process of identifying devices in the network. However,…
▽ More
IoT device identification is the process of recognizing and verifying connected IoT devices to the network. This is an essential process for ensuring that only authorized devices can access the network, and it is necessary for network management and maintenance. In recent years, machine learning models have been used widely for automating the process of identifying devices in the network. However, these models are vulnerable to adversarial attacks that can compromise their accuracy and effectiveness. To better secure device identification models, discretization techniques enable reduction in the sensitivity of machine learning models to adversarial attacks contributing to the stability and reliability of the model. On the other hand, Ensemble methods combine multiple heterogeneous models to reduce the impact of remaining noise or errors in the model. Therefore, in this paper, we integrate discretization techniques and ensemble methods and examine it on model robustness against adversarial attacks. In other words, we propose a discretization-based ensemble stacking technique to improve the security of our ML models. We evaluate the performance of different ML-based IoT device identification models against white box and black box attacks using a real-world dataset comprised of network traffic from 28 IoT devices. We demonstrate that the proposed method enables robustness to the models for IoT device identification.
△ Less
Submitted 17 July, 2023;
originally announced July 2023.
-
Access-based Lightweight Physical Layer Authentication for the Internet of Things Devices
Authors:
Saud Khan,
Chandra Thapa,
Salman Durrani,
Seyit Camtepe
Abstract:
Physical-layer authentication is a popular alternative to the conventional key-based authentication for internet of things (IoT) devices due to their limited computational capacity and battery power. However, this approach has limitations due to poor robustness under channel fluctuations, reconciliation overhead, and no clear safeguard distance to ensure the secrecy of the generated authentication…
▽ More
Physical-layer authentication is a popular alternative to the conventional key-based authentication for internet of things (IoT) devices due to their limited computational capacity and battery power. However, this approach has limitations due to poor robustness under channel fluctuations, reconciliation overhead, and no clear safeguard distance to ensure the secrecy of the generated authentication keys. In this regard, we propose a novel, secure, and lightweight continuous authentication scheme for IoT device authentication. Our scheme utilizes the inherent properties of the IoT devices' transmission model as its source for seed generation and device authentication. Specifically, our proposed scheme provides continuous authentication by checking the access time slots and spreading sequences of the IoT devices instead of repeatedly generating and verifying shared keys. Due to this, access to a coherent key is not required in our proposed scheme, resulting in the concealment of the seed information from attackers. Our proposed authentication scheme for IoT devices demonstrates improved performance compared to the benchmark schemes relying on physical channels. Our empirical results find a near threefold decrease in the misdetection rate of illegitimate devices and close to zero false alarm rate in various system settings with varied numbers of active devices up to 200 and signal-to-noise ratio from 0 dB to 25 dB. Our proposed authentication scheme also has a lower computational complexity of at least half the computational cost of the benchmark schemes based on support vector machine and binary hypothesis testing in our studies. This further corroborates the practicality of our scheme for IoT deployments.
△ Less
Submitted 6 November, 2023; v1 submitted 1 March, 2023;
originally announced March 2023.
-
Vertical Federated Learning: Taxonomies, Threats, and Prospects
Authors:
Qun Li,
Chandra Thapa,
Lawrence Ong,
Yifeng Zheng,
Hua Ma,
Seyit A. Camtepe,
Anmin Fu,
Yansong Gao
Abstract:
Federated learning (FL) is the most popular distributed machine learning technique. FL allows machine-learning models to be trained without acquiring raw data to a single point for processing. Instead, local models are trained with local data; the models are then shared and combined. This approach preserves data privacy as locally trained models are shared instead of the raw data themselves. Broad…
▽ More
Federated learning (FL) is the most popular distributed machine learning technique. FL allows machine-learning models to be trained without acquiring raw data to a single point for processing. Instead, local models are trained with local data; the models are then shared and combined. This approach preserves data privacy as locally trained models are shared instead of the raw data themselves. Broadly, FL can be divided into horizontal federated learning (HFL) and vertical federated learning (VFL). For the former, different parties hold different samples over the same set of features; for the latter, different parties hold different feature data belonging to the same set of samples. In a number of practical scenarios, VFL is more relevant than HFL as different companies (e.g., bank and retailer) hold different features (e.g., credit history and shopping history) for the same set of customers. Although VFL is an emerging area of research, it is not well-established compared to HFL. Besides, VFL-related studies are dispersed, and their connections are not intuitive. Thus, this survey aims to bring these VFL-related studies to one place. Firstly, we classify existing VFL structures and algorithms. Secondly, we present the threats from security and privacy perspectives to VFL. Thirdly, for the benefit of future researchers, we discussed the challenges and prospects of VFL in detail.
△ Less
Submitted 3 February, 2023;
originally announced February 2023.
-
Enabling All In-Edge Deep Learning: A Literature Review
Authors:
Praveen Joshi,
Mohammed Hasanuzzaman,
Chandra Thapa,
Haithem Afli,
Ted Scully
Abstract:
In recent years, deep learning (DL) models have demonstrated remarkable achievements on non-trivial tasks such as speech recognition and natural language understanding. One of the significant contributors to its success is the proliferation of end devices that acted as a catalyst to provide data for data-hungry DL models. However, computing DL training and inference is the main challenge. Usually,…
▽ More
In recent years, deep learning (DL) models have demonstrated remarkable achievements on non-trivial tasks such as speech recognition and natural language understanding. One of the significant contributors to its success is the proliferation of end devices that acted as a catalyst to provide data for data-hungry DL models. However, computing DL training and inference is the main challenge. Usually, central cloud servers are used for the computation, but it opens up other significant challenges, such as high latency, increased communication costs, and privacy concerns. To mitigate these drawbacks, considerable efforts have been made to push the processing of DL models to edge servers. Moreover, the confluence point of DL and edge has given rise to edge intelligence (EI). This survey paper focuses primarily on the fifth level of EI, called all in-edge level, where DL training and inference (deployment) are performed solely by edge servers. All in-edge is suitable when the end devices have low computing resources, e.g., Internet-of-Things, and other requirements such as latency and communication cost are important in mission-critical applications, e.g., health care. Firstly, this paper presents all in-edge computing architectures, including centralized, decentralized, and distributed. Secondly, this paper presents enabling technologies, such as model parallelism and split learning, which facilitate DL training and deployment at edge servers. Thirdly, model adaptation techniques based on model compression and conditional computation are described because the standard cloud-based DL deployment cannot be directly applied to all in-edge due to its limited computational resources. Fourthly, this paper discusses eleven key performance metrics to evaluate the performance of DL at all in-edge efficiently. Finally, several open research challenges in the area of all in-edge are presented.
△ Less
Submitted 12 December, 2022; v1 submitted 7 April, 2022;
originally announced April 2022.
-
Transformer-Based Language Models for Software Vulnerability Detection
Authors:
Chandra Thapa,
Seung Ick Jang,
Muhammad Ejaz Ahmed,
Seyit Camtepe,
Josef Pieprzyk,
Surya Nepal
Abstract:
The large transformer-based language models demonstrate excellent performance in natural language processing. By considering the transferability of the knowledge gained by these models in one domain to other related domains, and the closeness of natural languages to high-level programming languages, such as C/C++, this work studies how to leverage (large) transformer-based language models in detec…
▽ More
The large transformer-based language models demonstrate excellent performance in natural language processing. By considering the transferability of the knowledge gained by these models in one domain to other related domains, and the closeness of natural languages to high-level programming languages, such as C/C++, this work studies how to leverage (large) transformer-based language models in detecting software vulnerabilities and how good are these models for vulnerability detection tasks. In this regard, firstly, a systematic (cohesive) framework that details source code translation, model preparation, and inference is presented. Then, an empirical analysis is performed with software vulnerability datasets with C/C++ source codes having multiple vulnerabilities corresponding to the library function call, pointer usage, array usage, and arithmetic expression. Our empirical results demonstrate the good performance of the language models in vulnerability detection. Moreover, these language models have better performance metrics, such as F1-score, than the contemporary models, namely bidirectional long short-term memory and bidirectional gated recurrent unit. Experimenting with the language models is always challenging due to the requirement of computing resources, platforms, libraries, and dependencies. Thus, this paper also analyses the popular platforms to efficiently fine-tune these models and present recommendations while choosing the platforms.
△ Less
Submitted 5 September, 2022; v1 submitted 7 April, 2022;
originally announced April 2022.
-
Graph Lifelong Learning: A Survey
Authors:
Falih Gozi Febrinanto,
Feng Xia,
Kristen Moore,
Chandra Thapa,
Charu Aggarwal
Abstract:
Graph learning is a popular approach for performing machine learning on graph-structured data. It has revolutionized the machine learning ability to model graph data to address downstream tasks. Its application is wide due to the availability of graph data ranging from all types of networks to information systems. Most graph learning methods assume that the graph is static and its complete structu…
▽ More
Graph learning is a popular approach for performing machine learning on graph-structured data. It has revolutionized the machine learning ability to model graph data to address downstream tasks. Its application is wide due to the availability of graph data ranging from all types of networks to information systems. Most graph learning methods assume that the graph is static and its complete structure is known during training. This limits their applicability since they cannot be applied to problems where the underlying graph grows over time and/or new tasks emerge incrementally. Such applications require a lifelong learning approach that can learn the graph continuously and accommodate new information whilst retaining previously learned knowledge. Lifelong learning methods that enable continuous learning in regular domains like images and text cannot be directly applied to continuously evolving graph data, due to its irregular structure. As a result, graph lifelong learning is gaining attention from the research community. This survey paper provides a comprehensive overview of recent advancements in graph lifelong learning, including the categorization of existing methods, and the discussions of potential applications and open research problems.
△ Less
Submitted 3 November, 2022; v1 submitted 22 February, 2022;
originally announced February 2022.
-
Splitfed learning without client-side synchronization: Analyzing client-side split network portion size to overall performance
Authors:
Praveen Joshi,
Chandra Thapa,
Seyit Camtepe,
Mohammed Hasanuzzamana,
Ted Scully,
Haithem Afli
Abstract:
Federated Learning (FL), Split Learning (SL), and SplitFed Learning (SFL) are three recent developments in distributed machine learning that are gaining attention due to their ability to preserve the privacy of raw data. Thus, they are widely applicable in various domains where data is sensitive, such as large-scale medical image classification, internet-of-medical-things, and cross-organization p…
▽ More
Federated Learning (FL), Split Learning (SL), and SplitFed Learning (SFL) are three recent developments in distributed machine learning that are gaining attention due to their ability to preserve the privacy of raw data. Thus, they are widely applicable in various domains where data is sensitive, such as large-scale medical image classification, internet-of-medical-things, and cross-organization phishing email detection. SFL is developed on the confluence point of FL and SL. It brings the best of FL and SL by providing parallel client-side machine learning model updates from the FL paradigm and a higher level of model privacy (while training) by splitting the model between the clients and server coming from SL. However, SFL has communication and computation overhead at the client-side due to the requirement of client-side model synchronization. For the resource-constrained client-side, removal of such requirements is required to gain efficiency in the learning. In this regard, this paper studies SFL without client-side model synchronization. The resulting architecture is known as Multi-head Split Learning. Our empirical studies considering the ResNet18 model on MNIST data under IID data distribution among distributed clients find that Multi-head Split Learning is feasible. Its performance is comparable to the SFL. Moreover, SFL provides only 1%-2% better accuracy than Multi-head Split Learning on the MNIST test set. To further strengthen our results, we study the Multi-head Split Learning with various client-side model portions and its impact on the overall performance. To this end, our results find a minimal impact on the overall performance of the model.
△ Less
Submitted 19 September, 2021;
originally announced September 2021.
-
FedDICE: A ransomware spread detection in a distributed integrated clinical environment using federated learning and SDN based mitigation
Authors:
Chandra Thapa,
Kallol Krishna Karmakar,
Alberto Huertas Celdran,
Seyit Camtepe,
Vijay Varadharajan,
Surya Nepal
Abstract:
An integrated clinical environment (ICE) enables the connection and coordination of the internet of medical things around the care of patients in hospitals. However, ransomware attacks and their spread on hospital infrastructures, including ICE, are rising. Often the adversaries are targeting multiple hospitals with the same ransomware attacks. These attacks are detected by using machine learning…
▽ More
An integrated clinical environment (ICE) enables the connection and coordination of the internet of medical things around the care of patients in hospitals. However, ransomware attacks and their spread on hospital infrastructures, including ICE, are rising. Often the adversaries are targeting multiple hospitals with the same ransomware attacks. These attacks are detected by using machine learning algorithms. But the challenge is devising the anti-ransomware learning mechanisms and services under the following conditions: (1) provide immunity to other hospitals if one of them got the attack, (2) hospitals are usually distributed over geographical locations, and (3) direct data sharing is avoided due to privacy concerns. In this regard, this paper presents a federated distributed integrated clinical environment, aka. FedDICE. FedDICE integrates federated learning (FL), which is privacy-preserving learning, to SDN-oriented security architecture to enable collaborative learning, detection, and mitigation of ransomware attacks. We demonstrate the importance of FedDICE in a collaborative environment with up to four hospitals and four popular ransomware families, namely WannaCry, Petya, BadRabbit, and PowerGhost. Our results find that in both IID and non-IID data setups, FedDICE achieves the centralized baseline performance that needs direct data sharing for detection. However, as a trade-off to data privacy, FedDICE observes overhead in the anti-ransomware model training, e.g., 28x for the logistic regression model. Besides, FedDICE utilizes SDN's dynamic network programmability feature to remove the infected devices in ICE.
△ Less
Submitted 9 June, 2021;
originally announced June 2021.
-
Evaluation and Optimization of Distributed Machine Learning Techniques for Internet of Things
Authors:
Yansong Gao,
Minki Kim,
Chandra Thapa,
Sharif Abuadbba,
Zhi Zhang,
Seyit A. Camtepe,
Hyoungshick Kim,
Surya Nepal
Abstract:
Federated learning (FL) and split learning (SL) are state-of-the-art distributed machine learning techniques to enable machine learning training without accessing raw data on clients or end devices. However, their \emph{comparative training performance} under real-world resource-restricted Internet of Things (IoT) device settings, e.g., Raspberry Pi, remains barely studied, which, to our knowledge…
▽ More
Federated learning (FL) and split learning (SL) are state-of-the-art distributed machine learning techniques to enable machine learning training without accessing raw data on clients or end devices. However, their \emph{comparative training performance} under real-world resource-restricted Internet of Things (IoT) device settings, e.g., Raspberry Pi, remains barely studied, which, to our knowledge, have not yet been evaluated and compared, rendering inconvenient reference for practitioners. This work firstly provides empirical comparisons of FL and SL in real-world IoT settings regarding (i) learning performance with heterogeneous data distributions and (ii) on-device execution overhead. Our analyses in this work demonstrate that the learning performance of SL is better than FL under an imbalanced data distribution but worse than FL under an extreme non-IID data distribution. Recently, FL and SL are combined to form splitfed learning (SFL) to leverage each of their benefits (e.g., parallel training of FL and lightweight on-device computation requirement of SL). This work then considers FL, SL, and SFL, and mount them on Raspberry Pi devices to evaluate their performance, including training time, communication overhead, power consumption, and memory usage. Besides evaluations, we apply two optimizations. Firstly, we generalize SFL by carefully examining the possibility of a hybrid type of model training at the server-side. The generalized SFL merges sequential (dependent) and parallel (independent) processes of model training and is thus beneficial for a system with large-scaled IoT devices, specifically at the server-side operations. Secondly, we propose pragmatic techniques to substantially reduce the communication overhead by up to four times for the SL and (generalized) SFL.
△ Less
Submitted 3 March, 2021;
originally announced March 2021.
-
Advancements of federated learning towards privacy preservation: from federated learning to split learning
Authors:
Chandra Thapa,
M. A. P. Chamikara,
Seyit A. Camtepe
Abstract:
In the distributed collaborative machine learning (DCML) paradigm, federated learning (FL) recently attracted much attention due to its applications in health, finance, and the latest innovations such as industry 4.0 and smart vehicles. FL provides privacy-by-design. It trains a machine learning model collaboratively over several distributed clients (ranging from two to millions) such as mobile ph…
▽ More
In the distributed collaborative machine learning (DCML) paradigm, federated learning (FL) recently attracted much attention due to its applications in health, finance, and the latest innovations such as industry 4.0 and smart vehicles. FL provides privacy-by-design. It trains a machine learning model collaboratively over several distributed clients (ranging from two to millions) such as mobile phones, without sharing their raw data with any other participant. In practical scenarios, all clients do not have sufficient computing resources (e.g., Internet of Things), the machine learning model has millions of parameters, and its privacy between the server and the clients while training/testing is a prime concern (e.g., rival parties). In this regard, FL is not sufficient, so split learning (SL) is introduced. SL is reliable in these scenarios as it splits a model into multiple portions, distributes them among clients and server, and trains/tests their respective model portions to accomplish the full model training/testing. In SL, the participants do not share both data and their model portions to any other parties, and usually, a smaller network portion is assigned to the clients where data resides. Recently, a hybrid of FL and SL, called splitfed learning, is introduced to elevate the benefits of both FL (faster training/testing time) and SL (model split and training). Following the developments from FL to SL, and considering the importance of SL, this chapter is designed to provide extensive coverage in SL and its variants. The coverage includes fundamentals, existing findings, integration with privacy measures such as differential privacy, open problems, and code implementation.
△ Less
Submitted 25 November, 2020;
originally announced November 2020.
-
Precision Health Data: Requirements, Challenges and Existing Techniques for Data Security and Privacy
Authors:
Chandra Thapa,
Seyit Camtepe
Abstract:
Precision health leverages information from various sources, including omics, lifestyle, environment, social media, medical records, and medical insurance claims to enable personalized care, prevent and predict illness, and precise treatments. It extensively uses sensing technologies (e.g., electronic health monitoring devices), computations (e.g., machine learning), and communication (e.g., inter…
▽ More
Precision health leverages information from various sources, including omics, lifestyle, environment, social media, medical records, and medical insurance claims to enable personalized care, prevent and predict illness, and precise treatments. It extensively uses sensing technologies (e.g., electronic health monitoring devices), computations (e.g., machine learning), and communication (e.g., interaction between the health data centers). As health data contain sensitive private information, including the identity of patient and carer and medical conditions of the patient, proper care is required at all times. Leakage of these private information affects the personal life, including bullying, high insurance premium, and loss of job due to the medical history. Thus, the security, privacy of and trust on the information are of utmost importance. Moreover, government legislation and ethics committees demand the security and privacy of healthcare data. Herein, in the light of precision health data security, privacy, ethical and regulatory requirements, finding the best methods and techniques for the utilization of the health data, and thus precision health is essential. In this regard, firstly, this paper explores the regulations, ethical guidelines around the world, and domain-specific needs. Then it presents the requirements and investigates the associated challenges. Secondly, this paper investigates secure and privacy-preserving machine learning methods suitable for the computation of precision health data along with their usage in relevant health projects. Finally, it illustrates the best available techniques for precision health data security and privacy with a conceptual system model that enables compliance, ethics clearance, consent management, medical innovations, and developments in the health domain.
△ Less
Submitted 24 August, 2020;
originally announced August 2020.
-
Evaluation of Federated Learning in Phishing Email Detection
Authors:
Chandra Thapa,
Jun Wen Tang,
Alsharif Abuadbba,
Yansong Gao,
Seyit Camtepe,
Surya Nepal,
Mahathir Almashor,
Yifeng Zheng
Abstract:
The use of Artificial Intelligence (AI) to detect phishing emails is primarily dependent on large-scale centralized datasets, which opens it up to a myriad of privacy, trust, and legal issues. Moreover, organizations are loathed to share emails, given the risk of leakage of commercially sensitive information. So, it is uncommon to obtain sufficient emails to train a global AI model efficiently. Ac…
▽ More
The use of Artificial Intelligence (AI) to detect phishing emails is primarily dependent on large-scale centralized datasets, which opens it up to a myriad of privacy, trust, and legal issues. Moreover, organizations are loathed to share emails, given the risk of leakage of commercially sensitive information. So, it is uncommon to obtain sufficient emails to train a global AI model efficiently. Accordingly, privacy-preserving distributed and collaborative machine learning, particularly Federated Learning (FL), is a desideratum. Already prevalent in the healthcare sector, questions remain regarding the effectiveness and efficacy of FL-based phishing detection within the context of multi-organization collaborations. To the best of our knowledge, the work herein is the first to investigate the use of FL in email anti-phishing. This paper builds upon a deep neural network model, particularly RNN and BERT for phishing email detection. It analyzes the FL-entangled learning performance under various settings, including balanced and asymmetrical data distribution. Our results corroborate comparable performance statistics of FL in phishing email detection to centralized learning for balanced datasets, and low organization counts. Moreover, we observe a variation in performance when increasing organizational counts. For a fixed total email dataset, the global RNN based model suffers by a 1.8% accuracy drop when increasing organizational counts from 2 to 10. In contrast, BERT accuracy rises by 0.6% when going from 2 to 5 organizations. However, if we allow increasing the overall email dataset with the introduction of new organizations in the FL framework, the organizational level performance is improved by achieving a faster convergence speed. Besides, FL suffers in its overall global model performance due to highly unstable outputs if the email dataset distribution is highly asymmetric.
△ Less
Submitted 21 May, 2021; v1 submitted 26 July, 2020;
originally announced July 2020.
-
SplitFed: When Federated Learning Meets Split Learning
Authors:
Chandra Thapa,
M. A. P. Chamikara,
Seyit Camtepe,
Lichao Sun
Abstract:
Federated learning (FL) and split learning (SL) are two popular distributed machine learning approaches. Both follow a model-to-data scenario; clients train and test machine learning models without sharing raw data. SL provides better model privacy than FL due to the machine learning model architecture split between clients and the server. Moreover, the split model makes SL a better option for res…
▽ More
Federated learning (FL) and split learning (SL) are two popular distributed machine learning approaches. Both follow a model-to-data scenario; clients train and test machine learning models without sharing raw data. SL provides better model privacy than FL due to the machine learning model architecture split between clients and the server. Moreover, the split model makes SL a better option for resource-constrained environments. However, SL performs slower than FL due to the relay-based training across multiple clients. In this regard, this paper presents a novel approach, named splitfed learning (SFL), that amalgamates the two approaches eliminating their inherent drawbacks, along with a refined architectural configuration incorporating differential privacy and PixelDP to enhance data privacy and model robustness. Our analysis and empirical results demonstrate that (pure) SFL provides similar test accuracy and communication efficiency as SL while significantly decreasing its computation time per global epoch than in SL for multiple clients. Furthermore, as in SL, its communication efficiency over FL improves with the number of clients. Besides, the performance of SFL with privacy and robustness measures is further evaluated under extended experimental settings.
△ Less
Submitted 16 February, 2022; v1 submitted 25 April, 2020;
originally announced April 2020.
-
End-to-End Evaluation of Federated Learning and Split Learning for Internet of Things
Authors:
Yansong Gao,
Minki Kim,
Sharif Abuadbba,
Yeonjae Kim,
Chandra Thapa,
Kyuyeon Kim,
Seyit A. Camtepe,
Hyoungshick Kim,
Surya Nepal
Abstract:
This work is the first attempt to evaluate and compare felderated learning (FL) and split neural networks (SplitNN) in real-world IoT settings in terms of learning performance and device implementation overhead. We consider a variety of datasets, different model architectures, multiple clients, and various performance metrics. For learning performance, which is specified by the model accuracy and…
▽ More
This work is the first attempt to evaluate and compare felderated learning (FL) and split neural networks (SplitNN) in real-world IoT settings in terms of learning performance and device implementation overhead. We consider a variety of datasets, different model architectures, multiple clients, and various performance metrics. For learning performance, which is specified by the model accuracy and convergence speed metrics, we empirically evaluate both FL and SplitNN under different types of data distributions such as imbalanced and non-independent and identically distributed (non-IID) data. We show that the learning performance of SplitNN is better than FL under an imbalanced data distribution, but worse than FL under an extreme non-IID data distribution. For implementation overhead, we end-to-end mount both FL and SplitNN on Raspberry Pis, and comprehensively evaluate overheads including training time, communication overhead under the real LAN setting, power consumption and memory usage. Our key observations are that under IoT scenario where the communication traffic is the main concern, the FL appears to perform better over SplitNN because FL has the significantly lower communication overhead compared with SplitNN, which empirically corroborate previous statistical analysis. In addition, we reveal several unrecognized limitations about SplitNN, forming the basis for future research.
△ Less
Submitted 2 August, 2020; v1 submitted 30 March, 2020;
originally announced March 2020.
-
Can We Use Split Learning on 1D CNN Models for Privacy Preserving Training?
Authors:
Sharif Abuadbba,
Kyuyeon Kim,
Minki Kim,
Chandra Thapa,
Seyit A. Camtepe,
Yansong Gao,
Hyoungshick Kim,
Surya Nepal
Abstract:
A new collaborative learning, called split learning, was recently introduced, aiming to protect user data privacy without revealing raw input data to a server. It collaboratively runs a deep neural network model where the model is split into two parts, one for the client and the other for the server. Therefore, the server has no direct access to raw data processed at the client. Until now, the spl…
▽ More
A new collaborative learning, called split learning, was recently introduced, aiming to protect user data privacy without revealing raw input data to a server. It collaboratively runs a deep neural network model where the model is split into two parts, one for the client and the other for the server. Therefore, the server has no direct access to raw data processed at the client. Until now, the split learning is believed to be a promising approach to protect the client's raw data; for example, the client's data was protected in healthcare image applications using 2D convolutional neural network (CNN) models. However, it is still unclear whether the split learning can be applied to other deep learning models, in particular, 1D CNN.
In this paper, we examine whether split learning can be used to perform privacy-preserving training for 1D CNN models. To answer this, we first design and implement an 1D CNN model under split learning and validate its efficacy in detecting heart abnormalities using medical ECG data. We observed that the 1D CNN model under split learning can achieve the same accuracy of 98.9\% like the original (non-split) model. However, our evaluation demonstrates that split learning may fail to protect the raw data privacy on 1D CNN models. To address the observed privacy leakage in split learning, we adopt two privacy leakage mitigation techniques: 1) adding more hidden layers to the client side and 2) applying differential privacy. Although those mitigation techniques are helpful in reducing privacy leakage, they have a significant impact on model accuracy. Hence, based on those results, we conclude that split learning alone would not be sufficient to maintain the confidentiality of raw sequential data in 1D CNN models.
△ Less
Submitted 16 March, 2020;
originally announced March 2020.
-
Structural Characteristics of Two-Sender Index Coding
Authors:
Chandra Thapa,
Lawrence Ong,
Sarah J. Johnson,
Min Li
Abstract:
This paper studies index coding with two senders. In this setup, source messages are distributed among the senders possibly with common messages. In addition, there are multiple receivers, with each receiver having some messages a priori, known as side-information, and requesting one unique message such that each message is requested by only one receiver. Index coding in this setup is called two-s…
▽ More
This paper studies index coding with two senders. In this setup, source messages are distributed among the senders possibly with common messages. In addition, there are multiple receivers, with each receiver having some messages a priori, known as side-information, and requesting one unique message such that each message is requested by only one receiver. Index coding in this setup is called two-sender unicast index coding (TSUIC). The main goal is to find the shortest aggregate normalized codelength, which is expressed as the optimal broadcast rate. In this work, firstly, for a given TSUIC problem, we form three independent sub-problems each consisting of the only subset of the messages, based on whether the messages are available only in one of the senders or in both senders. Then we express the optimal broadcast rate of the TSUIC problem as a function of the optimal broadcast rates of those independent sub-problems. In this way, we discover the structural characteristics of TSUIC. For the proofs of our results, we utilize confusion graphs and coding techniques used in single-sender index coding. To adapt the confusion graph technique in TSUIC, we introduce a new graph-coloring approach that is different from the normal graph coloring, which we call two-sender graph coloring, and propose a way of grouping the vertices to analyze the number of colors used. We further determine a class of TSUIC instances where a certain type of side-information can be removed without affecting their optimal broadcast rates. Finally, we generalize the results of a class of TSUIC problems to multiple senders.
△ Less
Submitted 19 June, 2019; v1 submitted 22 November, 2017;
originally announced November 2017.
-
Graph-Theoretic Approaches to Two-Sender Index Coding
Authors:
Chandra Thapa,
Lawrence Ong,
Sarah J. Johnson
Abstract:
Consider a communication scenario over a noiseless channel where a sender is required to broadcast messages to multiple receivers, each having side information about some messages. In this scenario, the sender can leverage the receivers' side information during the encoding of messages in order to reduce the required transmissions. This type of encoding is called index coding. In this paper, we st…
▽ More
Consider a communication scenario over a noiseless channel where a sender is required to broadcast messages to multiple receivers, each having side information about some messages. In this scenario, the sender can leverage the receivers' side information during the encoding of messages in order to reduce the required transmissions. This type of encoding is called index coding. In this paper, we study index coding with two cooperative senders, each with some subset of messages, and multiple receivers, each requesting one unique message. The index coding in this setup is called two-sender unicast index coding (TSUIC). The main aim of TSUIC is to minimize the total number of transmissions required by the two senders. Based on graph-theoretic approaches, we prove that TSUIC is equivalent to single-sender unicast index coding (SSUIC) for some special cases. Moreover, we extend the existing schemes for SSUIC, viz., the cycle-cover scheme, the clique-cover scheme, and the local-chromatic scheme to the corresponding schemes for TSUIC.
△ Less
Submitted 28 September, 2016;
originally announced September 2016.
-
Interlinked Cycles for Index Coding: Generalizing Cycles and Cliques
Authors:
Chandra Thapa,
Lawrence Ong,
Sarah J. Johnson
Abstract:
We consider a graphical approach to index coding. While cycles have been shown to provide coding gain, only disjoint cycles and cliques (a specific type of overlapping cycles) have been exploited in existing literature. In this paper, we define a more general form of overlapping cycles, called the interlinked-cycle (IC) structure, that generalizes cycles and cliques. We propose a scheme, called th…
▽ More
We consider a graphical approach to index coding. While cycles have been shown to provide coding gain, only disjoint cycles and cliques (a specific type of overlapping cycles) have been exploited in existing literature. In this paper, we define a more general form of overlapping cycles, called the interlinked-cycle (IC) structure, that generalizes cycles and cliques. We propose a scheme, called the interlinked-cycle-cover (ICC) scheme, that leverages IC structures in digraphs to construct scalar linear index codes. We characterize a class of infinitely many digraphs where our proposed scheme is optimal over all linear and non-linear index codes. Consequently, for this class of digraphs, we indirectly prove that scalar linear index codes are optimal. Furthermore, we show that the ICC scheme can outperform all existing graph-based schemes (including partial-clique-cover and fractional-local-chromatic number schemes), and a random-coding scheme (namely, composite coding) for certain graphs.
△ Less
Submitted 24 February, 2018; v1 submitted 29 February, 2016;
originally announced March 2016.
-
Generalized Interlinked Cycle Cover for Index Coding
Authors:
Chandra Thapa,
Lawrence Ong,
Sarah J. Johnson
Abstract:
A source coding problem over a noiseless broadcast channel where the source is pre-informed about the contents of the cache of all receivers, is an index coding problem. Furthermore, if each message is requested by one receiver, then we call this an index coding problem with a unicast message setting. This problem can be represented by a directed graph. In this paper, we first define a structure (…
▽ More
A source coding problem over a noiseless broadcast channel where the source is pre-informed about the contents of the cache of all receivers, is an index coding problem. Furthermore, if each message is requested by one receiver, then we call this an index coding problem with a unicast message setting. This problem can be represented by a directed graph. In this paper, we first define a structure (we call generalized interlinked cycles (GIC)) in directed graphs. A GIC consists of cycles which are interlinked in some manner (i.e., not disjoint), and it turns out that the GIC is a generalization of cliques and cycles. We then propose a simple scalar linear encoding scheme with linear time encoding complexity. This scheme exploits GICs in the digraph. We prove that our scheme is optimal for a class of digraphs with message packets of any length. Moreover, we show that our scheme can outperform existing techniques, e.g., partial clique cover, local chromatic number, composite-coding, and interlinked cycle cover.
△ Less
Submitted 20 July, 2015; v1 submitted 19 April, 2015;
originally announced April 2015.
-
A New Index Coding Scheme Exploiting Interlinked Cycles
Authors:
Chandra Thapa,
Lawrence Ong,
Sarah J. Johnson
Abstract:
We study the index coding problem in the unicast message setting, i.e., where each message is requested by one unique receiver. This problem can be modeled by a directed graph. We propose a new scheme called interlinked cycle cover, which exploits interlinked cycles in the directed graph, for designing index codes. This new scheme generalizes the existing clique cover and cycle cover schemes. We p…
▽ More
We study the index coding problem in the unicast message setting, i.e., where each message is requested by one unique receiver. This problem can be modeled by a directed graph. We propose a new scheme called interlinked cycle cover, which exploits interlinked cycles in the directed graph, for designing index codes. This new scheme generalizes the existing clique cover and cycle cover schemes. We prove that for a class of infinitely many digraphs with messages of any length, interlinked cycle cover provides an optimal index code. Furthermore, the index code is linear with linear time encoding complexity.
△ Less
Submitted 8 April, 2015;
originally announced April 2015.