-
A Decentralized and Self-Adaptive Approach for Monitoring Volatile Edge Environments
Authors:
Shashikant Ilager,
Jakob Fahringer,
Alessandro Tundo,
Ivona Brandić
Abstract:
Edge computing provides resources for IoT workloads at the network edge. Monitoring systems are vital for efficiently managing resources and application workloads by collecting, storing, and providing relevant information about the state of the resources. However, traditional monitoring systems have a centralized architecture for both data plane and control plane, which increases latency, creates…
▽ More
Edge computing provides resources for IoT workloads at the network edge. Monitoring systems are vital for efficiently managing resources and application workloads by collecting, storing, and providing relevant information about the state of the resources. However, traditional monitoring systems have a centralized architecture for both data plane and control plane, which increases latency, creates a failure bottleneck, and faces challenges in providing quick and trustworthy data in volatile edge environments, especially where infrastructures are often built upon failure-prone, unsophisticated computing and network resources. Thus, we propose DEMon, a decentralized, self-adaptive monitoring system for edge. DEMon leverages the stochastic gossip communication protocol at its core. It develops efficient protocols for information dissemination, communication, and retrieval, avoiding a single point of failure and ensuring fast and trustworthy data access. Its decentralized control enables self-adaptive management of monitoring parameters, addressing the trade-offs between the quality of service of monitoring and resource consumption. We implement the proposed system as a lightweight and portable container-based system and evaluate it through experiments. We also present a use case demonstrating its feasibility. The results show that DEMon efficiently disseminates and retrieves the monitoring information, addressing the challenges of edge monitoring.
△ Less
Submitted 13 May, 2024;
originally announced May 2024.
-
Paving the Way to Hybrid Quantum-Classical Scientific Workflows
Authors:
Sandeep Suresh Cranganore,
Vincenzo De Maio,
Ivona Brandic,
Ewa Deelman
Abstract:
The increasing growth of data volume, and the consequent explosion in demand for computational power, are affecting scientific computing, as shown by the rise of extreme data scientific workflows. As the need for computing power increases, quantum computing has been proposed as a way to deliver it. It may provide significant theoretical speedups for many scientific applications (i.e., molecular dy…
▽ More
The increasing growth of data volume, and the consequent explosion in demand for computational power, are affecting scientific computing, as shown by the rise of extreme data scientific workflows. As the need for computing power increases, quantum computing has been proposed as a way to deliver it. It may provide significant theoretical speedups for many scientific applications (i.e., molecular dynamics, quantum chemistry, combinatorial optimization, and machine learning). Therefore, integrating quantum computers into the computing continuum constitutes a promising way to speed up scientific computation. However, the scientific computing community still lacks the necessary tools and expertise to fully harness the power of quantum computers in the execution of complex applications such as scientific workflows. In this work, we describe the main characteristics of quantum computing and its main benefits for scientific applications, then we formalize hybrid quantum-classic workflows, explore how to identify quantum components and map them onto resources. We demonstrate concepts on a real use case and define a software architecture for a hybrid workflow management system.
△ Less
Submitted 16 April, 2024;
originally announced April 2024.
-
On Optimizing Hyperparameters for Quantum Neural Networks
Authors:
Sabrina Herbst,
Vincenzo De Maio,
Ivona Brandic
Abstract:
The increasing capabilities of Machine Learning (ML) models go hand in hand with an immense amount of data and computational power required for training. Therefore, training is usually outsourced into HPC facilities, where we have started to experience limits in scaling conventional HPC hardware, as theorized by Moore's law. Despite heavy parallelization and optimization efforts, current state-of-…
▽ More
The increasing capabilities of Machine Learning (ML) models go hand in hand with an immense amount of data and computational power required for training. Therefore, training is usually outsourced into HPC facilities, where we have started to experience limits in scaling conventional HPC hardware, as theorized by Moore's law. Despite heavy parallelization and optimization efforts, current state-of-the-art ML models require weeks for training, which is associated with an enormous $CO_2$ footprint. Quantum Computing, and specifically Quantum Machine Learning (QML), can offer significant theoretical speed-ups and enhanced expressive power. However, training QML models requires tuning various hyperparameters, which is a nontrivial task and suboptimal choices can highly affect the trainability and performance of the models. In this study, we identify the most impactful hyperparameters and collect data about the performance of QML models. We compare different configurations and provide researchers with performance data and concrete suggestions for hyperparameter selection.
△ Less
Submitted 27 March, 2024;
originally announced March 2024.
-
FLIGAN: Enhancing Federated Learning with Incomplete Data using GAN
Authors:
Paul Joe Maliakel,
Shashikant Ilager,
Ivona Brandic
Abstract:
Federated Learning (FL) provides a privacy-preserving mechanism for distributed training of machine learning models on networked devices (e.g., mobile devices, IoT edge nodes). It enables Artificial Intelligence (AI) at the edge by creating models without sharing actual data across the network. Existing research typically focuses on generic aspects of non-IID data and heterogeneity in client's sys…
▽ More
Federated Learning (FL) provides a privacy-preserving mechanism for distributed training of machine learning models on networked devices (e.g., mobile devices, IoT edge nodes). It enables Artificial Intelligence (AI) at the edge by creating models without sharing actual data across the network. Existing research typically focuses on generic aspects of non-IID data and heterogeneity in client's system characteristics, but they often neglect the issue of insufficient data for model development, which can arise from uneven class label distribution and highly variable data volumes across edge nodes. In this work, we propose FLIGAN, a novel approach to address the issue of data incompleteness in FL. First, we leverage Generative Adversarial Networks (GANs) to adeptly capture complex data distributions and generate synthetic data that closely resemble real-world data. Then, we use synthetic data to enhance the robustness and completeness of datasets across nodes. Our methodology adheres to FL's privacy requirements by generating synthetic data in a federated manner without sharing the actual data in the process. We incorporate techniques such as classwise sampling and node grouping, designed to improve the federated GAN's performance, enabling the creation of high-quality synthetic datasets and facilitating efficient FL training. Empirical results from our experiments demonstrate that FLIGAN significantly improves model accuracy, especially in scenarios with high class imbalances, achieving up to a 20% increase in model accuracy over traditional FL baselines.
△ Less
Submitted 2 April, 2024; v1 submitted 25 March, 2024;
originally announced March 2024.
-
Training Computer Scientists for the Challenges of Hybrid Quantum-Classical Computing
Authors:
Vincenzo De Maio,
Meerzhan Kanatbekova,
Felix Zilk,
Nicolai Friis,
Tobias Guggemos,
Ivona Brandic
Abstract:
As we enter the post-Moore era, we experience the rise of various non-von-Neumann-architectures to address the increasing computational demand for modern applications, with quantum computing being among the most prominent and promising technologies. However, this development creates a gap in current computer science curricula since most quantum computing lectures are strongly physics-oriented and…
▽ More
As we enter the post-Moore era, we experience the rise of various non-von-Neumann-architectures to address the increasing computational demand for modern applications, with quantum computing being among the most prominent and promising technologies. However, this development creates a gap in current computer science curricula since most quantum computing lectures are strongly physics-oriented and have little intersection with the remaining curriculum of computer science. This fact makes designing an appealing course very difficult, in particular for non-physicists. Furthermore, in the academic community, there is consensus that quantum computers are going to be used only for specific computational tasks (e.g., in computational science), where hybrid systems - combined classical and quantum computers - facilitate the execution of an application on both quantum and classical computing resources. A hybrid system thus executes only certain suitable parts of an application on the quantum machine, while other parts are executed on the classical components of the system. To fully exploit the capabilities of hybrid systems and to meet future requirements in this emerging field, we need to prepare a new generation of computer scientists with skills in both distributed computing and quantum computing. To bridge this existing gap in standard computer science curricula, we designed a new lecture and exercise series on Hybrid Quantum-Classical Systems, where students learn how to decompose applications and implement computational tasks on a hybrid quantum-classical computational continuum. While learning the inherent concepts underlying quantum systems, students are obligated to apply techniques and methods they are already familiar with, making the entrance to the field of quantum computing comprehensive yet appealing and accessible to students of computer science.
△ Less
Submitted 1 March, 2024;
originally announced March 2024.
-
Streaming IoT Data and the Quantum Edge: A Classic/Quantum Machine Learning Use Case
Authors:
Sabrina Herbst,
Vincenzo De Maio,
Ivona Brandic
Abstract:
With the advent of the Post-Moore era, the scientific community is faced with the challenge of addressing the demands of current data-intensive machine learning applications, which are the cornerstone of urgent analytics in distributed computing. Quantum machine learning could be a solution for the increasing demand of urgent analytics, providing potential theoretical speedups and increased space…
▽ More
With the advent of the Post-Moore era, the scientific community is faced with the challenge of addressing the demands of current data-intensive machine learning applications, which are the cornerstone of urgent analytics in distributed computing. Quantum machine learning could be a solution for the increasing demand of urgent analytics, providing potential theoretical speedups and increased space efficiency. However, challenges such as (1) the encoding of data from the classical to the quantum domain, (2) hyperparameter tuning, and (3) the integration of quantum hardware into a distributed computing continuum limit the adoption of quantum machine learning for urgent analytics. In this work, we investigate the use of Edge computing for the integration of quantum machine learning into a distributed computing continuum, identifying the main challenges and possible solutions. Furthermore, exploring the data encoding and hyperparameter tuning challenges, we present preliminary results for quantum machine learning analytics on an IoT scenario.
△ Less
Submitted 23 February, 2024;
originally announced February 2024.
-
SymED: Adaptive and Online Symbolic Representation of Data on the Edge
Authors:
Daniel Hofstätter,
Shashikant Ilager,
Ivan Lujic,
Ivona Brandic
Abstract:
The edge computing paradigm helps handle the Internet of Things (IoT) generated data in proximity to its source. Challenges occur in transferring, storing, and processing this rapidly growing amount of data on resource-constrained edge devices. Symbolic Representation (SR) algorithms are promising solutions to reduce the data size by converting actual raw data into symbols. Also, they allow data a…
▽ More
The edge computing paradigm helps handle the Internet of Things (IoT) generated data in proximity to its source. Challenges occur in transferring, storing, and processing this rapidly growing amount of data on resource-constrained edge devices. Symbolic Representation (SR) algorithms are promising solutions to reduce the data size by converting actual raw data into symbols. Also, they allow data analytics (e.g., anomaly detection and trend prediction) directly on symbols, benefiting large classes of edge applications. However, existing SR algorithms are centralized in design and work offline with batch data, which is infeasible for real-time cases. We propose SymED - Symbolic Edge Data representation method, i.e., an online, adaptive, and distributed approach for symbolic representation of data on edge. SymED is based on the Adaptive Brownian Bridge-based Aggregation (ABBA), where we assume low-powered IoT devices do initial data compression (senders) and the more robust edge devices do the symbolic conversion (receivers). We evaluate SymED by measuring compression performance, reconstruction accuracy through Dynamic Time Warping (DTW) distance, and computational latency. The results show that SymED is able to (i) reduce the raw data with an average compression rate of 9.5%; (ii) keep a low reconstruction error of 13.25 in the DTW space; (iii) simultaneously provide real-time adaptability for online streaming IoT data at typical latencies of 42ms per symbol, reducing the overall network traffic.
△ Less
Submitted 6 September, 2023;
originally announced September 2023.
-
An Energy-Aware Approach to Design Self-Adaptive AI-based Applications on the Edge
Authors:
Alessandro Tundo,
Marco Mobilio,
Shashikant Ilager,
Ivona Brandić,
Ezio Bartocci,
Leonardo Mariani
Abstract:
The advent of edge devices dedicated to machine learning tasks enabled the execution of AI-based applications that efficiently process and classify the data acquired by the resource-constrained devices populating the Internet of Things. The proliferation of such applications (e.g., critical monitoring in smart cities) demands new strategies to make these systems also sustainable from an energetic…
▽ More
The advent of edge devices dedicated to machine learning tasks enabled the execution of AI-based applications that efficiently process and classify the data acquired by the resource-constrained devices populating the Internet of Things. The proliferation of such applications (e.g., critical monitoring in smart cities) demands new strategies to make these systems also sustainable from an energetic point of view.
In this paper, we present an energy-aware approach for the design and deployment of self-adaptive AI-based applications that can balance application objectives (e.g., accuracy in object detection and frames processing rate) with energy consumption. We address the problem of determining the set of configurations that can be used to self-adapt the system with a meta-heuristic search procedure that only needs a small number of empirical samples. The final set of configurations are selected using weighted gray relational analysis, and mapped to the operation modes of the self-adaptive application.
We validate our approach on an AI-based application for pedestrian detection. Results show that our self-adaptive application can outperform non-adaptive baseline configurations by saving up to 81\% of energy while loosing only between 2% and 6% in accuracy.
△ Less
Submitted 31 August, 2023;
originally announced September 2023.
-
AI for Next Generation Computing: Emerging Trends and Future Directions
Authors:
Sukhpal Singh Gill,
Minxian Xu,
Carlo Ottaviani,
Panos Patros,
Rami Bahsoon,
Arash Shaghaghi,
Muhammed Golec,
Vlado Stankovski,
Huaming Wu,
Ajith Abraham,
Manmeet Singh,
Harshit Mehta,
Soumya K. Ghosh,
Thar Baker,
Ajith Kumar Parlikad,
Hanan Lutfiyya,
Salil S. Kanhere,
Rizos Sakellariou,
Schahram Dustdar,
Omer Rana,
Ivona Brandic,
Steve Uhlig
Abstract:
Autonomic computing investigates how systems can achieve (user) specified control outcomes on their own, without the intervention of a human operator. Autonomic computing fundamentals have been substantially influenced by those of control theory for closed and open-loop systems. In practice, complex systems may exhibit a number of concurrent and inter-dependent control loops. Despite research into…
▽ More
Autonomic computing investigates how systems can achieve (user) specified control outcomes on their own, without the intervention of a human operator. Autonomic computing fundamentals have been substantially influenced by those of control theory for closed and open-loop systems. In practice, complex systems may exhibit a number of concurrent and inter-dependent control loops. Despite research into autonomic models for managing computer resources, ranging from individual resources (e.g., web servers) to a resource ensemble (e.g., multiple resources within a data center), research into integrating Artificial Intelligence (AI) and Machine Learning (ML) to improve resource autonomy and performance at scale continues to be a fundamental challenge. The integration of AI/ML to achieve such autonomic and self-management of systems can be achieved at different levels of granularity, from full to human-in-the-loop automation. In this article, leading academics, researchers, practitioners, engineers, and scientists in the fields of cloud computing, AI/ML, and quantum computing join to discuss current research and potential future directions for these fields. Further, we discuss challenges and opportunities for leveraging AI and ML in next generation computing for emerging computing paradigms, including cloud, fog, edge, serverless and quantum computing environments.
△ Less
Submitted 5 March, 2022;
originally announced March 2022.
-
Multi-agent Bayesian Deep Reinforcement Learning for Microgrid Energy Management under Communication Failures
Authors:
Hao Zhou,
Atakan Aral,
Ivona Brandic,
Melike Erol-Kantarci
Abstract:
Microgrids (MGs) are important players for the future transactive energy systems where a number of intelligent Internet of Things (IoT) devices interact for energy management in the smart grid. Although there have been many works on MG energy management, most studies assume a perfect communication environment, where communication failures are not considered. In this paper, we consider the MG as a…
▽ More
Microgrids (MGs) are important players for the future transactive energy systems where a number of intelligent Internet of Things (IoT) devices interact for energy management in the smart grid. Although there have been many works on MG energy management, most studies assume a perfect communication environment, where communication failures are not considered. In this paper, we consider the MG as a multi-agent environment with IoT devices in which AI agents exchange information with their peers for collaboration. However, the collaboration information may be lost due to communication failures or packet loss. Such events may affect the operation of the whole MG. To this end, we propose a multi-agent Bayesian deep reinforcement learning (BA-DRL) method for MG energy management under communication failures. We first define a multi-agent partially observable Markov decision process (MA-POMDP) to describe agents under communication failures, in which each agent can update its beliefs on the actions of its peers. Then, we apply a double deep Q-learning (DDQN) architecture for Q-value estimation in BA-DRL, and propose a belief-based correlated equilibrium for the joint-action selection of multi-agent BA-DRL. Finally, the simulation results show that BA-DRL is robust to both power supply uncertainty and communication failure uncertainty. BA-DRL has 4.1% and 10.3% higher reward than Nash Deep Q-learning (Nash-DQN) and alternating direction method of multipliers (ADMM) respectively under 1% communication failure probability.
△ Less
Submitted 21 November, 2021;
originally announced November 2021.
-
On the Future of Cloud Engineering
Authors:
David Bermbach,
Abhishek Chandra,
Chandra Krintz,
Aniruddha Gokhale,
Aleksander Slominski,
Lauritz Thamsen,
Everton Cavalcante,
Tian Guo,
Ivona Brandic,
Rich Wolski
Abstract:
Ever since the commercial offerings of the Cloud started appearing in 2006, the landscape of cloud computing has been undergoing remarkable changes with the emergence of many different types of service offerings, developer productivity enhancement tools, and new application classes as well as the manifestation of cloud functionality closer to the user at the edge. The notion of utility computing,…
▽ More
Ever since the commercial offerings of the Cloud started appearing in 2006, the landscape of cloud computing has been undergoing remarkable changes with the emergence of many different types of service offerings, developer productivity enhancement tools, and new application classes as well as the manifestation of cloud functionality closer to the user at the edge. The notion of utility computing, however, has remained constant throughout its evolution, which means that cloud users always seek to save costs of leasing cloud resources while maximizing their use. On the other hand, cloud providers try to maximize their profits while assuring service-level objectives of the cloud-hosted applications and keeping operational costs low. All these outcomes require systematic and sound cloud engineering principles. The aim of this paper is to highlight the importance of cloud engineering, survey the landscape of best practices in cloud engineering and its evolution, discuss many of the existing cloud engineering advances, and identify both the inherent technical challenges and research opportunities for the future of cloud computing in general and cloud engineering in particular.
△ Less
Submitted 19 August, 2021;
originally announced August 2021.
-
Towards an Integrated Platform for Big Data Analysis
Authors:
Mahdi Bohlouli,
Frank Schulz,
Lefteris Angelis,
David Pahor,
Ivona Brandic,
David Atlan,
Rosemary Tate
Abstract:
The amount of data in the world is expanding rapidly. Every day, huge amounts of data are created by scientific experiments, companies, and end users' activities. These large data sets have been labeled as "Big Data", and their storage, processing and analysis presents a plethora of new challenges to computer science researchers and IT professionals. In addition to efficient data management, addit…
▽ More
The amount of data in the world is expanding rapidly. Every day, huge amounts of data are created by scientific experiments, companies, and end users' activities. These large data sets have been labeled as "Big Data", and their storage, processing and analysis presents a plethora of new challenges to computer science researchers and IT professionals. In addition to efficient data management, additional complexity arises from dealing with semi-structured or unstructured data, and from time critical processing requirements. In order to understand these massive amounts of data, advanced visualization and data exploration techniques are required. Innovative approaches to these challenges have been developed during recent years, and continue to be a hot topic for re-search and industry in the future. An investigation of current approaches reveals that usually only one or two aspects are ad-dressed, either in the data management, processing, analysis or visualization. This paper presents the vision of an integrated plat-form for big data analysis that combines all these aspects. Main benefits of this approach are an enhanced scalability of the whole platform, a better parameterization of algorithms, a more efficient usage of system resources, and an improved usability during the end-to-end data analysis process.
△ Less
Submitted 26 April, 2020;
originally announced April 2020.
-
Performance-Based Pricing in Multi-Core Geo-Distributed Cloud Computing
Authors:
Dražen Lučanin,
Ilia Pietri,
Simon Holmbacka,
Ivona Brandic,
Johan Lilius,
Rizos Sakellariou
Abstract:
New pricing policies are emerging where cloud providers charge resource provisioning based on the allocated CPU frequencies. As a result, resources are offered to users as combinations of different performance levels and prices which can be configured at runtime. With such new pricing schemes and the increasing energy costs in data centres, balancing energy savings with performance and revenue los…
▽ More
New pricing policies are emerging where cloud providers charge resource provisioning based on the allocated CPU frequencies. As a result, resources are offered to users as combinations of different performance levels and prices which can be configured at runtime. With such new pricing schemes and the increasing energy costs in data centres, balancing energy savings with performance and revenue losses is a challenging problem for cloud providers. CPU frequency scaling can be used to reduce power dissipation, but also impacts VM performance and therefore revenue. In this paper, we firstly propose a non-linear power model that estimates power dissipation of a multi-core PM and secondly a pricing model that adjusts the pricing based on the VM's CPU-boundedness characteristics. Finally, we present a cloud controller that uses these models to allocate VMs and scale CPU frequencies of the PMs to achieve energy cost savings that exceed service revenue losses. We evaluate the proposed approach using simulations with realistic VM workloads, electricity price and temperature traces and estimate energy savings of up to 14.57%.
△ Less
Submitted 16 September, 2018;
originally announced September 2018.
-
A Cloud Controller for Performance-Based Pricing
Authors:
Dražen Lučanin,
Ilia Pietri,
Ivona Brandic,
Rizos Sakellariou
Abstract:
New dynamic cloud pricing options are emerging with cloud providers offering resources as a wide range of CPU frequencies and matching prices that can be switched at runtime. On the other hand, cloud providers are facing the problem of growing operational energy costs. This raises a trade-off problem between energy savings and revenue loss when performing actions such as CPU frequency scaling. Alt…
▽ More
New dynamic cloud pricing options are emerging with cloud providers offering resources as a wide range of CPU frequencies and matching prices that can be switched at runtime. On the other hand, cloud providers are facing the problem of growing operational energy costs. This raises a trade-off problem between energy savings and revenue loss when performing actions such as CPU frequency scaling. Although existing cloud con- trollers for managing cloud resources deploy frequency scaling, they only consider fixed virtual machine (VM) pricing. In this paper we propose a performance-based pricing model adapted for VMs with different CPU-boundedness properties. We present a cloud controller that scales CPU frequencies to achieve energy cost savings that exceed service revenue losses. We evaluate the approach in a simulation based on real VM workload, electricity price and temperature traces, estimating energy cost savings up to 32% in certain scenarios.
△ Less
Submitted 16 September, 2018;
originally announced September 2018.
-
Pervasive Cloud Controller for Geotemporal Inputs
Authors:
Dražen Lučanin,
Ivona Brandic
Abstract:
The rapid cloud computing growth has turned data center energy consumption into a global problem. At the same time, modern cloud providers operate multiple geographically-distributed data centers. Distributed data center infrastructure changes the rules of cloud control, as energy costs depend on current regional electricity prices and temperatures. Furthermore, to account for emerging technologie…
▽ More
The rapid cloud computing growth has turned data center energy consumption into a global problem. At the same time, modern cloud providers operate multiple geographically-distributed data centers. Distributed data center infrastructure changes the rules of cloud control, as energy costs depend on current regional electricity prices and temperatures. Furthermore, to account for emerging technologies surrounding the cloud ecosystem, a maintainable control solution needs to be forward-compatible. Existing cloud controllers are focused on VM consolidation methods suitable only for a single data center or consider migration just in case of workload peaks, not accounting for all the aspects of geographically distributed data centers. In this paper, we propose a pervasive cloud controller for dynamic resource reallocation adapting to volatile time- and location-dependent factors, while considering the QoS impact of too frequent migrations and the data quality limits of time series forecasting methods. The controller is designed with extensible decision support components. We evaluate it in a simulation using historical traces of electricity prices and temperatures. By optimising for these additional factors, we estimate 28.6% energy cost savings compared to baseline dynamic VM consolidation. We provide a range of guidelines for cloud providers, showing the environment conditions necessary to achieve significant cost savings and we validate the controller's extensibility.
△ Less
Submitted 16 September, 2018;
originally announced September 2018.
-
Using Meta-heuristics and Machine Learning for Software Optimization of Parallel Computing Systems: A Systematic Literature Review
Authors:
Suejb Memeti,
Sabri Pllana,
Alecio Binotto,
Joanna Kolodziej,
Ivona Brandic
Abstract:
While modern parallel computing systems offer high performance, utilizing these powerful computing resources to the highest possible extent demands advanced knowledge of various hardware architectures and parallel programming models. Furthermore, optimized software execution on parallel computing systems demands consideration of many parameters at compile-time and run-time. Determining the optimal…
▽ More
While modern parallel computing systems offer high performance, utilizing these powerful computing resources to the highest possible extent demands advanced knowledge of various hardware architectures and parallel programming models. Furthermore, optimized software execution on parallel computing systems demands consideration of many parameters at compile-time and run-time. Determining the optimal set of parameters in a given execution context is a complex task, and therefore to address this issue researchers have proposed different approaches that use heuristic search or machine learning. In this paper, we undertake a systematic literature review to aggregate, analyze and classify the existing software optimization methods for parallel computing systems. We review approaches that use machine learning or meta-heuristics for software optimization at compile-time and run-time. Additionally, we discuss challenges and future research directions. The results of this study may help to better understand the state-of-the-art techniques that use machine learning and meta-heuristics to deal with the complexity of software optimization for parallel computing systems. Furthermore, it may aid in understanding the limitations of existing approaches and identification of areas for improvement.
△ Less
Submitted 2 May, 2018; v1 submitted 29 January, 2018;
originally announced January 2018.
-
Challenges and Recommendations for Preparing HPC Applications for Exascale
Authors:
Erika Abraham,
Costas Bekas,
Ivona Brandic,
Samir Genaim,
Einar Broch Johnsen,
Ivan Kondov,
Sabri Pllana,
Achim Streit
Abstract:
While the HPC community is working towards the development of the first Exaflop computer (expected around 2020), after reaching the Petaflop milestone in 2008 still only few HPC applications are able to fully exploit the capabilities of Petaflop systems. In this paper we argue that efforts for preparing HPC applications for Exascale should start before such systems become available. We identify ch…
▽ More
While the HPC community is working towards the development of the first Exaflop computer (expected around 2020), after reaching the Petaflop milestone in 2008 still only few HPC applications are able to fully exploit the capabilities of Petaflop systems. In this paper we argue that efforts for preparing HPC applications for Exascale should start before such systems become available. We identify challenges that need to be addressed and recommend solutions in key areas of interest, including formal modeling, static analysis and optimization, runtime analysis and optimization, and autonomic computing. Furthermore, we outline a conceptual framework for porting HPC applications to future Exascale computing systems and propose steps for its implementation.
△ Less
Submitted 9 June, 2015; v1 submitted 24 March, 2015;
originally announced March 2015.
-
Energy-Aware Cloud Management through Progressive SLA Specification
Authors:
Dražen Lučanin,
Foued Jrad,
Ivona Brandic,
Achim Streit
Abstract:
Novel energy-aware cloud management methods dynamically reallocate computation across geographically distributed data centers to leverage regional electricity price and temperature differences. As a result, a managed VM may suffer occasional downtimes. Current cloud providers only offer high availability VMs, without enough flexibility to apply such energy-aware management. In this paper we show h…
▽ More
Novel energy-aware cloud management methods dynamically reallocate computation across geographically distributed data centers to leverage regional electricity price and temperature differences. As a result, a managed VM may suffer occasional downtimes. Current cloud providers only offer high availability VMs, without enough flexibility to apply such energy-aware management. In this paper we show how to analyse past traces of dynamic cloud management actions based on electricity prices and temperatures to estimate VM availability and price values. We propose a novel SLA specification approach for offering VMs with different availability and price values guaranteed over multiple SLAs to enable flexible energy-aware cloud management. We determine the optimal number of such SLAs as well as their availability and price guaranteed values. We evaluate our approach in a user SLA selection simulation using Wikipedia and Grid'5000 workloads. The results show higher customer conversion and 39% average energy savings per VM.
△ Less
Submitted 1 September, 2014;
originally announced September 2014.
-
Take a break: cloud scheduling optimized for real-time electricity pricing
Authors:
Dražen Lučanin,
Ivona Brandić
Abstract:
Cloud computing revolutionised the industry with its elastic, on-demand approach to computational resources, but has lead to a tremendous impact on the environment. Data centers constitute 1.1-1.5% of total electricity usage in the world. Taking a more informed view of the electrical grid by analysing real-time electricity prices, we set the foundations of a grid-conscious cloud. We propose a sche…
▽ More
Cloud computing revolutionised the industry with its elastic, on-demand approach to computational resources, but has lead to a tremendous impact on the environment. Data centers constitute 1.1-1.5% of total electricity usage in the world. Taking a more informed view of the electrical grid by analysing real-time electricity prices, we set the foundations of a grid-conscious cloud. We propose a scheduling algorithm that predicts electricity price peaks and throttles energy consumption by pausing virtual machines. We evaluate the approach on the OpenStack cloud manager through an empirical approach and show reductions in energy consumption and costs. Finally, we define green instances in which cloud providers can offer such services to their customers under better pricing options.
△ Less
Submitted 26 July, 2013;
originally announced July 2013.
-
Energy Efficient Service Delivery in Clouds in Compliance with the Kyoto Protocol
Authors:
Drazen Lucanin,
Michael Maurer,
Toni Mastelic,
Ivona Brandic
Abstract:
Cloud computing is revolutionizing the ICT landscape by providing scalable and efficient computing resources on demand. The ICT industry - especially data centers, are responsible for considerable amounts of CO2 emissions and will very soon be faced with legislative restrictions, such as the Kyoto protocol, defining caps at different organizational levels (country, industry branch etc.) A lot has…
▽ More
Cloud computing is revolutionizing the ICT landscape by providing scalable and efficient computing resources on demand. The ICT industry - especially data centers, are responsible for considerable amounts of CO2 emissions and will very soon be faced with legislative restrictions, such as the Kyoto protocol, defining caps at different organizational levels (country, industry branch etc.) A lot has been done around energy efficient data centers, yet there is very little work done in defining flexible models considering CO2. In this paper we present a first attempt of modeling data centers in compliance with the Kyoto protocol. We discuss a novel approach for trading credits for emission reductions across data centers to comply with their constraints. CO2 caps can be integrated with Service Level Agreements and juxtaposed to other computing commodities (e.g. computational power, storage), setting a foundation for implementing next-generation schedulers and pricing models that support Kyoto-compliant CO2 trading schemes.
△ Less
Submitted 30 April, 2012;
originally announced April 2012.