-
Revisiting the Robust Alignment of Circuit Breakers
Authors:
Leo Schwinn,
Simon Geisler
Abstract:
Over the past decade, adversarial training has emerged as one of the few reliable methods for enhancing model robustness against adversarial attacks [Szegedy et al., 2014, Madry et al., 2018, Xhonneux et al., 2024], while many alternative approaches have failed to withstand rigorous subsequent evaluations. Recently, an alternative defense mechanism, namely "circuit breakers" [Zou et al., 2024], ha…
▽ More
Over the past decade, adversarial training has emerged as one of the few reliable methods for enhancing model robustness against adversarial attacks [Szegedy et al., 2014, Madry et al., 2018, Xhonneux et al., 2024], while many alternative approaches have failed to withstand rigorous subsequent evaluations. Recently, an alternative defense mechanism, namely "circuit breakers" [Zou et al., 2024], has shown promising results for aligning LLMs. In this report, we show that the robustness claims of "Improving Alignment and Robustness with Circuit Breakers" against unconstraint continuous attacks in the embedding space of the input tokens may be overestimated [Zou et al., 2024]. Specifically, we demonstrate that by implementing a few simple changes to embedding space attacks [Schwinn et al., 2024a,b], we achieve 100% attack success rate (ASR) against circuit breaker models. Without conducting any further hyperparameter tuning, these adjustments increase the ASR by more than 80% compared to the original evaluation. Code is accessible at: https://github.com/SchwinnL/circuit-breakers-eval
△ Less
Submitted 2 August, 2024; v1 submitted 22 July, 2024;
originally announced July 2024.
-
Relaxing Graph Transformers for Adversarial Attacks
Authors:
Philipp Foth,
Lukas Gosch,
Simon Geisler,
Leo Schwinn,
Stephan Günnemann
Abstract:
Existing studies have shown that Graph Neural Networks (GNNs) are vulnerable to adversarial attacks. Even though Graph Transformers (GTs) surpassed Message-Passing GNNs on several benchmarks, their adversarial robustness properties are unexplored. However, attacking GTs is challenging due to their Positional Encodings (PEs) and special attention mechanisms which can be difficult to differentiate.…
▽ More
Existing studies have shown that Graph Neural Networks (GNNs) are vulnerable to adversarial attacks. Even though Graph Transformers (GTs) surpassed Message-Passing GNNs on several benchmarks, their adversarial robustness properties are unexplored. However, attacking GTs is challenging due to their Positional Encodings (PEs) and special attention mechanisms which can be difficult to differentiate. We overcome these challenges by targeting three representative architectures based on (1) random-walk PEs, (2) pair-wise-shortest-path PEs, and (3) spectral PEs - and propose the first adaptive attacks for GTs. We leverage our attacks to evaluate robustness to (a) structure perturbations on node classification; and (b) node injection attacks for (fake-news) graph classification. Our evaluation reveals that they can be catastrophically fragile and underlines our work's importance and the necessity for adaptive attacks.
△ Less
Submitted 16 July, 2024;
originally announced July 2024.
-
Explainable Graph Neural Networks Under Fire
Authors:
Zhong Li,
Simon Geisler,
Yuhang Wang,
Stephan Günnemann,
Matthijs van Leeuwen
Abstract:
Predictions made by graph neural networks (GNNs) usually lack interpretability due to their complex computational behavior and the abstract nature of graphs. In an attempt to tackle this, many GNN explanation methods have emerged. Their goal is to explain a model's predictions and thereby obtain trust when GNN models are deployed in decision critical applications. Most GNN explanation methods work…
▽ More
Predictions made by graph neural networks (GNNs) usually lack interpretability due to their complex computational behavior and the abstract nature of graphs. In an attempt to tackle this, many GNN explanation methods have emerged. Their goal is to explain a model's predictions and thereby obtain trust when GNN models are deployed in decision critical applications. Most GNN explanation methods work in a post-hoc manner and provide explanations in the form of a small subset of important edges and/or nodes. In this paper we demonstrate that these explanations can unfortunately not be trusted, as common GNN explanation methods turn out to be highly susceptible to adversarial perturbations. That is, even small perturbations of the original graph structure that preserve the model's predictions may yield drastically different explanations. This calls into question the trustworthiness and practical utility of post-hoc explanation methods for GNNs. To be able to attack GNN explanation models, we devise a novel attack method dubbed \textit{GXAttack}, the first \textit{optimization-based} adversarial attack method for post-hoc GNN explanations under such settings. Due to the devastating effectiveness of our attack, we call for an adversarial evaluation of future GNN explainers to demonstrate their robustness.
△ Less
Submitted 10 June, 2024;
originally announced June 2024.
-
Spatio-Spectral Graph Neural Networks
Authors:
Simon Geisler,
Arthur Kosmala,
Daniel Herbst,
Stephan Günnemann
Abstract:
Spatial Message Passing Graph Neural Networks (MPGNNs) are widely used for learning on graph-structured data. However, key limitations of l-step MPGNNs are that their "receptive field" is typically limited to the l-hop neighborhood of a node and that information exchange between distant nodes is limited by over-squashing. Motivated by these limitations, we propose Spatio-Spectral Graph Neural Netw…
▽ More
Spatial Message Passing Graph Neural Networks (MPGNNs) are widely used for learning on graph-structured data. However, key limitations of l-step MPGNNs are that their "receptive field" is typically limited to the l-hop neighborhood of a node and that information exchange between distant nodes is limited by over-squashing. Motivated by these limitations, we propose Spatio-Spectral Graph Neural Networks (S$^2$GNNs) -- a new modeling paradigm for Graph Neural Networks (GNNs) that synergistically combines spatially and spectrally parametrized graph filters. Parameterizing filters partially in the frequency domain enables global yet efficient information propagation. We show that S$^2$GNNs vanquish over-squashing and yield strictly tighter approximation-theoretic error bounds than MPGNNs. Further, rethinking graph convolutions at a fundamental level unlocks new design spaces. For example, S$^2$GNNs allow for free positional encodings that make them strictly more expressive than the 1-Weisfeiler-Lehman (WL) test. Moreover, to obtain general-purpose S$^2$GNNs, we propose spectrally parametrized filters for directed graphs. S$^2$GNNs outperform spatial MPGNNs, graph transformers, and graph rewirings, e.g., on the peptide long-range benchmark tasks, and are competitive with state-of-the-art sequence modeling. On a 40 GB GPU, S$^2$GNNs scale to millions of nodes.
△ Less
Submitted 2 June, 2024; v1 submitted 29 May, 2024;
originally announced May 2024.
-
Towards Enabling FAIR Dataspaces Using Large Language Models
Authors:
Benedikt T. Arnold,
Johannes Theissen-Lipp,
Diego Collarana,
Christoph Lange,
Sandra Geisler,
Edward Curry,
Stefan Decker
Abstract:
Dataspaces have recently gained adoption across various sectors, including traditionally less digitized domains such as culture. Leveraging Semantic Web technologies helps to make dataspaces FAIR, but their complexity poses a significant challenge to the adoption of dataspaces and increases their cost. The advent of Large Language Models (LLMs) raises the question of how these models can support t…
▽ More
Dataspaces have recently gained adoption across various sectors, including traditionally less digitized domains such as culture. Leveraging Semantic Web technologies helps to make dataspaces FAIR, but their complexity poses a significant challenge to the adoption of dataspaces and increases their cost. The advent of Large Language Models (LLMs) raises the question of how these models can support the adoption of FAIR dataspaces. In this work, we demonstrate the potential of LLMs in dataspaces with a concrete example. We also derive a research agenda for exploring this emerging field.
△ Less
Submitted 18 March, 2024;
originally announced March 2024.
-
Attacking Large Language Models with Projected Gradient Descent
Authors:
Simon Geisler,
Tom Wollschläger,
M. H. I. Abdalla,
Johannes Gasteiger,
Stephan Günnemann
Abstract:
Current LLM alignment methods are readily broken through specifically crafted adversarial prompts. While crafting adversarial prompts using discrete optimization is highly effective, such attacks typically use more than 100,000 LLM calls. This high computational cost makes them unsuitable for, e.g., quantitative analyses and adversarial training. To remedy this, we revisit Projected Gradient Desce…
▽ More
Current LLM alignment methods are readily broken through specifically crafted adversarial prompts. While crafting adversarial prompts using discrete optimization is highly effective, such attacks typically use more than 100,000 LLM calls. This high computational cost makes them unsuitable for, e.g., quantitative analyses and adversarial training. To remedy this, we revisit Projected Gradient Descent (PGD) on the continuously relaxed input prompt. Although previous attempts with ordinary gradient-based attacks largely failed, we show that carefully controlling the error introduced by the continuous relaxation tremendously boosts their efficacy. Our PGD for LLMs is up to one order of magnitude faster than state-of-the-art discrete optimization to achieve the same devastating attack results.
△ Less
Submitted 14 February, 2024;
originally announced February 2024.
-
Poisoning $\times$ Evasion: Symbiotic Adversarial Robustness for Graph Neural Networks
Authors:
Ege Erdogan,
Simon Geisler,
Stephan Günnemann
Abstract:
It is well-known that deep learning models are vulnerable to small input perturbations. Such perturbed instances are called adversarial examples. Adversarial examples are commonly crafted to fool a model either at training time (poisoning) or test time (evasion). In this work, we study the symbiosis of poisoning and evasion. We show that combining both threat models can substantially improve the d…
▽ More
It is well-known that deep learning models are vulnerable to small input perturbations. Such perturbed instances are called adversarial examples. Adversarial examples are commonly crafted to fool a model either at training time (poisoning) or test time (evasion). In this work, we study the symbiosis of poisoning and evasion. We show that combining both threat models can substantially improve the devastating efficacy of adversarial attacks. Specifically, we study the robustness of Graph Neural Networks (GNNs) under structure perturbations and devise a memory-efficient adaptive end-to-end attack for the novel threat model using first-order optimization.
△ Less
Submitted 9 December, 2023;
originally announced December 2023.
-
On the Adversarial Robustness of Graph Contrastive Learning Methods
Authors:
Filippo Guerranti,
Zinuo Yi,
Anna Starovoit,
Rafiq Kamel,
Simon Geisler,
Stephan Günnemann
Abstract:
Contrastive learning (CL) has emerged as a powerful framework for learning representations of images and text in a self-supervised manner while enhancing model robustness against adversarial attacks. More recently, researchers have extended the principles of contrastive learning to graph-structured data, giving birth to the field of graph contrastive learning (GCL). However, whether GCL methods ca…
▽ More
Contrastive learning (CL) has emerged as a powerful framework for learning representations of images and text in a self-supervised manner while enhancing model robustness against adversarial attacks. More recently, researchers have extended the principles of contrastive learning to graph-structured data, giving birth to the field of graph contrastive learning (GCL). However, whether GCL methods can deliver the same advantages in adversarial robustness as their counterparts in the image and text domains remains an open question. In this paper, we introduce a comprehensive robustness evaluation protocol tailored to assess the robustness of GCL models. We subject these models to adaptive adversarial attacks targeting the graph structure, specifically in the evasion scenario. We evaluate node and graph classification tasks using diverse real-world datasets and attack strategies. With our work, we aim to offer insights into the robustness of GCL methods and hope to open avenues for potential future research directions.
△ Less
Submitted 30 November, 2023; v1 submitted 29 November, 2023;
originally announced November 2023.
-
Topology-Matching Normalizing Flows for Out-of-Distribution Detection in Robot Learning
Authors:
Jianxiang Feng,
Jongseok Lee,
Simon Geisler,
Stephan Gunnemann,
Rudolph Triebel
Abstract:
To facilitate reliable deployments of autonomous robots in the real world, Out-of-Distribution (OOD) detection capabilities are often required. A powerful approach for OOD detection is based on density estimation with Normalizing Flows (NFs). However, we find that prior work with NFs attempts to match the complex target distribution topologically with naive base distributions leading to adverse im…
▽ More
To facilitate reliable deployments of autonomous robots in the real world, Out-of-Distribution (OOD) detection capabilities are often required. A powerful approach for OOD detection is based on density estimation with Normalizing Flows (NFs). However, we find that prior work with NFs attempts to match the complex target distribution topologically with naive base distributions leading to adverse implications. In this work, we circumvent this topological mismatch using an expressive class-conditional base distribution trained with an information-theoretic objective to match the required topology. The proposed method enjoys the merits of wide compatibility with existing learned models without any performance degradation and minimum computation overhead while enhancing OOD detection capabilities. We demonstrate superior results in density estimation and 2D object detection benchmarks in comparison with extensive baselines. Moreover, we showcase the applicability of the method with a real-robot deployment.
△ Less
Submitted 11 November, 2023;
originally announced November 2023.
-
Adversarial Training for Graph Neural Networks: Pitfalls, Solutions, and New Directions
Authors:
Lukas Gosch,
Simon Geisler,
Daniel Sturm,
Bertrand Charpentier,
Daniel Zügner,
Stephan Günnemann
Abstract:
Despite its success in the image domain, adversarial training did not (yet) stand out as an effective defense for Graph Neural Networks (GNNs) against graph structure perturbations. In the pursuit of fixing adversarial training (1) we show and overcome fundamental theoretical as well as practical limitations of the adopted graph learning setting in prior work; (2) we reveal that more flexible GNNs…
▽ More
Despite its success in the image domain, adversarial training did not (yet) stand out as an effective defense for Graph Neural Networks (GNNs) against graph structure perturbations. In the pursuit of fixing adversarial training (1) we show and overcome fundamental theoretical as well as practical limitations of the adopted graph learning setting in prior work; (2) we reveal that more flexible GNNs based on learnable graph diffusion are able to adjust to adversarial perturbations, while the learned message passing scheme is naturally interpretable; (3) we introduce the first attack for structure perturbations that, while targeting multiple nodes at once, is capable of handling global (graph-level) as well as local (node-level) constraints. Including these contributions, we demonstrate that adversarial training is a state-of-the-art defense against adversarial structure perturbations.
△ Less
Submitted 2 December, 2023; v1 submitted 27 June, 2023;
originally announced June 2023.
-
Evolving the Digital Industrial Infrastructure for Production: Steps Taken and the Road Ahead
Authors:
Jan Pennekamp,
Anastasiia Belova,
Thomas Bergs,
Matthias Bodenbenner,
Andreas Bührig-Polaczek,
Markus Dahlmanns,
Ike Kunze,
Moritz Kröger,
Sandra Geisler,
Martin Henze,
Daniel Lütticke,
Benjamin Montavon,
Philipp Niemietz,
Lucia Ortjohann,
Maximilian Rudack,
Robert H. Schmitt,
Uwe Vroomen,
Klaus Wehrle,
Michael Zeng
Abstract:
The Internet of Production (IoP) leverages concepts such as digital shadows, data lakes, and a World Wide Lab (WWL) to advance today's production. Consequently, it requires a technical infrastructure that can support the agile deployment of these concepts and corresponding high-level applications, which, e.g., demand the processing of massive data in motion and at rest. As such, key research aspec…
▽ More
The Internet of Production (IoP) leverages concepts such as digital shadows, data lakes, and a World Wide Lab (WWL) to advance today's production. Consequently, it requires a technical infrastructure that can support the agile deployment of these concepts and corresponding high-level applications, which, e.g., demand the processing of massive data in motion and at rest. As such, key research aspects are the support for low-latency control loops, concepts on scalable data stream processing, deployable information security, and semantically rich and efficient long-term storage. In particular, such an infrastructure cannot continue to be limited to machines and sensors, but additionally needs to encompass networked environments: production cells, edge computing, and location-independent cloud infrastructures. Finally, in light of the envisioned WWL, i.e., the interconnection of production sites, the technical infrastructure must be advanced to support secure and privacy-preserving industrial collaboration. To evolve today's production sites and lay the infrastructural foundation for the IoP, we identify five broad streams of research: (1) adapting data and stream processing to heterogeneous data from distributed sources, (2) ensuring data interoperability between systems and production sites, (3) exchanging and sharing data with different stakeholders, (4) network security approaches addressing the risks of increasing interconnectivity, and (5) security architectures to enable secure and privacy-preserving industrial collaboration. With our research, we evolve the underlying infrastructure from isolated, sparsely networked production sites toward an architecture that supports high-level applications and sophisticated digital shadows while facilitating the transition toward a WWL.
△ Less
Submitted 17 May, 2023;
originally announced May 2023.
-
GALOIS: A Hybrid and Platform-Agnostic Stream Processing Architecture
Authors:
Tarek Stolz,
István Koren,
Liam Tirpitz,
Sandra Geisler
Abstract:
With the increasing prevalence of IoT environments, the demand for processing massive distributed data streams has become a critical challenge. Data Stream Processing on the Edge (DSPoE) systems have emerged as a solution to address this challenge, but they often struggle to cope with the heterogeneity of hardware and platforms. To address this issue, we propose a new hybrid DSPoE architecture nam…
▽ More
With the increasing prevalence of IoT environments, the demand for processing massive distributed data streams has become a critical challenge. Data Stream Processing on the Edge (DSPoE) systems have emerged as a solution to address this challenge, but they often struggle to cope with the heterogeneity of hardware and platforms. To address this issue, we propose a new hybrid DSPoE architecture named GALOIS, which is based on WebAssembly (Wasm) and is hardware-, platform-, and language-agnostic. GALOIS employs a multi-layered approach that combines P2P and master-worker concepts for communication between components. We present experimental results showing that operators executed in Wasm outperform those in Docker in terms of energy and CPU consumption, making it a promising option for streaming operators in DSPoE. We therefore expect Wasm-based solutions to significantly improve the performance and resilience of DSPoE systems.
△ Less
Submitted 3 May, 2023;
originally announced May 2023.
-
Revisiting Robustness in Graph Machine Learning
Authors:
Lukas Gosch,
Daniel Sturm,
Simon Geisler,
Stephan Günnemann
Abstract:
Many works show that node-level predictions of Graph Neural Networks (GNNs) are unrobust to small, often termed adversarial, changes to the graph structure. However, because manual inspection of a graph is difficult, it is unclear if the studied perturbations always preserve a core assumption of adversarial examples: that of unchanged semantic content. To address this problem, we introduce a more…
▽ More
Many works show that node-level predictions of Graph Neural Networks (GNNs) are unrobust to small, often termed adversarial, changes to the graph structure. However, because manual inspection of a graph is difficult, it is unclear if the studied perturbations always preserve a core assumption of adversarial examples: that of unchanged semantic content. To address this problem, we introduce a more principled notion of an adversarial graph, which is aware of semantic content change. Using Contextual Stochastic Block Models (CSBMs) and real-world graphs, our results uncover: $i)$ for a majority of nodes the prevalent perturbation models include a large fraction of perturbed graphs violating the unchanged semantics assumption; $ii)$ surprisingly, all assessed GNNs show over-robustness - that is robustness beyond the point of semantic change. We find this to be a complementary phenomenon to adversarial examples and show that including the label-structure of the training graph into the inference process of GNNs significantly reduces over-robustness, while having a positive effect on test accuracy and adversarial robustness. Theoretically, leveraging our new semantics-aware notion of robustness, we prove that there is no robustness-accuracy tradeoff for inductively classifying a newly added node.
△ Less
Submitted 2 May, 2023; v1 submitted 1 May, 2023;
originally announced May 2023.
-
Transformers Meet Directed Graphs
Authors:
Simon Geisler,
Yujia Li,
Daniel Mankowitz,
Ali Taylan Cemgil,
Stephan Günnemann,
Cosmin Paduraru
Abstract:
Transformers were originally proposed as a sequence-to-sequence model for text but have become vital for a wide range of modalities, including images, audio, video, and undirected graphs. However, transformers for directed graphs are a surprisingly underexplored topic, despite their applicability to ubiquitous domains, including source code and logic circuits. In this work, we propose two directio…
▽ More
Transformers were originally proposed as a sequence-to-sequence model for text but have become vital for a wide range of modalities, including images, audio, video, and undirected graphs. However, transformers for directed graphs are a surprisingly underexplored topic, despite their applicability to ubiquitous domains, including source code and logic circuits. In this work, we propose two direction- and structure-aware positional encodings for directed graphs: (1) the eigenvectors of the Magnetic Laplacian - a direction-aware generalization of the combinatorial Laplacian; (2) directional random walk encodings. Empirically, we show that the extra directionality information is useful in various downstream tasks, including correctness testing of sorting networks and source code understanding. Together with a data-flow-centric graph construction, our model outperforms the prior state of the art on the Open Graph Benchmark Code2 relatively by 14.7%.
△ Less
Submitted 31 August, 2023; v1 submitted 31 January, 2023;
originally announced February 2023.
-
Are Defenses for Graph Neural Networks Robust?
Authors:
Felix Mujkanovic,
Simon Geisler,
Stephan Günnemann,
Aleksandar Bojchevski
Abstract:
A cursory reading of the literature suggests that we have made a lot of progress in designing effective adversarial defenses for Graph Neural Networks (GNNs). Yet, the standard methodology has a serious flaw - virtually all of the defenses are evaluated against non-adaptive attacks leading to overly optimistic robustness estimates. We perform a thorough robustness analysis of 7 of the most popular…
▽ More
A cursory reading of the literature suggests that we have made a lot of progress in designing effective adversarial defenses for Graph Neural Networks (GNNs). Yet, the standard methodology has a serious flaw - virtually all of the defenses are evaluated against non-adaptive attacks leading to overly optimistic robustness estimates. We perform a thorough robustness analysis of 7 of the most popular defenses spanning the entire spectrum of strategies, i.e., aimed at improving the graph, the architecture, or the training. The results are sobering - most defenses show no or only marginal improvement compared to an undefended baseline. We advocate using custom adaptive attacks as a gold standard and we outline the lessons we learned from successfully designing such attacks. Moreover, our diverse collection of perturbed graphs forms a (black-box) unit test offering a first glance at a model's robustness.
△ Less
Submitted 31 January, 2023;
originally announced January 2023.
-
Randomized Message-Interception Smoothing: Gray-box Certificates for Graph Neural Networks
Authors:
Yan Scholten,
Jan Schuchardt,
Simon Geisler,
Aleksandar Bojchevski,
Stephan Günnemann
Abstract:
Randomized smoothing is one of the most promising frameworks for certifying the adversarial robustness of machine learning models, including Graph Neural Networks (GNNs). Yet, existing randomized smoothing certificates for GNNs are overly pessimistic since they treat the model as a black box, ignoring the underlying architecture. To remedy this, we propose novel gray-box certificates that exploit…
▽ More
Randomized smoothing is one of the most promising frameworks for certifying the adversarial robustness of machine learning models, including Graph Neural Networks (GNNs). Yet, existing randomized smoothing certificates for GNNs are overly pessimistic since they treat the model as a black box, ignoring the underlying architecture. To remedy this, we propose novel gray-box certificates that exploit the message-passing principle of GNNs: We randomly intercept messages and carefully analyze the probability that messages from adversarially controlled nodes reach their target nodes. Compared to existing certificates, we certify robustness to much stronger adversaries that control entire nodes in the graph and can arbitrarily manipulate node features. Our certificates provide stronger guarantees for attacks at larger distances, as messages from farther-away nodes are more likely to get intercepted. We demonstrate the effectiveness of our method on various models and datasets. Since our gray-box certificates consider the underlying graph structure, we can significantly improve certifiable robustness by applying graph sparsification.
△ Less
Submitted 5 January, 2023;
originally announced January 2023.
-
Patterns of Sociotechnical Design Preferences of Chatbots for Intergenerational Collaborative Innovation : A Q Methodology Study
Authors:
Irawan Nurhas,
Pouyan Jahanbin,
Jan Pawlowski,
Stephen Wingreen,
Stefan Geisler
Abstract:
Chatbot technology is increasingly emerging as a virtual assistant. Chatbots could allow individuals and organizations to accomplish objectives that are currently not fully optimized for collaboration across an intergenerational context. This paper explores the preferences of chatbots as a companion in intergenerational innovation. The Q methodology was used to investigate different types of colla…
▽ More
Chatbot technology is increasingly emerging as a virtual assistant. Chatbots could allow individuals and organizations to accomplish objectives that are currently not fully optimized for collaboration across an intergenerational context. This paper explores the preferences of chatbots as a companion in intergenerational innovation. The Q methodology was used to investigate different types of collaborators and determine how different choices occur between collaborators that merge the problem and solution domains of chatbots' design within intergenerational settings. The study's findings reveal that various chatbot design priorities are more diverse among younger adults than senior adults. Additionally, our research further outlines the principles of chatbot design and how chatbots will support both generations. This research is the first step towards cultivating a deeper understanding of different age groups' subjective design preferences for chatbots functioning as a companion in the workplace. Moreover, this study demonstrates how the Q methodology can guide technological development by shifting the approach from an age-focused design to a common goal-oriented design within a multigenerational context.
△ Less
Submitted 7 December, 2022;
originally announced December 2022.
-
On the Robustness and Anomaly Detection of Sparse Neural Networks
Authors:
Morgane Ayle,
Bertrand Charpentier,
John Rachwan,
Daniel Zügner,
Simon Geisler,
Stephan Günnemann
Abstract:
The robustness and anomaly detection capability of neural networks are crucial topics for their safe adoption in the real-world. Moreover, the over-parameterization of recent networks comes with high computational costs and raises questions about its influence on robustness and anomaly detection. In this work, we show that sparsity can make networks more robust and better anomaly detectors. To mot…
▽ More
The robustness and anomaly detection capability of neural networks are crucial topics for their safe adoption in the real-world. Moreover, the over-parameterization of recent networks comes with high computational costs and raises questions about its influence on robustness and anomaly detection. In this work, we show that sparsity can make networks more robust and better anomaly detectors. To motivate this even further, we show that a pre-trained neural network contains, within its parameter space, sparse subnetworks that are better at these tasks without any further training. We also show that structured sparsity greatly helps in reducing the complexity of expensive robustness and detection methods, while maintaining or even improving their results on these tasks. Finally, we introduce a new method, SensNorm, which uses the sensitivity of weights derived from an appropriate pruning method to detect anomalous samples in the input.
△ Less
Submitted 9 July, 2022;
originally announced July 2022.
-
Winning the Lottery Ahead of Time: Efficient Early Network Pruning
Authors:
John Rachwan,
Daniel Zügner,
Bertrand Charpentier,
Simon Geisler,
Morgane Ayle,
Stephan Günnemann
Abstract:
Pruning, the task of sparsifying deep neural networks, received increasing attention recently. Although state-of-the-art pruning methods extract highly sparse models, they neglect two main challenges: (1) the process of finding these sparse models is often very expensive; (2) unstructured pruning does not provide benefits in terms of GPU memory, training time, or carbon emissions. We propose Early…
▽ More
Pruning, the task of sparsifying deep neural networks, received increasing attention recently. Although state-of-the-art pruning methods extract highly sparse models, they neglect two main challenges: (1) the process of finding these sparse models is often very expensive; (2) unstructured pruning does not provide benefits in terms of GPU memory, training time, or carbon emissions. We propose Early Compression via Gradient Flow Preservation (EarlyCroP), which efficiently extracts state-of-the-art sparse models before or early in training addressing challenge (1), and can be applied in a structured manner addressing challenge (2). This enables us to train sparse networks on commodity GPUs whose dense versions would be too large, thereby saving costs and reducing hardware requirements. We empirically show that EarlyCroP outperforms a rich set of baselines for many tasks (incl. classification, regression) and domains (incl. computer vision, natural language processing, and reinforcment learning). EarlyCroP leads to accuracy comparable to dense training while outperforming pruning baselines.
△ Less
Submitted 21 June, 2022;
originally announced June 2022.
-
Robustness of Graph Neural Networks at Scale
Authors:
Simon Geisler,
Tobias Schmidt,
Hakan Şirin,
Daniel Zügner,
Aleksandar Bojchevski,
Stephan Günnemann
Abstract:
Graph Neural Networks (GNNs) are increasingly important given their popularity and the diversity of applications. Yet, existing studies of their vulnerability to adversarial attacks rely on relatively small graphs. We address this gap and study how to attack and defend GNNs at scale. We propose two sparsity-aware first-order optimization attacks that maintain an efficient representation despite op…
▽ More
Graph Neural Networks (GNNs) are increasingly important given their popularity and the diversity of applications. Yet, existing studies of their vulnerability to adversarial attacks rely on relatively small graphs. We address this gap and study how to attack and defend GNNs at scale. We propose two sparsity-aware first-order optimization attacks that maintain an efficient representation despite optimizing over a number of parameters which is quadratic in the number of nodes. We show that common surrogate losses are not well-suited for global attacks on GNNs. Our alternatives can double the attack strength. Moreover, to improve GNNs' reliability we design a robust aggregation function, Soft Median, resulting in an effective defense at all scales. We evaluate our attacks and defense with standard GNNs on graphs more than 100 times larger compared to previous work. We even scale one order of magnitude further by extending our techniques to a scalable GNN.
△ Less
Submitted 30 April, 2023; v1 submitted 26 October, 2021;
originally announced October 2021.
-
Graph Posterior Network: Bayesian Predictive Uncertainty for Node Classification
Authors:
Maximilian Stadler,
Bertrand Charpentier,
Simon Geisler,
Daniel Zügner,
Stephan Günnemann
Abstract:
The interdependence between nodes in graphs is key to improve class predictions on nodes and utilized in approaches like Label Propagation (LP) or in Graph Neural Networks (GNN). Nonetheless, uncertainty estimation for non-independent node-level predictions is under-explored. In this work, we explore uncertainty quantification for node classification in three ways: (1) We derive three axioms expli…
▽ More
The interdependence between nodes in graphs is key to improve class predictions on nodes and utilized in approaches like Label Propagation (LP) or in Graph Neural Networks (GNN). Nonetheless, uncertainty estimation for non-independent node-level predictions is under-explored. In this work, we explore uncertainty quantification for node classification in three ways: (1) We derive three axioms explicitly characterizing the expected predictive uncertainty behavior in homophilic attributed graphs. (2) We propose a new model Graph Posterior Network (GPN) which explicitly performs Bayesian posterior updates for predictions on interdependent nodes. GPN provably obeys the proposed axioms. (3) We extensively evaluate GPN and a strong set of baselines on semi-supervised node classification including detection of anomalous features, and detection of left-out classes. GPN outperforms existing approaches for uncertainty estimation in the experiments.
△ Less
Submitted 26 October, 2021;
originally announced October 2021.
-
Generalization of Neural Combinatorial Solvers Through the Lens of Adversarial Robustness
Authors:
Simon Geisler,
Johanna Sommer,
Jan Schuchardt,
Aleksandar Bojchevski,
Stephan Günnemann
Abstract:
End-to-end (geometric) deep learning has seen first successes in approximating the solution of combinatorial optimization problems. However, generating data in the realm of NP-hard/-complete tasks brings practical and theoretical challenges, resulting in evaluation protocols that are too optimistic. Specifically, most datasets only capture a simpler subproblem and likely suffer from spurious featu…
▽ More
End-to-end (geometric) deep learning has seen first successes in approximating the solution of combinatorial optimization problems. However, generating data in the realm of NP-hard/-complete tasks brings practical and theoretical challenges, resulting in evaluation protocols that are too optimistic. Specifically, most datasets only capture a simpler subproblem and likely suffer from spurious features. We investigate these effects by studying adversarial robustness - a local generalization property - to reveal hard, model-specific instances and spurious features. For this purpose, we derive perturbation models for SAT and TSP. Unlike in other applications, where perturbation models are designed around subjective notions of imperceptibility, our perturbation models are efficient and sound, allowing us to determine the true label of perturbed samples without a solver. Surprisingly, with such perturbations, a sufficiently expressive neural solver does not suffer from the limitations of the accuracy-robustness trade-off common in supervised learning. Although such robust solvers exist, we show empirically that the assessed neural solvers do not generalize well w.r.t. small perturbations of the problem instance.
△ Less
Submitted 21 March, 2022; v1 submitted 21 October, 2021;
originally announced October 2021.
-
Knowledge-driven Data Ecosystems Towards Data Transparency
Authors:
Sandra Geisler,
Maria-Esther Vidal,
Cinzia Cappiello,
Bernadette Farias Lóscio,
Avigdor Gal,
Matthias Jarke,
Maurizio Lenzerini,
Paolo Missier,
Boris Otto,
Elda Paja,
Barbara Pernici,
Jakob Rehof
Abstract:
A Data Ecosystem offers a keystone-player or alliance-driven infrastructure that enables the interaction of different stakeholders and the resolution of interoperability issues among shared data. However, despite years of research in data governance and management, trustability is still affected by the absence of transparent and traceable data-driven pipelines. In this work, we focus on requiremen…
▽ More
A Data Ecosystem offers a keystone-player or alliance-driven infrastructure that enables the interaction of different stakeholders and the resolution of interoperability issues among shared data. However, despite years of research in data governance and management, trustability is still affected by the absence of transparent and traceable data-driven pipelines. In this work, we focus on requirements and challenges that data ecosystems face when ensuring data transparency. Requirements are derived from the data and organizational management, as well as from broader legal and ethical considerations. We propose a novel knowledge-driven data ecosystem architecture, providing the pillars for satisfying the analyzed requirements. We illustrate the potential of our proposal in a real-world scenario. Lastly, we discuss and rate the potential of the proposed architecture in the fulfillment of these requirements.
△ Less
Submitted 21 May, 2021; v1 submitted 19 May, 2021;
originally announced May 2021.
-
Natural Posterior Network: Deep Bayesian Uncertainty for Exponential Family Distributions
Authors:
Bertrand Charpentier,
Oliver Borchert,
Daniel Zügner,
Simon Geisler,
Stephan Günnemann
Abstract:
Uncertainty awareness is crucial to develop reliable machine learning models. In this work, we propose the Natural Posterior Network (NatPN) for fast and high-quality uncertainty estimation for any task where the target distribution belongs to the exponential family. Thus, NatPN finds application for both classification and general regression settings. Unlike many previous approaches, NatPN does n…
▽ More
Uncertainty awareness is crucial to develop reliable machine learning models. In this work, we propose the Natural Posterior Network (NatPN) for fast and high-quality uncertainty estimation for any task where the target distribution belongs to the exponential family. Thus, NatPN finds application for both classification and general regression settings. Unlike many previous approaches, NatPN does not require out-of-distribution (OOD) data at training time. Instead, it leverages Normalizing Flows to fit a single density on a learned low-dimensional and task-dependent latent space. For any input sample, NatPN uses the predicted likelihood to perform a Bayesian update over the target distribution. Theoretically, NatPN assigns high uncertainty far away from training data. Empirically, our extensive experiments on calibration and OOD detection show that NatPN delivers highly competitive performance for classification, regression and count prediction tasks.
△ Less
Submitted 16 March, 2022; v1 submitted 10 May, 2021;
originally announced May 2021.
-
Methods to integrate multinormals and compute classification measures
Authors:
Abhranil Das,
Wilson S Geisler
Abstract:
Univariate and multivariate normal probability distributions are widely used when modeling decisions under uncertainty. Computing the performance of such models requires integrating these distributions over specific domains, which can vary widely across models. Besides some special cases, there exist no general analytical expressions, standard numerical methods or software for these integrals. Her…
▽ More
Univariate and multivariate normal probability distributions are widely used when modeling decisions under uncertainty. Computing the performance of such models requires integrating these distributions over specific domains, which can vary widely across models. Besides some special cases, there exist no general analytical expressions, standard numerical methods or software for these integrals. Here we present mathematical results and open-source software that provide (i) the probability in any domain of a normal in any dimensions with any parameters, (ii) the probability density, cumulative distribution, and inverse cumulative distribution of any function of a normal vector, (iii) the classification errors among any number of normal distributions, the Bayes-optimal discriminability index and relation to the operating characteristic, (iv) ways to scale the discriminability of two distributions, (v) dimension reduction and visualizations for such problems, and (vi) tests for how reliably these methods may be used on given data. We demonstrate these tools with vision research applications of detecting occluding objects in natural scenes, and detecting camouflage.
△ Less
Submitted 29 July, 2024; v1 submitted 23 December, 2020;
originally announced December 2020.
-
Reliable Graph Neural Networks via Robust Aggregation
Authors:
Simon Geisler,
Daniel Zügner,
Stephan Günnemann
Abstract:
Perturbations targeting the graph structure have proven to be extremely effective in reducing the performance of Graph Neural Networks (GNNs), and traditional defenses such as adversarial training do not seem to be able to improve robustness. This work is motivated by the observation that adversarially injected edges effectively can be viewed as additional samples to a node's neighborhood aggregat…
▽ More
Perturbations targeting the graph structure have proven to be extremely effective in reducing the performance of Graph Neural Networks (GNNs), and traditional defenses such as adversarial training do not seem to be able to improve robustness. This work is motivated by the observation that adversarially injected edges effectively can be viewed as additional samples to a node's neighborhood aggregation function, which results in distorted aggregations accumulating over the layers. Conventional GNN aggregation functions, such as a sum or mean, can be distorted arbitrarily by a single outlier. We propose a robust aggregation function motivated by the field of robust statistics. Our approach exhibits the largest possible breakdown point of 0.5, which means that the bias of the aggregation is bounded as long as the fraction of adversarial edges of a node is less than 50\%. Our novel aggregation function, Soft Medoid, is a fully differentiable generalization of the Medoid and therefore lends itself well for end-to-end deep learning. Equipping a GNN with our aggregation improves the robustness with respect to structure perturbations on Cora ML by a factor of 3 (and 5.5 on Citeseer) and by a factor of 8 for low-degree nodes.
△ Less
Submitted 29 October, 2020;
originally announced October 2020.
-
Implicit Cooperation: Emotion Detection for Validation and Adaptation of Automated Vehicles' Driving Behavior
Authors:
Henrik Detjen,
Stefan Geisler,
Stefan Schneegass
Abstract:
Human emotion detection in automated vehicles helps to improve comfort and safety. Research in the automotive domain focuses a lot on sensing drivers' drowsiness and aggression. We present a new form of implicit driver-vehicle cooperation, where emotion detection is integrated into an automated vehicle's decision-making process. Constant evaluation of the driver's reaction to vehicle behavior allo…
▽ More
Human emotion detection in automated vehicles helps to improve comfort and safety. Research in the automotive domain focuses a lot on sensing drivers' drowsiness and aggression. We present a new form of implicit driver-vehicle cooperation, where emotion detection is integrated into an automated vehicle's decision-making process. Constant evaluation of the driver's reaction to vehicle behavior allows us to revise decisions and helps to increase the safety of future automated vehicles.
△ Less
Submitted 29 March, 2020;
originally announced March 2020.
-
Maneuver-based Driving for Intervention in Autonomous Cars
Authors:
Henrik Detjen,
Stefan Geisler,
Stefan Schneegass
Abstract:
The way we communicate with autonomous cars will fundamentally change as soon as manual input is no longer required as back-up for the autonomous system. Maneuver-based driving is a potential way to allow still the user to intervene with the autonomous car to communicate requests such as stopping at the next parking lot. In this work, we highlight different research questions that still need to be…
▽ More
The way we communicate with autonomous cars will fundamentally change as soon as manual input is no longer required as back-up for the autonomous system. Maneuver-based driving is a potential way to allow still the user to intervene with the autonomous car to communicate requests such as stopping at the next parking lot. In this work, we highlight different research questions that still need to be explored to gain insights into how such control can be realized in the future.
△ Less
Submitted 27 March, 2020;
originally announced March 2020.
-
Why Does Cultural Diversity Foster Technology-enabled Intergenerational Collaboration?
Authors:
Irawan Nurhas,
Bayu Rima Aditya,
Stefan Geisler,
Jan Pawlowski
Abstract:
Globalization and information technology enable people to join the movement of global citizenship and work without borders. However, different type of barriers existed that could affect collaboration in todays work environment, in which different generations are involved. Although researchers have identified several technical barriers to intergenerational collaboration (iGOAL), the influence of cu…
▽ More
Globalization and information technology enable people to join the movement of global citizenship and work without borders. However, different type of barriers existed that could affect collaboration in todays work environment, in which different generations are involved. Although researchers have identified several technical barriers to intergenerational collaboration (iGOAL), the influence of cultural diversity on iGOAL has rarely been studied. Therefore, using a quantitative study approach, this paper investigates the impact of differences in cultural background on perceived technical and operational barriers to iGOAL. Our study reveals six barriers to IGC that are perceived differently by culturally diverse people (CDP) and non-CDP. Furthermore, CDP can foster IGC because CDP consider the barriers to be of less of a reason to avoid working with different generations than do non-CDP.
△ Less
Submitted 21 January, 2020;
originally announced January 2020.
-
Towards humane digitization: a wellbeing-driven process of personas creation
Authors:
Irawan Nurhas,
Jan Pawlowski,
Stefan Geisler
Abstract:
Digital transformation is a process of digitizing the working and living environment in which people are at the center of digitization. In this paper, we present a personas-based guideline for system developers on how the humanization of digital transformation integrates into the design process. The proposed guideline uses the positive personas from the beginning as a basis for the transformation…
▽ More
Digital transformation is a process of digitizing the working and living environment in which people are at the center of digitization. In this paper, we present a personas-based guideline for system developers on how the humanization of digital transformation integrates into the design process. The proposed guideline uses the positive personas from the beginning as a basis for the transformation of the working environment into the digital form. We used the literature research as a preliminary study for the process of wellbeing-driven digital transformation design, consisting of questions for structuring the required information in the positive personas as well as a potential method that could be integrated into the wellbeing-based design process.
△ Less
Submitted 19 September, 2019;
originally announced September 2019.
-
Why Should the Q-method be Integrated Into the Design Science Research? A Systematic Mapping Study
Authors:
Irawan Nurhas,
Stefan Geisler,
Jan Pawlowski
Abstract:
The Q-method has been utilized over time in various areas, including information systems. In this study, we used a systematic mapping to illustrate how the Q-method was applied within Information Systems (IS) community and proposing towards the integration of Q-method into the Design Sciences Research (DSR) process as a tool for future research DSR-based IS studies. In this mapping study, we colle…
▽ More
The Q-method has been utilized over time in various areas, including information systems. In this study, we used a systematic mapping to illustrate how the Q-method was applied within Information Systems (IS) community and proposing towards the integration of Q-method into the Design Sciences Research (DSR) process as a tool for future research DSR-based IS studies. In this mapping study, we collected peer-reviewed journals from Basket-of-Eight journals and the digital library of the Association for Information Systems (AIS). Then we grouped the publications according to the process of DSR, and different variables for preparing Q-method from IS publications. We found that the potential of the Q-methodology can be used to support each main research stage of DSR processes and can serve as the useful tool to evaluate a system in the IS topic of system analysis and design
△ Less
Submitted 29 August, 2019; v1 submitted 14 August, 2019;
originally announced August 2019.
-
Positive Personas: Integrating Well-being Determinants into Personas
Authors:
Irawan Nurhas,
Stefan Geisler,
Jan Pawlowski
Abstract:
System design for well-being needs an appropriate tool to help designers to determine relevant requirements that can help human well-being to flourish. Personas come as a simple yet powerful tool in the early development stage of the user interface design. Considering well-being determinants in the early design process provide benefits for both the user and the development team. Therefore, in this…
▽ More
System design for well-being needs an appropriate tool to help designers to determine relevant requirements that can help human well-being to flourish. Personas come as a simple yet powerful tool in the early development stage of the user interface design. Considering well-being determinants in the early design process provide benefits for both the user and the development team. Therefore, in this short paper, we performed a literature study to provide a conceptual model of well-being in personas and propose positive design interventions in the personas creation process.
△ Less
Submitted 2 April, 2019; v1 submitted 31 March, 2019;
originally announced April 2019.
-
Group-centered framework towards a positive design of digital collaboration in global settings
Authors:
Irawan Nurhas,
Jan Pawlowski,
Stefan Geisler,
Maria Kovtunenko,
Bayu Rima Aditya
Abstract:
Globally distributed groups require collaborative systems to support their work. Besides being able to support the teamwork, these systems also should promote well-being and maximize the human potential that leads to an engaging system and joyful experience. Designing such system is a significant challenge and requires a thorough understanding of group work. We used the field theory as a lens to v…
▽ More
Globally distributed groups require collaborative systems to support their work. Besides being able to support the teamwork, these systems also should promote well-being and maximize the human potential that leads to an engaging system and joyful experience. Designing such system is a significant challenge and requires a thorough understanding of group work. We used the field theory as a lens to view the essential aspects of group motivation and then utilized collaboration personas to analyze the elements of group work. We integrated well-being determinants as engagement factors to develop a group-centered framework for digital collaboration in a global setting. Based on the outcomes, we proposed a conceptual framework to design an engaging collaborative system and recommend system values that can be used to evaluate the system further
△ Less
Submitted 7 April, 2019; v1 submitted 29 March, 2019;
originally announced April 2019.