Search | arXiv e-print repository

Excluding the Irrelevant: Focusing Reinforcement Learning through Continuous Action Masking

Authors: Roland Stolz, Hanna Krasowski, Jakob Thumm, Michael Eichelbeck, Philipp Gassert, Matthias Althoff

Abstract: Continuous action spaces in reinforcement learning (RL) are commonly defined as interval sets. While intervals usually reflect the action boundaries for tasks well, they can be challenging for learning because the typically large global action space leads to frequent exploration of irrelevant actions. Yet, little task knowledge can be sufficient to identify significantly smaller state-specific set… ▽ More Continuous action spaces in reinforcement learning (RL) are commonly defined as interval sets. While intervals usually reflect the action boundaries for tasks well, they can be challenging for learning because the typically large global action space leads to frequent exploration of irrelevant actions. Yet, little task knowledge can be sufficient to identify significantly smaller state-specific sets of relevant actions. Focusing learning on these relevant actions can significantly improve training efficiency and effectiveness. In this paper, we propose to focus learning on the set of relevant actions and introduce three continuous action masking methods for exactly mapping the action space to the state-dependent set of relevant actions. Thus, our methods ensure that only relevant actions are executed, enhancing the predictability of the RL agent and enabling its use in safety-critical applications. We further derive the implications of the proposed methods on the policy gradient. Using Proximal Policy Optimization (PPO), we evaluate our methods on three control tasks, where the relevant action set is computed based on the system dynamics and a relevant state set. Our experiments show that the three action masking methods achieve higher final rewards and converge faster than the baseline without action masking. △ Less

Submitted 5 June, 2024; originally announced June 2024.

arXiv:2406.03231 [pdf, other]

CommonPower: Supercharging Machine Learning for Smart Grids

Authors: Michael Eichelbeck, Hannah Markgraf, Matthias Althoff

Abstract: The growing complexity of power system management has led to an increased interest in the use of reinforcement learning (RL). However, no tool for comprehensive and realistic benchmarking of RL in smart grids exists. One prerequisite for such a comparison is a safeguarding mechanism since vanilla RL controllers can not guarantee the satisfaction of system constraints. Other central requirements in… ▽ More The growing complexity of power system management has led to an increased interest in the use of reinforcement learning (RL). However, no tool for comprehensive and realistic benchmarking of RL in smart grids exists. One prerequisite for such a comparison is a safeguarding mechanism since vanilla RL controllers can not guarantee the satisfaction of system constraints. Other central requirements include flexible modeling of benchmarking scenarios, credible baselines, and the possibility to investigate the impact of forecast uncertainties. Our Python tool CommonPower is the first modular framework addressing these needs. CommonPower offers a unified interface for single-agent and multi-agent RL training algorithms and includes a built-in model predictive control approach based on a symbolic representation of the system equations. This makes it possible to combine model predictive controllers with RL controllers in the same system. Leveraging the symbolic system model, CommonPower facilitates the study of safeguarding strategies via the flexible formulation of safety layers. Furthermore equipped with a generic forecasting interface, CommonPower constitutes a versatile tool significantly augmenting the exploration of safe RL controllers in smart grids on several dimensions. △ Less

Submitted 5 June, 2024; originally announced June 2024.

Comments: For the corresponding code repository, see https://github.com/TUMcps/commonpower

arXiv:2404.15065 [pdf, other]

Formal Verification of Graph Convolutional Networks with Uncertain Node Features and Uncertain Graph Structure

Authors: Tobias Ladner, Michael Eichelbeck, Matthias Althoff

Abstract: Graph neural networks are becoming increasingly popular in the field of machine learning due to their unique ability to process data structured in graphs. They have also been applied in safety-critical environments where perturbations inherently occur. However, these perturbations require us to formally verify neural networks before their deployment in safety-critical environments as neural networ… ▽ More Graph neural networks are becoming increasingly popular in the field of machine learning due to their unique ability to process data structured in graphs. They have also been applied in safety-critical environments where perturbations inherently occur. However, these perturbations require us to formally verify neural networks before their deployment in safety-critical environments as neural networks are prone to adversarial attacks. While there exists research on the formal verification of neural networks, there is no work verifying the robustness of generic graph convolutional network architectures with uncertainty in the node features and in the graph structure over multiple message-passing steps. This work addresses this research gap by explicitly preserving the non-convex dependencies of all elements in the underlying computations through reachability analysis with (matrix) polynomial zonotopes. We demonstrate our approach on three popular benchmark datasets. △ Less

Submitted 23 April, 2024; originally announced April 2024.

Comments: under review

arXiv:2404.11185 [pdf, other]

Approximability of the Containment Problem for Zonotopes and Ellipsotopes

Authors: Adrian Kulmburg, Lukas Schäfer, Matthias Althoff

Abstract: The zonotope containment problem, i.e., whether one zonotope is contained in another, is a central problem in control theory. Applications include detecting faults and robustifying controllers by computing invariant sets, and obtain fixed points in reachability analysis. Despite the inherent co-NP-hardness of this problem, an approximation algorithm developed by S. Sadraddini and R. Tedrake has ga… ▽ More The zonotope containment problem, i.e., whether one zonotope is contained in another, is a central problem in control theory. Applications include detecting faults and robustifying controllers by computing invariant sets, and obtain fixed points in reachability analysis. Despite the inherent co-NP-hardness of this problem, an approximation algorithm developed by S. Sadraddini and R. Tedrake has gained widespread recognition for its swift execution and consistent reliability in practice. In our study, we substantiate the precision of the algorithm with a definitive proof, elucidating the empirical accuracy observed in practice. Our proof hinges on establishing a connection between the containment problem and the computation of matrix norms, thereby enabling the extension of the approximation algorithm to encompass ellipsotopes -- a broader class of sets derived from zonotopes. We also explore the computational complexity of the ellipsotope containment problem with a focus on approximability. Finally, we present new methods to compute safe sets for linear dynamical systems, demonstrating the practical relevance of approximating the ellipsotope containment problem. △ Less

Submitted 19 June, 2024; v1 submitted 17 April, 2024; originally announced April 2024.

arXiv:2403.07470 [pdf, other]

DrPlanner: Diagnosis and Repair of Motion Planners Using Large Language Models

Authors: Yuanfei Lin, Chenran Li, Mingyu Ding, Masayoshi Tomizuka, Wei Zhan, Matthias Althoff

Abstract: Motion planners are essential for the safe operation of automated vehicles across various scenarios. However, no motion planning algorithm has achieved perfection in the literature, and improving its performance is often time-consuming and labor-intensive. To tackle the aforementioned issues, we present DrPlanner, the first framework designed to automatically diagnose and repair motion planners us… ▽ More Motion planners are essential for the safe operation of automated vehicles across various scenarios. However, no motion planning algorithm has achieved perfection in the literature, and improving its performance is often time-consuming and labor-intensive. To tackle the aforementioned issues, we present DrPlanner, the first framework designed to automatically diagnose and repair motion planners using large language models. Initially, we generate a structured description of the planner and its planned trajectories from both natural and programming languages. Leveraging the profound capabilities of large language models in addressing reasoning challenges, our framework returns repaired planners with detailed diagnostic descriptions. Furthermore, the framework advances iteratively with continuous feedback from the evaluation of the repaired outcomes. Our approach is validated using search-based motion planners; experimental results highlight the need of demonstrations in the prompt and the ability of our framework in identifying and rectifying elusive issues effectively. △ Less

Submitted 12 March, 2024; originally announced March 2024.

Comments: @2024 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

arXiv:2402.08502 [pdf, other]

doi 10.1109/TIV.2024.3400597

Provable Traffic Rule Compliance in Safe Reinforcement Learning on the Open Sea

Authors: Hanna Krasowski, Matthias Althoff

Abstract: For safe operation, autonomous vehicles have to obey traffic rules that are set forth in legal documents formulated in natural language. Temporal logic is a suitable concept to formalize such traffic rules. Still, temporal logic rules often result in constraints that are hard to solve using optimization-based motion planners. Reinforcement learning (RL) is a promising method to find motion plans f… ▽ More For safe operation, autonomous vehicles have to obey traffic rules that are set forth in legal documents formulated in natural language. Temporal logic is a suitable concept to formalize such traffic rules. Still, temporal logic rules often result in constraints that are hard to solve using optimization-based motion planners. Reinforcement learning (RL) is a promising method to find motion plans for autonomous vehicles. However, vanilla RL algorithms are based on random exploration and do not automatically comply with traffic rules. Our approach accomplishes guaranteed rule-compliance by integrating temporal logic specifications into RL. Specifically, we consider the application of vessels on the open sea, which must adhere to the Convention on the International Regulations for Preventing Collisions at Sea (COLREGS). To efficiently synthesize rule-compliant actions, we combine predicates based on set-based prediction with a statechart representing our formalized rules and their priorities. Action masking then restricts the RL agent to this set of verified rule-compliant actions. In numerical evaluations on critical maritime traffic situations, our agent always complies with the formalized legal rules and never collides while achieving a high goal-reaching rate during training and deployment. In contrast, vanilla and traffic rule-informed RL agents frequently violate traffic rules and collide even after training. △ Less

Submitted 16 May, 2024; v1 submitted 13 February, 2024; originally announced February 2024.

arXiv:2401.14961 [pdf, other]

Set-Based Training for Neural Network Verification

Authors: Lukas Koller, Tobias Ladner, Matthias Althoff

Abstract: Neural networks are vulnerable to adversarial attacks, i.e., small input perturbations can significantly affect the outputs of a neural network. In safety-critical environments, the inputs often contain noisy sensor data; hence, in this case, neural networks that are robust against input perturbations are required. To ensure safety, the robustness of a neural network must be formally verified. How… ▽ More Neural networks are vulnerable to adversarial attacks, i.e., small input perturbations can significantly affect the outputs of a neural network. In safety-critical environments, the inputs often contain noisy sensor data; hence, in this case, neural networks that are robust against input perturbations are required. To ensure safety, the robustness of a neural network must be formally verified. However, training and formally verifying robust neural networks is challenging. We address both of these challenges by employing, for the first time, an end-to-end set-based training procedure that trains robust neural networks for formal verification. Our training procedure trains neural networks, which can be easily verified using simple polynomial-time verification algorithms. Moreover, our extensive evaluation demonstrates that our set-based training procedure effectively trains robust neural networks, which are easier to verify. Set-based trained neural networks consistently match or outperform those trained with state-of-the-art robust training approaches. △ Less

Submitted 19 April, 2024; v1 submitted 26 January, 2024; originally announced January 2024.

arXiv:2312.08076 [pdf, other]

Provably-Correct Safety Protocol for Cooperative Platooning

Authors: Sebastian Mair, Matthias Althoff

Abstract: Cooperative Adaptive Cruise Control (CACC) is a well-studied technology for forming string-stable vehicle platoons. Ensuring collision avoidance is particularly difficult in CACC due to the small desired inter-vehicle spacing. We propose a safety protocol preventing collisions in a provably-correct manner while still maintaining a small distance to the preceding vehicle, by utilizing communicated… ▽ More Cooperative Adaptive Cruise Control (CACC) is a well-studied technology for forming string-stable vehicle platoons. Ensuring collision avoidance is particularly difficult in CACC due to the small desired inter-vehicle spacing. We propose a safety protocol preventing collisions in a provably-correct manner while still maintaining a small distance to the preceding vehicle, by utilizing communicated braking capabilities. In addition, the safety of the protocol is ensured despite possible communication failures. While our concept can be applied to any CACC system, we particularly consider a class of CACCs, where the platoon vehicles successively agree on a consensus behavior. Our safety protocol is evaluated on various scenarios using the CommonRoad benchmark suite. △ Less

Submitted 13 December, 2023; originally announced December 2023.

Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

arXiv:2310.19083 [pdf, other]

Backward Reachability Analysis of Perturbed Continuous-Time Linear Systems Using Set Propagation

Authors: Mark Wetzlinger, Matthias Althoff

Abstract: Backward reachability analysis computes the set of states that reach a target set under the competing influence of control input and disturbances. Depending on their interplay, the backward reachable set either represents all states that can be steered into the target set or all states that cannot avoid entering it -- the corresponding solutions can be used for controller synthesis and safety veri… ▽ More Backward reachability analysis computes the set of states that reach a target set under the competing influence of control input and disturbances. Depending on their interplay, the backward reachable set either represents all states that can be steered into the target set or all states that cannot avoid entering it -- the corresponding solutions can be used for controller synthesis and safety verification, respectively. A popular technique for backward reachable set computation solves Hamilton-Jacobi-Isaacs equations, which scales exponentially with the state dimension due to gridding the state space. In this work, we instead use set propagation techniques to design backward reachability algorithms for linear time-invariant systems. Crucially, the proposed algorithms scale only polynomially with the state dimension. Our numerical examples demonstrate the tightness of the obtained backward reachable sets and show an overwhelming improvement of our proposed algorithms over state-of-the-art methods regarding scalability, as systems with well over a hundred states can now be analyzed. △ Less

Submitted 5 April, 2024; v1 submitted 29 October, 2023; originally announced October 2023.

Comments: 16 pages

arXiv:2310.06208 [pdf, other]

Human-Robot Gym: Benchmarking Reinforcement Learning in Human-Robot Collaboration

Authors: Jakob Thumm, Felix Trost, Matthias Althoff

Abstract: Deep reinforcement learning (RL) has shown promising results in robot motion planning with first attempts in human-robot collaboration (HRC). However, a fair comparison of RL approaches in HRC under the constraint of guaranteed safety is yet to be made. We, therefore, present human-robot gym, a benchmark suite for safe RL in HRC. Our benchmark suite provides eight challenging, realistic HRC tasks… ▽ More Deep reinforcement learning (RL) has shown promising results in robot motion planning with first attempts in human-robot collaboration (HRC). However, a fair comparison of RL approaches in HRC under the constraint of guaranteed safety is yet to be made. We, therefore, present human-robot gym, a benchmark suite for safe RL in HRC. Our benchmark suite provides eight challenging, realistic HRC tasks in a modular simulation framework. Most importantly, human-robot gym includes a safety shield that provably guarantees human safety. We are, thereby, the first to provide a benchmark suite to train RL agents that adhere to the safety specifications of real-world HRC. This bridges a critical gap between theoretic RL research and its real-world deployment. Our evaluation of six tasks led to three key results: (a) the diverse nature of the tasks offered by human-robot gym creates a challenging benchmark for state-of-the-art RL methods, (b) incorporating expert knowledge in RL training in the form of an action-based reward can outperform the expert, and (c) our agents negligibly overfit to training data. △ Less

Submitted 25 June, 2024; v1 submitted 9 October, 2023; originally announced October 2023.

arXiv:2309.15492 [pdf, other]

EDGAR: An Autonomous Driving Research Platform -- From Feature Development to Real-World Application

Authors: Phillip Karle, Tobias Betz, Marcin Bosk, Felix Fent, Nils Gehrke, Maximilian Geisslinger, Luis Gressenbuch, Philipp Hafemann, Sebastian Huber, Maximilian Hübner, Sebastian Huch, Gemb Kaljavesi, Tobias Kerbl, Dominik Kulmer, Tobias Mascetta, Sebastian Maierhofer, Florian Pfab, Filip Rezabek, Esteban Rivera, Simon Sagmeister, Leander Seidlitz, Florian Sauerbeck, Ilir Tahiraj, Rainer Trauth, Nico Uhlemann , et al. (9 additional authors not shown)

Abstract: While current research and development of autonomous driving primarily focuses on developing new features and algorithms, the transfer from isolated software components into an entire software stack has been covered sparsely. Besides that, due to the complexity of autonomous software stacks and public road traffic, the optimal validation of entire stacks is an open research problem. Our paper targ… ▽ More While current research and development of autonomous driving primarily focuses on developing new features and algorithms, the transfer from isolated software components into an entire software stack has been covered sparsely. Besides that, due to the complexity of autonomous software stacks and public road traffic, the optimal validation of entire stacks is an open research problem. Our paper targets these two aspects. We present our autonomous research vehicle EDGAR and its digital twin, a detailed virtual duplication of the vehicle. While the vehicle's setup is closely related to the state of the art, its virtual duplication is a valuable contribution as it is crucial for a consistent validation process from simulation to real-world tests. In addition, different development teams can work with the same model, making integration and testing of the software stacks much easier, significantly accelerating the development process. The real and virtual vehicles are embedded in a comprehensive development environment, which is also introduced. All parameters of the digital twin are provided open-source at https://github.com/TUMFTM/edgar_digital_twin. △ Less

Submitted 16 January, 2024; v1 submitted 27 September, 2023; originally announced September 2023.

arXiv:2309.11944 [pdf, other]

Reachability Analysis of ARMAX Models

Authors: Laura Lützow, Matthias Althoff

Abstract: Reachability analysis is a powerful tool for computing the set of states or outputs reachable for a system. While previous work has focused on systems described by state-space models, we present the first methods to compute reachable sets of ARMAX models - one of the most common input-output models originating from data-driven system identification. The first approach we propose can only be used w… ▽ More Reachability analysis is a powerful tool for computing the set of states or outputs reachable for a system. While previous work has focused on systems described by state-space models, we present the first methods to compute reachable sets of ARMAX models - one of the most common input-output models originating from data-driven system identification. The first approach we propose can only be used with dependency-preserving set representations such as symbolic zonotopes, while the second one is valid for arbitrary set representations but relies on a reformulation of the ARMAX model. By analyzing the computational complexities, we show that both approaches scale quadratically with respect to the time horizon of the reachability problem when using symbolic zonotopes. To reduce the computational complexity, we propose a third approach that scales linearly with respect to the time horizon when using set representations that are closed under Minkowski addition and linear transformation and that satisfy that the computational complexity of the Minkowski sum is independent of the representation size of the operands. Our numerical experiments demonstrate that the reachable sets of ARMAX models are tighter than the reachable sets of equivalent state space models in case of unknown initial states. Therefore, this methodology has the potential to significantly reduce the conservatism of various verification techniques. △ Less

Submitted 28 September, 2023; v1 submitted 21 September, 2023; originally announced September 2023.

Comments: ©2023 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

arXiv:2309.08399 [pdf, other]

Optimizing Modular Robot Composition: A Lexicographic Genetic Algorithm Approach

Authors: Jonathan Külz, Matthias Althoff

Abstract: Industrial robots are designed as general-purpose hardware with limited ability to adapt to changing task requirements or environments. Modular robots, on the other hand, offer flexibility and can be easily customized to suit diverse needs. The morphology, i.e., the form and structure of a robot, significantly impacts the primary performance metrics acquisition cost, cycle time, and energy efficie… ▽ More Industrial robots are designed as general-purpose hardware with limited ability to adapt to changing task requirements or environments. Modular robots, on the other hand, offer flexibility and can be easily customized to suit diverse needs. The morphology, i.e., the form and structure of a robot, significantly impacts the primary performance metrics acquisition cost, cycle time, and energy efficiency. However, identifying an optimal module composition for a specific task remains an open problem, presenting a substantial hurdle in developing task-tailored modular robots. Previous approaches either lack adequate exploration of the design space or the possibility to adapt to complex tasks. We propose combining a genetic algorithm with a lexicographic evaluation of solution candidates to overcome this problem and navigate search spaces exceeding those in prior work by magnitudes in the number of possible compositions. We demonstrate that our approach outperforms a state-of-the-art baseline and is able to synthesize modular robots for industrial tasks in cluttered environments. △ Less

Submitted 4 March, 2024; v1 submitted 15 September, 2023; originally announced September 2023.

Comments: \c{opyright} 2024 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

arXiv:2307.13977 [pdf, ps, other]

doi 10.1016/j.ifacol.2023.10.028

Formal Verification of Robotic Contact Tasks via Reachability Analysis

Authors: Chencheng Tang, Matthias Althoff

Abstract: Verifying the correct behavior of robots in contact tasks is challenging due to model uncertainties associated with contacts. Standard methods for testing often fall short since all (uncountable many) solutions cannot be obtained. Instead, we propose to formally and efficiently verify robot behaviors in contact tasks using reachability analysis, which enables checking all the reachable states agai… ▽ More Verifying the correct behavior of robots in contact tasks is challenging due to model uncertainties associated with contacts. Standard methods for testing often fall short since all (uncountable many) solutions cannot be obtained. Instead, we propose to formally and efficiently verify robot behaviors in contact tasks using reachability analysis, which enables checking all the reachable states against user-provided specifications. To this end, we extend the state of the art in reachability analysis for hybrid (mixed discrete and continuous) dynamics subject to discrete-time input trajectories. In particular, we present a novel and scalable guard intersection approach to reliably compute the complex behavior caused by contacts. We model robots subject to contacts as hybrid automata in which crucial time delays are included. The usefulness of our approach is demonstrated by verifying safe human-robot interaction in the presence of constrained collisions, which was out of reach for existing methods. △ Less

Submitted 26 July, 2023; originally announced July 2023.

Comments: This work has been accepted by the 22nd IFAC World Congress (2023 in Yokohama, Japan)

arXiv:2305.17443 [pdf, other]

doi 10.1109/TIV.2023.3317977

Resilience in Platoons of Cooperative Heterogeneous Vehicles: Self-organization Strategies and Provably-correct Design

Authors: Di Liu, Sebastian Mair, Kang Yang, Simone Baldi, Paolo Frasca, Matthias Althoff

Abstract: This work proposes provably-correct self-organizing strategies for platoons of heterogeneous vehicles. We refer to self-organization as the capability of a platoon to autonomously homogenize to a common group behavior. We show that self-organization promotes resilience to acceleration limits and communication failures, i.e., homogenizing to a common group behavior makes the platoon recover from th… ▽ More This work proposes provably-correct self-organizing strategies for platoons of heterogeneous vehicles. We refer to self-organization as the capability of a platoon to autonomously homogenize to a common group behavior. We show that self-organization promotes resilience to acceleration limits and communication failures, i.e., homogenizing to a common group behavior makes the platoon recover from these causes of impairments. In the presence of acceleration limits, resilience is achieved by self-organizing to a common constrained group behavior that prevents the vehicles from hitting their acceleration limits. In the presence of communication failures, resilience is achieved by self-organizing to a common group observer to estimate the missing information. Stability of the self-organization mechanism is studied analytically, and correctness with respect to traffic actions (e.g. emergency braking, cut-in, merging) is realized through a provably-correct safety layer. Numerical validations via the platooning toolbox OpenCDA in CARLA and via the CommonRoad platform confirm improved performance through self-organization and the provably-correct safety layer. △ Less

Submitted 22 February, 2024; v1 submitted 27 May, 2023; originally announced May 2023.

arXiv:2305.10080 [pdf, other]

doi 10.1109/ITSC57777.2023.10422422

Automatic Traffic Scenario Conversion from OpenSCENARIO to CommonRoad

Authors: Yuanfei Lin, Michael Ratzel, Matthias Althoff

Abstract: Scenarios are a crucial element for developing, testing, and verifying autonomous driving systems. However, open-source scenarios are often formulated using different terminologies. This limits their usage across different applications as many scenario representation formats are not directly compatible with each other. To address this problem, we present the first open-source converter from the Op… ▽ More Scenarios are a crucial element for developing, testing, and verifying autonomous driving systems. However, open-source scenarios are often formulated using different terminologies. This limits their usage across different applications as many scenario representation formats are not directly compatible with each other. To address this problem, we present the first open-source converter from the OpenSCENARIO format to the CommonRoad format, which are two of the most popular scenario formats used in autonomous driving. Our converter employs a simulation tool to execute the dynamic elements defined by OpenSCENARIO. The converter is available at commonroad.in.tum.de and we demonstrate its usefulness by converting publicly available scenarios in the OpenSCENARIO format and evaluating them using CommonRoad tools. △ Less

Submitted 19 July, 2023; v1 submitted 17 May, 2023; originally announced May 2023.

Comments: 6 pages, 4 figures, ITSC 2023 accepted

arXiv:2305.01932 [pdf, other]

Fully Automatic Neural Network Reduction for Formal Verification

Authors: Tobias Ladner, Matthias Althoff

Abstract: Formal verification of neural networks is essential before their deployment in safety-critical applications. However, existing methods for formally verifying neural networks are not yet scalable enough to handle practical problems involving a large number of neurons. We address this challenge by introducing a fully automatic and sound reduction of neural networks using reachability analysis. The s… ▽ More Formal verification of neural networks is essential before their deployment in safety-critical applications. However, existing methods for formally verifying neural networks are not yet scalable enough to handle practical problems involving a large number of neurons. We address this challenge by introducing a fully automatic and sound reduction of neural networks using reachability analysis. The soundness ensures that the verification of the reduced network entails the verification of the original network. To the best of our knowledge, we present the first sound reduction approach that is applicable to neural networks with any type of element-wise activation function, such as ReLU, sigmoid, and tanh. The network reduction is computed on the fly while simultaneously verifying the original network and its specifications. All parameters are automatically tuned to minimize the network size without compromising verifiability. We further show the applicability of our approach to convolutional neural networks by explicitly exploiting similar neighboring pixels. Our evaluation shows that our approach can reduce the number of neurons to a fraction of the original number of neurons with minor outer-approximation and thus reduce the verification time to a similar degree. △ Less

Submitted 23 April, 2024; v1 submitted 3 May, 2023; originally announced May 2023.

Comments: under review

arXiv:2303.05173 [pdf, other]

M-Representation of Polytopes

Authors: Sebastian Sigl, Matthias Althoff

Abstract: We introduce the M-representation of polytopes, which makes it possible to compute linear transformations, convex hulls, and Minkowski sums with linear complexity in the dimension of the polytopes. When the polytope is a convex hull of a zonotope and a polytope, the representation size can be smaller than any of the known representations (V-representation, H-representation, and Z-representation).… ▽ More We introduce the M-representation of polytopes, which makes it possible to compute linear transformations, convex hulls, and Minkowski sums with linear complexity in the dimension of the polytopes. When the polytope is a convex hull of a zonotope and a polytope, the representation size can be smaller than any of the known representations (V-representation, H-representation, and Z-representation). We also provide a variant of the M-representation: The chain representation is more compact and we can directly use it to compute linear transformations and convex hulls -- for all other operations on the chain representation, one requires a conversion to the M-representation. △ Less

Submitted 9 March, 2023; originally announced March 2023.

arXiv:2303.04218 [pdf, other]

Deep Occupancy-Predictive Representations for Autonomous Driving

Authors: Eivind Meyer, Lars Frederik Peiss, Matthias Althoff

Abstract: Manually specifying features that capture the diversity in traffic environments is impractical. Consequently, learning-based agents cannot realize their full potential as neural motion planners for autonomous vehicles. Instead, this work proposes to learn which features are task-relevant. Given its immediate relevance to motion planning, our proposed architecture encodes the probabilistic occupanc… ▽ More Manually specifying features that capture the diversity in traffic environments is impractical. Consequently, learning-based agents cannot realize their full potential as neural motion planners for autonomous vehicles. Instead, this work proposes to learn which features are task-relevant. Given its immediate relevance to motion planning, our proposed architecture encodes the probabilistic occupancy map as a proxy for obtaining pre-trained state representations. By leveraging a map-aware graph formulation of the environment, our agent-centric encoder generalizes to arbitrary road networks and traffic situations. We show that our approach significantly improves the downstream performance of a reinforcement learning agent operating in urban traffic environments. △ Less

Submitted 7 March, 2023; originally announced March 2023.

Comments: Accepted at ICRA 2023

arXiv:2303.03339 [pdf, other]

Reducing Safety Interventions in Provably Safe Reinforcement Learning

Authors: Jakob Thumm, Guillaume Pelat, Matthias Althoff

Abstract: Deep Reinforcement Learning (RL) has shown promise in addressing complex robotic challenges. In real-world applications, RL is often accompanied by failsafe controllers as a last resort to avoid catastrophic events. While necessary for safety, these interventions can result in undesirable behaviors, such as abrupt braking or aggressive steering. This paper proposes two safety intervention reductio… ▽ More Deep Reinforcement Learning (RL) has shown promise in addressing complex robotic challenges. In real-world applications, RL is often accompanied by failsafe controllers as a last resort to avoid catastrophic events. While necessary for safety, these interventions can result in undesirable behaviors, such as abrupt braking or aggressive steering. This paper proposes two safety intervention reduction methods: proactive replacement and proactive projection, which change the action of the agent if it leads to a potential failsafe intervention. These approaches are compared to state-of-the-art constrained RL on the OpenAI safety gym benchmark and a human-robot collaboration task. Our study demonstrates that the combination of our method with provably safe RL leads to high-performing policies with zero safety violations and a low number of failsafe interventions. Our versatile method can be applied to a wide range of real-world robotic tasks, while effectively improving safety without sacrificing task performance. △ Less

Submitted 25 September, 2023; v1 submitted 6 March, 2023; originally announced March 2023.

Comments: 8 pages, 6 figures

arXiv:2302.01259 [pdf, other]

Geometric Deep Learning for Autonomous Driving: Unlocking the Power of Graph Neural Networks With CommonRoad-Geometric

Authors: Eivind Meyer, Maurice Brenner, Bowen Zhang, Max Schickert, Bilal Musani, Matthias Althoff

Abstract: Heterogeneous graphs offer powerful data representations for traffic, given their ability to model the complex interaction effects among a varying number of traffic participants and the underlying road infrastructure. With the recent advent of graph neural networks (GNNs) as the accompanying deep learning framework, the graph structure can be efficiently leveraged for various machine learning appl… ▽ More Heterogeneous graphs offer powerful data representations for traffic, given their ability to model the complex interaction effects among a varying number of traffic participants and the underlying road infrastructure. With the recent advent of graph neural networks (GNNs) as the accompanying deep learning framework, the graph structure can be efficiently leveraged for various machine learning applications such as trajectory prediction. As a first of its kind, our proposed Python framework offers an easy-to-use and fully customizable data processing pipeline to extract standardized graph datasets from traffic scenarios. Providing a platform for GNN-based autonomous driving research, it improves comparability between approaches and allows researchers to focus on model implementation instead of dataset curation. △ Less

Submitted 24 April, 2023; v1 submitted 2 February, 2023; originally announced February 2023.

Comments: Presented at IV 2023

arXiv:2212.06129 [pdf, other]

Safe Reinforcement Learning with Probabilistic Guarantees Satisfying Temporal Logic Specifications in Continuous Action Spaces

Authors: Hanna Krasowski, Prithvi Akella, Aaron D. Ames, Matthias Althoff

Abstract: Vanilla Reinforcement Learning (RL) can efficiently solve complex tasks but does not provide any guarantees on system behavior. To bridge this gap, we propose a three-step safe RL procedure for continuous action spaces that provides probabilistic guarantees with respect to temporal logic specifications. First, our approach probabilistically verifies a candidate controller with respect to a tempora… ▽ More Vanilla Reinforcement Learning (RL) can efficiently solve complex tasks but does not provide any guarantees on system behavior. To bridge this gap, we propose a three-step safe RL procedure for continuous action spaces that provides probabilistic guarantees with respect to temporal logic specifications. First, our approach probabilistically verifies a candidate controller with respect to a temporal logic specification while randomizing the control inputs to the system within a bounded set. Second, we improve the performance of this probabilistically verified controller by adding an RL agent that optimizes the verified controller for performance in the same bounded set around the control input. Third, we verify probabilistic safety guarantees with respect to temporal logic specifications for the learned agent. Our approach is efficiently implementable for continuous action and state spaces. The separation of safety verification and performance improvement into two distinct steps realizes both explicit probabilistic safety guarantees and a straightforward RL setup that focuses on performance. We evaluate our approach on an evasion task where a robot has to reach a goal while evading a dynamic obstacle with a specific maneuver. Our results show that our safe RL approach leads to efficient learning while maintaining its probabilistic safety specification. △ Less

Submitted 28 September, 2023; v1 submitted 12 December, 2022; originally announced December 2022.

arXiv:2210.10691 [pdf, ps, other]

doi 10.1109/OJCSYS.2023.3256305

Provably Safe Reinforcement Learning via Action Projection using Reachability Analysis and Polynomial Zonotopes

Authors: Niklas Kochdumper, Hanna Krasowski, Xiao Wang, Stanley Bak, Matthias Althoff

Abstract: While reinforcement learning produces very promising results for many applications, its main disadvantage is the lack of safety guarantees, which prevents its use in safety-critical systems. In this work, we address this issue by a safety shield for nonlinear continuous systems that solve reach-avoid tasks. Our safety shield prevents applying potentially unsafe actions from a reinforcement learnin… ▽ More While reinforcement learning produces very promising results for many applications, its main disadvantage is the lack of safety guarantees, which prevents its use in safety-critical systems. In this work, we address this issue by a safety shield for nonlinear continuous systems that solve reach-avoid tasks. Our safety shield prevents applying potentially unsafe actions from a reinforcement learning agent by projecting the proposed action to the closest safe action. This approach is called action projection and is implemented via mixed-integer optimization. The safety constraints for action projection are obtained by applying parameterized reachability analysis using polynomial zonotopes, which enables to accurately capture the nonlinear effects of the actions on the system. In contrast to other state-of-the-art approaches for action projection, our safety shield can efficiently handle input constraints and dynamic obstacles, eases incorporation of the spatial robot dimensions into the safety constraints, guarantees robust safety despite process noise and measurement errors, and is well suited for high-dimensional systems, as we demonstrate on several challenging benchmark systems. △ Less

Submitted 14 March, 2023; v1 submitted 19 October, 2022; originally announced October 2022.

arXiv:2209.09321 [pdf, other]

doi 10.1109/TAC.2023.3292008

Fully-Automated Verification of Linear Systems Using Inner- and Outer-Approximations of Reachable Sets

Authors: Mark Wetzlinger, Niklas Kochdumper, Stanley Bak, Matthias Althoff

Abstract: Reachability analysis is a formal method to guarantee safety of dynamical systems under the influence of uncertainties. A substantial bottleneck of all reachability algorithms is the necessity to adequately tune specific algorithm parameters, such as the time step size, which requires expert knowledge. In this work, we solve this issue with a fully automated reachability algorithm that tunes all a… ▽ More Reachability analysis is a formal method to guarantee safety of dynamical systems under the influence of uncertainties. A substantial bottleneck of all reachability algorithms is the necessity to adequately tune specific algorithm parameters, such as the time step size, which requires expert knowledge. In this work, we solve this issue with a fully automated reachability algorithm that tunes all algorithm parameters internally such that the reachable set enclosure respects a user-defined approximation error bound in terms of the Hausdorff distance to the exact reachable set. Moreover, this bound can be used to extract an inner-approximation of the reachable set from the outer-approximation using the Minkowski difference. Finally, we propose a novel verification algorithm that automatically refines the accuracy of the outer-approximation and inner-approximation until specifications given by time-varying safe and unsafe sets can be verified or falsified. The numerical evaluation demonstrates that our verification algorithm successfully verifies or falsifies benchmarks from different domains without requiring manual tuning. △ Less

Submitted 22 February, 2024; v1 submitted 19 September, 2022; originally announced September 2022.

Comments: 16 pages

MSC Class: 65L70 ACM Class: G.1.7

arXiv:2209.07881 [pdf, other]

doi 10.1109/LRA.2023.3324582

Model Predictive Robustness of Signal Temporal Logic Predicates

Authors: Yuanfei Lin, Haoxuan Li, Matthias Althoff

Abstract: The robustness of signal temporal logic not only assesses whether a signal adheres to a specification but also provides a measure of how much a formula is fulfilled or violated. The calculation of robustness is based on evaluating the robustness of underlying predicates. However, the robustness of predicates is usually defined in a model-free way, i.e., without including the system dynamics. Moreo… ▽ More The robustness of signal temporal logic not only assesses whether a signal adheres to a specification but also provides a measure of how much a formula is fulfilled or violated. The calculation of robustness is based on evaluating the robustness of underlying predicates. However, the robustness of predicates is usually defined in a model-free way, i.e., without including the system dynamics. Moreover, it is often nontrivial to define the robustness of complicated predicates precisely. To address these issues, we propose a notion of model predictive robustness, which provides a more systematic way of evaluating robustness compared to previous approaches by considering model-based predictions. In particular, we use Gaussian process regression to learn the robustness based on precomputed predictions so that robustness values can be efficiently computed online. We evaluate our approach for the use case of autonomous driving with predicates used in formalized traffic rules on a recorded dataset, which highlights the advantage of our approach compared to traditional approaches in terms of precision. By incorporating our robustness definitions into a trajectory planner, autonomous vehicles obey traffic rules more robustly than human drivers in the dataset. △ Less

Submitted 14 October, 2023; v1 submitted 16 September, 2022; originally announced September 2022.

Comments: @2023 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

arXiv:2209.06758 [pdf, other]

Timor Python: A Toolbox for Industrial Modular Robotics

Authors: Jonathan Külz, Matthias Mayer, Matthias Althoff

Abstract: Modular Reconfigurable Robots (MRRs) represent an exciting path forward for industrial robotics, opening up new possibilities for robot design. Compared to monolithic manipulators, they promise greater flexibility, improved maintainability, and cost-efficiency. However, there is no tool or standardized way to model and simulate assemblies of modules in the same way it has been done for robotic man… ▽ More Modular Reconfigurable Robots (MRRs) represent an exciting path forward for industrial robotics, opening up new possibilities for robot design. Compared to monolithic manipulators, they promise greater flexibility, improved maintainability, and cost-efficiency. However, there is no tool or standardized way to model and simulate assemblies of modules in the same way it has been done for robotic manipulators for decades. We introduce the Toolbox for Industrial Modular Robotics (Timor), a Python toolbox to bridge this gap and integrate modular robotics into existing simulation and optimization pipelines. Our open-source library offers model generation and task-based configuration optimization for MRRs. It can easily be integrated with existing simulation tools - not least by offering URDF export of arbitrary modular robot assemblies. Moreover, our experimental study demonstrates the effectiveness of Timor as a tool for designing modular robots optimized for specific use cases. △ Less

Submitted 15 September, 2023; v1 submitted 14 September, 2022; originally announced September 2022.

arXiv:2207.02715 [pdf, ps, other]

doi 10.1007/978-3-031-33170-1_2

Open- and Closed-Loop Neural Network Verification using Polynomial Zonotopes

Authors: Niklas Kochdumper, Christian Schilling, Matthias Althoff, Stanley Bak

Abstract: We present a novel approach to efficiently compute tight non-convex enclosures of the image through neural networks with ReLU, sigmoid, or hyperbolic tangent activation functions. In particular, we abstract the input-output relation of each neuron by a polynomial approximation, which is evaluated in a set-based manner using polynomial zonotopes. While our approach can also can be beneficial for op… ▽ More We present a novel approach to efficiently compute tight non-convex enclosures of the image through neural networks with ReLU, sigmoid, or hyperbolic tangent activation functions. In particular, we abstract the input-output relation of each neuron by a polynomial approximation, which is evaluated in a set-based manner using polynomial zonotopes. While our approach can also can be beneficial for open-loop neural network verification, our main application is reachability analysis of neural network controlled systems, where polynomial zonotopes are able to capture the non-convexity caused by the neural network as well as the system dynamics. This results in a superior performance compared to other methods, as we demonstrate on various benchmarks. △ Less

Submitted 17 April, 2023; v1 submitted 6 July, 2022; originally announced July 2022.

Journal ref: NFM 2023

arXiv:2205.06750 [pdf, other]

Provably Safe Reinforcement Learning: Conceptual Analysis, Survey, and Benchmarking

Authors: Hanna Krasowski, Jakob Thumm, Marlon Müller, Lukas Schäfer, Xiao Wang, Matthias Althoff

Abstract: Ensuring the safety of reinforcement learning (RL) algorithms is crucial to unlock their potential for many real-world tasks. However, vanilla RL and most safe RL approaches do not guarantee safety. In recent years, several methods have been proposed to provide hard safety guarantees for RL, which is essential for applications where unsafe actions could have disastrous consequences. Nevertheless,… ▽ More Ensuring the safety of reinforcement learning (RL) algorithms is crucial to unlock their potential for many real-world tasks. However, vanilla RL and most safe RL approaches do not guarantee safety. In recent years, several methods have been proposed to provide hard safety guarantees for RL, which is essential for applications where unsafe actions could have disastrous consequences. Nevertheless, there is no comprehensive comparison of these provably safe RL methods. Therefore, we introduce a categorization of existing provably safe RL methods, present the conceptual foundations for both continuous and discrete action spaces, and empirically benchmark existing methods. We categorize the methods based on how they adapt the action: action replacement, action projection, and action masking. Our experiments on an inverted pendulum and a quadrotor stabilization task indicate that action replacement is the best-performing approach for these applications despite its comparatively simple realization. Furthermore, adding a reward penalty, every time the safety verification is engaged, improved training performance in our experiments. Finally, we provide practical guidance on selecting provably safe RL approaches depending on the safety specification, RL algorithm, and type of action space. △ Less

Submitted 18 November, 2023; v1 submitted 13 May, 2022; originally announced May 2022.

Comments: The published paper is available at https://openreview.net/forum?id=mcN0ezbnzO

Journal ref: Transactions on Machine Learning Research, 2023

arXiv:2205.06311 [pdf, other]

Provably Safe Deep Reinforcement Learning for Robotic Manipulation in Human Environments

Authors: Jakob Thumm, Matthias Althoff

Abstract: Deep reinforcement learning (RL) has shown promising results in the motion planning of manipulators. However, no method guarantees the safety of highly dynamic obstacles, such as humans, in RL-based manipulator control. This lack of formal safety assurances prevents the application of RL for manipulators in real-world human environments. Therefore, we propose a shielding mechanism that ensures ISO… ▽ More Deep reinforcement learning (RL) has shown promising results in the motion planning of manipulators. However, no method guarantees the safety of highly dynamic obstacles, such as humans, in RL-based manipulator control. This lack of formal safety assurances prevents the application of RL for manipulators in real-world human environments. Therefore, we propose a shielding mechanism that ensures ISO-verified human safety while training and deploying RL algorithms on manipulators. We utilize a fast reachability analysis of humans and manipulators to guarantee that the manipulator comes to a complete stop before a human is within its range. Our proposed method guarantees safety and significantly improves the RL performance by preventing episode-ending collisions. We demonstrate the performance of our proposed method in simulation using human motion capture data. △ Less

Submitted 12 May, 2022; originally announced May 2022.

Comments: Accepted for ICRA 2022

arXiv:2205.06212 [pdf, other]

Contingency-constrained economic dispatch with safe reinforcement learning

Authors: Michael Eichelbeck, Hannah Markgraf, Matthias Althoff

Abstract: Future power systems will rely heavily on micro grids with a high share of decentralised renewable energy sources and energy storage systems. The high complexity and uncertainty in this context might make conventional power dispatch strategies infeasible. Reinforcement-learning based (RL) controllers can address this challenge, however, cannot themselves provide safety guarantees, preventing their… ▽ More Future power systems will rely heavily on micro grids with a high share of decentralised renewable energy sources and energy storage systems. The high complexity and uncertainty in this context might make conventional power dispatch strategies infeasible. Reinforcement-learning based (RL) controllers can address this challenge, however, cannot themselves provide safety guarantees, preventing their deployment in practice. To overcome this limitation, we propose a formally validated RL controller for economic dispatch. We extend conventional constraints by a time-dependent constraint encoding the islanding contingency. The contingency constraint is computed using set-based backwards reachability analysis and actions of the RL agent are verified through a safety layer. Unsafe actions are projected into the safe action space while leveraging constrained zonotope set representations for computational efficiency. The developed approach is demonstrated on a residential use case using real-world measurements. △ Less

Submitted 20 July, 2022; v1 submitted 12 May, 2022; originally announced May 2022.

arXiv:2203.09337 [pdf, other]

CoBRA: A Composable Benchmark for Robotics Applications

Authors: Matthias Mayer, Jonathan Külz, Matthias Althoff

Abstract: Selecting an optimal robot, its base pose, and trajectory for a given task is currently mainly done by human expertise or trial and error. To evaluate automatic approaches to this combined optimization problem, we introduce a benchmark suite encompassing a unified format for robots, environments, and task descriptions. Our benchmark suite is especially useful for modular robots, where the multitud… ▽ More Selecting an optimal robot, its base pose, and trajectory for a given task is currently mainly done by human expertise or trial and error. To evaluate automatic approaches to this combined optimization problem, we introduce a benchmark suite encompassing a unified format for robots, environments, and task descriptions. Our benchmark suite is especially useful for modular robots, where the multitude of robots that can be assembled creates a host of additional parameters to optimize. We include tasks such as machine tending and welding in synthetic environments and 3D scans of real-world machine shops. All benchmarks are accessible through https://cobra.cps.cit.tum.de, a platform to conveniently share, reference, and compare tasks, robot models, and solutions. △ Less

Submitted 21 March, 2024; v1 submitted 17 March, 2022; originally announced March 2022.

Comments: 7 pages, 5 Figures, 5 Tables Final version for IEEE ICRA'24

arXiv:2103.01626 [pdf, other]

doi 10.1109/TRO.2023.3277268

Guarantees for Real Robotic Systems: Unifying Formal Controller Synthesis and Reachset-Conformant Identification

Authors: Stefan B. Liu, Bastian Schürmann, Matthias Althoff

Abstract: Robots are used increasingly often in safety-critical scenarios, such as robotic surgery or human-robot interaction. To ensure stringent performance criteria, formal controller synthesis is a promising direction to guarantee that robots behave as desired. However, formally ensured properties only transfer to the real robot when the model is appropriate. We address this problem by combining the ide… ▽ More Robots are used increasingly often in safety-critical scenarios, such as robotic surgery or human-robot interaction. To ensure stringent performance criteria, formal controller synthesis is a promising direction to guarantee that robots behave as desired. However, formally ensured properties only transfer to the real robot when the model is appropriate. We address this problem by combining the identification of a reachset-conformant model with controller synthesis. Since the reachset-conformant model contains all the measured behaviors of the real robot, the safety properties of the model transfer to the real robot. The transferability is demonstrated by experiments on a real robot, for which we synthesize tracking controllers. △ Less

Submitted 12 September, 2023; v1 submitted 2 March, 2021; originally announced March 2021.

Comments: \c{opyright} 2023 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

Journal ref: IEEE Transactions on Robotics

arXiv:2010.11097 [pdf, other]

doi 10.1016/j.ejcon.2023.100786

Privacy Preserving Set-Based Estimation Using Partially Homomorphic Encryption

Authors: Amr Alanwar, Victor Gassmann, Xingkang He, Hazem Said, Henrik Sandberg, Karl Henrik Johansson, Matthias Althoff

Abstract: The set-based estimation has gained a lot of attention due to its ability to guarantee state enclosures for safety-critical systems. However, collecting measurements from distributed sensors often requires outsourcing the set-based operations to an aggregator node, raising many privacy concerns. To address this problem, we present set-based estimation protocols using partially homomorphic encrypti… ▽ More The set-based estimation has gained a lot of attention due to its ability to guarantee state enclosures for safety-critical systems. However, collecting measurements from distributed sensors often requires outsourcing the set-based operations to an aggregator node, raising many privacy concerns. To address this problem, we present set-based estimation protocols using partially homomorphic encryption that preserve the privacy of the measurements and sets bounding the estimates. We consider a linear discrete-time dynamical system with bounded modeling and measurement uncertainties. Sets are represented by zonotopes and constrained zonotopes as they can compactly represent high-dimensional sets and are closed under linear maps and Minkowski addition. By selectively encrypting parameters of the set representations, we establish the notion of encrypted sets and intersect sets in the encrypted domain, which enables guaranteed state estimation while ensuring privacy. In particular, we show that our protocols achieve computational privacy using the cryptographic notion of computational indistinguishability. We demonstrate the efficiency of our approach by localizing a real mobile quadcopter using ultra-wideband wireless devices. △ Less

Submitted 25 February, 2023; v1 submitted 19 October, 2020; originally announced October 2020.

Comments: This paper is accepted at the European Journal of Control

arXiv:2007.00691 [pdf, other]

doi 10.1109/ICMLA51294.2020.00042

Falsification-Based Robust Adversarial Reinforcement Learning

Authors: Xiao Wang, Saasha Nair, Matthias Althoff

Abstract: Reinforcement learning (RL) has achieved enormous progress in solving various sequential decision-making problems, such as control tasks in robotics. Since policies are overfitted to training environments, RL methods have often failed to be generalized to safety-critical test scenarios. Robust adversarial RL (RARL) was previously proposed to train an adversarial network that applies disturbances t… ▽ More Reinforcement learning (RL) has achieved enormous progress in solving various sequential decision-making problems, such as control tasks in robotics. Since policies are overfitted to training environments, RL methods have often failed to be generalized to safety-critical test scenarios. Robust adversarial RL (RARL) was previously proposed to train an adversarial network that applies disturbances to a system, which improves the robustness in test scenarios. However, an issue of neural network-based adversaries is that integrating system requirements without handcrafting sophisticated reward signals are difficult. Safety falsification methods allow one to find a set of initial conditions and an input sequence, such that the system violates a given property formulated in temporal logic. In this paper, we propose falsification-based RARL (FRARL): this is the first generic framework for integrating temporal logic falsification in adversarial learning to improve policy robustness. By applying our falsification method, we do not need to construct an extra reward function for the adversary. Moreover, we evaluate our approach on a braking assistance system and an adaptive cruise control system of autonomous vehicles. Our experimental results demonstrate that policies trained with a falsification-based adversary generalize better and show less violation of the safety specification in test scenarios than those trained without an adversary or with an adversarial network. △ Less

Submitted 20 March, 2023; v1 submitted 1 July, 2020; originally announced July 2020.

Comments: 8 pages, 4 figures

Journal ref: IEEE International Conference on Machine Learning and Applications (ICMLA), 2020, pp. 205-212

arXiv:2006.12091 [pdf, other]

doi 10.1109/CDC42340.2020.9304431

Adaptive Parameter Tuning for Reachability Analysis of Linear Systems

Authors: Mark Wetzlinger, Niklas Kochdumper, Matthias Althoff

Abstract: Despite the possibility to quickly compute reachable sets of large-scale linear systems, current methods are not yet widely applied by practitioners. The main reason for this is probably that current approaches are not push-button-capable and still require to manually set crucial parameters, such as time step sizes and the accuracy of the used set representation -- these settings require expert kn… ▽ More Despite the possibility to quickly compute reachable sets of large-scale linear systems, current methods are not yet widely applied by practitioners. The main reason for this is probably that current approaches are not push-button-capable and still require to manually set crucial parameters, such as time step sizes and the accuracy of the used set representation -- these settings require expert knowledge. We present a generic framework to automatically find near-optimal parameters for reachability analysis of linear systems given a user-defined accuracy. To limit the computational overhead as much as possible, our methods tune all relevant parameters during runtime. We evaluate our approach on benchmarks from the ARCH competition as well as on random examples. Our results show that our new framework verifies the selected benchmarks faster than manually-tuned parameters and is an order of magnitude faster compared to genetic algorithms. △ Less

Submitted 22 February, 2024; v1 submitted 22 June, 2020; originally announced June 2020.

arXiv:2006.04260 [pdf, other]

Formal synthesis of closed-form sampled-data controllers for nonlinear continuous-time systems under STL specifications

Authors: Cees F. Verdier, Niklas Kochdumper, Matthias Althoff, Manuel Mazo Jr

Abstract: We propose a counterexample-guided inductive synthesis framework for the formal synthesis of closed-form sampled-data controllers for nonlinear systems to meet STL specifications over finite-time trajectories. Rather than stating the STL specification for a single initial condition, we consider an (infinite and bounded) set of initial conditions. Candidate solutions are proposed using genetic prog… ▽ More We propose a counterexample-guided inductive synthesis framework for the formal synthesis of closed-form sampled-data controllers for nonlinear systems to meet STL specifications over finite-time trajectories. Rather than stating the STL specification for a single initial condition, we consider an (infinite and bounded) set of initial conditions. Candidate solutions are proposed using genetic programming, which evolves controllers based on a finite number of simulations. Subsequently, the best candidate is verified using reachability analysis; if the candidate solution does not satisfy the specification, an initial condition violating the specification is extracted as a counterexample. Based on this counterexample, candidate solutions are refined until eventually a solution is found (or a user-specified number of iterations is met). The resulting sampled-data controller is expressed as a closed-form expression, enabling both interpretability and the implementation in embedded hardware with limited memory and computation power. The effectiveness of our approach is demonstrated for multiple systems. △ Less

Submitted 20 March, 2021; v1 submitted 7 June, 2020; originally announced June 2020.

Comments: submitted to Automatica

arXiv:2005.08849 [pdf, ps, other]

Constrained Polynomial Zonotopes

Authors: Niklas Kochdumper, Matthias Althoff

Abstract: We introduce constrained polynomial zonotopes, a novel non-convex set representation that is closed under linear map, Minkowski sum, Cartesian product, convex hull, intersection, union, and quadratic as well as higher-order maps. We show that the computational complexity of the above-mentioned set operations for constrained polynomial zonotopes is at most polynomial in the representation size. The… ▽ More We introduce constrained polynomial zonotopes, a novel non-convex set representation that is closed under linear map, Minkowski sum, Cartesian product, convex hull, intersection, union, and quadratic as well as higher-order maps. We show that the computational complexity of the above-mentioned set operations for constrained polynomial zonotopes is at most polynomial in the representation size. The fact that constrained polynomial zonotopes are generalizations of zonotopes, polytopes, polynomial zonotopes, Taylor models, and ellipsoids, further substantiates the relevance of this new set representation. The conversion from other set representations to constrained polynomial zonotopes is at most polynomial with respect to the dimension. △ Less

Submitted 3 April, 2023; v1 submitted 18 May, 2020; originally announced May 2020.

arXiv:2003.11959 [pdf]

doi 10.1109/TITS.2020.3006767

Pedestrian Models for Autonomous Driving Part II: High-Level Models of Human Behavior

Authors: Fanta Camara, Nicola Bellotto, Serhan Cosar, Florian Weber, Dimitris Nathanael, Matthias Althoff, Jingyuan Wu, Johannes Ruenz, André Dietrich, Gustav Markkula, Anna Schieben, Fabio Tango, Natasha Merat, Charles W. Fox

Abstract: Autonomous vehicles (AVs) must share space with pedestrians, both in carriageway cases such as cars at pedestrian crossings and off-carriageway cases such as delivery vehicles navigating through crowds on pedestrianized high-streets. Unlike static obstacles, pedestrians are active agents with complex, interactive motions. Planning AV actions in the presence of pedestrians thus requires modelling o… ▽ More Autonomous vehicles (AVs) must share space with pedestrians, both in carriageway cases such as cars at pedestrian crossings and off-carriageway cases such as delivery vehicles navigating through crowds on pedestrianized high-streets. Unlike static obstacles, pedestrians are active agents with complex, interactive motions. Planning AV actions in the presence of pedestrians thus requires modelling of their probable future behaviour as well as detecting and tracking them. This narrative review article is Part II of a pair, together surveying the current technology stack involved in this process, organising recent research into a hierarchical taxonomy ranging from low-level image detection to high-level psychological models, from the perspective of an AV designer. This self-contained Part II covers the higher levels of this stack, consisting of models of pedestrian behaviour, from prediction of individual pedestrians' likely destinations and paths, to game-theoretic models of interactions between pedestrians and autonomous vehicles. This survey clearly shows that, although there are good models for optimal walking behaviour, high-level psychological and social modelling of pedestrian behaviour still remains an open research question that requires many conceptual issues to be clarified. Early work has been done on descriptive and qualitative models of behaviour, but much work is still needed to translate them into quantitative algorithms for practical AV control. △ Less

Submitted 20 July, 2020; v1 submitted 26 March, 2020; originally announced March 2020.

Comments: Accepted for publication in the IEEE Transactions on Intelligent Transportation Systems

arXiv:2003.10347 [pdf, other]

doi 10.1016/j.jfranklin.2023.03.025

Distributed Set-Based Observers Using Diffusion Strategies

Authors: Amr Alanwar, Jagat Jyoti Rath, Hazem Said, Karl Henrik Johansson, Matthias Althoff

Abstract: We propose two distributed set-based observers using strip-based and set-propagation approaches for linear discrete-time dynamical systems with bounded modeling and measurement uncertainties. Both algorithms utilize a set-based diffusion step, which decreases the estimation errors and the size of estimated sets, and can be seen as a lightweight approach to achieve partial consensus between the dis… ▽ More We propose two distributed set-based observers using strip-based and set-propagation approaches for linear discrete-time dynamical systems with bounded modeling and measurement uncertainties. Both algorithms utilize a set-based diffusion step, which decreases the estimation errors and the size of estimated sets, and can be seen as a lightweight approach to achieve partial consensus between the distributed estimated sets. Every node shares its measurement with its neighbor in the measurement update step. In the diffusion step, the neighbors intersect their estimated sets using our novel lightweight zonotope intersection technique. A localization example demonstrates the applicability of our algorithms. △ Less

Submitted 18 June, 2023; v1 submitted 23 March, 2020; originally announced March 2020.

Comments: Accepted at Journal of the Franklin Institute

arXiv:2002.11669 [pdf]

doi 10.1109/TITS.2020.3006768

Pedestrian Models for Autonomous Driving Part I: Low-Level Models, from Sensing to Tracking

Authors: Fanta Camara, Nicola Bellotto, Serhan Cosar, Dimitris Nathanael, Matthias Althoff, Jingyuan Wu, Johannes Ruenz, André Dietrich, Charles W. Fox

Abstract: Autonomous vehicles (AVs) must share space with pedestrians, both in carriageway cases such as cars at pedestrian crossings and off-carriageway cases such as delivery vehicles navigating through crowds on pedestrianized high-streets. Unlike static obstacles, pedestrians are active agents with complex, interactive motions. Planning AV actions in the presence of pedestrians thus requires modelling o… ▽ More Autonomous vehicles (AVs) must share space with pedestrians, both in carriageway cases such as cars at pedestrian crossings and off-carriageway cases such as delivery vehicles navigating through crowds on pedestrianized high-streets. Unlike static obstacles, pedestrians are active agents with complex, interactive motions. Planning AV actions in the presence of pedestrians thus requires modelling of their probable future behaviour as well as detecting and tracking them. This narrative review article is Part I of a pair, together surveying the current technology stack involved in this process, organising recent research into a hierarchical taxonomy ranging from low-level image detection to high-level psychology models, from the perspective of an AV designer. This self-contained Part I covers the lower levels of this stack, from sensing, through detection and recognition, up to tracking of pedestrians. Technologies at these levels are found to be mature and available as foundations for use in high-level systems, such as behaviour modelling, prediction and interaction control. △ Less

Submitted 20 July, 2020; v1 submitted 26 February, 2020; originally announced February 2020.

Comments: Accepted for publication in the IEEE Transactions on Intelligent Transportation Systems

arXiv:1910.08354 [pdf, other]

doi 10.1145/3365365.3382192

Utilizing Dependencies to Obtain Subsets of Reachable Sets

Authors: Niklas Kochdumper, Bastian Schürmann, Matthias Althoff

Abstract: Reachability analysis, in general, is a fundamental method that supports formally-correct synthesis, robust model predictive control, set-based observers, fault detection, invariant computation, and conformance checking, to name but a few. In many of these applications, one requires to compute a reachable set starting within a previously computed reachable set. While it was previously required to… ▽ More Reachability analysis, in general, is a fundamental method that supports formally-correct synthesis, robust model predictive control, set-based observers, fault detection, invariant computation, and conformance checking, to name but a few. In many of these applications, one requires to compute a reachable set starting within a previously computed reachable set. While it was previously required to re-compute the entire reachable set, we demonstrate that one can leverage the dependencies of states within the previously computed set. As a result, we almost instantly obtain an over-approximative subset of a previously computed reachable set by evaluating analytical maps. The advantages of our novel method are demonstrated for falsification of systems, optimization over reachable sets, and synthesizing safe maneuver automata. In all of these applications, the computation time is reduced significantly. △ Less

Submitted 16 November, 2020; v1 submitted 18 October, 2019; originally announced October 2019.

Journal ref: Proceedings of the 23rd International Conference on Hybrid Systems: Computation and Control, 2020

arXiv:1910.07271 [pdf, other]

Representation of Polytopes as Polynomial Zonotopes

Authors: Niklas Kochdumper, Matthias Althoff

Abstract: We prove that each bounded polytope can be represented as a polynomial zonotope, which we refer to as the Z-representation of polytopes. Previous representations are the vertex representation (V-representation) and the halfspace representation (H-representation). Depending on the polytope, the Z-representation can be more compact than the V-representation and the H-representation. In addition, the… ▽ More We prove that each bounded polytope can be represented as a polynomial zonotope, which we refer to as the Z-representation of polytopes. Previous representations are the vertex representation (V-representation) and the halfspace representation (H-representation). Depending on the polytope, the Z-representation can be more compact than the V-representation and the H-representation. In addition, the Z-representation enables the computation of linear maps, Minkowski addition, and convex hull with a computational complexity that is polynomial in the representation size. The usefulness of the new representation is demonstrated by range bounding within polytopes. △ Less

Submitted 16 October, 2019; originally announced October 2019.

arXiv:1901.01780 [pdf, other]

doi 10.1109/TAC.2020.3024348

Sparse Polynomial Zonotopes: A Novel Set Representation for Reachability Analysis

Authors: Niklas Kochdumper, Matthias Althoff

Abstract: We introduce sparse polynomial zonotopes, a new set representation for formal verification of hybrid systems. Sparse polynomial zonotopes can represent non-convex sets and are generalizations of zonotopes, polytopes, and Taylor models. Operations like Minkowski sum, quadratic mapping, and reduction of the representation size can be computed with polynomial complexity w.r.t. the dimension of the sy… ▽ More We introduce sparse polynomial zonotopes, a new set representation for formal verification of hybrid systems. Sparse polynomial zonotopes can represent non-convex sets and are generalizations of zonotopes, polytopes, and Taylor models. Operations like Minkowski sum, quadratic mapping, and reduction of the representation size can be computed with polynomial complexity w.r.t. the dimension of the system. In particular, for reachability analysis of nonlinear systems, the wrapping effect is substantially reduced using sparse polynomial zonotopes, as demonstrated by numerical examples. In addition, we can significantly reduce the computation time compared to zonotopes when dealing with nonlinear dynamics. △ Less

Submitted 16 November, 2020; v1 submitted 7 January, 2019; originally announced January 2019.

Journal ref: IEEE Transactions on Automatic Control 2020

arXiv:1712.00369 [pdf, ps, other]

doi 10.1109/TAC.2019.2906432.

Reachability Analysis of Large Linear Systems with Uncertain Inputs in the Krylov Subspace

Authors: Matthias Althoff

Abstract: One often wishes for the ability to formally analyze large-scale systems---typically, however, one can either formally analyze a rather small system or informally analyze a large-scale system. This work tries to further close this performance gap for reachability analysis of linear systems. Reachability analysis can capture the whole set of possible solutions of a dynamic system and is thus used t… ▽ More One often wishes for the ability to formally analyze large-scale systems---typically, however, one can either formally analyze a rather small system or informally analyze a large-scale system. This work tries to further close this performance gap for reachability analysis of linear systems. Reachability analysis can capture the whole set of possible solutions of a dynamic system and is thus used to prove that unsafe states are never reached; this requires full consideration of arbitrarily varying uncertain inputs, since sensor noise or disturbances usually do not follow any patterns. We use Krylov methods in this work to compute reachable sets for large-scale linear systems. While Krylov methods have been used before in reachability analysis, we overcome the previous limitation that inputs must be (piecewise) constant. As a result, we can compute reachable sets of systems with several thousand state variables for bounded, but arbitrarily varying inputs. △ Less

Submitted 5 August, 2020; v1 submitted 1 December, 2017; originally announced December 2017.

Journal ref: in IEEE Transactions on Automatic Control, vol. 65, no. 2, pp. 477-492, Feb. 2020

arXiv:1711.00493 [pdf, other]

Event-Triggered Diffusion Kalman Filters

Authors: Amr Alanwar, Hazem Said, Ankur Mehta, Matthias Althoff

Abstract: Distributed state estimation strongly depends on collaborative signal processing, which often requires excessive communication and computation to be executed on resource-constrained sensor nodes. To address this problem, we propose an event-triggered diffusion Kalman filter, which collects measurements and exchanges messages between nodes based on a local signal indicating the estimation error. On… ▽ More Distributed state estimation strongly depends on collaborative signal processing, which often requires excessive communication and computation to be executed on resource-constrained sensor nodes. To address this problem, we propose an event-triggered diffusion Kalman filter, which collects measurements and exchanges messages between nodes based on a local signal indicating the estimation error. On this basis, we develop an energy-aware state estimation algorithm that regulates the resource consumption in wireless networks and ensures the effectiveness of every consumed resource. The proposed algorithm does not require the nodes to share its local covariance matrices, and thereby allows considerably reducing the number of transmission messages. To confirm its efficiency, we apply the proposed algorithm to the distributed simultaneous localization and time synchronization problem and evaluate it on a physical testbed of a mobile quadrotor node and stationary custom ultra-wideband wireless devices. The obtained experimental results indicate that the proposed algorithm allows saving 86% of the communication overhead associated with the original diffusion Kalman filter while causing deterioration of performance by 16% only. We make the Matlab code and the real testing data available online. △ Less

Submitted 18 February, 2020; v1 submitted 1 November, 2017; originally announced November 2017.

arXiv:1512.02794 [pdf, ps, other]

On Computing the Minkowski Difference of Zonotopes

Authors: Matthias Althoff

Abstract: Zonotopes are becoming an increasingly popular set representation for formal verification techniques. This is mainly due to their efficient representation and their favorable computational complexity of important operations in high-dimensional spaces. In particular, zonotopes are closed under Minkowski addition and linear maps, which can be very efficiently implemented. Unfortunately, zonotopes ar… ▽ More Zonotopes are becoming an increasingly popular set representation for formal verification techniques. This is mainly due to their efficient representation and their favorable computational complexity of important operations in high-dimensional spaces. In particular, zonotopes are closed under Minkowski addition and linear maps, which can be very efficiently implemented. Unfortunately, zonotopes are not closed under Minkowski difference for dimensions greater than two. However, we present an algorithm that efficiently computes a halfspace representation of the Minkowski difference of two zonotopes. In addition, we present an efficient algorithm that computes an approximation of the Minkowski difference in generator representation. The efficiency of the proposed solution is demonstrated by numerical experiments. These experiments show a reduced computation time in comparison to that when first the halfspace representation of zonotopes is obtained and the Minkowski difference is performed subsequently. △ Less

Submitted 23 August, 2022; v1 submitted 9 December, 2015; originally announced December 2015.

Showing 1–46 of 46 results for author: Althoff, M