-
NLP Verification: Towards a General Methodology for Certifying Robustness
Authors:
Marco Casadio,
Tanvi Dinkar,
Ekaterina Komendantskaya,
Luca Arnaboldi,
Matthew L. Daggitt,
Omri Isac,
Guy Katz,
Verena Rieser,
Oliver Lemon
Abstract:
Deep neural networks have exhibited substantial success in the field of Natural Language Processing and ensuring their safety and reliability is crucial: there are safety critical contexts where such models must be robust to variability or attack, and give guarantees over their output. Unlike Computer Vision, NLP lacks a unified verification methodology and, despite recent advancements in literatu…
▽ More
Deep neural networks have exhibited substantial success in the field of Natural Language Processing and ensuring their safety and reliability is crucial: there are safety critical contexts where such models must be robust to variability or attack, and give guarantees over their output. Unlike Computer Vision, NLP lacks a unified verification methodology and, despite recent advancements in literature, they are often light on the pragmatical issues of NLP verification. In this paper, we attempt to distil and evaluate general components of an NLP verification pipeline, that emerges from the progress in the field to date. Our contributions are two-fold. Firstly, we give a general (i.e. algorithm-independent) characterisation of verifiable subspaces that result from embedding sentences into continuous spaces. We identify, and give an effective method to deal with, the technical challenge of semantic generalisability of verified subspaces; and propose it as a standard metric in the NLP verification pipelines (alongside with the standard metrics of model accuracy and model verifiability). Secondly, we propose a general methodology to analyse the effect of the embedding gap -- a problem that refers to the discrepancy between verification of geometric subspaces, and the semantic meaning of sentences which the geometric subspaces are supposed to represent. In extreme cases, poor choices in embedding of sentences may invalidate verification results. We propose a number of practical NLP methods that can help to quantify the effects of the embedding gap; and in particular we propose the metric of falsifiability of semantic subspaces as another fundamental metric to be reported as part of the NLP verification pipeline. We believe that together these general principles pave the way towards a more consolidated and effective development of this new domain.
△ Less
Submitted 31 May, 2024; v1 submitted 15 March, 2024;
originally announced March 2024.
-
ANTONIO: Towards a Systematic Method of Generating NLP Benchmarks for Verification
Authors:
Marco Casadio,
Luca Arnaboldi,
Matthew L. Daggitt,
Omri Isac,
Tanvi Dinkar,
Daniel Kienitz,
Verena Rieser,
Ekaterina Komendantskaya
Abstract:
Verification of machine learning models used in Natural Language Processing (NLP) is known to be a hard problem. In particular, many known neural network verification methods that work for computer vision and other numeric datasets do not work for NLP. Here, we study technical reasons that underlie this problem. Based on this analysis, we propose practical methods and heuristics for preparing NLP…
▽ More
Verification of machine learning models used in Natural Language Processing (NLP) is known to be a hard problem. In particular, many known neural network verification methods that work for computer vision and other numeric datasets do not work for NLP. Here, we study technical reasons that underlie this problem. Based on this analysis, we propose practical methods and heuristics for preparing NLP datasets and models in a way that renders them amenable to known verification methods based on abstract interpretation. We implement these methods as a Python library called ANTONIO that links to the neural network verifiers ERAN and Marabou. We perform evaluation of the tool using an NLP dataset R-U-A-Robot suggested as a benchmark for verifying legally critical NLP applications. We hope that, thanks to its general applicability, this work will open novel possibilities for including NLP verification problems into neural network verification competitions, and will popularise NLP problems within this community.
△ Less
Submitted 15 August, 2023; v1 submitted 6 May, 2023;
originally announced May 2023.
-
Bimanual Motor Strategies and Handedness Role During Human-Exoskeleton Haptic Interaction
Authors:
Elisa Galofaro,
Erika D'Antonio,
Nicola Lotti,
Fabrizio Patane',
Maura Casadio,
Lorenzo Masia
Abstract:
Bimanual object manipulation involves multiple visuo-haptic sensory feedbacks arising from the interaction with the environment that are managed from the central nervous system and consequently translated in motor commands. Kinematic strategies that occur during bimanual coupled tasks are still a scientific debate despite modern advances in haptics and robotics. Current technologies may have the p…
▽ More
Bimanual object manipulation involves multiple visuo-haptic sensory feedbacks arising from the interaction with the environment that are managed from the central nervous system and consequently translated in motor commands. Kinematic strategies that occur during bimanual coupled tasks are still a scientific debate despite modern advances in haptics and robotics. Current technologies may have the potential to provide realistic scenarios involving the entire upper limb extremities during multi-joint movements but are not yet exploited to their full potential. The present study explores how hands dynamically interact when manipulating a shared object through the use of two impedance-controlled exoskeletons programmed to simulate bimanually coupled manipulation of virtual objects. We enrolled twenty-six participants (2 groups: right-handed and left-handed) who were requested to use both hands to grab simulated objects across the robot workspace and place them in specific locations. The virtual objects were rendered with different dynamic proprieties and textures influencing the manipulation strategies to complete the tasks. Results revealed that the roles of hands are related to the movement direction, the haptic features, and the handedness preference. Outcomes suggested that the haptic feedback affects bimanual strategies depending on the movement direction. However, left-handers show better control of the force applied between the two hands, probably due to environmental pressures for right-handed manipulations.
△ Less
Submitted 22 November, 2022;
originally announced November 2022.
-
Automated Classification of General Movements in Infants Using a Two-stream Spatiotemporal Fusion Network
Authors:
Yuki Hashimoto,
Akira Furui,
Koji Shimatani,
Maura Casadio,
Paolo Moretti,
Pietro Morasso,
Toshio Tsuji
Abstract:
The assessment of general movements (GMs) in infants is a useful tool in the early diagnosis of neurodevelopmental disorders. However, its evaluation in clinical practice relies on visual inspection by experts, and an automated solution is eagerly awaited. Recently, video-based GMs classification has attracted attention, but this approach would be strongly affected by irrelevant information, such…
▽ More
The assessment of general movements (GMs) in infants is a useful tool in the early diagnosis of neurodevelopmental disorders. However, its evaluation in clinical practice relies on visual inspection by experts, and an automated solution is eagerly awaited. Recently, video-based GMs classification has attracted attention, but this approach would be strongly affected by irrelevant information, such as background clutter in the video. Furthermore, for reliability, it is necessary to properly extract the spatiotemporal features of infants during GMs. In this study, we propose an automated GMs classification method, which consists of preprocessing networks that remove unnecessary background information from GMs videos and adjust the infant's body position, and a subsequent motion classification network based on a two-stream structure. The proposed method can efficiently extract the essential spatiotemporal features for GMs classification while preventing overfitting to irrelevant information for different recording environments. We validated the proposed method using videos obtained from 100 infants. The experimental results demonstrate that the proposed method outperforms several baseline models and the existing methods.
△ Less
Submitted 4 July, 2022;
originally announced July 2022.
-
Why Robust Natural Language Understanding is a Challenge
Authors:
Marco Casadio,
Ekaterina Komendantskaya,
Verena Rieser,
Matthew L. Daggitt,
Daniel Kienitz,
Luca Arnaboldi,
Wen Kokke
Abstract:
With the proliferation of Deep Machine Learning into real-life applications, a particular property of this technology has been brought to attention: robustness Neural Networks notoriously present low robustness and can be highly sensitive to small input perturbations. Recently, many methods for verifying networks' general properties of robustness have been proposed, but they are mostly applied in…
▽ More
With the proliferation of Deep Machine Learning into real-life applications, a particular property of this technology has been brought to attention: robustness Neural Networks notoriously present low robustness and can be highly sensitive to small input perturbations. Recently, many methods for verifying networks' general properties of robustness have been proposed, but they are mostly applied in Computer Vision. In this paper we propose a Verification specification for Natural Language Understanding classification based on larger regions of interest, and we discuss the challenges of such task. We observe that, although the data is almost linearly separable, the verifier struggles to output positive results and we explain the problems and implications.
△ Less
Submitted 13 July, 2022; v1 submitted 21 June, 2022;
originally announced June 2022.
-
Image-guided Breast Biopsy of MRI-visible Lesions with a Hand-mounted Motorised Needle Steering Tool
Authors:
Marta Lagomarsino,
Vincent Groenhuis,
Maura Casadio,
Marcel K. Welleweerd,
Francoise J. Siepel,
Stefano Stramigioli
Abstract:
A biopsy is the only diagnostic procedure for accurate histological confirmation of breast cancer. When sonographic placement is not feasible, a Magnetic Resonance Imaging(MRI)-guided biopsy is often preferred. The lack of real-time imaging information and the deformations of the breast make it challenging to bring the needle precisely towards the tumour detected in pre-interventional Magnetic Res…
▽ More
A biopsy is the only diagnostic procedure for accurate histological confirmation of breast cancer. When sonographic placement is not feasible, a Magnetic Resonance Imaging(MRI)-guided biopsy is often preferred. The lack of real-time imaging information and the deformations of the breast make it challenging to bring the needle precisely towards the tumour detected in pre-interventional Magnetic Resonance (MR) images. The current manual MRI-guided biopsy workflow is inaccurate and would benefit from a technique that allows real-time tracking and localisation of the tumour lesion during needle insertion. This paper proposes a robotic setup and software architecture to assist the radiologist in targeting MR-detected suspicious tumours. The approach benefits from image fusion of preoperative images with intraoperative optical tracking of markers attached to the patient's skin. A hand-mounted biopsy device has been constructed with an actuated needle base to drive the tip toward the desired direction. The steering commands may be provided both by user input and by computer guidance. The workflow is validated through phantom experiments. On average, the suspicious breast lesion is targeted with a radius down to 2.3 mm. The results suggest that robotic systems taking into account breast deformations have the potentials to tackle this clinical challenge.
△ Less
Submitted 20 June, 2021;
originally announced June 2021.
-
Neural Network Robustness as a Verification Property: A Principled Case Study
Authors:
Marco Casadio,
Ekaterina Komendantskaya,
Matthew L. Daggitt,
Wen Kokke,
Guy Katz,
Guy Amir,
Idan Refaeli
Abstract:
Neural networks are very successful at detecting patterns in noisy data, and have become the technology of choice in many fields. However, their usefulness is hampered by their susceptibility to adversarial attacks. Recently, many methods for measuring and improving a network's robustness to adversarial perturbations have been proposed, and this growing body of research has given rise to numerous…
▽ More
Neural networks are very successful at detecting patterns in noisy data, and have become the technology of choice in many fields. However, their usefulness is hampered by their susceptibility to adversarial attacks. Recently, many methods for measuring and improving a network's robustness to adversarial perturbations have been proposed, and this growing body of research has given rise to numerous explicit or implicit notions of robustness. Connections between these notions are often subtle, and a systematic comparison between them is missing in the literature. In this paper we begin addressing this gap, by setting up general principles for the empirical analysis and evaluation of a network's robustness as a mathematical property - during the network's training phase, its verification, and after its deployment. We then apply these principles and conduct a case study that showcases the practical benefits of our general approach.
△ Less
Submitted 13 July, 2022; v1 submitted 3 April, 2021;
originally announced April 2021.