Search | arXiv e-print repository

NLP Verification: Towards a General Methodology for Certifying Robustness

Authors: Marco Casadio, Tanvi Dinkar, Ekaterina Komendantskaya, Luca Arnaboldi, Matthew L. Daggitt, Omri Isac, Guy Katz, Verena Rieser, Oliver Lemon

Abstract: Deep neural networks have exhibited substantial success in the field of Natural Language Processing and ensuring their safety and reliability is crucial: there are safety critical contexts where such models must be robust to variability or attack, and give guarantees over their output. Unlike Computer Vision, NLP lacks a unified verification methodology and, despite recent advancements in literatu… ▽ More Deep neural networks have exhibited substantial success in the field of Natural Language Processing and ensuring their safety and reliability is crucial: there are safety critical contexts where such models must be robust to variability or attack, and give guarantees over their output. Unlike Computer Vision, NLP lacks a unified verification methodology and, despite recent advancements in literature, they are often light on the pragmatical issues of NLP verification. In this paper, we attempt to distil and evaluate general components of an NLP verification pipeline, that emerges from the progress in the field to date. Our contributions are two-fold. Firstly, we give a general (i.e. algorithm-independent) characterisation of verifiable subspaces that result from embedding sentences into continuous spaces. We identify, and give an effective method to deal with, the technical challenge of semantic generalisability of verified subspaces; and propose it as a standard metric in the NLP verification pipelines (alongside with the standard metrics of model accuracy and model verifiability). Secondly, we propose a general methodology to analyse the effect of the embedding gap -- a problem that refers to the discrepancy between verification of geometric subspaces, and the semantic meaning of sentences which the geometric subspaces are supposed to represent. In extreme cases, poor choices in embedding of sentences may invalidate verification results. We propose a number of practical NLP methods that can help to quantify the effects of the embedding gap; and in particular we propose the metric of falsifiability of semantic subspaces as another fundamental metric to be reported as part of the NLP verification pipeline. We believe that together these general principles pave the way towards a more consolidated and effective development of this new domain. △ Less

Submitted 31 May, 2024; v1 submitted 15 March, 2024; originally announced March 2024.

arXiv:2305.04003 [pdf, other]

ANTONIO: Towards a Systematic Method of Generating NLP Benchmarks for Verification

Authors: Marco Casadio, Luca Arnaboldi, Matthew L. Daggitt, Omri Isac, Tanvi Dinkar, Daniel Kienitz, Verena Rieser, Ekaterina Komendantskaya

Abstract: Verification of machine learning models used in Natural Language Processing (NLP) is known to be a hard problem. In particular, many known neural network verification methods that work for computer vision and other numeric datasets do not work for NLP. Here, we study technical reasons that underlie this problem. Based on this analysis, we propose practical methods and heuristics for preparing NLP… ▽ More Verification of machine learning models used in Natural Language Processing (NLP) is known to be a hard problem. In particular, many known neural network verification methods that work for computer vision and other numeric datasets do not work for NLP. Here, we study technical reasons that underlie this problem. Based on this analysis, we propose practical methods and heuristics for preparing NLP datasets and models in a way that renders them amenable to known verification methods based on abstract interpretation. We implement these methods as a Python library called ANTONIO that links to the neural network verifiers ERAN and Marabou. We perform evaluation of the tool using an NLP dataset R-U-A-Robot suggested as a benchmark for verifying legally critical NLP applications. We hope that, thanks to its general applicability, this work will open novel possibilities for including NLP verification problems into neural network verification competitions, and will popularise NLP problems within this community. △ Less

Submitted 15 August, 2023; v1 submitted 6 May, 2023; originally announced May 2023.

Comments: To appear in proceedings of 6th Workshop on Formal Methods for ML-Enabled Autonomous Systems (Affiliated with CAV 2023)

arXiv:2211.12408 [pdf]

Bimanual Motor Strategies and Handedness Role During Human-Exoskeleton Haptic Interaction

Authors: Elisa Galofaro, Erika D'Antonio, Nicola Lotti, Fabrizio Patane', Maura Casadio, Lorenzo Masia

Abstract: Bimanual object manipulation involves multiple visuo-haptic sensory feedbacks arising from the interaction with the environment that are managed from the central nervous system and consequently translated in motor commands. Kinematic strategies that occur during bimanual coupled tasks are still a scientific debate despite modern advances in haptics and robotics. Current technologies may have the p… ▽ More Bimanual object manipulation involves multiple visuo-haptic sensory feedbacks arising from the interaction with the environment that are managed from the central nervous system and consequently translated in motor commands. Kinematic strategies that occur during bimanual coupled tasks are still a scientific debate despite modern advances in haptics and robotics. Current technologies may have the potential to provide realistic scenarios involving the entire upper limb extremities during multi-joint movements but are not yet exploited to their full potential. The present study explores how hands dynamically interact when manipulating a shared object through the use of two impedance-controlled exoskeletons programmed to simulate bimanually coupled manipulation of virtual objects. We enrolled twenty-six participants (2 groups: right-handed and left-handed) who were requested to use both hands to grab simulated objects across the robot workspace and place them in specific locations. The virtual objects were rendered with different dynamic proprieties and textures influencing the manipulation strategies to complete the tasks. Results revealed that the roles of hands are related to the movement direction, the haptic features, and the handedness preference. Outcomes suggested that the haptic feedback affects bimanual strategies depending on the movement direction. However, left-handers show better control of the force applied between the two hands, probably due to environmental pressures for right-handed manipulations. △ Less

Submitted 22 November, 2022; originally announced November 2022.

arXiv:2207.03344 [pdf, other]

Automated Classification of General Movements in Infants Using a Two-stream Spatiotemporal Fusion Network

Authors: Yuki Hashimoto, Akira Furui, Koji Shimatani, Maura Casadio, Paolo Moretti, Pietro Morasso, Toshio Tsuji

Abstract: The assessment of general movements (GMs) in infants is a useful tool in the early diagnosis of neurodevelopmental disorders. However, its evaluation in clinical practice relies on visual inspection by experts, and an automated solution is eagerly awaited. Recently, video-based GMs classification has attracted attention, but this approach would be strongly affected by irrelevant information, such… ▽ More The assessment of general movements (GMs) in infants is a useful tool in the early diagnosis of neurodevelopmental disorders. However, its evaluation in clinical practice relies on visual inspection by experts, and an automated solution is eagerly awaited. Recently, video-based GMs classification has attracted attention, but this approach would be strongly affected by irrelevant information, such as background clutter in the video. Furthermore, for reliability, it is necessary to properly extract the spatiotemporal features of infants during GMs. In this study, we propose an automated GMs classification method, which consists of preprocessing networks that remove unnecessary background information from GMs videos and adjust the infant's body position, and a subsequent motion classification network based on a two-stream structure. The proposed method can efficiently extract the essential spatiotemporal features for GMs classification while preventing overfitting to irrelevant information for different recording environments. We validated the proposed method using videos obtained from 100 infants. The experimental results demonstrate that the proposed method outperforms several baseline models and the existing methods. △ Less

Submitted 4 July, 2022; originally announced July 2022.

Comments: Accepted by MICCAI 2022

arXiv:2206.14575 [pdf, other]

Why Robust Natural Language Understanding is a Challenge

Authors: Marco Casadio, Ekaterina Komendantskaya, Verena Rieser, Matthew L. Daggitt, Daniel Kienitz, Luca Arnaboldi, Wen Kokke

Abstract: With the proliferation of Deep Machine Learning into real-life applications, a particular property of this technology has been brought to attention: robustness Neural Networks notoriously present low robustness and can be highly sensitive to small input perturbations. Recently, many methods for verifying networks' general properties of robustness have been proposed, but they are mostly applied in… ▽ More With the proliferation of Deep Machine Learning into real-life applications, a particular property of this technology has been brought to attention: robustness Neural Networks notoriously present low robustness and can be highly sensitive to small input perturbations. Recently, many methods for verifying networks' general properties of robustness have been proposed, but they are mostly applied in Computer Vision. In this paper we propose a Verification specification for Natural Language Understanding classification based on larger regions of interest, and we discuss the challenges of such task. We observe that, although the data is almost linearly separable, the verifier struggles to output positive results and we explain the problems and implications. △ Less

Submitted 13 July, 2022; v1 submitted 21 June, 2022; originally announced June 2022.

arXiv:2106.10672 [pdf, other]

doi 10.1109/ISMR48346.2021.9661564

Image-guided Breast Biopsy of MRI-visible Lesions with a Hand-mounted Motorised Needle Steering Tool

Authors: Marta Lagomarsino, Vincent Groenhuis, Maura Casadio, Marcel K. Welleweerd, Francoise J. Siepel, Stefano Stramigioli

Abstract: A biopsy is the only diagnostic procedure for accurate histological confirmation of breast cancer. When sonographic placement is not feasible, a Magnetic Resonance Imaging(MRI)-guided biopsy is often preferred. The lack of real-time imaging information and the deformations of the breast make it challenging to bring the needle precisely towards the tumour detected in pre-interventional Magnetic Res… ▽ More A biopsy is the only diagnostic procedure for accurate histological confirmation of breast cancer. When sonographic placement is not feasible, a Magnetic Resonance Imaging(MRI)-guided biopsy is often preferred. The lack of real-time imaging information and the deformations of the breast make it challenging to bring the needle precisely towards the tumour detected in pre-interventional Magnetic Resonance (MR) images. The current manual MRI-guided biopsy workflow is inaccurate and would benefit from a technique that allows real-time tracking and localisation of the tumour lesion during needle insertion. This paper proposes a robotic setup and software architecture to assist the radiologist in targeting MR-detected suspicious tumours. The approach benefits from image fusion of preoperative images with intraoperative optical tracking of markers attached to the patient's skin. A hand-mounted biopsy device has been constructed with an actuated needle base to drive the tip toward the desired direction. The steering commands may be provided both by user input and by computer guidance. The workflow is validated through phantom experiments. On average, the suspicious breast lesion is targeted with a radius down to 2.3 mm. The results suggest that robotic systems taking into account breast deformations have the potentials to tackle this clinical challenge. △ Less

Submitted 20 June, 2021; originally announced June 2021.

Comments: Submitted to 2021 International Symposium on Medical Robotics (ISMR)

arXiv:2104.01396 [pdf, other]

Neural Network Robustness as a Verification Property: A Principled Case Study

Authors: Marco Casadio, Ekaterina Komendantskaya, Matthew L. Daggitt, Wen Kokke, Guy Katz, Guy Amir, Idan Refaeli

Abstract: Neural networks are very successful at detecting patterns in noisy data, and have become the technology of choice in many fields. However, their usefulness is hampered by their susceptibility to adversarial attacks. Recently, many methods for measuring and improving a network's robustness to adversarial perturbations have been proposed, and this growing body of research has given rise to numerous… ▽ More Neural networks are very successful at detecting patterns in noisy data, and have become the technology of choice in many fields. However, their usefulness is hampered by their susceptibility to adversarial attacks. Recently, many methods for measuring and improving a network's robustness to adversarial perturbations have been proposed, and this growing body of research has given rise to numerous explicit or implicit notions of robustness. Connections between these notions are often subtle, and a systematic comparison between them is missing in the literature. In this paper we begin addressing this gap, by setting up general principles for the empirical analysis and evaluation of a network's robustness as a mathematical property - during the network's training phase, its verification, and after its deployment. We then apply these principles and conduct a case study that showcases the practical benefits of our general approach. △ Less

Submitted 13 July, 2022; v1 submitted 3 April, 2021; originally announced April 2021.

Comments: 11 pages, CAV 2022

Showing 1–7 of 7 results for author: Casadio, M