(Translated by https://www.hiragana.jp/)
Search | arXiv e-print repository
Skip to main content

Showing 1–6 of 6 results for author: Bombari, S

.
  1. arXiv:2402.02969  [pdf, other

    stat.ML cs.CL cs.LG

    Towards Understanding the Word Sensitivity of Attention Layers: A Study via Random Features

    Authors: Simone Bombari, Marco Mondelli

    Abstract: Understanding the reasons behind the exceptional success of transformers requires a better analysis of why attention layers are suitable for NLP tasks. In particular, such tasks require predictive models to capture contextual meaning which often depends on one or few words, even if the sentence is long. Our work studies this key property, dubbed word sensitivity (WS), in the prototypical setting o… ▽ More

    Submitted 17 May, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

    Comments: Revision after ICML2024 reviews

  2. arXiv:2305.12100  [pdf, other

    stat.ML cs.LG

    How Spurious Features Are Memorized: Precise Analysis for Random and NTK Features

    Authors: Simone Bombari, Marco Mondelli

    Abstract: Deep learning models are known to overfit and memorize spurious features in the training dataset. While numerous empirical studies have aimed at understanding this phenomenon, a rigorous theoretical framework to quantify it is still missing. In this paper, we consider spurious features that are uncorrelated with the learning task, and we provide a precise characterization of how they are memorized… ▽ More

    Submitted 17 May, 2024; v1 submitted 20 May, 2023; originally announced May 2023.

    Comments: Revision after ICML2024 acceptance. Motivation of the paper changed from Privacy to Spurious Features. arXiv admin note: text overlap with arXiv:2302.01629

  3. arXiv:2302.01629  [pdf, other

    stat.ML cs.LG

    Beyond the Universal Law of Robustness: Sharper Laws for Random Features and Neural Tangent Kernels

    Authors: Simone Bombari, Shayan Kiyani, Marco Mondelli

    Abstract: Machine learning models are vulnerable to adversarial perturbations, and a thought-provoking paper by Bubeck and Sellke has analyzed this phenomenon through the lens of over-parameterization: interpolating smoothly the data requires significantly more parameters than simply memorizing it. However, this "universal" law provides only a necessary condition for robustness, and it is unable to discrimi… ▽ More

    Submitted 27 May, 2023; v1 submitted 3 February, 2023; originally announced February 2023.

    Comments: Second arxiv version, updated to the icml23 version of the paper

  4. arXiv:2205.10217  [pdf, other

    stat.ML cs.IT cs.LG

    Memorization and Optimization in Deep Neural Networks with Minimum Over-parameterization

    Authors: Simone Bombari, Mohammad Hossein Amani, Marco Mondelli

    Abstract: The Neural Tangent Kernel (NTK) has emerged as a powerful tool to provide memorization, optimization and generalization guarantees in deep neural networks. A line of work has studied the NTK spectrum for two-layer and deep networks with at least a layer with $Ωおめが(N)$ neurons, $N$ being the number of training samples. Furthermore, there is increasing evidence suggesting that deep networks with sub-li… ▽ More

    Submitted 21 May, 2023; v1 submitted 20 May, 2022; originally announced May 2022.

    Comments: Uniformed with the published NeurIPS 2022 version

  5. arXiv:2205.08199  [pdf, ps, other

    cs.IT cs.LG stat.ML

    Sharp asymptotics on the compression of two-layer neural networks

    Authors: Mohammad Hossein Amani, Simone Bombari, Marco Mondelli, Rattana Pukdee, Stefano Rini

    Abstract: In this paper, we study the compression of a target two-layer neural network with N nodes into a compressed network with M<N nodes. More precisely, we consider the setting in which the weights of the target network are i.i.d. sub-Gaussian, and we minimize the population L_2 loss between the outputs of the target and of the compressed network, under the assumption of Gaussian inputs. By using tools… ▽ More

    Submitted 16 August, 2022; v1 submitted 17 May, 2022; originally announced May 2022.

  6. arXiv:2203.16701  [pdf, other

    cs.LG cs.CR stat.ML

    Towards Differential Relational Privacy and its use in Question Answering

    Authors: Simone Bombari, Alessandro Achille, Zijian Wang, Yu-Xiang Wang, Yusheng Xie, Kunwar Yashraj Singh, Srikar Appalaraju, Vijay Mahadevan, Stefano Soatto

    Abstract: Memorization of the relation between entities in a dataset can lead to privacy issues when using a trained model for question answering. We introduce Relational Memorization (RM) to understand, quantify and control this phenomenon. While bounding general memorization can have detrimental effects on the performance of a trained model, bounding RM does not prevent effective learning. The difference… ▽ More

    Submitted 30 March, 2022; originally announced March 2022.