(Translated by https://www.hiragana.jp/)
Search | arXiv e-print repository
Skip to main content

Showing 1–50 of 89 results for author: Kailkhura, B

.
  1. arXiv:2408.05636  [pdf, other

    cs.CL cs.LG

    Speculative Diffusion Decoding: Accelerating Language Generation through Diffusion

    Authors: Jacob K Christopher, Brian R Bartoldson, Bhavya Kailkhura, Ferdinando Fioretto

    Abstract: Speculative decoding has emerged as a widely adopted method to accelerate large language model inference without sacrificing the quality of the model outputs. While this technique has facilitated notable speed improvements by enabling parallel sequence verification, its efficiency remains inherently limited by the reliance on incremental token generation in existing draft models. To overcome this… ▽ More

    Submitted 10 August, 2024; originally announced August 2024.

  2. arXiv:2406.04273  [pdf, other

    cs.CV cs.AI

    ELFS: Enhancing Label-Free Coreset Selection via Clustering-based Pseudo-Labeling

    Authors: Haizhong Zheng, Elisa Tsai, Yifu Lu, Jiachen Sun, Brian R. Bartoldson, Bhavya Kailkhura, Atul Prakash

    Abstract: High-quality human-annotated data is crucial for modern deep learning pipelines, yet the human annotation process is both costly and time-consuming. Given a constrained human labeling budget, selecting an informative and representative data subset for labeling can significantly reduce human annotation effort. Well-performing state-of-the-art (SOTA) coreset selection methods require ground-truth la… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

  3. arXiv:2405.18572  [pdf, other

    cs.LG cs.AI cs.CL

    Low-rank finetuning for LLMs: A fairness perspective

    Authors: Saswat Das, Marco Romanelli, Cuong Tran, Zarreen Reza, Bhavya Kailkhura, Ferdinando Fioretto

    Abstract: Low-rank approximation techniques have become the de facto standard for fine-tuning Large Language Models (LLMs) due to their reduced computational and memory requirements. This paper investigates the effectiveness of these methods in capturing the shift of fine-tuning datasets from the initial pre-trained data distribution. Our findings reveal that there are cases in which low-rank fine-tuning fa… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  4. arXiv:2405.17399  [pdf, other

    cs.LG cs.AI

    Transformers Can Do Arithmetic with the Right Embeddings

    Authors: Sean McLeish, Arpit Bansal, Alex Stein, Neel Jain, John Kirchenbauer, Brian R. Bartoldson, Bhavya Kailkhura, Abhinav Bhatele, Jonas Geiping, Avi Schwarzschild, Tom Goldstein

    Abstract: The poor performance of transformers on arithmetic tasks seems to stem in large part from their inability to keep track of the exact position of each digit inside of a large span of digits. We mend this problem by adding an embedding to each digit that encodes its position relative to the start of the number. In addition to the boost these embeddings provide on their own, we show that this fix ena… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  5. arXiv:2404.18239  [pdf, other

    cs.LG cs.CL

    SOUL: Unlocking the Power of Second-Order Optimization for LLM Unlearning

    Authors: Jinghan Jia, Yihua Zhang, Yimeng Zhang, Jiancheng Liu, Bharat Runwal, James Diffenderfer, Bhavya Kailkhura, Sijia Liu

    Abstract: Large Language Models (LLMs) have highlighted the necessity of effective unlearning mechanisms to comply with data regulations and ethical AI practices. LLM unlearning aims at removing undesired data influences and associated model capabilities without compromising utility beyond the scope of unlearning. While interest in studying LLM unlearning is growing, the impact of the optimizer choice for L… ▽ More

    Submitted 24 June, 2024; v1 submitted 28 April, 2024; originally announced April 2024.

  6. arXiv:2404.12241  [pdf, other

    cs.CL cs.AI

    Introducing v0.5 of the AI Safety Benchmark from MLCommons

    Authors: Bertie Vidgen, Adarsh Agrawal, Ahmed M. Ahmed, Victor Akinwande, Namir Al-Nuaimi, Najla Alfaraj, Elie Alhajjar, Lora Aroyo, Trupti Bavalatti, Max Bartolo, Borhane Blili-Hamelin, Kurt Bollacker, Rishi Bomassani, Marisa Ferrara Boston, Siméon Campos, Kal Chakra, Canyu Chen, Cody Coleman, Zacharie Delpierre Coudert, Leon Derczynski, Debojyoti Dutta, Ian Eisenberg, James Ezick, Heather Frase, Brian Fuller , et al. (75 additional authors not shown)

    Abstract: This paper introduces v0.5 of the AI Safety Benchmark, which has been created by the MLCommons AI Safety Working Group. The AI Safety Benchmark has been designed to assess the safety risks of AI systems that use chat-tuned language models. We introduce a principled approach to specifying and constructing the benchmark, which for v0.5 covers only a single use case (an adult chatting to a general-pu… ▽ More

    Submitted 13 May, 2024; v1 submitted 18 April, 2024; originally announced April 2024.

  7. arXiv:2404.11766  [pdf, other

    cs.LG math.NA math.OC

    End-to-End Mesh Optimization of a Hybrid Deep Learning Black-Box PDE Solver

    Authors: Shaocong Ma, James Diffenderfer, Bhavya Kailkhura, Yi Zhou

    Abstract: Deep learning has been widely applied to solve partial differential equations (PDEs) in computational fluid dynamics. Recent research proposed a PDE correction framework that leverages deep learning to correct the solution obtained by a PDE solver on a coarse mesh. However, end-to-end training of such a PDE correction model over both solver-dependent parameters such as mesh parameters and neural n… ▽ More

    Submitted 28 April, 2024; v1 submitted 17 April, 2024; originally announced April 2024.

  8. arXiv:2404.09349  [pdf, other

    cs.LG cs.CR cs.CV

    Adversarial Robustness Limits via Scaling-Law and Human-Alignment Studies

    Authors: Brian R. Bartoldson, James Diffenderfer, Konstantinos Parasyris, Bhavya Kailkhura

    Abstract: This paper revisits the simple, long-studied, yet still unsolved problem of making image classifiers robust to imperceptible perturbations. Taking CIFAR10 as an example, SOTA clean accuracy is about $100$%, but SOTA robustness to $\ell_{\infty}$-norm bounded perturbations barely exceeds $70$%. To understand this gap, we analyze how model size, dataset size, and synthetic data quality affect robust… ▽ More

    Submitted 10 July, 2024; v1 submitted 14 April, 2024; originally announced April 2024.

    Comments: ICML 2024

  9. arXiv:2403.15447  [pdf, other

    cs.CL cs.AI

    Decoding Compressed Trust: Scrutinizing the Trustworthiness of Efficient LLMs Under Compression

    Authors: Junyuan Hong, Jinhao Duan, Chenhui Zhang, Zhangheng Li, Chulin Xie, Kelsey Lieberman, James Diffenderfer, Brian Bartoldson, Ajay Jaiswal, Kaidi Xu, Bhavya Kailkhura, Dan Hendrycks, Dawn Song, Zhangyang Wang, Bo Li

    Abstract: Compressing high-capability Large Language Models (LLMs) has emerged as a favored strategy for resource-efficient inferences. While state-of-the-art (SoTA) compression methods boast impressive advancements in preserving benign task performance, the potential risks of compression in terms of safety and trustworthiness have been largely neglected. This study conducts the first, thorough evaluation o… ▽ More

    Submitted 4 June, 2024; v1 submitted 17 March, 2024; originally announced March 2024.

    Comments: Accepted to ICML'24

  10. arXiv:2402.12348  [pdf, other

    cs.CL cs.AI cs.LG

    GTBench: Uncovering the Strategic Reasoning Limitations of LLMs via Game-Theoretic Evaluations

    Authors: Jinhao Duan, Renming Zhang, James Diffenderfer, Bhavya Kailkhura, Lichao Sun, Elias Stengel-Eskin, Mohit Bansal, Tianlong Chen, Kaidi Xu

    Abstract: As Large Language Models (LLMs) are integrated into critical real-world applications, their strategic and logical reasoning abilities are increasingly crucial. This paper evaluates LLMs' reasoning abilities in competitive environments through game-theoretic tasks, e.g., board and card games that require pure logic and strategic reasoning to compete with opponents. We first propose GTBench, a langu… ▽ More

    Submitted 10 June, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

    Comments: 26 pages; the first two authors contributed equally; GTBench HF Leaderboard: https://huggingface.co/spaces/GTBench/GTBench

  11. arXiv:2401.05561  [pdf, other

    cs.CL

    TrustLLM: Trustworthiness in Large Language Models

    Authors: Lichao Sun, Yue Huang, Haoran Wang, Siyuan Wu, Qihui Zhang, Yuan Li, Chujie Gao, Yixin Huang, Wenhan Lyu, Yixuan Zhang, Xiner Li, Zhengliang Liu, Yixin Liu, Yijue Wang, Zhikun Zhang, Bertie Vidgen, Bhavya Kailkhura, Caiming Xiong, Chaowei Xiao, Chunyuan Li, Eric Xing, Furong Huang, Hao Liu, Heng Ji, Hongyi Wang , et al. (45 additional authors not shown)

    Abstract: Large language models (LLMs), exemplified by ChatGPT, have gained considerable attention for their excellent natural language processing capabilities. Nonetheless, these LLMs present many challenges, particularly in the realm of trustworthiness. Therefore, ensuring the trustworthiness of LLMs emerges as an important topic. This paper introduces TrustLLM, a comprehensive study of trustworthiness in… ▽ More

    Submitted 17 March, 2024; v1 submitted 10 January, 2024; originally announced January 2024.

    Comments: This work is still under work and we welcome your contribution

  12. arXiv:2312.13131  [pdf, other

    cs.LG cs.AI cs.CR

    Scaling Compute Is Not All You Need for Adversarial Robustness

    Authors: Edoardo Debenedetti, Zishen Wan, Maksym Andriushchenko, Vikash Sehwag, Kshitij Bhardwaj, Bhavya Kailkhura

    Abstract: The last six years have witnessed significant progress in adversarially robust deep learning. As evidenced by the CIFAR-10 dataset category in RobustBench benchmark, the accuracy under $\ell_\infty$ adversarial perturbations improved from 44\% in \citet{Madry2018Towards} to 71\% in \citet{peng2023robust}. Although impressive, existing state-of-the-art is still far from satisfactory. It is further… ▽ More

    Submitted 20 December, 2023; originally announced December 2023.

  13. arXiv:2312.06900  [pdf, other

    cs.CV

    When Bio-Inspired Computing meets Deep Learning: Low-Latency, Accurate, & Energy-Efficient Spiking Neural Networks from Artificial Neural Networks

    Authors: Gourav Datta, Zeyu Liu, James Diffenderfer, Bhavya Kailkhura, Peter A. Beerel

    Abstract: Bio-inspired Spiking Neural Networks (SNN) are now demonstrating comparable accuracy to intricate convolutional neural networks (CNN), all while delivering remarkable energy and latency efficiency when deployed on neuromorphic hardware. In particular, ANN-to-SNN conversion has recently gained significant traction in developing deep SNNs with close to state-of-the-art (SOTA) test accuracy on comple… ▽ More

    Submitted 11 December, 2023; originally announced December 2023.

    Comments: Under review

  14. arXiv:2311.12060  [pdf, other

    cs.NE

    Pursing the Sparse Limitation of Spiking Deep Learning Structures

    Authors: Hao Cheng, Jiahang Cao, Erjia Xiao, Mengshu Sun, Le Yang, Jize Zhang, Xue Lin, Bhavya Kailkhura, Kaidi Xu, Renjing Xu

    Abstract: Spiking Neural Networks (SNNs), a novel brain-inspired algorithm, are garnering increased attention for their superior computation and energy efficiency over traditional artificial neural networks (ANNs). To facilitate deployment on memory-constrained devices, numerous studies have explored SNN pruning. However, these efforts are hindered by challenges such as scalability challenges in more comple… ▽ More

    Submitted 18 November, 2023; originally announced November 2023.

  15. arXiv:2310.07506  [pdf, other

    cs.CV cs.LG

    Leveraging Hierarchical Feature Sharing for Efficient Dataset Condensation

    Authors: Haizhong Zheng, Jiachen Sun, Shutong Wu, Bhavya Kailkhura, Zhuoqing Mao, Chaowei Xiao, Atul Prakash

    Abstract: Given a real-world dataset, data condensation (DC) aims to synthesize a small synthetic dataset that captures the knowledge of a natural dataset while being usable for training models with comparable accuracy. Recent works propose to enhance DC with data parameterization, which condenses data into very compact parameterized data containers instead of images. The intuition behind data parameterizat… ▽ More

    Submitted 18 July, 2024; v1 submitted 11 October, 2023; originally announced October 2023.

    Journal ref: ECCV 2024

  16. arXiv:2310.05914  [pdf, other

    cs.CL cs.LG

    NEFTune: Noisy Embeddings Improve Instruction Finetuning

    Authors: Neel Jain, Ping-yeh Chiang, Yuxin Wen, John Kirchenbauer, Hong-Min Chu, Gowthami Somepalli, Brian R. Bartoldson, Bhavya Kailkhura, Avi Schwarzschild, Aniruddha Saha, Micah Goldblum, Jonas Geiping, Tom Goldstein

    Abstract: We show that language model finetuning can be improved, sometimes dramatically, with a simple augmentation. NEFTune adds noise to the embedding vectors during training. Standard finetuning of LLaMA-2-7B using Alpaca achieves 29.79% on AlpacaEval, which rises to 64.69% using noisy embeddings. NEFTune also improves over strong baselines on modern instruction datasets. Models trained with Evol-Instru… ▽ More

    Submitted 10 October, 2023; v1 submitted 9 October, 2023; originally announced October 2023.

    Comments: 25 pages, Code is available on Github: https://github.com/neelsjain/NEFTune

  17. arXiv:2310.02025  [pdf, other

    cs.LG

    DeepZero: Scaling up Zeroth-Order Optimization for Deep Model Training

    Authors: Aochuan Chen, Yimeng Zhang, Jinghan Jia, James Diffenderfer, Jiancheng Liu, Konstantinos Parasyris, Yihua Zhang, Zheng Zhang, Bhavya Kailkhura, Sijia Liu

    Abstract: Zeroth-order (ZO) optimization has become a popular technique for solving machine learning (ML) problems when first-order (FO) information is difficult or impossible to obtain. However, the scalability of ZO optimization remains an open problem: Its use has primarily been limited to relatively small-scale ML problems, such as sample-wise adversarial attack generation. To our best knowledge, no pri… ▽ More

    Submitted 15 March, 2024; v1 submitted 3 October, 2023; originally announced October 2023.

    Comments: Accepted to ICLR'24. Codes are available at https://github.com/OPTML-Group/DeepZero

  18. arXiv:2307.08657  [pdf, other

    eess.IV cs.LG

    Neural Image Compression: Generalization, Robustness, and Spectral Biases

    Authors: Kelsey Lieberman, James Diffenderfer, Charles Godfrey, Bhavya Kailkhura

    Abstract: Recent advances in neural image compression (NIC) have produced models that are starting to outperform classic codecs. While this has led to growing excitement about using NIC in real-world applications, the successful adoption of any machine learning system in the wild requires it to generalize (and be robust) to unseen distribution shifts at deployment. Unfortunately, current research lacks comp… ▽ More

    Submitted 27 October, 2023; v1 submitted 17 July, 2023; originally announced July 2023.

    Comments: NeurIPS 2023

  19. arXiv:2307.08551  [pdf, other

    cs.CV

    On the Fly Neural Style Smoothing for Risk-Averse Domain Generalization

    Authors: Akshay Mehra, Yunbei Zhang, Bhavya Kailkhura, Jihun Hamm

    Abstract: Achieving high accuracy on data from domains unseen during training is a fundamental challenge in domain generalization (DG). While state-of-the-art DG classifiers have demonstrated impressive performance across various tasks, they have shown a bias towards domain-dependent information, such as image styles, rather than domain-invariant information, such as image content. This bias renders them un… ▽ More

    Submitted 17 July, 2023; originally announced July 2023.

  20. arXiv:2307.01379  [pdf, other

    cs.CL cs.AI cs.LG

    Shifting Attention to Relevance: Towards the Predictive Uncertainty Quantification of Free-Form Large Language Models

    Authors: Jinhao Duan, Hao Cheng, Shiqi Wang, Alex Zavalny, Chenan Wang, Renjing Xu, Bhavya Kailkhura, Kaidi Xu

    Abstract: Large Language Models (LLMs) show promising results in language generation and instruction following but frequently "hallucinate", making their outputs less reliable. Despite Uncertainty Quantification's (UQ) potential solutions, implementing it accurately within LLMs is challenging. Our research introduces a simple heuristic: not all tokens in auto-regressive LLM text equally represent the underl… ▽ More

    Submitted 28 May, 2024; v1 submitted 3 July, 2023; originally announced July 2023.

    Comments: To appear in ACL 2024

  21. arXiv:2302.12366  [pdf, other

    cs.LG cs.CV

    Less is More: Data Pruning for Faster Adversarial Training

    Authors: Yize Li, Pu Zhao, Xue Lin, Bhavya Kailkhura, Ryan Goldhahn

    Abstract: Deep neural networks (DNNs) are sensitive to adversarial examples, resulting in fragile and unreliable performance in the real world. Although adversarial training (AT) is currently one of the most effective methodologies to robustify DNNs, it is computationally very expensive (e.g., 5-10X costlier than standard training). To address this challenge, existing approaches focus on single-step AT, ref… ▽ More

    Submitted 27 February, 2023; v1 submitted 23 February, 2023; originally announced February 2023.

    Comments: The AAAI-23 Workshop on Artificial Intelligence Safety (SafeAI 2023)

  22. arXiv:2210.06640  [pdf, other

    cs.LG

    Compute-Efficient Deep Learning: Algorithmic Trends and Opportunities

    Authors: Brian R. Bartoldson, Bhavya Kailkhura, Davis Blalock

    Abstract: Although deep learning has made great progress in recent years, the exploding economic and environmental costs of training neural networks are becoming unsustainable. To address this problem, there has been a great deal of research on *algorithmically-efficient deep learning*, which seeks to reduce training costs not at the hardware or implementation level, but through changes in the semantics of… ▽ More

    Submitted 21 March, 2023; v1 submitted 12 October, 2022; originally announced October 2022.

    Comments: 77 pages

    Journal ref: Journal of Machine Learning Research (2023)

  23. arXiv:2209.12839  [pdf, other

    cs.LG cs.AI

    Efficient Multi-Prize Lottery Tickets: Enhanced Accuracy, Training, and Inference Speed

    Authors: Hao Cheng, Pu Zhao, Yize Li, Xue Lin, James Diffenderfer, Ryan Goldhahn, Bhavya Kailkhura

    Abstract: Recently, Diffenderfer and Kailkhura proposed a new paradigm for learning compact yet highly accurate binary neural networks simply by pruning and quantizing randomly weighted full precision neural networks. However, the accuracy of these multi-prize tickets (MPTs) is highly sensitive to the optimal prune ratio, which limits their applicability. Furthermore, the original implementation did not att… ▽ More

    Submitted 26 September, 2022; originally announced September 2022.

  24. arXiv:2207.04075  [pdf, other

    cs.LG

    Models Out of Line: A Fourier Lens on Distribution Shift Robustness

    Authors: Sara Fridovich-Keil, Brian R. Bartoldson, James Diffenderfer, Bhavya Kailkhura, Peer-Timo Bremer

    Abstract: Improving the accuracy of deep neural networks (DNNs) on out-of-distribution (OOD) data is critical to an acceptance of deep learning (DL) in real world applications. It has been observed that accuracies on in-distribution (ID) versus OOD data follow a linear trend and models that outperform this baseline are exceptionally rare (and referred to as "effectively robust"). Recently, some promising ap… ▽ More

    Submitted 8 July, 2022; originally announced July 2022.

  25. arXiv:2206.12364  [pdf, other

    cs.LG

    On Certifying and Improving Generalization to Unseen Domains

    Authors: Akshay Mehra, Bhavya Kailkhura, Pin-Yu Chen, Jihun Hamm

    Abstract: Domain Generalization (DG) aims to learn models whose performance remains high on unseen domains encountered at test-time by using data from multiple related source domains. Many existing DG algorithms reduce the divergence between source distributions in a representation space to potentially align the unseen domain close to the sources. This is motivated by the analysis that explains generalizati… ▽ More

    Submitted 24 June, 2022; originally announced June 2022.

  26. arXiv:2206.07736  [pdf, other

    cs.LG cs.CV

    Improving Diversity with Adversarially Learned Transformations for Domain Generalization

    Authors: Tejas Gokhale, Rushil Anirudh, Jayaraman J. Thiagarajan, Bhavya Kailkhura, Chitta Baral, Yezhou Yang

    Abstract: To be successful in single source domain generalization, maximizing diversity of synthesized domains has emerged as one of the most effective strategies. Many of the recent successes have come from methods that pre-specify the types of diversity that a model is exposed to during training, so that it can ultimately generalize well to new domains. However, naïve diversity based augmentations do not… ▽ More

    Submitted 12 December, 2022; v1 submitted 15 June, 2022; originally announced June 2022.

    Comments: WACV 2023. Code: https://github.com/tejas-gokhale/ALT

  27. arXiv:2206.02785  [pdf, other

    cs.LG cs.AI

    Zeroth-Order SciML: Non-intrusive Integration of Scientific Software with Deep Learning

    Authors: Ioannis Tsaknakis, Bhavya Kailkhura, Sijia Liu, Donald Loveland, James Diffenderfer, Anna Maria Hiszpanski, Mingyi Hong

    Abstract: Using deep learning (DL) to accelerate and/or improve scientific workflows can yield discoveries that are otherwise impossible. Unfortunately, DL models have yielded limited success in complex scientific domains due to large data requirements. In this work, we propose to overcome this issue by integrating the abundance of scientific knowledge sources (SKS) with the DL training process. Existing kn… ▽ More

    Submitted 4 June, 2022; originally announced June 2022.

  28. arXiv:2205.13757  [pdf, other

    cond-mat.mtrl-sci cs.LG

    Representing Polymers as Periodic Graphs with Learned Descriptors for Accurate Polymer Property Predictions

    Authors: Evan R. Antoniuk, Peggy Li, Bhavya Kailkhura, Anna M. Hiszpanski

    Abstract: One of the grand challenges of utilizing machine learning for the discovery of innovative new polymers lies in the difficulty of accurately representing the complex structures of polymeric materials. Although a wide array of hand-designed polymer representations have been explored, there has yet to be an ideal solution for how to capture the periodicity of polymer structures, and how to develop po… ▽ More

    Submitted 27 May, 2022; originally announced May 2022.

  29. arXiv:2203.16615  [pdf, other

    cs.LG math.OC

    A Fast and Convergent Proximal Algorithm for Regularized Nonconvex and Nonsmooth Bi-level Optimization

    Authors: Ziyi Chen, Bhavya Kailkhura, Yi Zhou

    Abstract: Many important machine learning applications involve regularized nonconvex bi-level optimization. However, the existing gradient-based bi-level optimization algorithms cannot handle nonconvex or nonsmooth regularizers, and they suffer from a high computation complexity in nonconvex bi-level optimization. In this work, we study a proximal gradient-type algorithm that adopts the approximate implicit… ▽ More

    Submitted 3 June, 2022; v1 submitted 30 March, 2022; originally announced March 2022.

    Comments: 20 pages, 1 figure, 1 table

  30. arXiv:2203.11295  [pdf, other

    cs.LG cs.AR

    Benchmarking Test-Time Unsupervised Deep Neural Network Adaptation on Edge Devices

    Authors: Kshitij Bhardwaj, James Diffenderfer, Bhavya Kailkhura, Maya Gokhale

    Abstract: The prediction accuracy of the deep neural networks (DNNs) after deployment at the edge can suffer with time due to shifts in the distribution of the new data. To improve robustness of DNNs, they must be able to update themselves to enhance their prediction accuracy. This adaptation at the resource-constrained edge is challenging as: (i) new labeled data may not be present; (ii) adaptation needs t… ▽ More

    Submitted 21 March, 2022; originally announced March 2022.

    Comments: This paper was selected for poster presentation in International Symposium on Performance Analysis of Systems and Software (ISPASS), 2022

  31. arXiv:2203.08398  [pdf, other

    cs.LG cs.CR

    COPA: Certifying Robust Policies for Offline Reinforcement Learning against Poisoning Attacks

    Authors: Fan Wu, Linyi Li, Chejian Xu, Huan Zhang, Bhavya Kailkhura, Krishnaram Kenthapadi, Ding Zhao, Bo Li

    Abstract: As reinforcement learning (RL) has achieved near human-level performance in a variety of tasks, its robustness has raised great attention. While a vast body of research has explored test-time (evasion) attacks in RL and corresponding defenses, its robustness against training-time (poisoning) attacks remains largely unanswered. In this work, we focus on certifying the robustness of offline RL in th… ▽ More

    Submitted 16 March, 2022; originally announced March 2022.

    Comments: Published as a conference paper at ICLR 2022

  32. arXiv:2201.12296  [pdf, other

    cs.LG cs.AI cs.CV

    Benchmarking Robustness of 3D Point Cloud Recognition Against Common Corruptions

    Authors: Jiachen Sun, Qingzhao Zhang, Bhavya Kailkhura, Zhiding Yu, Chaowei Xiao, Z. Morley Mao

    Abstract: Deep neural networks on 3D point cloud data have been widely used in the real world, especially in safety-critical applications. However, their robustness against corruptions is less studied. In this paper, we present ModelNet40-C, the first comprehensive benchmark on 3D point cloud corruption robustness, consisting of 15 common and realistic corruptions. Our evaluation shows a significant gap bet… ▽ More

    Submitted 28 January, 2022; originally announced January 2022.

    Comments: Codebase and dataset are included in https://github.com/jiachens/ModelNet40-C

    Report number: 23 pages

  33. arXiv:2112.00659  [pdf, other

    cs.LG cs.AI cs.CR

    Certified Adversarial Defenses Meet Out-of-Distribution Corruptions: Benchmarking Robustness and Simple Baselines

    Authors: Jiachen Sun, Akshay Mehra, Bhavya Kailkhura, Pin-Yu Chen, Dan Hendrycks, Jihun Hamm, Z. Morley Mao

    Abstract: Certified robustness guarantee gauges a model's robustness to test-time attacks and can assess the model's readiness for deployment in the real world. In this work, we critically examine how the adversarial robustness guarantees from randomized smoothing-based certification methods change when state-of-the-art certifiably robust models encounter out-of-distribution (OOD) data. Our analysis demonst… ▽ More

    Submitted 1 December, 2021; originally announced December 2021.

    Comments: 21 pages, 15 figures, and 9 tables

  34. arXiv:2107.10873  [pdf, other

    cs.LG cs.AI cs.CR cs.CV

    On the Certified Robustness for Ensemble Models and Beyond

    Authors: Zhuolin Yang, Linyi Li, Xiaojun Xu, Bhavya Kailkhura, Tao Xie, Bo Li

    Abstract: Recent studies show that deep neural networks (DNN) are vulnerable to adversarial examples, which aim to mislead DNNs by adding perturbations with small magnitude. To defend against such attacks, both empirical and theoretical defense approaches have been extensively studied for a single ML model. In this work, we aim to analyze and provide the certified robustness for ensemble ML models, together… ▽ More

    Submitted 21 April, 2022; v1 submitted 22 July, 2021; originally announced July 2021.

    Comments: ICLR 2022. 51 pages, 10 pages for main text. Forum and code: https://openreview.net/forum?id=tUa4REjGjTf

  35. arXiv:2107.03919  [pdf, other

    cs.LG

    Understanding the Limits of Unsupervised Domain Adaptation via Data Poisoning

    Authors: Akshay Mehra, Bhavya Kailkhura, Pin-Yu Chen, Jihun Hamm

    Abstract: Unsupervised domain adaptation (UDA) enables cross-domain learning without target domain labels by transferring knowledge from a labeled source domain whose distribution differs from that of the target. However, UDA is not always successful and several accounts of `negative transfer' have been reported in the literature. In this work, we prove a simple lower bound on the target domain error that c… ▽ More

    Submitted 3 November, 2021; v1 submitted 8 July, 2021; originally announced July 2021.

    Comments: Neurips 2021

  36. arXiv:2106.13427  [pdf, other

    cs.LG

    Reliable Graph Neural Network Explanations Through Adversarial Training

    Authors: Donald Loveland, Shusen Liu, Bhavya Kailkhura, Anna Hiszpanski, Yong Han

    Abstract: Graph neural network (GNN) explanations have largely been facilitated through post-hoc introspection. While this has been deemed successful, many post-hoc explanation methods have been shown to fail in capturing a model's learned representation. Due to this problem, it is worthwhile to consider how one might train a model so that it is more amenable to post-hoc analysis. Given the success of adver… ▽ More

    Submitted 25 June, 2021; originally announced June 2021.

    Comments: 4 pages, 3 figures, ICML Workshop on Theoretic Foundation, Criticism, and Application Trend of Explainable AI

  37. arXiv:2106.09129  [pdf, other

    cs.LG

    A Winning Hand: Compressing Deep Networks Can Improve Out-Of-Distribution Robustness

    Authors: James Diffenderfer, Brian R. Bartoldson, Shreya Chaganti, Jize Zhang, Bhavya Kailkhura

    Abstract: Successful adoption of deep learning (DL) in the wild requires models to be: (1) compact, (2) accurate, and (3) robust to distributional shifts. Unfortunately, efforts towards simultaneously meeting these requirements have mostly been unsuccessful. This raises an important question: Is the inability to create Compact, Accurate, and Robust Deep neural networks (CARDs) fundamental? To answer this qu… ▽ More

    Submitted 5 November, 2021; v1 submitted 16 June, 2021; originally announced June 2021.

  38. arXiv:2104.10586  [pdf, ps, other

    cs.LG cs.AI cs.CR

    Mixture of Robust Experts (MoRE):A Robust Denoising Method towards multiple perturbations

    Authors: Kaidi Xu, Chenan Wang, Hao Cheng, Bhavya Kailkhura, Xue Lin, Ryan Goldhahn

    Abstract: To tackle the susceptibility of deep neural networks to examples, the adversarial training has been proposed which provides a notion of robust through an inner maximization problem presenting the first-order embedded within the outer minimization of the training loss. To generalize the adversarial robustness over different perturbation types, the adversarial training method has been augmented with… ▽ More

    Submitted 20 July, 2021; v1 submitted 21 April, 2021; originally announced April 2021.

    Comments: This paper is a seminar and dicussing paper, which will not be published and printed anywhere. And it will be keep updating

  39. Certifiably-Robust Federated Adversarial Learning via Randomized Smoothing

    Authors: Cheng Chen, Bhavya Kailkhura, Ryan Goldhahn, Yi Zhou

    Abstract: Federated learning is an emerging data-private distributed learning framework, which, however, is vulnerable to adversarial attacks. Although several heuristic defenses are proposed to enhance the robustness of federated learning, they do not provide certifiable robustness guarantees. In this paper, we incorporate randomized smoothing techniques into federated adversarial training to enable data-p… ▽ More

    Submitted 29 March, 2021; originally announced March 2021.

    Comments: 9 pages, 12 figures

    Journal ref: 2021 IEEE 18th International Conference on Mobile Ad Hoc and Smart Systems (MASS), Denver, CO, USA, 2021, pp. 173-179

  40. arXiv:2103.09377  [pdf, other

    cs.LG cs.CV

    Multi-Prize Lottery Ticket Hypothesis: Finding Accurate Binary Neural Networks by Pruning A Randomly Weighted Network

    Authors: James Diffenderfer, Bhavya Kailkhura

    Abstract: Recently, Frankle & Carbin (2019) demonstrated that randomly-initialized dense networks contain subnetworks that once found can be trained to reach test accuracy comparable to the trained dense network. However, finding these high performing trainable subnetworks is expensive, requiring iterative process of training and pruning weights. In this paper, we propose (and prove) a stronger Multi-Prize… ▽ More

    Submitted 16 March, 2021; originally announced March 2021.

  41. arXiv:2101.05950  [pdf, other

    cs.LG cs.AI

    Robusta: Robust AutoML for Feature Selection via Reinforcement Learning

    Authors: Xiaoyang Wang, Bo Li, Yibo Zhang, Bhavya Kailkhura, Klara Nahrstedt

    Abstract: Several AutoML approaches have been proposed to automate the machine learning (ML) process, such as searching for the ML model architectures and hyper-parameters. However, these AutoML pipelines only focus on improving the learning accuracy of benign samples while ignoring the ML model robustness under adversarial attacks. As ML systems are increasingly being used in a variety of mission-critical… ▽ More

    Submitted 14 January, 2021; originally announced January 2021.

  42. arXiv:2012.01806  [pdf, other

    cs.CV cs.LG

    Attribute-Guided Adversarial Training for Robustness to Natural Perturbations

    Authors: Tejas Gokhale, Rushil Anirudh, Bhavya Kailkhura, Jayaraman J. Thiagarajan, Chitta Baral, Yezhou Yang

    Abstract: While existing work in robust deep learning has focused on small pixel-level norm-based perturbations, this may not account for perturbations encountered in several real-world settings. In many such cases although test data might not be available, broad specifications about the types of perturbations (such as an unknown degree of rotation) may be known. We consider a setup where robustness is expe… ▽ More

    Submitted 7 April, 2021; v1 submitted 3 December, 2020; originally announced December 2020.

    Comments: AAAI 2021. Camera Ready version + Appendix

  43. arXiv:2012.01478  [pdf, other

    cond-mat.mtrl-sci cs.CV cs.LG physics.app-ph

    Leveraging Uncertainty from Deep Learning for Trustworthy Materials Discovery Workflows

    Authors: Jize Zhang, Bhavya Kailkhura, T. Yong-Jin Han

    Abstract: In this paper, we leverage predictive uncertainty of deep neural networks to answer challenging questions material scientists usually encounter in machine learning based materials applications workflows. First, we show that by leveraging predictive uncertainty, a user can determine the required training data set size necessary to achieve a certain classification accuracy. Next, we propose uncertai… ▽ More

    Submitted 22 April, 2021; v1 submitted 2 December, 2020; originally announced December 2020.

  44. arXiv:2012.01274  [pdf, other

    cs.LG

    How Robust are Randomized Smoothing based Defenses to Data Poisoning?

    Authors: Akshay Mehra, Bhavya Kailkhura, Pin-Yu Chen, Jihun Hamm

    Abstract: Predictions of certifiably robust classifiers remain constant in a neighborhood of a point, making them resilient to test-time attacks with a guarantee. In this work, we present a previously unrecognized threat to robust machine learning models that highlights the importance of training-data quality in achieving high certified adversarial robustness. Specifically, we propose a novel bilevel optimi… ▽ More

    Submitted 30 March, 2021; v1 submitted 2 December, 2020; originally announced December 2020.

    Comments: CVPR 2021

  45. FedCluster: Boosting the Convergence of Federated Learning via Cluster-Cycling

    Authors: Cheng Chen, Ziyi Chen, Yi Zhou, Bhavya Kailkhura

    Abstract: We develop FedCluster--a novel federated learning framework with improved optimization efficiency, and investigate its theoretical convergence properties. The FedCluster groups the devices into multiple clusters that perform federated learning cyclically in each learning round. Therefore, each learning round of FedCluster consists of multiple cycles of meta-update that boost the overall convergenc… ▽ More

    Submitted 22 September, 2020; originally announced September 2020.

    Comments: 10 pages, 6 figures

    Journal ref: 2020 IEEE International Conference on Big Data (Big Data), Atlanta, GA, USA, 2020, pp. 5017-5026

  46. arXiv:2007.10800  [pdf, other

    cs.LG cs.CV eess.IV stat.ML

    Probabilistic Neighbourhood Component Analysis: Sample Efficient Uncertainty Estimation in Deep Learning

    Authors: Ankur Mallick, Chaitanya Dwivedi, Bhavya Kailkhura, Gauri Joshi, T. Yong-Jin Han

    Abstract: While Deep Neural Networks (DNNs) achieve state-of-the-art accuracy in various applications, they often fall short in accurately estimating their predictive uncertainty and, in turn, fail to recognize when these predictions may be wrong. Several uncertainty-aware models, such as Bayesian Neural Network (BNNs) and Deep Ensembles have been proposed in the literature for quantifying predictive uncert… ▽ More

    Submitted 18 July, 2020; originally announced July 2020.

  47. arXiv:2007.08631  [pdf, other

    cs.LG cs.CV

    Explainable Deep Learning for Uncovering Actionable Scientific Insights for Materials Discovery and Design

    Authors: Shusen Liu, Bhavya Kailkhura, Jize Zhang, Anna M. Hiszpanski, Emily Robertson, Donald Loveland, T. Yong-Jin Han

    Abstract: The scientific community has been increasingly interested in harnessing the power of deep learning to solve various domain challenges. However, despite the effectiveness in building predictive models, fundamental challenges exist in extracting actionable knowledge from deep neural networks due to their opaque nature. In this work, we propose techniques for exploring the behavior of deep learning m… ▽ More

    Submitted 16 July, 2020; originally announced July 2020.

    Report number: LLNL-JRNL-811201

  48. arXiv:2007.00067  [pdf, other

    cs.CL cs.LG stat.ML

    Adversarial Mutual Information for Text Generation

    Authors: Boyuan Pan, Yazheng Yang, Kaizhao Liang, Bhavya Kailkhura, Zhongming Jin, Xian-Sheng Hua, Deng Cai, Bo Li

    Abstract: Recent advances in maximizing mutual information (MI) between the source and target have demonstrated its effectiveness in text generation. However, previous works paid little attention to modeling the backward network of MI (i.e., dependency from the target to the source), which is crucial to the tightness of the variational information maximization lower bound. In this paper, we propose Adversar… ▽ More

    Submitted 30 June, 2020; originally announced July 2020.

    Comments: Published at ICML 2020

  49. arXiv:2006.16533  [pdf, other

    cs.CV cs.LG

    Actionable Attribution Maps for Scientific Machine Learning

    Authors: Shusen Liu, Bhavya Kailkhura, Jize Zhang, Anna M. Hiszpanski, Emily Robertson, Donald Loveland, T. Yong-Jin Han

    Abstract: The scientific community has been increasingly interested in harnessing the power of deep learning to solve various domain challenges. However, despite the effectiveness in building predictive models, fundamental challenges exist in extracting actionable knowledge from the deep neural network due to their opaque nature. In this work, we propose techniques for exploring the behavior of deep learnin… ▽ More

    Submitted 30 June, 2020; originally announced June 2020.

  50. arXiv:2006.06224  [pdf, other

    cs.LG eess.SP stat.ML

    A Primer on Zeroth-Order Optimization in Signal Processing and Machine Learning

    Authors: Sijia Liu, Pin-Yu Chen, Bhavya Kailkhura, Gaoyuan Zhang, Alfred Hero, Pramod K. Varshney

    Abstract: Zeroth-order (ZO) optimization is a subset of gradient-free optimization that emerges in many signal processing and machine learning applications. It is used for solving optimization problems similarly to gradient-based methods. However, it does not require the gradient, using only function evaluations. Specifically, ZO optimization iteratively performs three major steps: gradient estimation, desc… ▽ More

    Submitted 21 June, 2020; v1 submitted 11 June, 2020; originally announced June 2020.

    Comments: IEEE Signal Processing Magazine