Search | arXiv e-print repository

Fairness under Covariate Shift: Improving Fairness-Accuracy tradeoff with few Unlabeled Test Samples

Authors: Shreyas Havaldar, Jatin Chauhan, Karthikeyan Shanmugam, Jay Nandy, Aravindan Raghuveer

Abstract: Covariate shift in the test data is a common practical phenomena that can significantly downgrade both the accuracy and the fairness performance of the model. Ensuring fairness across different sensitive groups under covariate shift is of paramount importance due to societal implications like criminal justice. We operate in the unsupervised regime where only a small set of unlabeled test samples a… ▽ More Covariate shift in the test data is a common practical phenomena that can significantly downgrade both the accuracy and the fairness performance of the model. Ensuring fairness across different sensitive groups under covariate shift is of paramount importance due to societal implications like criminal justice. We operate in the unsupervised regime where only a small set of unlabeled test samples along with a labeled training set is available. Towards improving fairness under this highly challenging yet realistic scenario, we make three contributions. First is a novel composite weighted entropy based objective for prediction accuracy which is optimized along with a representation matching loss for fairness. We experimentally verify that optimizing with our loss formulation outperforms a number of state-of-the-art baselines in the pareto sense with respect to the fairness-accuracy tradeoff on several standard datasets. Our second contribution is a new setting we term Asymmetric Covariate Shift that, to the best of our knowledge, has not been studied before. Asymmetric covariate shift occurs when distribution of covariates of one group shifts significantly compared to the other groups and this happens when a dominant group is over-represented. While this setting is extremely challenging for current baselines, We show that our proposed method significantly outperforms them. Our third contribution is theoretical, where we show that our weighted entropy term along with prediction loss on the training set approximates test loss under covariate shift. Empirically and through formal sample complexity bounds, we show that this approximation to the unseen test loss does not depend on importance sampling variance which affects many other baselines. △ Less

Submitted 8 January, 2024; v1 submitted 11 October, 2023; originally announced October 2023.

Comments: Accepted at The 38th Annual AAAI Conference on Artificial Intelligence (AAAI 2024)

arXiv:2206.12626 [pdf, other]

doi 10.1145/3534678.3539394

Multi-Variate Time Series Forecasting on Variable Subsets

Authors: Jatin Chauhan, Aravindan Raghuveer, Rishi Saket, Jay Nandy, Balaraman Ravindran

Abstract: We formulate a new inference task in the domain of multivariate time series forecasting (MTSF), called Variable Subset Forecast (VSF), where only a small subset of the variables is available during inference. Variables are absent during inference because of long-term data loss (eg. sensor failures) or high -> low-resource domain shift between train / test. To the best of our knowledge, robustness… ▽ More We formulate a new inference task in the domain of multivariate time series forecasting (MTSF), called Variable Subset Forecast (VSF), where only a small subset of the variables is available during inference. Variables are absent during inference because of long-term data loss (eg. sensor failures) or high -> low-resource domain shift between train / test. To the best of our knowledge, robustness of MTSF models in presence of such failures, has not been studied in the literature. Through extensive evaluation, we first show that the performance of state of the art methods degrade significantly in the VSF setting. We propose a non-parametric, wrapper technique that can be applied on top any existing forecast models. Through systematic experiments across 4 datasets and 5 forecast models, we show that our technique is able to recover close to 95\% performance of the models even when only 15\% of the original variables are present. △ Less

Submitted 25 June, 2022; originally announced June 2022.

Journal ref: ACM SIGKDD 2022 Research Track

arXiv:2107.11822 [pdf, other]

Distributional Shifts in Automated Diabetic Retinopathy Screening

Authors: Jay Nandy, Wynne Hsu, Mong Li Lee

Abstract: Deep learning-based models are developed to automatically detect if a retina image is `referable' in diabetic retinopathy (DR) screening. However, their classification accuracy degrades as the input images distributionally shift from their training distribution. Further, even if the input is not a retina image, a standard DR classifier produces a high confident prediction that the image is `refera… ▽ More Deep learning-based models are developed to automatically detect if a retina image is `referable' in diabetic retinopathy (DR) screening. However, their classification accuracy degrades as the input images distributionally shift from their training distribution. Further, even if the input is not a retina image, a standard DR classifier produces a high confident prediction that the image is `referable'. Our paper presents a Dirichlet Prior Network-based framework to address this issue. It utilizes an out-of-distribution (OOD) detector model and a DR classification model to improve generalizability by identifying OOD images. Experiments on real-world datasets indicate that the proposed framework can eliminate the unknown non-retina images and identify the distributionally shifted retina images for human intervention. △ Less

Submitted 25 July, 2021; originally announced July 2021.

Comments: Accepted at IEEE ICIP 2021

arXiv:2102.05096 [pdf, other]

Towards Bridging the gap between Empirical and Certified Robustness against Adversarial Examples

Authors: Jay Nandy, Sudipan Saha, Wynne Hsu, Mong Li Lee, Xiao Xiang Zhu

Abstract: The current state-of-the-art defense methods against adversarial examples typically focus on improving either empirical or certified robustness. Among them, adversarially trained (AT) models produce empirical state-of-the-art defense against adversarial examples without providing any robustness guarantees for large classifiers or higher-dimensional inputs. In contrast, existing randomized smoothin… ▽ More The current state-of-the-art defense methods against adversarial examples typically focus on improving either empirical or certified robustness. Among them, adversarially trained (AT) models produce empirical state-of-the-art defense against adversarial examples without providing any robustness guarantees for large classifiers or higher-dimensional inputs. In contrast, existing randomized smoothing based models achieve state-of-the-art certified robustness while significantly degrading the empirical robustness against adversarial examples. In this paper, we propose a novel method, called \emph{Certification through Adaptation}, that transforms an AT model into a randomized smoothing classifier during inference to provide certified robustness for $\ell_2$ norm without affecting their empirical robustness against adversarial attacks. We also propose \emph{Auto-Noise} technique that efficiently approximates the appropriate noise levels to flexibly certify the test examples using randomized smoothing technique. Our proposed \emph{Certification through Adaptation} with \emph{Auto-Noise} technique achieves an \textit{average certified radius (ACR) scores} up to $1.102$ and $1.148$ respectively for CIFAR-10 and ImageNet datasets using AT models without affecting their empirical robustness or benign accuracy. Therefore, our paper is a step towards bridging the gap between the empirical and certified robustness against adversarial examples by achieving both using the same classifier. △ Less

Submitted 30 July, 2022; v1 submitted 9 February, 2021; originally announced February 2021.

Comments: An abridged version of this work has been presented at ICLR 2021 Workshop on Security and Safety in Machine Learning Systems: https://aisecure-workshop.github.io/aml-iclr2021/papers/2.pdf

arXiv:2010.10474 [pdf, other]

Towards Maximizing the Representation Gap between In-Domain & Out-of-Distribution Examples

Authors: Jay Nandy, Wynne Hsu, Mong Li Lee

Abstract: Among existing uncertainty estimation approaches, Dirichlet Prior Network (DPN) distinctly models different predictive uncertainty types. However, for in-domain examples with high data uncertainties among multiple classes, even a DPN model often produces indistinguishable representations from the out-of-distribution (OOD) examples, compromising their OOD detection performance. We address this shor… ▽ More Among existing uncertainty estimation approaches, Dirichlet Prior Network (DPN) distinctly models different predictive uncertainty types. However, for in-domain examples with high data uncertainties among multiple classes, even a DPN model often produces indistinguishable representations from the out-of-distribution (OOD) examples, compromising their OOD detection performance. We address this shortcoming by proposing a novel loss function for DPN to maximize the \textit{representation gap} between in-domain and OOD examples. Experimental results demonstrate that our proposed approach consistently improves OOD detection performance. △ Less

Submitted 6 January, 2021; v1 submitted 20 October, 2020; originally announced October 2020.

Comments: Accepted at NeurIPS 2020 Workshop version: ICML UDL 2020, Link: http://www.gatsby.ucl.ac.uk/~balaji/udl2020/accepted-papers/UDL2020-paper-134.pdf

arXiv:2004.02183 [pdf, other]

Approximate Manifold Defense Against Multiple Adversarial Perturbations

Authors: Jay Nandy, Wynne Hsu, Mong Li Lee

Abstract: Existing defenses against adversarial attacks are typically tailored to a specific perturbation type. Using adversarial training to defend against multiple types of perturbation requires expensive adversarial examples from different perturbation types at each training step. In contrast, manifold-based defense incorporates a generative network to project an input sample onto the clean data manifold… ▽ More Existing defenses against adversarial attacks are typically tailored to a specific perturbation type. Using adversarial training to defend against multiple types of perturbation requires expensive adversarial examples from different perturbation types at each training step. In contrast, manifold-based defense incorporates a generative network to project an input sample onto the clean data manifold. This approach eliminates the need to generate expensive adversarial examples while achieving robustness against multiple perturbation types. However, the success of this approach relies on whether the generative network can capture the complete clean data manifold, which remains an open problem for complex input domain. In this work, we devise an approximate manifold defense mechanism, called RBF-CNN, for image classification. Instead of capturing the complete data manifold, we use an RBF layer to learn the density of small image patches. RBF-CNN also utilizes a reconstruction layer that mitigates any minor adversarial perturbations. Further, incorporating our proposed reconstruction process for training improves the adversarial robustness of our RBF-CNN models. Experiment results on MNIST and CIFAR-10 datasets indicate that RBF-CNN offers robustness for multiple perturbations without the need for expensive adversarial training. △ Less

Submitted 15 October, 2020; v1 submitted 5 April, 2020; originally announced April 2020.

Comments: Workshop on Machine Learning with Guarantees, NeurIPS 2019. IJCNN, 2020 (full paper)

arXiv:1805.05269 [pdf, other]

Normal Similarity Network for Generative Modelling

Authors: Jay Nandy, Wynne Hsu, Mong Li Lee

Abstract: Gaussian distributions are commonly used as a key building block in many generative models. However, their applicability has not been well explored in deep networks. In this paper, we propose a novel deep generative model named as Normal Similarity Network (NSN) where the layers are constructed with Gaussian-style filters. NSN is trained with a layer-wise non-parametric density estimation algorith… ▽ More Gaussian distributions are commonly used as a key building block in many generative models. However, their applicability has not been well explored in deep networks. In this paper, we propose a novel deep generative model named as Normal Similarity Network (NSN) where the layers are constructed with Gaussian-style filters. NSN is trained with a layer-wise non-parametric density estimation algorithm that iteratively down-samples the training images and captures the density of the down-sampled training images in the final layer. Additionally, we propose NSN-Gen for generating new samples from noise vectors by iteratively reconstructing feature maps in the hidden layers of NSN. Our experiments suggest encouraging results of the proposed model for a wide range of computer vision applications including image generation, styling and reconstruction from occluded images. △ Less

Submitted 14 May, 2018; originally announced May 2018.

Showing 1–7 of 7 results for author: Nandy, J