-
Maximum Weight Entropy
Authors:
Antoine de Mathelin,
François Deheeger,
Mathilde Mougeot,
Nicolas Vayatis
Abstract:
This paper deals with uncertainty quantification and out-of-distribution detection in deep learning using Bayesian and ensemble methods. It proposes a practical solution to the lack of prediction diversity observed recently for standard approaches when used out-of-distribution (Ovadia et al., 2019; Liu et al., 2021). Considering that this issue is mainly related to a lack of weight diversity, we c…
▽ More
This paper deals with uncertainty quantification and out-of-distribution detection in deep learning using Bayesian and ensemble methods. It proposes a practical solution to the lack of prediction diversity observed recently for standard approaches when used out-of-distribution (Ovadia et al., 2019; Liu et al., 2021). Considering that this issue is mainly related to a lack of weight diversity, we claim that standard methods sample in "over-restricted" regions of the weight space due to the use of "over-regularization" processes, such as weight decay and zero-mean centered Gaussian priors. We propose to solve the problem by adopting the maximum entropy principle for the weight distribution, with the underlying idea to maximize the weight diversity. Under this paradigm, the epistemic uncertainty is described by the weight distribution of maximal entropy that produces neural networks "consistent" with the training observations. Considering stochastic neural networks, a practical optimization is derived to build such a distribution, defined as a trade-off between the average empirical risk and the weight distribution entropy. We develop a novel weight parameterization for the stochastic model, based on the singular value decomposition of the neural network's hidden representations, which enables a large increase of the weight entropy for a small empirical risk penalization. We provide both theoretical and numerical results to assess the efficiency of the approach. In particular, the proposed algorithm appears in the top three best methods in all configurations of an extensive out-of-distribution detection benchmark including more than thirty competitors.
△ Less
Submitted 27 September, 2023;
originally announced September 2023.
-
Deep Anti-Regularized Ensembles provide reliable out-of-distribution uncertainty quantification
Authors:
Antoine de Mathelin,
Francois Deheeger,
Mathilde Mougeot,
Nicolas Vayatis
Abstract:
We consider the problem of uncertainty quantification in high dimensional regression and classification for which deep ensemble have proven to be promising methods. Recent observations have shown that deep ensemble often return overconfident estimates outside the training domain, which is a major limitation because shifted distributions are often encountered in real-life scenarios. The principal c…
▽ More
We consider the problem of uncertainty quantification in high dimensional regression and classification for which deep ensemble have proven to be promising methods. Recent observations have shown that deep ensemble often return overconfident estimates outside the training domain, which is a major limitation because shifted distributions are often encountered in real-life scenarios. The principal challenge for this problem is to solve the trade-off between increasing the diversity of the ensemble outputs and making accurate in-distribution predictions. In this work, we show that an ensemble of networks with large weights fitting the training data are likely to meet these two objectives. We derive a simple and practical approach to produce such ensembles, based on an original anti-regularization term penalizing small weights and a control process of the weight increase which maintains the in-distribution loss under an acceptable threshold. The developed approach does not require any out-of-distribution training data neither any trade-off hyper-parameter calibration. We derive a theoretical framework for this approach and show that the proposed optimization can be seen as a "water-filling" problem. Several experiments in both regression and classification settings highlight that Deep Anti-Regularized Ensembles (DARE) significantly improve uncertainty quantification outside the training domain in comparison to recent deep ensembles and out-of-distribution detection methods. All the conducted experiments are reproducible and the source code is available at \url{https://github.com/antoinedemathelin/DARE}.
△ Less
Submitted 8 April, 2023;
originally announced April 2023.
-
Fast and Accurate Importance Weighting for Correcting Sample Bias
Authors:
Antoine de Mathelin,
Francois Deheeger,
Mathilde Mougeot,
Nicolas Vayatis
Abstract:
Bias in datasets can be very detrimental for appropriate statistical estimation. In response to this problem, importance weighting methods have been developed to match any biased distribution to its corresponding target unbiased distribution. The seminal Kernel Mean Matching (KMM) method is, nowadays, still considered as state of the art in this research field. However, one of the main drawbacks o…
▽ More
Bias in datasets can be very detrimental for appropriate statistical estimation. In response to this problem, importance weighting methods have been developed to match any biased distribution to its corresponding target unbiased distribution. The seminal Kernel Mean Matching (KMM) method is, nowadays, still considered as state of the art in this research field. However, one of the main drawbacks of this method is the computational burden for large datasets. Building on previous works by Huang et al. (2007) and de Mathelin et al. (2021), we derive a novel importance weighting algorithm which scales to large datasets by using a neural network to predict the instance weights. We show, on multiple public datasets, under various sample biases, that our proposed approach drastically reduces the computational time on large dataset while maintaining similar sample bias correction performance compared to other importance weighting methods. The proposed approach appears to be the only one able to give relevant reweighting in a reasonable time for large dataset with up to two million data.
△ Less
Submitted 9 September, 2022;
originally announced September 2022.
-
ADAPT : Awesome Domain Adaptation Python Toolbox
Authors:
Antoine de Mathelin,
Mounir Atiq,
Guillaume Richard,
Alejandro de la Concha,
Mouad Yachouti,
François Deheeger,
Mathilde Mougeot,
Nicolas Vayatis
Abstract:
In this paper, we introduce the ADAPT library, an open source Python API providing the implementation of the main transfer learning and domain adaptation methods. The library is designed with a user friendly approach to facilitate the access to domain adaptation for a wide public. ADAPT is compatible with scikit-learn and TensorFlow and a full documentation is proposed online https://adapt-python.…
▽ More
In this paper, we introduce the ADAPT library, an open source Python API providing the implementation of the main transfer learning and domain adaptation methods. The library is designed with a user friendly approach to facilitate the access to domain adaptation for a wide public. ADAPT is compatible with scikit-learn and TensorFlow and a full documentation is proposed online https://adapt-python.github.io/adapt/ with a substantial gallery of examples.
△ Less
Submitted 1 February, 2023; v1 submitted 7 July, 2021;
originally announced July 2021.
-
Discrepancy-Based Active Learning for Domain Adaptation
Authors:
Antoine de Mathelin,
Francois Deheeger,
Mathilde Mougeot,
Nicolas Vayatis
Abstract:
The goal of the paper is to design active learning strategies which lead to domain adaptation under an assumption of Lipschitz functions. Building on previous work by Mansour et al. (2009) we adapt the concept of discrepancy distance between source and target distributions to restrict the maximization over the hypothesis class to a localized class of functions which are performing accurate labelin…
▽ More
The goal of the paper is to design active learning strategies which lead to domain adaptation under an assumption of Lipschitz functions. Building on previous work by Mansour et al. (2009) we adapt the concept of discrepancy distance between source and target distributions to restrict the maximization over the hypothesis class to a localized class of functions which are performing accurate labeling on the source domain. We derive generalization error bounds for such active learning strategies in terms of Rademacher average and localized discrepancy for general loss functions which satisfy a regularity condition. A practical K-medoids algorithm that can address the case of large data set is inferred from the theoretical bounds. Our numerical experiments show that the proposed algorithm is competitive against other state-of-the-art active learning techniques in the context of domain adaptation, in particular on large data sets of around one hundred thousand images.
△ Less
Submitted 14 September, 2022; v1 submitted 5 March, 2021;
originally announced March 2021.
-
Adversarial Weighting for Domain Adaptation in Regression
Authors:
Antoine de Mathelin,
Guillaume Richard,
Francois Deheeger,
Mathilde Mougeot,
Nicolas Vayatis
Abstract:
We present a novel instance-based approach to handle regression tasks in the context of supervised domain adaptation under an assumption of covariate shift. The approach developed in this paper is based on the assumption that the task on the target domain can be efficiently learned by adequately reweighting the source instances during training phase. We introduce a novel formulation of the optimiz…
▽ More
We present a novel instance-based approach to handle regression tasks in the context of supervised domain adaptation under an assumption of covariate shift. The approach developed in this paper is based on the assumption that the task on the target domain can be efficiently learned by adequately reweighting the source instances during training phase. We introduce a novel formulation of the optimization objective for domain adaptation which relies on a discrepancy distance characterizing the difference between domains according to a specific task and a class of hypotheses. To solve this problem, we develop an adversarial network algorithm which learns both the source weighting scheme and the task in one feed-forward gradient descent. We provide numerical evidence of the relevance of the method on public data sets for regression domain adaptation through reproducible experiments.
△ Less
Submitted 15 September, 2021; v1 submitted 15 June, 2020;
originally announced June 2020.
-
Metamodel-based importance sampling for structural reliability analysis
Authors:
V. Dubourg,
F. Deheeger,
B. Sudret
Abstract:
Structural reliability methods aim at computing the probability of failure of systems with respect to some prescribed performance functions. In modern engineering such functions usually resort to running an expensive-to-evaluate computational model (e.g. a finite element model). In this respect simulation methods, which may require $10^{3-6}$ runs cannot be used directly. Surrogate models such as…
▽ More
Structural reliability methods aim at computing the probability of failure of systems with respect to some prescribed performance functions. In modern engineering such functions usually resort to running an expensive-to-evaluate computational model (e.g. a finite element model). In this respect simulation methods, which may require $10^{3-6}$ runs cannot be used directly. Surrogate models such as quadratic response surfaces, polynomial chaos expansions or kriging (which are built from a limited number of runs of the original model) are then introduced as a substitute of the original model to cope with the computational cost. In practice it is almost impossible to quantify the error made by this substitution though. In this paper we propose to use a kriging surrogate of the performance function as a means to build a quasi-optimal importance sampling density. The probability of failure is eventually obtained as the product of an augmented probability computed by substituting the meta-model for the original performance function and a correction term which ensures that there is no bias in the estimation even if the meta-model is not fully accurate. The approach is applied to analytical and finite element reliability problems and proves efficient up to 100 random variables.
△ Less
Submitted 7 May, 2011; v1 submitted 3 May, 2011;
originally announced May 2011.
-
Metamodel-based importance sampling for the simulation of rare events
Authors:
V. Dubourg,
F. Deheeger,
B. Sudret
Abstract:
In the field of structural reliability, the Monte-Carlo estimator is considered as the reference probability estimator. However, it is still untractable for real engineering cases since it requires a high number of runs of the model. In order to reduce the number of computer experiments, many other approaches known as reliability methods have been proposed. A certain approach consists in replacing…
▽ More
In the field of structural reliability, the Monte-Carlo estimator is considered as the reference probability estimator. However, it is still untractable for real engineering cases since it requires a high number of runs of the model. In order to reduce the number of computer experiments, many other approaches known as reliability methods have been proposed. A certain approach consists in replacing the original experiment by a surrogate which is much faster to evaluate. Nevertheless, it is often difficult (or even impossible) to quantify the error made by this substitution. In this paper an alternative approach is developed. It takes advantage of the kriging meta-modeling and importance sampling techniques. The proposed alternative estimator is finally applied to a finite element based structural reliability analysis.
△ Less
Submitted 18 April, 2011;
originally announced April 2011.