(Translated by https://www.hiragana.jp/)
GitHub - axa-rev-research/comparing-fairML-strategies: TBD
Skip to content

axa-rev-research/comparing-fairML-strategies

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

When Mitigating Bias is Unfair: A Comprehensive Study on the Impact of Bias Mitigation Algorithms

This repository contains the codebase for "When Mitigating Bias is Unfair: A Comprehensive Study on the Impact of Bias Mitigation Algorithms" (on Arxiv).

Most works on the fairness of machine learning systems focus on the blind optimization of common fairness metrics, such as Demographic Parity and Equalized Odds. In this paper, we conduct a comparative study of several bias mitigation approaches to investigate their behaviors at a fine grain, the prediction level. Our objective is to characterize the differences between fair models obtained with different approaches. With comparable performances in fairness and accuracy, are the different bias mitigation approaches impacting a similar number of individuals? Do they mitigate bias in a similar way? Do they affect the same individuals when debiasing a model? Our findings show that bias mitigation approaches differ a lot in their strategies, both in the number of impacted individuals and the populations targeted. More surprisingly, we show these results even apply for several runs of the same mitigation approach. These findings raise questions about the limitations of the current group fairness metrics, as well as the arbitrariness, hence unfairness, of the whole debiasing process.

Structure

.
├── README.md
├── fairlearn_int <-- [Fairlearn](https://fairlearn.org/) with slight modifications to allow working with multiple different data structures
├── fairness 
|   ├── helpers <-- file containing helper functions
|   ├── avd_helpers  <-- file containing helper functions
├── notebooks <-- folder containing experiments for each analysed dataset, structured in the following way:
|   ├── name_dataset
|   |   ├── name
|   |   ├── runs
├── results <-- folder containing experiment results necessary for further analyses for each analysed dataset, structured in the following way:
|   ├── name_dataset
|   |   ├── results

Data

The following datasets are used for the analysis: