Eric Balkanski
Columbia University
eb3224@columbia.edu &Will Ma
Columbia University
wm2428@gsb.columbia.edu Andreas Maggiori
Columbia University
am6292@columbia.edu
Abstract
Algorithms with predictions is a recent framework for decision-making under uncertainty that leverages the power of machine-learned predictions without making any assumption about their quality. The goal in this framework is for algorithms to achieve an improved performance when the predictions are accurate while maintaining acceptable guarantees when the predictions are erroneous. A serious concern with algorithms that use predictions is that these predictions can be biased and, as a result, cause the algorithm to make decisions that are deemed unfair. We show that this concern manifests itself in the classical secretary problem in the learning-augmented setting—the state-of-the-art algorithm can have zero probability of accepting the best candidate, which we deem unfair, despite promising to accept a candidate whose expected value is at least times the optimal value, where is the prediction error.
We show how to preserve this promise while also guaranteeing to accept the best candidate with probability . Our algorithm and analysis are based on a new “pegging” idea that diverges from existing works and simplifies/unifies some of their results. Finally, we extend to the -secretary problem and complement our theoretical analysis with experiments.
1 Introduction
As machine learning algorithms are increasingly used in socially impactful decision-making applications, the fairness of those algorithms has become a primary concern.
Many algorithms deployed in recent years have been shown to be explicitly unfair or reflect bias that is present in training data. Applications where automated decision-making algorithms have been used and fairness is of central importance include loan/credit-risk evaluation (Mukerjee et al., 2002; Khandani et al., 2010; Malhotra and Malhotra, 2003), hiring (Bogen and Rieke, 2018; Cohen et al., 2019), recidivism evaluation Mustard (2003); Angwin et al. (2016); Dressel and Farid (2018); Chouldechova (2017); COMPAS () (software), childhood welfare systems Chouldechova et al. (2018), job recommendations Lambrecht and Tucker (2019), price discrimination Cohen et al. (2022), resource allocation Manshadi et al. (2023), and others Hern (2016); Grossman (2010); Howard and Borenstein (2018); Balseiro et al. (2021); Ma et al. (2022).
A lot of work in recent years has been devoted to formally defining different notions of fairness Luong et al. (2011); Kamishima et al. (2012); Feldman et al. (2015); Dwork et al. (2012); Corbett-Davies et al. (2017); Kleinberg et al. (2017); Kleinberg and Raghavan (2017), designing algorithms that satisfy these different definitions Kamishima et al. (2011); Joseph et al. (2016); Celis et al. (2018); Chierichetti et al. (2019); Yang and Stoyanovich (2017), and investigating trade-offs between fairness and other optimization objectives Bertsimas et al. (2011, 2012).
While most fairness work concentrates on classification problems where the instance is known offline, we explore the problem of making fair decisions when the input is revealed in an online manner. Although fairness in online algorithms is an interesting line of research per se, fairness considerations have become increasingly important due to the recent interest in incorporating (possibly biased) machine learning predictions into the design of classical online algorithms. This framework, usually referred to as learning-augmented algorithms or algorithms with predictions, was first formalized in Lykouris and Vassilvitskii (2018). In contrast to classical online algorithms problems where it is assumed that no information is known about the future, learning-augmented online algorithms are given as input, possibly erroneous, predictions about the future. The main challenge is to simultaneously achieve an improved performance when the predictions are accurate and a robust performance when the predictions are arbitrarily inaccurate.
A long list of online problems have been considered in this setting and we point to Lindermayr and Megow for an up-to-date list of papers.
We enrich this active area of research by investigating how potentially biased predictions affect the fairness of decisions made by learning-augmented algorithms, and ask the following question:
{mdframed}[hidealllines=true, backgroundcolor=gray!15]
Can we design fair algorithms that take advantage of unfair predictions?
In this paper, we study this question on a parsimonious formulation of the secretary problem with predictions, motivated by fairness in hiring candidates.
The problem. In the classical secretary problem, there are candidates who each have a value and arrive in a random order. Upon arrival of a candidate, the algorithm observes the value of that candidate and must irrevocably decide whether to accept or reject that candidate. It can only accept one candidate and the goal is to maximize the probability of accepting the candidate with maximum value. In the classical formulation, only the ordinal ranks of candidates matter, and the algorithm of Dynkin (1963) accepts the best candidate with a constant probability, that equals the best-possible .
In the learning-augmented formulation of the problem proposed by Fujii and Yoshida (2023), the algorithm is initially given a predicted value about each candidate and the authors focus on comparing the expected cardinal value accepted by the algorithm to the maximum cardinal value. The authors derive an algorithm that obtains expected value at least times the maximum value, where is the prediction error. The strength of this guarantee is that it approaches as the prediction error decreases and it is a positive constant even when the error is arbitrarily large.
However, because the algorithm is now using predictions that could be biased, the best candidate may no longer have any probability of being accepted. We view this as a form of unfairness, and aim to derive algorithms that are fair to the best candidate by guaranteeing them a constant probability of being accepted (we contrast with other notions of fairness in stopping problems in Section1.1). Of course, a simple way to be fair by this metric is to ignore the predictions altogether and run the classical algorithm of Dynkin. However, this approach would ignore potentially valuable information and lose the improved guarantee of Fujii and Yoshida (2023) that approaches 1 when the prediction error is low.
Outline of results. We first formally show that the algorithm of Fujii and Yoshida (2023) may in fact accept the best candidate with 0 probability. Our main result is then a new algorithm for secretary with predictions that: obtains expected value at least times the maximum value, like Fujii and Yoshida (2023); and ensures that, under any predictions, the best candidate is hired with probability. This result takes advantage of potentially biased predictions to achieve a guarantee on expected value that approaches when the prediction error is small, while also providing a fairness guarantee for the true best candidate irrespective of the predictions. We note that Antoniadis et al. (2020b) also derive an algorithm for secretary with predictions, where the prediction is of the maximum value (a less informative form of prediction).
This algorithm accepts the best candidate with constant probability but it does not provide a guarantee on the expected value accepted that approaches as the prediction error approaches . Similarly, Dynkin’s algorithm for the classical secretary problem accepts the best candidate with constant probability but does not make use of predictions at all.
Our algorithm is fundamentally different from existing algorithms for secretary with predictions, as our “pegging” idea, i.e., the idea not to accept a possibly suboptimal candidate if there is a future candidate with high enough predicted value, is important to achieve our fairness desideratum.
We also note that the definitions of the prediction error differ in (Fujii and Yoshida, 2023) and (Antoniadis et al., 2020b); the former error definition uses the maximum ratio over all candidates between their predicted and true value while the latter uses the absolute difference.
Our techniques present an arguably simpler analysis and extend to a general family of prediction error measures that includes both of these error definitions.
We then extend our approach to the multiple choice or -secretary problem where the goal is to accept at most candidates and maximize the total of their values, which is the most technical part of the paper. We design an algorithm that obtains expected total value at least times the optimum (which is the sum of the highest values), while simultaneously guaranteeing the highest-valued candidates a constant probability of being accepted. We also have a refined guarantee that provides a higher acceptance probability for the highest-valued candidates, for any .
Finally, we simulate our algorithms in the exact experimental setup of Fujii and Yoshida (2023). We find that they perform well both in terms of expected value accepted and fairness, whereas benchmark algorithms compromise on one of these desiderata.
1.1 Related work
The secretary problem. After Gardner (1960) introduced the secretary problem, Dynkin (1963) developed a simple and optimal stopping rule algorithm that, with probability at least , accepts the candidate with maximum value. Due to its general and simple formulation, the problem has received a lot of attention (see, e.g., Lindley (1961); Gilbert and Mosteller (1966) and references therein) and it was later extended to more general versions such as -secretary (Kleinberg, 2005), matroid-secretary (Babaioff et al., 2007b) and knapsack-secretary (Babaioff et al., 2007a).
Secretaries with predictions. The two works which are closest to our paper are those of Antoniadis et al. (2020b) and Fujii and Yoshida (2023). Both works design algorithms that use predictions regarding the values of the candidates to improve the performance guarantee of Dynkin’s algorithm when the predictions are accurate while also maintaining robustness guarantees when the predictions are arbitrarily wrong. Antoniadis et al. (2020b) uses as prediction only the maximum value and defines the prediction error as the additive difference between the predicted and true maximum value while Fujii and Yoshida (2023) receives a prediction for each candidate and defines the error as the maximum multiplicative difference between true and predicted value among all candidates. Very recently, Choo and Ling (2024) showed that any secretary algorithm that is -consistent cannot achieve robustness better than , even with predictions for each candidate. This result implies that, if we wish to maintain the competitive ratio guarantee from (Fujii and Yoshida, 2023), then the probability of accepting the best candidate cannot be better than .
Secretaries with distributional advice. Another active line of work is to explore how distributional advice can be used to surpass the barrier of the classical secretary problem. Examples of this line of work include the prophet secretary problems where each candidate draws its valuation from a known distribution Esfandiari et al. (2017); Correa et al. (2021b, c); Azar et al. (2018) and the sample secretary problem where the algorithm designer has only sample access to this distribution Kaplan et al. (2020); Correa et al. (2021a). We note that in the former models, predictions are either samples from distributions or distributions themselves which are assumed to be perfectly correct, while in the learning-augmented setting, we receive point predictions that could be completely incorrect. Dütting et al. (2021) investigate a general model for advice where both values and advice are revealed upon a candidate’s arrival and are drawn from a joint distribution .
For example, their advice can be a noisy binary prediction about whether the current candidate is the best overall. Their main result uses linear programming to design optimal algorithms for a broad family of advice that satisfies two conditions. However, these two conditions are not satisfied by the predictions we consider. Additionally, we do not assume any prior knowledge of the prediction quality, whereas their noisy binary prediction setting assumes that the error probability of the binary advice is known.
Fairness in stopping algorithms.
We say that a learning-augmented algorithm for the secretary problem is -fair if it accepts the candidate with the maximum true value with probability at least . In that definition, we do not quantify unfairness as a prediction property but as an algorithmic one, since the algorithm has to accept the best candidate with probability at least no matter how biased predictions are our fairness notion is a challenging one.
That notion can be characterized as an individual fairness notion similar to the identity-independent fairness (IIF) and time-independent fairness (TIF) introduced in Arsenis and Kleinberg (2022).
In the context of the secretary problem, IIF and TIF try to mitigate discrimination due to a person’s identity and arrival time respectively. While these are very appealing fairness notions, the fair algorithms designed in Arsenis and Kleinberg (2022) fall in the classical online algorithms setting as they do not make any assumptions about the future. Consequently, their performance is upper bound by the performance of the best algorithm in the classical worst-case analysis setting. It is also interesting to note the similarities with the poset secretary problem in Salem and Gupta (2023). In the latter work the set of candidates is split into several groups and candidates belonging to different groups cannot be compared due to different biases in the evaluation. In some sense, we try to do the same; different groups of candidates may have predictions that are affected by different biases making the comparison difficult before the true value of each candidate is revealed. Again, in Salem and Gupta (2023) no information about the values of future candidates is available and the performance of their algorithms is upper bounded by the best possible performance in the worst-case analysis setting.
2 Preliminaries
Secretary problem with predictions.
Candidates have true values and predicted values .
The number of candidates and their predicted values are known in advance.
The candidates arrive in a uniformly random order. Every time a new candidate arrives their true value is revealed and the algorithm must immediately decide whether to accept the current candidate or reject them irrevocably and wait for the next arrival. We let and denote the indices of the candidates with the maximum true and predicted value respectively. An instance consists of the values which,
for convenience, are assumed to be positive111Our results for additive error allow negative values, but our extension to multiplicative error in SectionA.2 requires positive values. and mutually distinct222This is without loss as adding an arbitrarily small perturbation to each true and predicted value does not change the performance of our algorithms. This allows for a unique in the definitions of and ..
We let denote its prediction error.
For simplicity, we focus on the additive prediction error
, but we consider an abstract generalization that includes the multiplicative prediction error of Fujii and Yoshida (2023) in Appendix A.2.
Objectives. We let be a random variable denoting the candidate accepted by a given algorithm on a fixed instance, which depends on both the arrival order and any internal randomness in the algorithm. We consider the following desiderata for a given algorithm:
(smoothness)
(fairness)
Since the prediction error is an additive prediction error, we define smoothness to provide an additive approximation guarantee that depends on . When considering the multiplicative prediction error of Fujii and Yoshida (2023), smoothness is defined to provide an approximation guarantee that is multiplicative instead of additive (see Theorem 4).
We aim to derive algorithms that can satisfy smoothness and fairness with constants that do not depend on the instance or the number of candidates . Existing algorithms for secretary with predictions do not simultaneously satisfy these desiderata, as shown by our examples in Appendix A.1.
Comparison to other objectives.
Existing algorithms for secretary with predictions do satisfy a weaker notion called -robustness, where for some constant .
Our desideratum of fairness implies -robustness and aligns with the classical secretary formulation where one is only rewarded for accepting the best candidate.
Another notion of interest in existing literature is consistency, which is how compares to when .
Our smoothness desideratum implies -consistency, the best possible consistency result, and guarantees a smooth degradation as increases beyond .
3 Algorithm and Analysis
We first present and analyze Additive-Pegging in Algorithm1 which achieves the desiderata from Section2.
Then, we mention how using a more abstract prediction error and an almost identical analysis, permits us to generalize Additive-Pegging to Pegging which achieves comparable guarantees for a more general class of error functions that includes the multiplicative error.
Our algorithms assume that each candidate arrives at an independently random arrival time drawn uniformly from . The latter continuous-time arrival model is equivalent to candidates arriving in a uniformly random order and simplifies the algorithm description and analysis. We also write as shorthand for , as shorthand for (so that ) and if .
Description of Additive-Pegging.Additive-Pegging ensures smoothness by always accepting a candidate whose value is close to which, as we argue, is at least . To see this, note that , where we used that (by definition of ) and (by definition of ). Consequently, for smoothness, our algorithm defines the literal at each new arrival, which is true if and only if has the highest predicted value. Accepting while holds would maintain smoothness.
For the fairness desideratum, we note that Dynkin’s algorithm (Dynkin, 1963) for the classical secretary problem relies on the observation that if a constant fraction of the candidates have arrived and the candidate who just arrived has the maximum true value so far, then this candidate has a constant probability of being the best overall. The same high-level intuition is used in our algorithm. Every time a new candidate arrives, we check if is the maximum so far and if ; namely, we compute the literal . Accepting when is true, which is what Dynkin’s algorithm does, would ensure fairness.
However, there are two crucial situations where Additive-Pegging differs from Dynkin’s algorithm.
The first such situation is when the candidate with maximum predicted value arrives and we have that is not the maximum so far or , i.e., is true.
In this case, we cannot always reject , as Dynkin’s algorithm would, because that would not guarantee smoothness.
Instead, we reject only if there is a future candidate whose prediction is sufficiently high compared to . We call the set of those candidates.
The main idea behind the pegged set is that it contains the last candidate to arrive who can guarantee the smoothness property, which is why we accept that candidate when they arrive.
The second situation where our algorithm departs from Dynkin’s algorithm is when a candidate arrives with and we have that is true, in which case Algorithm1 executes the if statement under the case . In this situation, we cannot always accept as Dynkin’s algorithm would, because that would again violate smoothness. Instead, we accept only if can be lower bounded by , noting that if conversely is smaller than , then accepting might be detrimental to our quest of ensuring smoothness.
Algorithm 1Additive-Pegging
//* The algorithm stops when it accepts a candidate by executing . *//
Initialization:
while agent arrives at time do
ifthen
ifthen
else
ifthen
elseifthen
(note that )
ifthen
elseifthen
ifthen
Analysis of the Additive-Pegging algorithm.
Lemma 1.
Additive-Pegging satisfies with probability 1.
Proof.
Let denote the last arriving candidate in .
We first argue that Pegging always accepts a candidate irrespective of the random arrival times of the candidates. We focus on any instance where Additive-Pegging does not accept a candidate until time . At time either or are true. Since in the former case candidate is accepted, we focus on the latter case and in particular whenever the set which is computed is non-empty (otherwise, candidate is accepted). In that case, it is guaranteed that by time Additive-Pegging will accept a candidate.
We now argue that in all cases Additive-Pegging maintains smoothness. Using , definitions and the fact that is the candidate with the maximum predicted value we have:
.
If candidate is accepted then using the latter lower bound we get .
If we accept and the if statement of is executed at time then we have .
Finally, we need to lower bound the value in case our algorithm terminates accepting .
Note that from the way the pegged set is updated when is true we always have . Since we can conclude that .
∎
Lemma 2.
Additive-Pegging satisfies .
Proof.
In the following, we assume that the number of candidates is larger or equal to . The proof for the case where is almost identical while the fairness guarantee in that case is . We denote by the index of the candidate with the highest true value except and , i.e., . Note that depending on the value of , might denote the index of the candidate with the second or third highest true value. To prove fairness we distinguish between two cases: either or . For each of those cases, we define an event and argue that: (1) the event happens with constant probability, and (2) if that event happens then Additive-Pegging accepts .
If we define event for which . implies that our algorithm does not accept any candidate until time . Indeed, note that at any point in time before , both literals and are simultaneously false. On the contrary, at time , both and are true and our algorithm accepts .
On the other hand, if we distinguish between two sub-cases. First, we show that either or is true. By contradiction, assume that both inequalities do not hold, then
which is a contradiction. We now define two events and which imply that is always accepted whenever and are true respectively.
If , then we define event which is composed by independent events and it happens with probability .
implies that , thus we can deduce that .
Consequently, if until time all candidates are rejected, implies that is true at time and candidate is hired. To argue that no candidate is accepted before time , note that is false at all times before and at time (when literal is true) the set contains .
If , then we define which happens with probability
Note that until time no candidate is accepted since and are both false at all times. Indeed, between times and only could have been accepted but its arrival time is after , and between times and no candidate has a true value larger than . Finally, note that at time we have and consequently is true and gets accepted.
∎
Theorem 3.
Additive-Pegging satisfies smoothness and fairness with and .
Theorem3 follows directly from Lemmas1 and 2. We note that Lemma1 actually implies a stronger notion of smoothness that holds with probability 1.
The general Pegging algorithm. In SectionA.2 we generalize the Additive-Pegging algorithm to the Pegging algorithm to provide fair and smooth algorithms for different prediction error definitions. Additive-Pegging is an instantiation of Pegging when the prediction error is defined as the maximum absolute difference between true and predicted values among candidates. To further demonstrate the generality of Pegging, we also instantiate it over the same prediction error definition as in Fujii and Yoshida (2023) and recover similar smoothness bounds while also ensuring fairness. We name the latter instantiation Multiplicative-Pegging and present its guarantees in Theorem4.
Theorem 4.
Let and assume . Then Multiplicative-Pegging satisfies fairness with and selects a candidate such that with probability 1.
Fujii and Yoshida (2023) define the prediction error as in Theorem4 and design an algorithm that accepts a candidate with expected value at least . Since their algorithm satisfies a smoothness desideratum similar to the one in Theorem4, but as we prove in SectionA.1, it violates the fairness desideratum.
4 Extension: -Secretary problem with predictions
We consider the generalization to the -secretary problem, where candidates can be accepted.
To simplify notation we label the candidates in decreasing order of predicted value, so that and denote to be the index of the candidate with the ’th highest true value so that .
The prediction error is again defined as and we let denote the random set of candidates accepted by a given algorithm on a fixed instance. The extension of our two objectives to this setting is
(smoothness for -secretary)
(fairness for -secretary)
The smoothness desideratum compares the expected sum of true values accepted by the algorithm to the sum of the highest true values that could have been accepted.
The fairness desideratum guarantees each of the candidates ranked to be accepted with probability . The -secretary problem with predictions has been studied by Fujii and Yoshida (2023), who derive an algorithm satisfying but without any fairness guarantees.
We derive an algorithm -Pegging that satisfies the following.
Theorem 5.
-Pegging satisfies smoothness and fairness for -secretary with and for all .
We note that the algorithm of Kleinberg (2005) for -Secretary (without predictions) obtains in expectation at least
which has a better asymptotic dependence on than our smoothness constant if the prediction error is relatively large, i.e., . On the other hand,
regarding our fairness guarantee, if one only considers our fairness desideratum for -secretary, then a simple algorithm suffices to achieve for all , namely: reject all candidates with ; accept any candidate with whose true value is among the highest true values observed so far, space permitting.
For any of the top candidates, i.e., candidate with , their value is always greater than the threshold , which our algorithm recomputes upon the arrival of each candidate. Consequently, candidate is added to our solution if the following conditions are satisfied: (1) ; and (2) there is space available in the solution when arrives.
For condition (2) to hold, it suffices that at least of the candidates with the highest values other than arrive before time . This ensures that at most candidates other than can be accepted after time .
The probability of both conditions (1) and (2) being satisfied is at least , establishing that for all .
Assuming is a constant, and in Theorem5 are constants that do not depend on or the instance . For large values of the first term in is exponentially decaying, but the second term still guarantees candidate a probability of acceptance that is independent of as long as is bounded away from 1.
More precisely, for and we have that candidate is accepted with probability at least , i.e., every candidate among the top is accepted with probability at least , thus
. This implies a multiplicative guarantee on total value that does not depend on when is large.
The algorithm. While we defer the proof of Theorem5 to AppendixB, we present the intuition and the main technical difficulties in the design of -Pegging. The algorithm maintains in an online manner the following sets: (1) the solution set which contains all the candidates that have already been accepted; (2) a set that we call the “Hopefuls” and contains the future candidates with highest predicted values; (3) a set that we call the blaming set, which contains a subset of already arrived candidates that pegged a future candidate; and (4) the set of pegged elements which contains all candidates that have been pegged by a candidate in . In addition, we use function to store the “pegging responsibility”, i.e., if , for some candidates where had one of the highest predicted values, then was not accepted at the time of its arrival and pegged . We use to denote that was pegged by .
Algorithm 2 -Pegging
//* The algorithm stops when it accepts candidates, i.e., when . *//
Initialization:
while agent arrives at time do
ifthen Case 1
Add to , remove from , and remove from
,
ifthen Case 2
Add to and remove from
elseifthen Case 3
ifthen subcase a
Add to and remove from
else subcase b
Add to , add to , remove from , and set
elseifthen Case 4
ifthen subcase a
Add to , remove from , and remove from
elseifthen subcase b
Add to and remove from
To satisfy the fairness property, we check if the current candidate has arrived at time and if is larger than the highest value seen so far. We refer to these two conditions as the fairness conditions. If (case 1) or and the fairness conditions hold (case 2), then we accept . If the fairness conditions hold but then we accept if there is a past candidate in with lower true value than (subcase 4a), or a future candidate in with low predicted value compared to (subcase 4b).
The main technical challenge in generalizing the pegging idea to arises when a candidate arrives, but the fairness conditions do not hold (case 3). In this situation, it is unclear whether to reject and peg a future candidate, or accept . For instance, consider a scenario where the prediction error is consistently large (i.e., is always large), such that when arrives, the set is always non-empty.
If and we accept , we risk depleting our budget too quickly before time , leaving insufficient capacity to accept candidates not in who arrive later. Conversely, if we reject , we deny it the possibility of acceptance in the first half of the time horizon, potentially reducing its overall acceptance probability. -Pegging balances this tradeoff while achieving smoothness.
To establish smoothness, we demonstrate that the candidates with the highest predicted values can be mapped to the solution set , ensuring that the true values within our solution set are pairwise “close” to the values of candidates in . This is proven in Lemma8 by constructing an injective function from set to such that for each , .
5 Experiments
We simulate our Additive-Pegging and Multiplicative-Pegging algorithms in the exact experimental setup of Fujii and Yoshida (2023), to test its average-case performance.
Experimental Setup. Fujii and Yoshida (2023) generate various types of instances. We follow their Almost-constant, Uniform, and Adversarial types of instances, and also create the Unfair type of instance to further highlight how slightly biased predictions can lead to very unfair outcomes. Both true and predicted values of candidates in all these instance types are parameterized by a scalar which controls the prediction error.
Setting creates instances with perfect predictions and setting a higher value of creates instances with more erroneous predictions.
Almost-constant models a situation where one candidate has a true value of and the rest of the candidates have a value of . All predictions are set to . In Uniform, we sample each independently from the exponential distribution with parameter . The exponential distribution generates a large value with a small probability and consequently models a situation where one candidate is significantly better than the rest. All predicted values are generated by perturbing the actual value with the uniform distribution, i.e., , where is sampled uniformly and independently from .
In Adversarial, the true values are again independent samples from the exponential distribution with parameter . The predictions are “adversarially” perturbed while maintaining the error to be at most in the following manner: if belongs to the top half of candidates in terms of true value, then ; if belongs to the bottom half, then . Finally, in Unfair all candidates have values that are at most a multiplicative factor apart. Formally, is a uniform value in , and since we have that the smallest and largest value are indeed very close.
We set where is the rank of , i.e., predictions create a completely inverted order.
We compare Additive-Pegging and Multiplicative-Pegging against Learned-Dynkin Fujii and Yoshida (2023), Highest-prediction which always accepts the candidate with the highest prediction, and the classical Dynkin algorithm which does not use the predictions. Following Fujii and Yoshida (2023), we set the number of candidates to be . We experiment with all values of in . For each type of instance and value of in this set, we randomly generate 10000 instances, and then run each algorithm on each instance. For each algorithm, we consider instance-wise the ratio of the true value it accepted to the maximum true value, calling the average of this ratio across the 10000 instances its competitive ratio. For each algorithm, we consider the fraction of the 10000 instances on which it successfully accepted the candidate with the highest true value, calling this fraction its fairness. We report the competitive ratio and fairness of each algorithm, for each type of instance and each value of , in Figure1. Our code is written in Python 3.11.5 and we conduct experiments on an M3 Pro CPU with 18 GB of RAM. The total runtime is less than minutes.
Results. The results are summarized in figure1. Since Additive-Pegging and Multiplicative-Pegging achieve almost the same competitive ratio and fairness for all instance types and values of we only present Additive-Pegging in figure1 but include the code of both in the supplementary material.
Our algorithms are consistently either the best or close to the best in terms of both competitive ratio and fairness for all different pairs of instance types and values.
Before discussing the results of each instance type individually it is instructive to mention some characteristics of our benchmarks. While Dynkin does not use predictions and is therefore bound to suboptimal competitive ratios when predictions are accurate, we note that it accepts the maximum value candidate with probability at least , i.e., it is -fair. When predictions are non-informative this is an upper bound on the attainable fairness for any algorithm whether it uses predictions or not. Highest-prediction is expected to perform well when the highest prediction matches the true highest value candidate and poorly when the latter is not true. In Almost-constant for small values of all candidates have very close true values and all algorithms except Dynkin have a competitive ratio close to . Dynkin may not accept any candidate and this is why its performance is poorer than the rest of the algorithms. Note that as increases both our algorithms perform significantly better than all other benchmarks.
In terms of fairness, predictions do not offer any information regarding the ordinal comparison between candidates’ true values and this is why for small values of the probability of Highest-prediction and Learned-Dynkin of accepting the best candidate is close to , i.e., random. Here, the fairness of our algorithms and Dynkin is similar and close to . In both Uniform and Adversarial we observe that for small values of the highest predicted candidate is the true highest and Additive-Pegging, Learned-Dynkin and Highest-prediction all accept that candidate having a very close performance both in terms of fairness and competitive ratio. For higher values of the fairness of those algorithms deteriorates similarly and it approaches again .
In Unfair our algorithms outperform all other benchmarks in terms of competitive ratio for all values of and achieve a close to optimal fairness. This is expected as our algorithms are particularly suited for cases where predictions may be accurate but unfair.
Figure 1: Competitive ratio and fairness of different algorithms, for each instance type and level of .
Overall, our algorithms are the best-performing and most robust. The Highest-prediction algorithm does perform slightly better on Uniform instances and Adversarial instances under most values of , but performs consistently worse on Almost-constant and Unfair instances, especially in terms of fairness. Our algorithms perform better than Learned-Dynkin in almost all situations.
6 Limitations and Future Work
We study a notion of fairness that is tailored to the secretary problem with predictions and build our algorithms based on this notion.
However, there are alternative notions of fairness one could consider in applications such as hiring, as well as variations of the secretary problem that capture other features in these applications.
While our model allows for arbitrary bias in the predictions we assume that the true value of a candidate is fully discovered upon arrival, and define fairness based on hiring the best candidate (who has the highest true value) with a reasonable probability. Thus, we ignore considerations such as bias in how we get the true value of a candidate (e.g., via an interview process).
In addition, as noted in Section1, we use an individual fairness notion which does not model other natural desiderata like hiring from underprivileged populations or balance the hiring probabilities across different populations. These are considerations with potentially high societal impact which our algorithms do not consider and are interesting directions for future work on fair selection with predictions.
Regarding trade-offs in our guarantees: for the single-secretary problem, we can improve the fairness guarantee from to by optimizing the constants in our algorithm. However, we choose not to do so, as the performance increase is marginal, and we aim to keep the proof as simple as possible.
Additionally, as we noted in Section1.1 any constant for smoothness implies an upper bound of for fairness. Finding the Pareto-optimal curve in terms of smoothness and fairness is an interesting direction. The main challenge in achieving a smooth trade-off between fairness and smoothness is as follows: any bound on for smoothness implies a competitive ratio of , which reaches a ratio of 1 when the predictions are exactly correct. Thus, regardless of the smoothness guarantee, we must achieve a competitive ratio of 1 when predictions are fully accurate. This constraint makes it challenging to improve the fairness guarantee , even at the cost of a less favorable smoothness constant .
Antoniadis et al. [2020a]
Antonios Antoniadis, Christian Coester, Marek Elias, Adam Polak, and Bertrand Simon.
Online metric algorithms with untrusted predictions.
In Proceedings of the 37th International Conference on Machine Learning (ICML), pages 345–355, 2020a.
Antoniadis et al. [2020b]
Antonios Antoniadis, Themis Gouleakis, Pieter Kleer, and Pavel Kolev.
Secretary and online matching problems with machine learned advice.
In Proceedings of the 33rd Annual Conference on Neural Information Processing Systems (NeurIPS), pages 7933–7944, 2020b.
Arsenis and Kleinberg [2022]
Makis Arsenis and Robert Kleinberg.
Individual fairness in prophet inequalities.
In Proceedings of the 23rd ACM Conference on Economics and Computation (EC), page 245, 2022.
Azar et al. [2018]
Yossi Azar, Ashish Chiplunkar, and Haim Kaplan.
Prophet secretary: Surpassing the barrier.
In Proceedings of the 19th ACM Conference on Economics and Computation (EC), pages 303–318, 2018.
Babaioff et al. [2007a]
Moshe Babaioff, Nicole Immorlica, David Kempe, and Robert Kleinberg.
A knapsack secretary problem with applications.
In Proceedings of the 10th International Workshop on Approximation Algorithms for Combinatorial Optimization Problems (APPROX), pages 16–28, 2007a.
Babaioff et al. [2007b]
Moshe Babaioff, Nicole Immorlica, and Robert Kleinberg.
Matroids, secretary problems, and online mechanisms.
In Proceedings of the 18th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), page 434–443, 2007b.
Balseiro et al. [2021]
Santiago Balseiro, Haihao Lu, and Vahab Mirrokni.
Regularized online allocation problems: Fairness and beyond.
In Proceedings of the 38th International Conference on Machine Learning (ICML), pages 630–639, 2021.
Bertsimas et al. [2011]
Dimitris Bertsimas, Vivek F. Farias, and Nikolaos Trichakis.
The price of fairness.
Operations Research, 59(1):17–31, 2011.
Bertsimas et al. [2012]
Dimitris Bertsimas, Vivek F. Farias, and Nikolaos Trichakis.
On the efficiency-fairness trade-off.
Management Science, 12:2234–2250, 2012.
Bogen and Rieke [2018]
Miranda Bogen and Aaron Rieke.
Help wanted: an examination of hiring algorithms.
Equity, and Bias, Upturn (December 2018), 2018.
Celis et al. [2018]
L. Elisa Celis, Damian Straszak, and Nisheeth K. Vishnoi.
Ranking with fairness constraints.
In Proceedings of the 45th International Colloquium on Automata, Languages and Programming (ICALP), 2018.
Chierichetti et al. [2019]
Flavio Chierichetti, Ravi Kumar, Silvio Lattanzi, and Sergei Vassilvtiskii.
Matroids, matchings, and fairness.
In Proceedings of the 22ndInternational Conference on Artificial Intelligence and Statistics (AISTATS), pages 2212–2220, 2019.
Choo and Ling [2024]
Davin Choo and Chun Kai Ling.
A short note about the learning-augmented secretary problem.
arXiv preprint arXiv:2410.06583, 2024.
Chouldechova [2017]
Alexandra Chouldechova.
Fair prediction with disparate impact: A study of bias in recidivism prediction instruments.
Big Data, 5(2):153–163, June 2017.
Chouldechova et al. [2018]
Alexandra Chouldechova, Diana Benavides-Prado, Oleksandr Fialko, and Rhema Vaithianathan.
A case study of algorithm-assisted decision making in child maltreatment hotline screening decisions.
In Proceedings of the 1st Conference on Fairness, Accountability and Transparency (FAccT), pages 134–148, 2018.
Cohen et al. [2019]
Lee Cohen, Zachary C. Lipton, and Yishay Mansour.
Efficient candidate screening under multiple tests and implications for fairness.
In Proceedings of the 1st Symposium on the foundations of responsible computing (FORC), page 1–20, 2019.
Cohen et al. [2022]
Maxime C. Cohen, Adam N. Elmachtoub, and Xiao Lei.
Price discrimination with fairness constraints.
Management Science, 68(12):8536–8552, 2022.
Corbett-Davies et al. [2017]
Sam Corbett-Davies, Emma Pierson, Avi Feller, Sharad Goel, and Aziz Huq.
Algorithmic decision making and the cost of fairness.
In Proceedings of the 23rd International Conference on Knowledge Discovery and Data Mining (KDD), page 797–806, 2017.
Correa et al. [2021a]
José R. Correa, Andrés Cristi, Laurent Feuilloley, Tim Oosterwijk, and Alexandros Tsigonias-Dimitriadis.
The secretary problem with independent sampling.
In Proceedings of the 32nd Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 2047–2058, 2021a.
Correa et al. [2021b]
José R. Correa, Patricio Foncea, Ruben Hoeksma, Tim Oosterwijk, and Tjark Vredeveld.
Posted price mechanisms and optimal threshold strategies for random arrivals.
Mathematics of Operations Research, 46(4):1452–1478, 2021b.
Correa et al. [2021c]
José R. Correa, Raimundo Saona, and Bruno Ziliotto.
Prophet secretary through blind strategies.
Mathematical Programming, 190(1):483–521, 2021c.
Dressel and Farid [2018]
Julia Dressel and Hany Farid.
The accuracy, fairness, and limits of predicting recidivism.
Science Advances, 4(1), 2018.
Dütting et al. [2021]
Paul Dütting, Silvio Lattanzi, Renato Paes Leme, and Sergei Vassilvitskii.
Secretaries with advice.
In Proceedings of the 22nd ACM Conference on Economics and Computation (EC), page 409–429, 2021.
Dwork et al. [2012]
Cynthia Dwork, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Richard Zemel.
Fairness through awareness.
In Proceedings of the 3rd Innovations in Theoretical Computer Science Conference (ITCS), page 214–226, 2012.
Dynkin [1963]
E. B. Dynkin.
The optimum choice of the instant for stopping a markov process.
Soviet Mathematics Doklady, 4:627–629, 1963.
Esfandiari et al. [2017]
Hossein Esfandiari, MohammadTaghi Hajiaghayi, Vahid Liaghat, and Morteza Monemizadeh.
Prophet secretary.
SIAM Journal on Discrete Mathematics (SIDMA), 31(3):1685–1701, 2017.
Feldman et al. [2015]
Michael Feldman, Sorelle A. Friedler, John Moeller, Carlos Scheidegger, and Suresh Venkatasubramanian.
Certifying and removing disparate impact.
In Proceedings of the 21th International Conference on Knowledge Discovery and Data Mining (KDD), page 259–268, 2015.
Fujii and Yoshida [2023]
Kaito Fujii and Yuichi Yoshida.
The secretary problem with predictions.
In Mathematics of Operations Research, 2023.
Gardner [1960]
Martin Gardner.
Mathematical games.
Scientific American, 202(3):172–186, 1960.
Gilbert and Mosteller [1966]
John P. Gilbert and Frederick Mosteller.
Recognizing the maximum of a sequence.
Journal of the American Statistical Association, 61:35–73, 1966.
Grossman [2010]
Lev Grossman.
Are face-detection cameras racist?
Time, 2010.
Howard and Borenstein [2018]
A Howard and J Borenstein.
The ugly truth about ourselves and our robot creations: The problem of bias and social inequity.
Science and Engineering Ethics, 24(5):1521–1536, October 2018.
Joseph et al. [2016]
Matthew Joseph, Michael Kearns, Jamie H Morgenstern, and Aaron Roth.
Fairness in learning: Classic and contextual bandits.
In Proceedings of the 29th Annual Conference on Neural Information Processing Systems (NeurIPS), 2016.
Kamishima et al. [2011]
Toshihiro Kamishima, Shotaro Akaho, and Jun Sakuma.
Fairness-aware learning through regularization approach.
In Proceedings of the 11thInternational Conference on Data Mining Workshops (ICDMW), pages 643–650, 2011.
Kamishima et al. [2012]
Toshihiro Kamishima, Shotaro Akaho, Hideki Asoh, and Jun Sakuma.
Fairness-aware classifier with prejudice remover regularizer.
In Peter A. Flach, Tijl De Bie, and Nello Cristianini, editors, Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD), volume 7524, 2012.
Kaplan et al. [2020]
Haim Kaplan, David Naori, and Danny Raz.
Competitive analysis with a sample and the secretary problem.
In Proceedings of the 31st Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 2082–2095, 2020.
Kleinberg and Raghavan [2017]
Jon Kleinberg and Manish Raghavan.
Inherent trade-offs in the fair determination of risk scores.
In Proceedings of the 8th Innovations in Theoretical Computer Science Conference (ITCS), 2017.
Kleinberg et al. [2017]
Jon Kleinberg, Himabindu Lakkaraju, Jure Leskovec, Jens Ludwig, and Sendhil Mullainathan.
Human Decisions and Machine Predictions.
The Quarterly Journal of Economics, 133(1):237–293, 2017.
Kleinberg [2005]
Robert D. Kleinberg.
A multiple-choice secretary algorithm with applications to online auctions.
In Proceedings of the 16th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 630–631, 2005.
Lambrecht and Tucker [2019]
Anja Lambrecht and Catherine Tucker.
Algorithmic bias? an empirical study of apparent gender-based discrimination in the display of stem career ads.
Management Science, 65(7):2966–2981, 2019.
Lindley [1961]
D. V. Lindley.
Dynamic programming and decision theory.
Journal of the Royal Statistical Society: Series C (Applied Statistics), 10(1):39–51, 1961.
Luong et al. [2011]
Binh Thanh Luong, Salvatore Ruggieri, and Franco Turini.
k-nn as an implementation of situation testing for discrimination discovery and prevention.
In Proceedings of the 17th International Conference on Knowledge Discovery and Data Mining (KDD), page 502–510, 2011.
Lykouris and Vassilvitskii [2018]
Thodoris Lykouris and Sergei Vassilvitskii.
Competitive caching with machine learned advice.
In Proceedings of the 35th International Conference on Machine Learning (ICML), pages 3302–3311, 2018.
Ma et al. [2022]
Will Ma, Pan Xu, and Yifan Xu.
Group-level fairness maximization in online bipartite matching.
In Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems, page 1687–1689, 2022.
Malhotra and Malhotra [2003]
Rashmi Malhotra and D. K. Malhotra.
Evaluating con