-
Unbiased Learning to Rank Meets Reality: Lessons from Baidu's Large-Scale Search Dataset
Authors:
Philipp Hager,
Romain Deffayet,
Jean-Michel Renders,
Onno Zoeter,
Maarten de Rijke
Abstract:
Unbiased learning-to-rank (ULTR) is a well-established framework for learning from user clicks, which are often biased by the ranker collecting the data. While theoretically justified and extensively tested in simulation, ULTR techniques lack empirical validation, especially on modern search engines. The Baidu-ULTR dataset released for the WSDM Cup 2023, collected from Baidu's search engine, offer…
▽ More
Unbiased learning-to-rank (ULTR) is a well-established framework for learning from user clicks, which are often biased by the ranker collecting the data. While theoretically justified and extensively tested in simulation, ULTR techniques lack empirical validation, especially on modern search engines. The Baidu-ULTR dataset released for the WSDM Cup 2023, collected from Baidu's search engine, offers a rare opportunity to assess the real-world performance of prominent ULTR techniques. Despite multiple submissions during the WSDM Cup 2023 and the subsequent NTCIR ULTRE-2 task, it remains unclear whether the observed improvements stem from applying ULTR or other learning techniques.
In this work, we revisit and extend the available experiments on the Baidu-ULTR dataset. We find that standard unbiased learning-to-rank techniques robustly improve click predictions but struggle to consistently improve ranking performance, especially considering the stark differences obtained by choice of ranking loss and query-document features. Our experiments reveal that gains in click prediction do not necessarily translate to enhanced ranking performance on expert relevance annotations, implying that conclusions strongly depend on how success is measured in this benchmark.
△ Less
Submitted 15 May, 2024; v1 submitted 3 April, 2024;
originally announced April 2024.
-
MIRT: a simultaneous reconstruction and affine motion compensation technique for four dimensional computed tomography (4DCT)
Authors:
Anh-Tuan Nguyen,
Jens Renders,
Domenico Iuso,
Yves Maris,
Jeroen Soete,
Martine Wevers,
Jan Sijbers,
Jan De Beenhouwer
Abstract:
In four-dimensional computed tomography (4DCT), 3D images of moving or deforming samples are reconstructed from a set of 2D projection images. Recent techniques for iterative motion-compensated reconstruction either necessitate a reference acquisition or alternate image reconstruction and motion estimation steps. In these methods, the motion estimation step involves the estimation of either comple…
▽ More
In four-dimensional computed tomography (4DCT), 3D images of moving or deforming samples are reconstructed from a set of 2D projection images. Recent techniques for iterative motion-compensated reconstruction either necessitate a reference acquisition or alternate image reconstruction and motion estimation steps. In these methods, the motion estimation step involves the estimation of either complete deformation vector fields (DVFs) or a limited set of parameters corresponding to the affine motion, including rigid motion or scaling. The majority of these approaches rely on nested iterations, incurring significant computational expenses. Notably, despite the direct benefits of an analytical formulation and a substantial reduction in computational complexity, there has been no exploration into parameterizing DVFs for general affine motion in CT imaging. In this work, we propose the Motion-compensated Iterative Reconstruction Technique (MIRT)- an efficient iterative reconstruction scheme that combines image reconstruction and affine motion estimation in a single update step, based on the analytical gradients of the motion towards both the reconstruction and the affine motion parameters. When most of the state-of-the-art 4DCT methods have not attempted to be tested on real data, results from simulation and real experiments show that our method outperforms the state-of-the-art CT reconstruction with affine motion correction methods in computational feasibility and projection distance. In particular, this allows accurate reconstruction for a proper microscale diamond in the appearance of motion from the practically acquired projection radiographs, which leads to a novel application of 4DCT.
△ Less
Submitted 6 February, 2024;
originally announced February 2024.
-
SLIM: Skill Learning with Multiple Critics
Authors:
David Emukpere,
Bingbing Wu,
Julien Perez,
Jean-Michel Renders
Abstract:
Self-supervised skill learning aims to acquire useful behaviors that leverage the underlying dynamics of the environment. Latent variable models, based on mutual information maximization, have been successful in this task but still struggle in the context of robotic manipulation. As it requires impacting a possibly large set of degrees of freedom composing the environment, mutual information maxim…
▽ More
Self-supervised skill learning aims to acquire useful behaviors that leverage the underlying dynamics of the environment. Latent variable models, based on mutual information maximization, have been successful in this task but still struggle in the context of robotic manipulation. As it requires impacting a possibly large set of degrees of freedom composing the environment, mutual information maximization fails alone in producing useful and safe manipulation behaviors. Furthermore, tackling this by augmenting skill discovery rewards with additional rewards through a naive combination might fail to produce desired behaviors. To address this limitation, we introduce SLIM, a multi-critic learning approach for skill discovery with a particular focus on robotic manipulation. Our main insight is that utilizing multiple critics in an actor-critic framework to gracefully combine multiple reward functions leads to a significant improvement in latent-variable skill discovery for robotic manipulation while overcoming possible interference occurring among rewards which hinders convergence to useful skills. Furthermore, in the context of tabletop manipulation, we demonstrate the applicability of our novel skill discovery approach to acquire safe and efficient motor primitives in a hierarchical reinforcement learning fashion and leverage them through planning, significantly surpassing baseline approaches for skill discovery.
△ Less
Submitted 21 March, 2024; v1 submitted 1 February, 2024;
originally announced February 2024.
-
HIST-Critical Graphs and Malkevitch's Conjecture
Authors:
Jan Goedgebeur,
Kenta Noguchi,
Jarne Renders,
Carol T. Zamfirescu
Abstract:
In a given graph, a HIST is a spanning tree without $2$-valent vertices. Motivated by developing a better understanding of HIST-free graphs, i.e. graphs containing no HIST, in this article's first part we study HIST-critical graphs, i.e. HIST-free graphs in which every vertex-deleted subgraph does contain a HIST (e.g. a triangle). We give an almost complete characterisation of the orders for which…
▽ More
In a given graph, a HIST is a spanning tree without $2$-valent vertices. Motivated by developing a better understanding of HIST-free graphs, i.e. graphs containing no HIST, in this article's first part we study HIST-critical graphs, i.e. HIST-free graphs in which every vertex-deleted subgraph does contain a HIST (e.g. a triangle). We give an almost complete characterisation of the orders for which these graphs exist and present an infinite family of planar examples which are $3$-connected and in which nearly all vertices are $4$-valent. This leads naturally to the second part in which we investigate planar $4$-regular graphs with and without HISTs, motivated by a conjecture of Malkevitch, which we computationally verify up to order $22$. First we enumerate HISTs in antiprisms, whereafter we present planar $4$-regular graphs with and without HISTs, obtained via line graphs.
△ Less
Submitted 9 January, 2024;
originally announced January 2024.
-
SARDINE: A Simulator for Automated Recommendation in Dynamic and Interactive Environments
Authors:
Romain Deffayet,
Thibaut Thonet,
Dongyoon Hwang,
Vassilissa Lehoux,
Jean-Michel Renders,
Maarten de Rijke
Abstract:
Simulators can provide valuable insights for researchers and practitioners who wish to improve recommender systems, because they allow one to easily tweak the experimental setup in which recommender systems operate, and as a result lower the cost of identifying general trends and uncovering novel findings about the candidate methods. A key requirement to enable this accelerated improvement cycle i…
▽ More
Simulators can provide valuable insights for researchers and practitioners who wish to improve recommender systems, because they allow one to easily tweak the experimental setup in which recommender systems operate, and as a result lower the cost of identifying general trends and uncovering novel findings about the candidate methods. A key requirement to enable this accelerated improvement cycle is that the simulator is able to span the various sources of complexity that can be found in the real recommendation environment that it simulates.
With the emergence of interactive and data-driven methods - e.g., reinforcement learning or online and counterfactual learning-to-rank - that aim to achieve user-related goals beyond the traditional accuracy-centric objectives, adequate simulators are needed. In particular, such simulators must model the various mechanisms that render the recommendation environment dynamic and interactive, e.g., the effect of recommendations on the user or the effect of biased data on subsequent iterations of the recommender system. We therefore propose SARDINE, a flexible and interpretable recommendation simulator that can help accelerate research in interactive and data-driven recommender systems. We demonstrate its usefulness by studying existing methods within nine diverse environments derived from SARDINE, and even uncover novel insights about them.
△ Less
Submitted 8 April, 2024; v1 submitted 28 November, 2023;
originally announced November 2023.
-
Generation and New Infinite Families of $K_2$-hypohamiltonian Graphs
Authors:
Jan Goedgebeur,
Jarne Renders,
Carol T. Zamfirescu
Abstract:
We present an algorithm which can generate all pairwise non-isomorphic $K_2$-hypohamiltonian graphs, i.e. non-hamiltonian graphs in which the removal of any pair of adjacent vertices yields a hamiltonian graph, of a given order. We introduce new bounding criteria specifically designed for $K_2$-hypohamiltonian graphs, allowing us to improve upon earlier computational results. Specifically, we char…
▽ More
We present an algorithm which can generate all pairwise non-isomorphic $K_2$-hypohamiltonian graphs, i.e. non-hamiltonian graphs in which the removal of any pair of adjacent vertices yields a hamiltonian graph, of a given order. We introduce new bounding criteria specifically designed for $K_2$-hypohamiltonian graphs, allowing us to improve upon earlier computational results. Specifically, we characterise the orders for which $K_2$-hypohamiltonian graphs exist and improve existing lower bounds on the orders of the smallest planar and the smallest bipartite $K_2$-hypohamiltonian graphs. Furthermore, we describe a new operation for creating $K_2$-hypohamiltonian graphs that preserves planarity under certain conditions and use it to prove the existence of a planar $K_2$-hypohamiltonian graph of order $n$ for every integer $n\geq 134$. Additionally, motivated by a theorem of Thomassen on hypohamiltonian graphs, we show the existence $K_2$-hypohamiltonian graphs with large maximum degree and size.
△ Less
Submitted 28 February, 2024; v1 submitted 17 November, 2023;
originally announced November 2023.
-
$K_2$-Hamiltonian Graphs: II
Authors:
Jan Goedgebeur,
Jarne Renders,
Gábor Wiener,
Carol T. Zamfirescu
Abstract:
In this paper we use theoretical and computational tools to continue our investigation of $K_2$-hamiltonian graphs, i.e. graphs in which the removal of any pair of adjacent vertices yields a hamiltonian graph, and their interplay with $K_1$-hamiltonian graphs, i.e. graphs in which every vertex-deleted subgraph is hamiltonian. Perhaps surprisingly, there exist graphs that are both $K_1$- and $K_2$-…
▽ More
In this paper we use theoretical and computational tools to continue our investigation of $K_2$-hamiltonian graphs, i.e. graphs in which the removal of any pair of adjacent vertices yields a hamiltonian graph, and their interplay with $K_1$-hamiltonian graphs, i.e. graphs in which every vertex-deleted subgraph is hamiltonian. Perhaps surprisingly, there exist graphs that are both $K_1$- and $K_2$-hamiltonian, yet non-hamiltonian, e.g. the Petersen graph. Grünbaum conjectured that every planar $K_1$-hamiltonian graph must itself be hamiltonian; Thomassen disproved this conjecture. Here we show that even planar graphs that are both $K_1$- and $K_2$-hamiltonian need not be hamiltonian, and that the number of such graphs grows at least exponentially. Motivated by results of Aldred, McKay, and Wormald, we determine for every integer $n$ that is not 14 or 17 whether there exists a $K_2$-hypohamiltonian, i.e. non-hamiltonian and $K_2$-hamiltonian, graph of order~$n$, and characterise all orders for which such cubic graphs and such snarks exist. We also describe the smallest cubic planar graph which is $K_2$-hypohamiltonian, as well as the smallest planar $K_2$-hypohamiltonian graph of girth $5$. We conclude with open problems and by correcting two inaccuracies from the first article [Zamfirescu, SIAM J. Disc. Math. 35 (2021) 1706-1728].
△ Less
Submitted 9 November, 2023;
originally announced November 2023.
-
Distributional Reinforcement Learning with Dual Expectile-Quantile Regression
Authors:
Sami Jullien,
Romain Deffayet,
Jean-Michel Renders,
Paul Groth,
Maarten de Rijke
Abstract:
Distributional reinforcement learning (RL) has proven useful in multiple benchmarks as it enables approximating the full distribution of returns and makes a better use of environment samples. The commonly used quantile regression approach to distributional RL -- based on asymmetric $L_1$ losses -- provides a flexible and effective way of learning arbitrary return distributions. In practice, it is…
▽ More
Distributional reinforcement learning (RL) has proven useful in multiple benchmarks as it enables approximating the full distribution of returns and makes a better use of environment samples. The commonly used quantile regression approach to distributional RL -- based on asymmetric $L_1$ losses -- provides a flexible and effective way of learning arbitrary return distributions. In practice, it is often improved by using a more efficient, hybrid asymmetric $L_1$-$L_2$ Huber loss for quantile regression. However, by doing so, distributional estimation guarantees vanish, and we empirically observe that the estimated distribution rapidly collapses to its mean. Indeed, asymmetric $L_2$ losses, corresponding to expectile regression, cannot be readily used for distributional temporal difference learning. Motivated by the efficiency of $L_2$-based learning, we propose to jointly learn expectiles and quantiles of the return distribution in a way that allows efficient learning while keeping an estimate of the full distribution of returns. We prove that our approach approximately learns the correct return distribution, and we benchmark a practical implementation on a toy example and at scale. On the Atari benchmark, our approach matches the performance of the Huber-based IQN-1 baseline after $200$M training frames but avoids distributional collapse and keeps estimates of the full distribution of returns.
△ Less
Submitted 18 March, 2024; v1 submitted 26 May, 2023;
originally announced May 2023.
-
Frank number and nowhere-zero flows on graphs
Authors:
Jan Goedgebeur,
Edita Máčajová,
Jarne Renders
Abstract:
An edge $e$ of a graph $G$ is called deletable for some orientation $o$ if the restriction of $o$ to $G-e$ is a strong orientation. Inspired by a problem of Frank, in 2021 Hörsch and Szigeti proposed a new parameter for $3$-edge-connected graphs, called the Frank number, which refines $k$-edge-connectivity. The Frank number is defined as the minimum number of orientations of $G$ for which every ed…
▽ More
An edge $e$ of a graph $G$ is called deletable for some orientation $o$ if the restriction of $o$ to $G-e$ is a strong orientation. Inspired by a problem of Frank, in 2021 Hörsch and Szigeti proposed a new parameter for $3$-edge-connected graphs, called the Frank number, which refines $k$-edge-connectivity. The Frank number is defined as the minimum number of orientations of $G$ for which every edge of $G$ is deletable in at least one of them. They showed that every $3$-edge-connected graph has Frank number at most $7$ and that in case these graphs are also $3$-edge-colourable the parameter is at most $3$. Here we strengthen both results by showing that every $3$-edge-connected graph has Frank number at most $4$ and that every graph which is $3$-edge-connected and $3$-edge-colourable has Frank number $2$. The latter also confirms a conjecture by Barát and Blázsik. Furthermore, we prove two sufficient conditions for cubic graphs to have Frank number $2$ and use them in an algorithm to computationally show that the Petersen graph is the only cyclically $4$-edge-connected cubic graph up to $36$ vertices having Frank number greater than $2$.
△ Less
Submitted 10 November, 2023; v1 submitted 3 May, 2023;
originally announced May 2023.
-
An Offline Metric for the Debiasedness of Click Models
Authors:
Romain Deffayet,
Philipp Hager,
Jean-Michel Renders,
Maarten de Rijke
Abstract:
A well-known problem when learning from user clicks are inherent biases prevalent in the data, such as position or trust bias. Click models are a common method for extracting information from user clicks, such as document relevance in web search, or to estimate click biases for downstream applications such as counterfactual learning-to-rank, ad placement, or fair ranking. Recent work shows that th…
▽ More
A well-known problem when learning from user clicks are inherent biases prevalent in the data, such as position or trust bias. Click models are a common method for extracting information from user clicks, such as document relevance in web search, or to estimate click biases for downstream applications such as counterfactual learning-to-rank, ad placement, or fair ranking. Recent work shows that the current evaluation practices in the community fail to guarantee that a well-performing click model generalizes well to downstream tasks in which the ranking distribution differs from the training distribution, i.e., under covariate shift. In this work, we propose an evaluation metric based on conditional independence testing to detect a lack of robustness to covariate shift in click models. We introduce the concept of debiasedness and a metric for measuring it. We prove that debiasedness is a necessary condition for recovering unbiased and consistent relevance scores and for the invariance of click prediction under covariate shift. In extensive semi-synthetic experiments, we show that our proposed metric helps to predict the downstream performance of click models under covariate shift and is useful in an off-policy model selection setting.
△ Less
Submitted 11 May, 2023; v1 submitted 19 April, 2023;
originally announced April 2023.
-
Generative Slate Recommendation with Reinforcement Learning
Authors:
Romain Deffayet,
Thibaut Thonet,
Jean-Michel Renders,
Maarten de Rijke
Abstract:
Recent research has employed reinforcement learning (RL) algorithms to optimize long-term user engagement in recommender systems, thereby avoiding common pitfalls such as user boredom and filter bubbles. They capture the sequential and interactive nature of recommendations, and thus offer a principled way to deal with long-term rewards and avoid myopic behaviors. However, RL approaches are intract…
▽ More
Recent research has employed reinforcement learning (RL) algorithms to optimize long-term user engagement in recommender systems, thereby avoiding common pitfalls such as user boredom and filter bubbles. They capture the sequential and interactive nature of recommendations, and thus offer a principled way to deal with long-term rewards and avoid myopic behaviors. However, RL approaches are intractable in the slate recommendation scenario - where a list of items is recommended at each interaction turn - due to the combinatorial action space. In that setting, an action corresponds to a slate that may contain any combination of items.
While previous work has proposed well-chosen decompositions of actions so as to ensure tractability, these rely on restrictive and sometimes unrealistic assumptions. Instead, in this work we propose to encode slates in a continuous, low-dimensional latent space learned by a variational auto-encoder. Then, the RL agent selects continuous actions in this latent space, which are ultimately decoded into the corresponding slates. By doing so, we are able to (i) relax assumptions required by previous work, and (ii) improve the quality of the action selection by modeling full slates instead of independent items, in particular by enabling diversity. Our experiments performed on a wide array of simulated environments confirm the effectiveness of our generative modeling of slates over baselines in practical scenarios where the restrictive assumptions underlying the baselines are lifted. Our findings suggest that representation learning using generative models is a promising direction towards generalizable RL-based slate recommendation.
△ Less
Submitted 24 January, 2023; v1 submitted 20 January, 2023;
originally announced January 2023.
-
Offline Evaluation for Reinforcement Learning-based Recommendation: A Critical Issue and Some Alternatives
Authors:
Romain Deffayet,
Thibaut Thonet,
Jean-Michel Renders,
Maarten de Rijke
Abstract:
In this paper, we argue that the paradigm commonly adopted for offline evaluation of sequential recommender systems is unsuitable for evaluating reinforcement learning-based recommenders. We find that most of the existing offline evaluation practices for reinforcement learning-based recommendation are based on a next-item prediction protocol, and detail three shortcomings of such an evaluation pro…
▽ More
In this paper, we argue that the paradigm commonly adopted for offline evaluation of sequential recommender systems is unsuitable for evaluating reinforcement learning-based recommenders. We find that most of the existing offline evaluation practices for reinforcement learning-based recommendation are based on a next-item prediction protocol, and detail three shortcomings of such an evaluation protocol. Notably, it cannot reflect the potential benefits that reinforcement learning (RL) is expected to bring while it hides critical deficiencies of certain offline RL agents. Our suggestions for alternative ways to evaluate RL-based recommender systems aim to shed light on the existing possibilities and inspire future research on reliable evaluation protocols.
△ Less
Submitted 3 January, 2023;
originally announced January 2023.
-
Pareto-Optimal Fairness-Utility Amortizations in Rankings with a DBN Exposure Model
Authors:
Till Kletti,
Jean-Michel Renders,
Patrick Loiseau
Abstract:
In recent years, it has become clear that rankings delivered in many areas need not only be useful to the users but also respect fairness of exposure for the item producers. We consider the problem of finding ranking policies that achieve a Pareto-optimal tradeoff between these two aspects. Several methods were proposed to solve it; for instance a popular one is to use linear programming with a Bi…
▽ More
In recent years, it has become clear that rankings delivered in many areas need not only be useful to the users but also respect fairness of exposure for the item producers. We consider the problem of finding ranking policies that achieve a Pareto-optimal tradeoff between these two aspects. Several methods were proposed to solve it; for instance a popular one is to use linear programming with a Birkhoff-von Neumann decomposition. These methods, however, are based on a classical Position Based exposure Model (PBM), which assumes independence between the items (hence the exposure only depends on the rank). In many applications, this assumption is unrealistic and the community increasingly moves towards considering other models that include dependences, such as the Dynamic Bayesian Network (DBN) exposure model. For such models, computing (exact) optimal fair ranking policies remains an open question.
We answer this question by leveraging a new geometrical method based on the so-called expohedron proposed recently for the PBM (Kletti et al., WSDM'22). We lay out the structure of a new geometrical object (the DBN-expohedron), and propose for it a Carathéodory decomposition algorithm of complexity $O(n^3)$, where $n$ is the number of documents to rank. Such an algorithm enables expressing any feasible expected exposure vector as a distribution over at most $n$ rankings; furthermore we show that we can compute the whole set of Pareto-optimal expected exposure vectors with the same complexity $O(n^3)$. Our work constitutes the first exact algorithm able to efficiently find a Pareto-optimal distribution of rankings. It is applicable to a broad range of fairness notions, including classical notions of meritocratic and demographic fairness. We empirically evaluate our method on the TREC2020 and MSLR datasets and compare it to several baselines in terms of Pareto-optimality and speed.
△ Less
Submitted 16 May, 2022;
originally announced May 2022.
-
Introducing the Expohedron for Efficient Pareto-optimal Fairness-Utility Amortizations in Repeated Rankings
Authors:
Till Kletti,
Jean-Michel Renders,
Patrick Loiseau
Abstract:
We consider the problem of computing a sequence of rankings that maximizes consumer-side utility while minimizing producer-side individual unfairness of exposure. While prior work has addressed this problem using linear or quadratic programs on bistochastic matrices, such approaches, relying on Birkhoff-von Neumann (BvN) decompositions, are too slow to be implemented at large scale.
In this pape…
▽ More
We consider the problem of computing a sequence of rankings that maximizes consumer-side utility while minimizing producer-side individual unfairness of exposure. While prior work has addressed this problem using linear or quadratic programs on bistochastic matrices, such approaches, relying on Birkhoff-von Neumann (BvN) decompositions, are too slow to be implemented at large scale.
In this paper we introduce a geometrical object, a polytope that we call expohedron, whose points represent all achievable exposures of items for a Position Based Model (PBM). We exhibit some of its properties and lay out a Carathéodory decomposition algorithm with complexity $O(n^2\log(n))$ able to express any point inside the expohedron as a convex sum of at most $n$ vertices, where $n$ is the number of items to rank. Such a decomposition makes it possible to express any feasible target exposure as a distribution over at most $n$ rankings. Furthermore we show that we can use this polytope to recover the whole Pareto frontier of the multi-objective fairness-utility optimization problem, using a simple geometrical procedure with complexity $O(n^2\log(n))$. Our approach compares favorably to linear or quadratic programming baselines in terms of algorithmic complexity and empirical runtime and is applicable to any merit that is a non-decreasing function of item relevance. Furthermore our solution can be expressed as a distribution over only $n$ permutations, instead of the $(n-1)^2 + 1$ achieved with BvN decompositions. We perform experiments on synthetic and real-world datasets, confirming our theoretical results.
△ Less
Submitted 7 February, 2022;
originally announced February 2022.
-
SmoothI: Smooth Rank Indicators for Differentiable IR Metrics
Authors:
Thibaut Thonet,
Yagmur Gizem Cinar,
Eric Gaussier,
Minghan Li,
Jean-Michel Renders
Abstract:
Information retrieval (IR) systems traditionally aim to maximize metrics built on rankings, such as precision or NDCG. However, the non-differentiability of the ranking operation prevents direct optimization of such metrics in state-of-the-art neural IR models, which rely entirely on the ability to compute meaningful gradients. To address this shortcoming, we propose SmoothI, a smooth approximatio…
▽ More
Information retrieval (IR) systems traditionally aim to maximize metrics built on rankings, such as precision or NDCG. However, the non-differentiability of the ranking operation prevents direct optimization of such metrics in state-of-the-art neural IR models, which rely entirely on the ability to compute meaningful gradients. To address this shortcoming, we propose SmoothI, a smooth approximation of rank indicators that serves as a basic building block to devise differentiable approximations of IR metrics. We further provide theoretical guarantees on SmoothI and derived approximations, showing in particular that the approximation errors decrease exponentially with an inverse temperature-like hyperparameter that controls the quality of the approximations. Extensive experiments conducted on four standard learning-to-rank datasets validate the efficacy of the listwise losses based on SmoothI, in comparison to previously proposed ones. Additional experiments with a vanilla BERT ranking model on a text-based IR task also confirm the benefits of our listwise approach.
△ Less
Submitted 3 May, 2021;
originally announced May 2021.
-
Interactive and Explainable Point-of-Interest Recommendation using Look-alike Groups
Authors:
Behrooz Omidvar-Tehrani,
Sruthi Viswanathan,
Jean-Michel Renders
Abstract:
Recommending Points-of-Interest (POIs) is surfacing in many location-based applications. The literature contains personalized and socialized POI recommendation approaches which employ historical check-ins and social links to make recommendations. However these systems still lack customizability (incorporating session-based user interactions with the system) and contextuality (incorporating the sit…
▽ More
Recommending Points-of-Interest (POIs) is surfacing in many location-based applications. The literature contains personalized and socialized POI recommendation approaches which employ historical check-ins and social links to make recommendations. However these systems still lack customizability (incorporating session-based user interactions with the system) and contextuality (incorporating the situational context of the user), particularly in cold start situations, where nearly no user information is available. In this paper, we propose LikeMind, a POI recommendation system which tackles the challenges of cold start, customizability, contextuality, and explainability by exploiting look-alike groups mined in public POI datasets. LikeMind reformulates the problem of POI recommendation, as recommending explainable look-alike groups (and their POIs) which are in line with user's interests. LikeMind frames the task of POI recommendation as an exploratory process where users interact with the system by expressing their favorite POIs, and their interactions impact the way look-alike groups are selected out. Moreover, LikeMind employs "mindsets", which capture actual situation and intent of the user, and enforce the semantics of POI interestingness. In an extensive set of experiments, we show the quality of our approach in recommending relevant look-alike groups and their POIs, in terms of efficiency and effectiveness.
△ Less
Submitted 31 August, 2020;
originally announced September 2020.
-
Real-Time Optimization Of Web Publisher RTB Revenues
Authors:
Pedro Chahuara,
Nicolas Grislain,
Grégoire Jauvion,
Jean-Michel Renders
Abstract:
This paper describes an engine to optimize web publisher revenues from second-price auctions. These auctions are widely used to sell online ad spaces in a mechanism called real-time bidding (RTB). Optimization within these auctions is crucial for web publishers, because setting appropriate reserve prices can significantly increase revenue. We consider a practical real-world setting where the only…
▽ More
This paper describes an engine to optimize web publisher revenues from second-price auctions. These auctions are widely used to sell online ad spaces in a mechanism called real-time bidding (RTB). Optimization within these auctions is crucial for web publishers, because setting appropriate reserve prices can significantly increase revenue. We consider a practical real-world setting where the only available information before an auction occurs consists of a user identifier and an ad placement identifier. The real-world challenges we had to tackle consist mainly of tracking the dependencies on both the user and placement in an highly non-stationary environment and of dealing with censored bid observations. These challenges led us to make the following design choices: (i) we adopted a relatively simple non-parametric regression model of auction revenue based on an incremental time-weighted matrix factorization which implicitly builds adaptive users' and placements' profiles; (ii) we jointly used a non-parametric model to estimate the first and second bids' distribution when they are censored, based on an on-line extension of the Aalen's Additive model.
Our engine is a component of a deployed system handling hundreds of web publishers across the world, serving billions of ads a day to hundreds of millions of visitors. The engine is able to predict, for each auction, an optimal reserve price in approximately one millisecond and yields a significant revenue increase for the web publishers.
△ Less
Submitted 12 June, 2020;
originally announced June 2020.
-
Modeling ASR Ambiguity for Dialogue State Tracking Using Word Confusion Networks
Authors:
Vaishali Pal,
Fabien Guillot,
Manish Shrivastava,
Jean-Michel Renders,
Laurent Besacier
Abstract:
Spoken dialogue systems typically use a list of top-N ASR hypotheses for inferring the semantic meaning and tracking the state of the dialogue. However ASR graphs, such as confusion networks (confnets), provide a compact representation of a richer hypothesis space than a top-N ASR list. In this paper, we study the benefits of using confusion networks with a state-of-the-art neural dialogue state t…
▽ More
Spoken dialogue systems typically use a list of top-N ASR hypotheses for inferring the semantic meaning and tracking the state of the dialogue. However ASR graphs, such as confusion networks (confnets), provide a compact representation of a richer hypothesis space than a top-N ASR list. In this paper, we study the benefits of using confusion networks with a state-of-the-art neural dialogue state tracker (DST). We encode the 2-dimensional confnet into a 1-dimensional sequence of embeddings using an attentional confusion network encoder which can be used with any DST system. Our confnet encoder is plugged into the state-of-the-art 'Global-locally Self-Attentive Dialogue State Tacker' (GLAD) model for DST and obtains significant improvements in both accuracy and inference time compared to using top-N ASR hypotheses.
△ Less
Submitted 1 August, 2020; v1 submitted 3 February, 2020;
originally announced February 2020.
-
Active Search for High Recall: a Non-Stationary Extension of Thompson Sampling
Authors:
Jean-Michel Renders
Abstract:
We consider the problem of Active Search, where a maximum of relevant objects - ideally all relevant objects - should be retrieved with the minimum effort or minimum time. Typically, there are two main challenges to face when tackling this problem: first, the class of relevant objects has often low prevalence and, secondly, this class can be multi-faceted or multi-modal: objects could be relevant…
▽ More
We consider the problem of Active Search, where a maximum of relevant objects - ideally all relevant objects - should be retrieved with the minimum effort or minimum time. Typically, there are two main challenges to face when tackling this problem: first, the class of relevant objects has often low prevalence and, secondly, this class can be multi-faceted or multi-modal: objects could be relevant for completely different reasons. To solve this problem and its associated issues, we propose an approach based on a non-stationary (aka restless) extension of Thompson Sampling, a well-known strategy for Multi-Armed Bandits problems. The collection is first soft-clustered into a finite set of components and a posterior distribution of getting a relevant object inside each cluster is updated after receiving the user feedback about the proposed instances. The "next instance" selection strategy is a mixed, two-level decision process, where both the soft clusters and their instances are considered. This method can be considered as an insurance, where the cost of the insurance is an extra exploration effort in the short run, for achieving a nearly "total" recall with less efforts in the long run.
△ Less
Submitted 21 March, 2018; v1 submitted 27 December, 2017;
originally announced December 2017.
-
Efficient Online Learning for Optimizing Value of Information: Theory and Application to Interactive Troubleshooting
Authors:
Yuxin Chen,
Jean-Michel Renders,
Morteza Haghir Chehreghani,
Andreas Krause
Abstract:
We consider the optimal value of information (VoI) problem, where the goal is to sequentially select a set of tests with a minimal cost, so that one can efficiently make the best decision based on the observed outcomes. Existing algorithms are either heuristics with no guarantees, or scale poorly (with exponential run time in terms of the number of available tests). Moreover, these methods assume…
▽ More
We consider the optimal value of information (VoI) problem, where the goal is to sequentially select a set of tests with a minimal cost, so that one can efficiently make the best decision based on the observed outcomes. Existing algorithms are either heuristics with no guarantees, or scale poorly (with exponential run time in terms of the number of available tests). Moreover, these methods assume a known distribution over the test outcomes, which is often not the case in practice. We propose an efficient sampling-based online learning framework to address the above issues. First, assuming the distribution over hypotheses is known, we propose a dynamic hypothesis enumeration strategy, which allows efficient information gathering with strong theoretical guarantees. We show that with sufficient amount of samples, one can identify a near-optimal decision with high probability. Second, when the parameters of the hypotheses distribution are unknown, we propose an algorithm which learns the parameters progressively via posterior sampling in an online fashion. We further establish a rigorous bound on the expected regret. We demonstrate the effectiveness of our approach on a real-world interactive troubleshooting application and show that one can efficiently make high-quality decisions with low cost.
△ Less
Submitted 17 July, 2017; v1 submitted 15 March, 2017;
originally announced March 2017.
-
Joint Event Detection and Entity Resolution: a Virtuous Cycle
Authors:
Matthias Galle,
Jean-Michel Renders,
Guillaume Jacquet
Abstract:
Clustering web documents has numerous applications, such as aggregating news articles into meaningful events, detecting trends and hot topics on the Web, preserving diversity in search results, etc. At the same time, the importance of named entities and, in particular, the ability to recognize them and to solve the associated co-reference resolution problem are widely recognized as key enabling fa…
▽ More
Clustering web documents has numerous applications, such as aggregating news articles into meaningful events, detecting trends and hot topics on the Web, preserving diversity in search results, etc. At the same time, the importance of named entities and, in particular, the ability to recognize them and to solve the associated co-reference resolution problem are widely recognized as key enabling factors when mining, aggregating and comparing content on the Web.
Instead of considering these two problems separately, we propose in this paper a method that tackles jointly the problem of clustering news articles into events and cross-document co-reference resolution of named entities. The co-occurrence of named entities in the same clusters is used as an additional signal to decide whether two referents should be merged into one entity. These refined entities can in turn be used as enhanced features to re-cluster the documents and then be refined again, entering into a virtuous cycle that improves simultaneously the performances of both tasks. We implemented a prototype system and report results using the TDT5 collection of news articles, demonstrating the potential of our approach.
△ Less
Submitted 18 July, 2016;
originally announced July 2016.
-
LSTM-based Mixture-of-Experts for Knowledge-Aware Dialogues
Authors:
Phong Le,
Marc Dymetman,
Jean-Michel Renders
Abstract:
We introduce an LSTM-based method for dynamically integrating several word-prediction experts to obtain a conditional language model which can be good simultaneously at several subtasks. We illustrate this general approach with an application to dialogue where we integrate a neural chat model, good at conversational aspects, with a neural question-answering model, good at retrieving precise inform…
▽ More
We introduce an LSTM-based method for dynamically integrating several word-prediction experts to obtain a conditional language model which can be good simultaneously at several subtasks. We illustrate this general approach with an application to dialogue where we integrate a neural chat model, good at conversational aspects, with a neural question-answering model, good at retrieving precise information from a knowledge-base, and show how the integration combines the strengths of the independent components. We hope that this focused contribution will attract attention on the benefits of using such mixtures of experts in NLP.
△ Less
Submitted 5 May, 2016;
originally announced May 2016.
-
Assisting Composition of Email Responses: a Topic Prediction Approach
Authors:
Spandana Gella,
Marc Dymetman,
Jean Michel Renders,
Sriram Venkatapathy
Abstract:
We propose an approach for helping agents compose email replies to customer requests. To enable that, we use LDA to extract latent topics from a collection of email exchanges. We then use these latent topics to label our data, obtaining a so-called "silver standard" topic labelling. We exploit this labelled set to train a classifier to: (i) predict the topic distribution of the entire agent's emai…
▽ More
We propose an approach for helping agents compose email replies to customer requests. To enable that, we use LDA to extract latent topics from a collection of email exchanges. We then use these latent topics to label our data, obtaining a so-called "silver standard" topic labelling. We exploit this labelled set to train a classifier to: (i) predict the topic distribution of the entire agent's email response, based on features of the customer's email; and (ii) predict the topic distribution of the next sentence in the agent's reply, based on the customer's email features and on features of the agent's current sentence. The experimental results on a large email collection from a contact center in the tele- com domain show that the proposed ap- proach is effective in predicting the best topic of the agent's next sentence. In 80% of the cases, the correct topic is present among the top five recommended topics (out of fifty possible ones). This shows the potential of this method to be applied in an interactive setting, where the agent is presented a small list of likely topics to choose from for the next sentence.
△ Less
Submitted 7 October, 2015;
originally announced October 2015.