(Translated by https://www.hiragana.jp/)
Search | arXiv e-print repository
Skip to main content

Showing 1–50 of 81 results for author: Kaufmann, E

.
  1. arXiv:2406.03033  [pdf, other

    cs.LG stat.ML

    Optimal Multi-Fidelity Best-Arm Identification

    Authors: Riccardo Poiani, Rémy Degenne, Emilie Kaufmann, Alberto Maria Metelli, Marcello Restelli

    Abstract: In bandit best-arm identification, an algorithm is tasked with finding the arm with highest mean reward with a specified accuracy as fast as possible. We study multi-fidelity best-arm identification, in which the algorithm can choose to sample an arm at a lower fidelity (less accurate mean estimate) for a lower cost. Several methods have been proposed for tackling this problem, but their optimalit… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

  2. arXiv:2406.02235  [pdf, other

    cs.AI

    Power Mean Estimation in Stochastic Monte-Carlo Tree_Search

    Authors: Tuan Dam, Odalric-Ambrym Maillard, Emilie Kaufmann

    Abstract: Monte-Carlo Tree Search (MCTS) is a widely-used strategy for online planning that combines Monte-Carlo sampling with forward tree search. Its success relies on the Upper Confidence bound for Trees (UCT) algorithm, an extension of the UCB method for multi-arm bandits. However, the theoretical foundation of UCT is incomplete due to an error in the logarithmic bonus term for action selection, leading… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

    Comments: UAI 2024 conference

  3. arXiv:2405.17108  [pdf, ps, other

    cs.LG

    Finding good policies in average-reward Markov Decision Processes without prior knowledge

    Authors: Adrienne Tuynman, Rémy Degenne, Emilie Kaufmann

    Abstract: We revisit the identification of an $\varepsilon$-optimal policy in average-reward Markov Decision Processes (MDP). In such MDPs, two measures of complexity have appeared in the literature: the diameter, $D$, and the optimal bias span, $H$, which satisfy $H\leq D$. Prior work have studied the complexity of $\varepsilon$-optimal policy identification only when a generative model is available. In th… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  4. arXiv:2311.05638  [pdf, ps, other

    stat.ML cs.LG

    Towards Instance-Optimality in Online PAC Reinforcement Learning

    Authors: Aymen Al-Marjani, Andrea Tirinzoni, Emilie Kaufmann

    Abstract: Several recent works have proposed instance-dependent upper bounds on the number of episodes needed to identify, with probability $1-δでるた$, an $\varepsilon$-optimal policy in finite-horizon tabular Markov Decision Processes (MDPs). These upper bounds feature various complexity measures for the MDP, which are defined based on different notions of sub-optimality gaps. However, as of now, no lower bound… ▽ More

    Submitted 31 October, 2023; originally announced November 2023.

  5. arXiv:2311.03992  [pdf, other

    stat.ML cs.LG

    Bandit Pareto Set Identification: the Fixed Budget Setting

    Authors: Cyrille Kone, Emilie Kaufmann, Laura Richert

    Abstract: We study a multi-objective pure exploration problem in a multi-armed bandit model. Each arm is associated to an unknown multi-variate distribution and the goal is to identify the distributions whose mean is not uniformly worse than that of another distribution: the Pareto optimal set. We propose and analyze the first algorithms for the \emph{fixed budget} Pareto Set Identification task. We propose… ▽ More

    Submitted 7 November, 2023; originally announced November 2023.

    Comments: 42 pages

  6. Agilicious: Open-Source and Open-Hardware Agile Quadrotor for Vision-Based Flight

    Authors: Philipp Foehn, Elia Kaufmann, Angel Romero, Robert Penicka, Sihao Sun, Leonard Bauersfeld, Thomas Laengle, Giovanni Cioffi, Yunlong Song, Antonio Loquercio, Davide Scaramuzza

    Abstract: Autonomous, agile quadrotor flight raises fundamental challenges for robotics research in terms of perception, planning, learning, and control. A versatile and standardized platform is needed to accelerate research and let practitioners focus on the core problems. To this end, we present Agilicious, a co-designed hardware and software framework tailored to autonomous, agile quadrotor flight. It is… ▽ More

    Submitted 12 July, 2023; originally announced July 2023.

    Comments: 14 pages, 5 figures, 2 tables

    Journal ref: Science Robotics Vol. 7, Issue 67, 2022

  7. arXiv:2307.00424  [pdf, other

    stat.ML cs.LG

    Adaptive Algorithms for Relaxed Pareto Set Identification

    Authors: Cyrille Kone, Emilie Kaufmann, Laura Richert

    Abstract: In this paper we revisit the fixed-confidence identification of the Pareto optimal set in a multi-objective multi-armed bandit model. As the sample complexity to identify the exact Pareto set can be very large, a relaxation allowing to output some additional near-optimal arms has been studied. In this work we also tackle alternative relaxations that allow instead to identify a relevant subset of t… ▽ More

    Submitted 3 November, 2023; v1 submitted 1 July, 2023; originally announced July 2023.

    MSC Class: 68T05

  8. arXiv:2306.13601  [pdf, other

    cs.LG

    Active Coverage for PAC Reinforcement Learning

    Authors: Aymen Al-Marjani, Andrea Tirinzoni, Emilie Kaufmann

    Abstract: Collecting and leveraging data with good coverage properties plays a crucial role in different aspects of reinforcement learning (RL), including reward-free exploration and offline learning. However, the notion of "good coverage" really depends on the application at hand, as data suitable for one context may not be so for another. In this paper, we formalize the problem of active coverage in episo… ▽ More

    Submitted 23 June, 2023; originally announced June 2023.

    Comments: Accepted at COLT 2023

  9. arXiv:2305.16041  [pdf, other

    stat.ML cs.LG

    An $\varepsilon$-Best-Arm Identification Algorithm for Fixed-Confidence and Beyond

    Authors: Marc Jourdan, Rémy Degenne, Emilie Kaufmann

    Abstract: We propose EB-TC$\varepsilon$, a novel sampling rule for $\varepsilon$-best arm identification in stochastic bandits. It is the first instance of Top Two algorithm analyzed for approximate best arm identification. EB-TC$\varepsilon$ is an *anytime* sampling rule that can therefore be employed without modification for fixed confidence or fixed budget identification (without prior knowledge of the b… ▽ More

    Submitted 6 November, 2023; v1 submitted 25 May, 2023; originally announced May 2023.

    Comments: 68 pages, 14 figures, 4 tables. To be published in the Thirty-seventh Conference on Neural Information Processing Systems

  10. arXiv:2304.04128  [pdf, other

    cs.RO

    Learning Agile, Vision-based Drone Flight: from Simulation to Reality

    Authors: Davide Scaramuzza, Elia Kaufmann

    Abstract: We present our latest research in learning deep sensorimotor policies for agile, vision-based quadrotor flight. We show methodologies for the successful transfer of such policies from simulation to the real world. In addition, we discuss the open research questions that still need to be answered to improve the agility and robustness of autonomous drones toward human-pilot performance.

    Submitted 8 April, 2023; originally announced April 2023.

  11. arXiv:2301.13089  [pdf, ps, other

    cs.CL cs.AI cs.HC

    Can an AI Win Ghana's National Science and Maths Quiz? An AI Grand Challenge for Education

    Authors: George Boateng, Victor Kumbol, Elsie Effah Kaufmann

    Abstract: There is a lack of enough qualified teachers across Africa which hampers efforts to provide adequate learning support such as educational question answering (EQA) to students. An AI system that can enable students to ask questions via text or voice and get instant answers will make high-quality education accessible. Despite advances in the field of AI, there exists no robust benchmark or challenge… ▽ More

    Submitted 30 January, 2023; originally announced January 2023.

  12. Autonomous Drone Racing: A Survey

    Authors: Drew Hanover, Antonio Loquercio, Leonard Bauersfeld, Angel Romero, Robert Penicka, Yunlong Song, Giovanni Cioffi, Elia Kaufmann, Davide Scaramuzza

    Abstract: Over the last decade, the use of autonomous drone systems for surveying, search and rescue, or last-mile delivery has increased exponentially. With the rise of these applications comes the need for highly robust, safety-critical algorithms which can operate drones in complex and uncertain environments. Additionally, flying fast enables drones to cover more ground which in turn increases productivi… ▽ More

    Submitted 8 July, 2024; v1 submitted 4 January, 2023; originally announced January 2023.

    Comments: 26 pages

    Journal ref: IEEE Transactions on Robotics (T-RO), Vol. 40, 2024

  13. arXiv:2211.12181  [pdf, ps, other

    cs.RO

    User-Conditioned Neural Control Policies for Mobile Robotics

    Authors: Leonard Bauersfeld, Elia Kaufmann, Davide Scaramuzza

    Abstract: Recently, learning-based controllers have been shown to push mobile robotic systems to their limits and provide the robustness needed for many real-world applications. However, only classical optimization-based control frameworks offer the inherent flexibility to be dynamically adjusted during execution by, for example, setting target speeds or actuator limits. We present a framework to overcome t… ▽ More

    Submitted 2 April, 2023; v1 submitted 22 November, 2022; originally announced November 2022.

    Comments: 6 pages + 1 pages references

    Journal ref: IEEE International Conference on Robotics and Automation (ICRA), London, 2023

  14. arXiv:2210.15287  [pdf, other

    cs.RO

    Learned Inertial Odometry for Autonomous Drone Racing

    Authors: Giovanni Cioffi, Leonard Bauersfeld, Elia Kaufmann, Davide Scaramuzza

    Abstract: Inertial odometry is an attractive solution to the problem of state estimation for agile quadrotor flight. It is inexpensive, lightweight, and it is not affected by perceptual degradation. However, only relying on the integration of the inertial measurements for state estimation is infeasible. The errors and time-varying biases present in such measurements cause the accumulation of large drift in… ▽ More

    Submitted 28 February, 2023; v1 submitted 27 October, 2022; originally announced October 2022.

    Journal ref: Robotics and Automation Letters (RA-L), 2023

  15. arXiv:2210.00974  [pdf, other

    stat.ML cs.LG

    Dealing with Unknown Variances in Best-Arm Identification

    Authors: Marc Jourdan, Rémy Degenne, Emilie Kaufmann

    Abstract: The problem of identifying the best arm among a collection of items having Gaussian rewards distribution is well understood when the variances are known. Despite its practical relevance for many applications, few works studied it for unknown variances. In this paper we introduce and analyze two approaches to deal with unknown variances, either by plugging in the empirical variance or by adapting t… ▽ More

    Submitted 23 January, 2023; v1 submitted 3 October, 2022; originally announced October 2022.

    Comments: 73 pages, 5 figures, 3 tables. To be published in the 34th International Conference on Algorithmic Learning Theory, Singapore, 2023

  16. arXiv:2207.05852  [pdf, other

    cs.LG

    Optimistic PAC Reinforcement Learning: the Instance-Dependent View

    Authors: Andrea Tirinzoni, Aymen Al-Marjani, Emilie Kaufmann

    Abstract: Optimistic algorithms have been extensively studied for regret minimization in episodic tabular MDPs, both from a minimax and an instance-dependent view. However, for the PAC RL problem, where the goal is to identify a near-optimal policy with high probability, little is known about their instance-dependent sample complexity. A negative result of Wagenmaker et al. (2021) suggests that optimistic s… ▽ More

    Submitted 12 July, 2022; originally announced July 2022.

    Comments: arXiv admin note: text overlap with arXiv:2203.09251

  17. arXiv:2206.05979  [pdf, other

    stat.ML cs.LG

    Top Two Algorithms Revisited

    Authors: Marc Jourdan, Rémy Degenne, Dorian Baudry, Rianne de Heide, Emilie Kaufmann

    Abstract: Top Two algorithms arose as an adaptation of Thompson sampling to best arm identification in multi-armed bandit models (Russo, 2016), for parametric families of arms. They select the next arm to sample from by randomizing among two candidate arms, a leader and a challenger. Despite their good empirical performance, theoretical guarantees for fixed-confidence best arm identification have only been… ▽ More

    Submitted 4 October, 2022; v1 submitted 13 June, 2022; originally announced June 2022.

    Comments: 75 pages, 8 figures, 3 tables

  18. arXiv:2206.00121  [pdf, ps, other

    cs.LG stat.ML

    Near-Optimal Collaborative Learning in Bandits

    Authors: Clémence Réda, Sattar Vakili, Emilie Kaufmann

    Abstract: This paper introduces a general multi-agent bandit model in which each agent is facing a finite set of arms and may communicate with other agents through a central controller in order to identify, in pure exploration, or play, in regret minimization, its optimal arm. The twist is that the optimal arm for each agent is the arm with largest expected mixed reward, where the mixed reward of an arm is… ▽ More

    Submitted 28 October, 2022; v1 submitted 31 May, 2022; originally announced June 2022.

  19. Learning Minimum-Time Flight in Cluttered Environments

    Authors: Robert Penicka, Yunlong Song, Elia Kaufmann, Davide Scaramuzza

    Abstract: We tackle the problem of minimum-time flight for a quadrotor through a sequence of waypoints in the presence of obstacles while exploiting the full quadrotor dynamics. Early works relied on simplified dynamics or polynomial trajectory representations that did not exploit the full actuator potential of the quadrotor, and, thus, resulted in suboptimal solutions. Recent works can plan minimum-time tr… ▽ More

    Submitted 17 June, 2022; v1 submitted 28 March, 2022; originally announced March 2022.

    Journal ref: IEEE Robotics and Automation Letters, vol. 7, no. 3, pp. 7209-7216, July 2022

  20. arXiv:2203.10883  [pdf, other

    cs.LG

    Efficient Algorithms for Extreme Bandits

    Authors: Dorian Baudry, Yoan Russac, Emilie Kaufmann

    Abstract: In this paper, we contribute to the Extreme Bandit problem, a variant of Multi-Armed Bandits in which the learner seeks to collect the largest possible reward. We first study the concentration of the maximum of i.i.d random variables under mild assumptions on the tail of the rewards distributions. This analysis motivates the introduction of Quantile of Maxima (QoMax). The properties of QoMax are s… ▽ More

    Submitted 21 March, 2022; originally announced March 2022.

    Comments: Proceedings of the 25 th International Conference on Artificial Intelligence and Statistics (AISTATS) 2022

  21. arXiv:2203.09251  [pdf, other

    cs.LG stat.ML

    Near Instance-Optimal PAC Reinforcement Learning for Deterministic MDPs

    Authors: Andrea Tirinzoni, Aymen Al-Marjani, Emilie Kaufmann

    Abstract: In probably approximately correct (PAC) reinforcement learning (RL), an agent is required to identify an $εいぷしろん$-optimal policy with probability $1-δでるた$. While minimax optimal algorithms exist for this problem, its instance-dependent complexity remains elusive in episodic Markov decision processes (MDPs). In this paper, we propose the first nearly matching (up to a horizon squared factor and logarithmic… ▽ More

    Submitted 24 October, 2022; v1 submitted 17 March, 2022; originally announced March 2022.

  22. arXiv:2203.07747  [pdf, other

    cs.RO cs.LG eess.SY

    Real-time Neural-MPC: Deep Learning Model Predictive Control for Quadrotors and Agile Robotic Platforms

    Authors: Tim Salzmann, Elia Kaufmann, Jon Arrizabalaga, Marco Pavone, Davide Scaramuzza, Markus Ryll

    Abstract: Model Predictive Control (MPC) has become a popular framework in embedded control for high-performance autonomous systems. However, to achieve good control performance using MPC, an accurate dynamics model is key. To maintain real-time operation, the dynamics models used on embedded systems have been limited to simple first-principle models, which substantially limits their representative power. I… ▽ More

    Submitted 25 July, 2023; v1 submitted 15 March, 2022; originally announced March 2022.

    Journal ref: IEEE Robotics and Automation Letters (Volume: 8, Issue: 4, April 2023)

  23. arXiv:2202.10796  [pdf, ps, other

    cs.RO

    A Benchmark Comparison of Learned Control Policies for Agile Quadrotor Flight

    Authors: Elia Kaufmann, Leonard Bauersfeld, Davide Scaramuzza

    Abstract: Quadrotors are highly nonlinear dynamical systems that require carefully tuned controllers to be pushed to their physical limits. Recently, learning-based control policies have been proposed for quadrotors, as they would potentially allow learning direct mappings from high-dimensional raw sensory observations to actions. Due to sample inefficiency, training such learned controllers on the real pla… ▽ More

    Submitted 22 February, 2022; originally announced February 2022.

    Comments: 6 pages (+1 references)

    Journal ref: IEEE International Conference on Robotics and Automation (ICRA), Philadelphia, 2022

  24. arXiv:2110.05832  [pdf, other

    astro-ph.EP astro-ph.IM

    Cometary dust analogues for physics experiments

    Authors: A. Lethuillier, C. Feller, E. Kaufmann, P. Becerra, N. Hänni, R. Diethelm, C. Kreuzig, B. Gundlach, J. Blum, A. Pommerol, G. Kargl, E. Kührt, H. Capelo, D. Haack, X. Zhang, J. Knollenberg, N. S. Molinski, T. Gilke, H. Sierks, P. Tiefenbacher, C. Güttler, K. A. Otto, D. Bischoff, M. Schweighart, A. Hagermann , et al. (1 additional authors not shown)

    Abstract: The CoPhyLab (Cometary Physics Laboratory) project is designed to study the physics of comets through a series of earth-based experiments. For these experiments, a dust analogue was created with physical properties comparable to those of the non-volatile dust found on comets. This "CoPhyLab dust" is planned to be mixed with water and CO$_2$ ice and placed under cometary conditions in vacuum chambe… ▽ More

    Submitted 12 October, 2021; originally announced October 2021.

  25. arXiv:2110.05113  [pdf, other

    cs.RO cs.LG eess.SY

    Learning High-Speed Flight in the Wild

    Authors: Antonio Loquercio, Elia Kaufmann, René Ranftl, Matthias Müller, Vladlen Koltun, Davide Scaramuzza

    Abstract: Quadrotors are agile. Unlike most other machines, they can traverse extremely complex environments at high speeds. To date, only expert human pilots have been able to fully exploit their capabilities. Autonomous operation with on-board sensing and computation has been limited to low speeds. State-of-the-art methods generally separate the navigation problem into subtasks: sensing, mapping, and plan… ▽ More

    Submitted 11 October, 2021; originally announced October 2021.

    Comments: 16 pages (+7 supplementary)

    Journal ref: Science Robotics 2021 Vol. 6, Issue 59, abg5810

  26. Performance, Precision, and Payloads: Adaptive Nonlinear MPC for Quadrotors

    Authors: Drew Hanover, Philipp Foehn, Sihao Sun, Elia Kaufmann, Davide Scaramuzza

    Abstract: Agile quadrotor flight in challenging environments has the potential to revolutionize shipping, transportation, and search and rescue applications. Nonlinear model predictive control (NMPC) has recently shown promising results for agile quadrotor control, but relies on highly accurate models for maximum performance. Hence, model uncertainties in the form of unmodeled complex aerodynamic effects, v… ▽ More

    Submitted 3 December, 2021; v1 submitted 9 September, 2021; originally announced September 2021.

    Comments: 8 Pages, 6 figures, Accepted RAL 2021

    Journal ref: IEEE Robotics and Automation Letters 0, 2377-3766, 2021

  27. arXiv:2109.01365  [pdf, other

    cs.RO

    A Comparative Study of Nonlinear MPC and Differential-Flatness-Based Control for Quadrotor Agile Flight

    Authors: Sihao Sun, Angel Romero, Philipp Foehn, Elia Kaufmann, Davide Scaramuzza

    Abstract: Accurate trajectory tracking control for quadrotors is essential for safe navigation in cluttered environments. However, this is challenging in agile flights due to nonlinear dynamics, complex aerodynamic effects, and actuation constraints. In this article, we empirically compare two state-of-the-art control frameworks: the nonlinear-model-predictive controller (NMPC) and the differential-flatness… ▽ More

    Submitted 4 January, 2024; v1 submitted 3 September, 2021; originally announced September 2021.

    Journal ref: The paper has been published in the IEEE Transactions on Robotics (T-RO), 2022

  28. NeuroBEM: Hybrid Aerodynamic Quadrotor Model

    Authors: Leonard Bauersfeld, Elia Kaufmann, Philipp Foehn, Sihao Sun, Davide Scaramuzza

    Abstract: Quadrotors are extremely agile, so much in fact, that classic first-principle-models come to their limits. Aerodynamic effects, while insignificant at low speeds, become the dominant model defect during high speeds or agile maneuvers. Accurate modeling is needed to design robust high-performance control systems and enable flying close to the platform's physical limits. We propose a hybrid approach… ▽ More

    Submitted 15 June, 2021; originally announced June 2021.

    Comments: 9 pages + 1 pages references

    Journal ref: Robotics: Science and Systems (RSS), 2021

  29. A Statistical Review of Light Curves and the Prevalence of Contact Binaries in the Kuiper Belt

    Authors: Mark R. Showalter, Susan D. Benecchi, Marc W. Buie, William M. Grundy, James T. Keane, Carey M. Lisse, Cathy B. Olkin, Simon B. Porter, Stuart J. Robbins, Kelsi N. Singer, Anne J. Verbiscer, Harold A. Weaver, Amanda M. Zangari, Douglas P. Hamilton, David E. Kaufmann, Tod R. Lauer, D. S. Mehoke, T. S. Mehoke, J. R. Spencer, H. B. Throop, J. W. Parker, S. Alan Stern

    Abstract: We investigate what can be learned about a population of distant KBOs by studying the statistical properties of their light curves. Whereas others have successfully inferred the properties of individual, highly variable KBOs, we show that the fraction of KBOs with low amplitudes also provides fundamental information about a population. Each light curve is primarily the result of two factors: shape… ▽ More

    Submitted 7 May, 2021; originally announced May 2021.

    Journal ref: Icarus 356, id. 114098 (2021)

  30. arXiv:2103.14666  [pdf, other

    cs.RO cs.AI

    Autonomous Overtaking in Gran Turismo Sport Using Curriculum Reinforcement Learning

    Authors: Yunlong Song, HaoChih Lin, Elia Kaufmann, Peter Duerr, Davide Scaramuzza

    Abstract: Professional race-car drivers can execute extreme overtaking maneuvers. However, existing algorithms for autonomous overtaking either rely on simplified assumptions about the vehicle dynamics or try to solve expensive trajectory-optimization problems online. When the vehicle approaches its physical limits, existing model-based controllers struggle to handle highly nonlinear dynamics, and cannot le… ▽ More

    Submitted 9 May, 2021; v1 submitted 26 March, 2021; originally announced March 2021.

    Comments: Accepted for publication at the IEEE International Conference on Robotics and Automation (ICRA), Xi An, 2021

  31. arXiv:2103.10070  [pdf, other

    cs.LG cs.AI math.ST q-bio.QM

    Top-m identification for linear bandits

    Authors: Clémence Réda, Emilie Kaufmann, Andrée Delahaye-Duriez

    Abstract: Motivated by an application to drug repurposing, we propose the first algorithms to tackle the identification of the m $\ge$ 1 arms with largest means in a linear bandit model, in the fixed-confidence setting. These algorithms belong to the generic family of Gap-Index Focused Algorithms (GIFA) that we introduce for Top-m identification in linear bandits. We propose a unified analysis of these algo… ▽ More

    Submitted 18 March, 2021; originally announced March 2021.

  32. arXiv:2103.08624  [pdf, other

    cs.RO cs.AI

    Autonomous Drone Racing with Deep Reinforcement Learning

    Authors: Yunlong Song, Mats Steinweg, Elia Kaufmann, Davide Scaramuzza

    Abstract: In many robotic tasks, such as autonomous drone racing, the goal is to travel through a set of waypoints as fast as possible. A key challenge for this task is planning the time-optimal trajectory, which is typically solved by assuming perfect knowledge of the waypoints to pass in advance. The resulting solution is either highly specialized for a single-track layout, or suboptimal due to simplifyin… ▽ More

    Submitted 2 August, 2021; v1 submitted 15 March, 2021; originally announced March 2021.

    Comments: This paper has been accepted for publication at the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Prague, 2021. Copyright @ IEEE

  33. arXiv:2102.05773  [pdf, other

    cs.RO

    Data-Driven MPC for Quadrotors

    Authors: Guillem Torrente, Elia Kaufmann, Philipp Foehn, Davide Scaramuzza

    Abstract: Aerodynamic forces render accurate high-speed trajectory tracking with quadrotors extremely challenging. These complex aerodynamic effects become a significant disturbance at high speeds, introducing large positional tracking errors, and are extremely difficult to model. To fly at high speeds, feedback control must be able to account for these aerodynamic effects in real-time. This necessitates a… ▽ More

    Submitted 3 March, 2021; v1 submitted 10 February, 2021; originally announced February 2021.

    Comments: 8 pages

    Journal ref: IEEE Robotics and Automation Letters (RA-L), 2021

  34. Gas flow in Martian spider formation

    Authors: Nicholas Attree, Erkia Kaufmann, Axel Hagermann

    Abstract: Martian araneiform terrain, located in the Southern polar regions, consists of features with central pits and radial troughs which are thought to be associated with the solid state greenhouse effect under a CO$_{2}$ ice sheet. Sublimation at the base of this ice leads to gas buildup, fracturing of the ice and the flow of gas and entrained regolith out of vents and onto the surface. There are two p… ▽ More

    Submitted 29 January, 2021; originally announced January 2021.

    Comments: Accepted in Icarus

  35. arXiv:2012.05754  [pdf, other

    cs.LG

    Optimal Thompson Sampling strategies for support-aware CVaR bandits

    Authors: Dorian Baudry, Romain Gautron, Emilie Kaufmann, Odalric-Ambryn Maillard

    Abstract: In this paper we study a multi-arm bandit problem in which the quality of each arm is measured by the Conditional Value at Risk (CVaR) at some level alpha of the reward distribution. While existing works in this setting mainly focus on Upper Confidence Bound algorithms, we introduce a new Thompson Sampling approach for CVaR bandits on bounded rewards that is flexible enough to solve a variety of p… ▽ More

    Submitted 21 March, 2022; v1 submitted 10 December, 2020; originally announced December 2020.

    Comments: Presented at the Thirty-eighth International Conference on Machine Learning (ICML 2021). In this version we refine Lemma 2 and correct its proof (does not change the main theorems)

  36. arXiv:2010.14323  [pdf, other

    stat.ML cs.LG

    Sub-sampling for Efficient Non-Parametric Bandit Exploration

    Authors: Dorian Baudry, Emilie Kaufmann, Odalric-Ambrym Maillard

    Abstract: In this paper we propose the first multi-armed bandit algorithm based on re-sampling that achieves asymptotically optimal regret simultaneously for different families of arms (namely Bernoulli, Gaussian and Poisson distributions). Unlike Thompson Sampling which requires to specify a different prior to be optimal in each case, our proposal RB-SDA does not need any distribution-dependent tuning. RB-… ▽ More

    Submitted 27 October, 2020; originally announced October 2020.

    Comments: NeurIPS 2020, Dec 2020, Vancouver, Canada

  37. arXiv:2010.03531  [pdf, ps, other

    cs.LG stat.ML

    Episodic Reinforcement Learning in Finite MDPs: Minimax Lower Bounds Revisited

    Authors: Omar Darwiche Domingues, Pierre Ménard, Emilie Kaufmann, Michal Valko

    Abstract: In this paper, we propose new problem-independent lower bounds on the sample complexity and regret in episodic MDPs, with a particular focus on the non-stationary case in which the transition kernel is allowed to change in each stage of the episode. Our main contribution is a novel lower bound of $Ωおめが((H^3SA/εいぷしろん^2)\log(1/δでるた))$ on the sample complexity of an $(\varepsilon,δでるた)$-PAC algorithm for best poli… ▽ More

    Submitted 7 October, 2020; originally announced October 2020.

  38. arXiv:2009.00563  [pdf, other

    cs.RO cs.AI

    Flightmare: A Flexible Quadrotor Simulator

    Authors: Yunlong Song, Selim Naji, Elia Kaufmann, Antonio Loquercio, Davide Scaramuzza

    Abstract: State-of-the-art quadrotor simulators have a rigid and highly-specialized structure: either are they really fast, physically accurate, or photo-realistic. In this work, we propose a novel quadrotor simulator: Flightmare. Flightmare is composed of two main components: a configurable rendering engine built on Unity and a flexible physics engine for dynamics simulation. Those two components are total… ▽ More

    Submitted 9 May, 2021; v1 submitted 1 September, 2020; originally announced September 2020.

    Comments: Accepted for publication at 4th Conference on Robot Learning (CoRL), Cambridge MA, USA. 2020

  39. arXiv:2008.07971  [pdf, other

    cs.AI cs.LG cs.RO

    Super-Human Performance in Gran Turismo Sport Using Deep Reinforcement Learning

    Authors: Florian Fuchs, Yunlong Song, Elia Kaufmann, Davide Scaramuzza, Peter Duerr

    Abstract: Autonomous car racing is a major challenge in robotics. It raises fundamental problems for classical approaches such as planning minimum-time trajectories under uncertain dynamics and controlling the car at the limits of its handling. Besides, the requirement of minimizing the lap time, which is a sparse objective, and the difficulty of collecting training data from human experts have also hindere… ▽ More

    Submitted 9 May, 2021; v1 submitted 18 August, 2020; originally announced August 2020.

    Comments: Accepted for Publication at the IEEE Robotics and Automation Letters (RA-L) 2021, and International Conference on Robots and Automation (ICRA) 2021

    Journal ref: IEEE Robotics and Automation Letters (RAL) 2021

  40. arXiv:2007.13442  [pdf, other

    cs.LG stat.ML

    Fast active learning for pure exploration in reinforcement learning

    Authors: Pierre Ménard, Omar Darwiche Domingues, Anders Jonsson, Emilie Kaufmann, Edouard Leurent, Michal Valko

    Abstract: Realistic environments often provide agents with very limited feedback. When the environment is initially unknown, the feedback, in the beginning, can be completely absent, and the agents may first choose to devote all their effort on exploring efficiently. The exploration remains a challenge while it has been addressed with many hand-tuned heuristics with different levels of generality on one sid… ▽ More

    Submitted 10 October, 2020; v1 submitted 27 July, 2020; originally announced July 2020.

  41. arXiv:2007.05078  [pdf, other

    cs.LG stat.ML

    A Kernel-Based Approach to Non-Stationary Reinforcement Learning in Metric Spaces

    Authors: Omar Darwiche Domingues, Pierre Ménard, Matteo Pirotta, Emilie Kaufmann, Michal Valko

    Abstract: In this work, we propose KeRNS: an algorithm for episodic reinforcement learning in non-stationary Markov Decision Processes (MDPs) whose state-action set is endowed with a metric. Using a non-parametric model of the MDP built with time-dependent kernels, we prove a regret bound that scales with the covering dimension of the state-action space and the total variation of the MDP with time, which qu… ▽ More

    Submitted 23 March, 2022; v1 submitted 9 July, 2020; originally announced July 2020.

    Comments: Update following the publication in AISTATS 2021. Fixed typos and lemma about runtime

  42. arXiv:2006.06294  [pdf, other

    cs.LG stat.ML

    Adaptive Reward-Free Exploration

    Authors: Emilie Kaufmann, Pierre Ménard, Omar Darwiche Domingues, Anders Jonsson, Edouard Leurent, Michal Valko

    Abstract: Reward-free exploration is a reinforcement learning setting studied by Jin et al. (2020), who address it by running several algorithms with regret guarantees in parallel. In our work, we instead give a more natural adaptive approach for reward-free exploration which directly reduces upper bounds on the maximum MDP estimation error. We show that, interestingly, our reward-free UCRL algorithm can be… ▽ More

    Submitted 7 October, 2020; v1 submitted 11 June, 2020; originally announced June 2020.

  43. arXiv:2006.05879  [pdf, other

    cs.LG stat.ML

    Planning in Markov Decision Processes with Gap-Dependent Sample Complexity

    Authors: Anders Jonsson, Emilie Kaufmann, Pierre Ménard, Omar Darwiche Domingues, Edouard Leurent, Michal Valko

    Abstract: We propose MDP-GapE, a new trajectory-based Monte-Carlo Tree Search algorithm for planning in a Markov Decision Process in which transitions have a finite support. We prove an upper bound on the number of calls to the generative models needed for MDP-GapE to identify a near-optimal action with high probability. This problem-dependent sample complexity result is expressed in terms of the sub-optima… ▽ More

    Submitted 10 June, 2020; originally announced June 2020.

  44. arXiv:2006.05768  [pdf, other

    cs.RO

    Deep Drone Acrobatics

    Authors: Elia Kaufmann, Antonio Loquercio, René Ranftl, Matthias Müller, Vladlen Koltun, Davide Scaramuzza

    Abstract: Performing acrobatic maneuvers with quadrotors is extremely challenging. Acrobatic flight requires high thrust and extreme angular accelerations that push the platform to its physical limits. Professional drone pilots often measure their level of mastery by flying such maneuvers in competitions. In this paper, we propose to learn a sensorimotor policy that enables an autonomous quadrotor to fly ex… ▽ More

    Submitted 11 June, 2020; v1 submitted 10 June, 2020; originally announced June 2020.

    Comments: 8 pages + 2 pages references. Video: https://youtu.be/2N_wKXQ6MXA. Code: https://github.com/uzh-rpg/deep_drone_acrobatics

    Journal ref: Robotics, Science, and Systems (RSS), 2020

  45. arXiv:2005.12813  [pdf, other

    cs.RO cs.CV eess.SY

    AlphaPilot: Autonomous Drone Racing

    Authors: Philipp Foehn, Dario Brescianini, Elia Kaufmann, Titus Cieslewski, Mathias Gehrig, Manasi Muglikar, Davide Scaramuzza

    Abstract: This paper presents a novel system for autonomous, vision-based drone racing combining learned data abstraction, nonlinear filtering, and time-optimal trajectory planning. The system has successfully been deployed at the first autonomous drone racing world championship: the 2019 AlphaPilot Challenge. Contrary to traditional drone racing systems, which only detect the next gate, our approach makes… ▽ More

    Submitted 20 August, 2021; v1 submitted 26 May, 2020; originally announced May 2020.

    Comments: This paper is an extended version of an accepted publication from Robotics: Science and Systems, 2020. This version has been accepted for publication in Autonomous Robots (Springer). Please cite as "AlphaPilot: Autonomous Drone Racing", P. Foehn, Autonomous Robots 2021. Associated video at https://youtu.be/DGjwm5PZQT8

  46. arXiv:2004.05599  [pdf, other

    cs.LG stat.ML

    Kernel-Based Reinforcement Learning: A Finite-Time Analysis

    Authors: Omar Darwiche Domingues, Pierre Ménard, Matteo Pirotta, Emilie Kaufmann, Michal Valko

    Abstract: We consider the exploration-exploitation dilemma in finite-horizon reinforcement learning problems whose state-action space is endowed with a metric. We introduce Kernel-UCBVI, a model-based optimistic algorithm that leverages the smoothness of the MDP and a non-parametric kernel estimator of the rewards and transitions to efficiently balance exploration and exploitation. For problems with $K$ epi… ▽ More

    Submitted 23 March, 2022; v1 submitted 12 April, 2020; originally announced April 2020.

    Comments: Update following the publication in ICML 2021, including fixed typos

  47. Initial results from the New Horizons exploration of 2014 MU69, a small Kuiper Belt Object

    Authors: S. A. Stern, H. A. Weaver, J. R. Spencer, C. B. Olkin, G. R. Gladstone, W. M. Grundy, J. M. Moore, D. P. Cruikshank, H. A. Elliott, W. B. McKinnon, J. Wm. Parker, A. J. Verbiscer, L. A. Young, D. A. Aguilar, J. M. Albers, T. Andert, J. P. Andrews, F. Bagenal, M. E. Banks, B. A. Bauer, J. A. Bauman, K. E. Bechtold, C. B. Beddingfield, N. Behrooz, K. B. Beisser , et al. (180 additional authors not shown)

    Abstract: The Kuiper Belt is a distant region of the Solar System. On 1 January 2019, the New Horizons spacecraft flew close to (486958) 2014 MU69, a Cold Classical Kuiper Belt Object, a class of objects that have never been heated by the Sun and are therefore well preserved since their formation. Here we describe initial results from these encounter observations. MU69 is a bi-lobed contact binary with a fl… ▽ More

    Submitted 2 April, 2020; originally announced April 2020.

    Comments: 43 pages, 8 figure

    Journal ref: Science 364, eaaw9771 (2019)

  48. The Geology and Geophysics of Kuiper Belt Object (486958) Arrokoth

    Authors: J. R. Spencer, S. A. Stern, J. M. Moore, H. A. Weaver, K. N. Singer, C. B. Olkin, A. J. Verbiscer, W. B. McKinnon, J. Wm. Parker, R. A. Beyer, J. T. Keane, T. R. Lauer, S. B. Porter, O. L. White, B. J. Buratti, M. R. El-Maarry, C. M. Lisse, A. H. Parker, H. B. Throop, S. J. Robbins, O. M. Umurhan, R. P. Binzel, D. T. Britt, M. W. Buie, A. F. Cheng , et al. (53 additional authors not shown)

    Abstract: The Cold Classical Kuiper Belt, a class of small bodies in undisturbed orbits beyond Neptune, are primitive objects preserving information about Solar System formation. The New Horizons spacecraft flew past one of these objects, the 36 km long contact binary (486958) Arrokoth (2014 MU69), in January 2019. Images from the flyby show that Arrokoth has no detectable rings, and no satellites (larger t… ▽ More

    Submitted 1 April, 2020; originally announced April 2020.

    Journal ref: Science, 367, aay3999 (2020)

  49. arXiv:1912.03074  [pdf, other

    stat.ML cs.LG

    Solving Bernoulli Rank-One Bandits with Unimodal Thompson Sampling

    Authors: Cindy Trinh, Emilie Kaufmann, Claire Vernade, Richard Combes

    Abstract: Stochastic Rank-One Bandits (Katarya et al, (2017a,b)) are a simple framework for regret minimization problems over rank-one matrices of arms. The initially proposed algorithms are proved to have logarithmic regret, but do not match the existing lower bound for this problem. We close this gap by first proving that rank-one bandits are a particular instance of unimodal bandits, and then providing a… ▽ More

    Submitted 6 December, 2019; originally announced December 2019.

  50. arXiv:1910.10945  [pdf, other

    cs.LG stat.ML

    Fixed-Confidence Guarantees for Bayesian Best-Arm Identification

    Authors: Xuedong Shang, Rianne de Heide, Emilie Kaufmann, Pierre Ménard, Michal Valko

    Abstract: We investigate and provide new insights on the sampling rule called Top-Two Thompson Sampling (TTTS). In particular, we justify its use for fixed-confidence best-arm identification. We further propose a variant of TTTS called Top-Two Transportation Cost (T3C), which disposes of the computational burden of TTTS. As our main contribution, we provide the first sample complexity analysis of TTTS and T… ▽ More

    Submitted 28 October, 2019; v1 submitted 24 October, 2019; originally announced October 2019.