-
Learning Optimal Admission Control in Partially Observable Queueing Networks
Authors:
Jonatha Anselmi,
Bruno Gaujal,
Louis-Sébastien Rebuffi
Abstract:
We present an efficient reinforcement learning algorithm that learns the optimal admission control policy in a partially observable queueing network. Specifically, only the arrival and departure times from the network are observable, and optimality refers to the average holding/rejection cost in infinite horizon.
While reinforcement learning in Partially Observable Markov Decision Processes (POM…
▽ More
We present an efficient reinforcement learning algorithm that learns the optimal admission control policy in a partially observable queueing network. Specifically, only the arrival and departure times from the network are observable, and optimality refers to the average holding/rejection cost in infinite horizon.
While reinforcement learning in Partially Observable Markov Decision Processes (POMDP) is prohibitively expensive in general, we show that our algorithm has a regret that only depends sub-linearly on the maximal number of jobs in the network, $S$. In particular, in contrast with existing regret analyses, our regret bound does not depend on the diameter of the underlying Markov Decision Process (MDP), which in most queueing systems is at least exponential in $S$.
The novelty of our approach is to leverage Norton's equivalent theorem for closed product-form queueing networks and an efficient reinforcement learning algorithm for MDPs with the structure of birth-and-death processes.
△ Less
Submitted 4 August, 2023;
originally announced August 2023.
-
Reinforcement Learning in a Birth and Death Process: Breaking the Dependence on the State Space
Authors:
Jonatha Anselmi,
Bruno Gaujal,
Louis-Sébastien Rebuffi
Abstract:
In this paper, we revisit the regret of undiscounted reinforcement learning in MDPs with a birth and death structure. Specifically, we consider a controlled queue with impatient jobs and the main objective is to optimize a trade-off between energy consumption and user-perceived performance. Within this setting, the \emph{diameter} $D$ of the MDP is $Ω(S^S)$, where $S$ is the number of states. Ther…
▽ More
In this paper, we revisit the regret of undiscounted reinforcement learning in MDPs with a birth and death structure. Specifically, we consider a controlled queue with impatient jobs and the main objective is to optimize a trade-off between energy consumption and user-perceived performance. Within this setting, the \emph{diameter} $D$ of the MDP is $Ω(S^S)$, where $S$ is the number of states. Therefore, the existing lower and upper bounds on the regret at time$T$, of order $O(\sqrt{DSAT})$ for MDPs with $S$ states and $A$ actions, may suggest that reinforcement learning is inefficient here. In our main result however, we exploit the structure of our MDPs to show that the regret of a slightly-tweaked version of the classical learning algorithm {\sc Ucrl2} is in fact upper bounded by $\tilde{\mathcal{O}}(\sqrt{E_2AT})$ where $E_2$ is related to the weighted second moment of the stationary measure of a reference policy. Importantly, $E_2$ is bounded independently of $S$. Thus, our bound is asymptotically independent of the number of states and of the diameter. This result is based on a careful study of the number of visits performed by the learning algorithm to the states of the MDP, which is highly non-uniform.
△ Less
Submitted 21 February, 2023;
originally announced February 2023.
-
Real-time X-ray Phase-contrast Imaging Using SPINNet -- A Speckle-based Phase-contrast Imaging Neural Network
Authors:
Zhi Qiao,
Xianbo Shi,
Yudong Yao,
Michael J. Wojcik,
Luca Rebuffi,
Mathew J. Cherukara,
Lahsen Assoufid
Abstract:
X-ray phase-contrast imaging has become indispensable for visualizing samples with low absorption contrast. In this regard, speckle-based techniques have shown significant advantages in spatial resolution, phase sensitivity, and implementation flexibility compared with traditional methods. However, their computational cost has hindered their wider adoption. By exploiting the power of deep learning…
▽ More
X-ray phase-contrast imaging has become indispensable for visualizing samples with low absorption contrast. In this regard, speckle-based techniques have shown significant advantages in spatial resolution, phase sensitivity, and implementation flexibility compared with traditional methods. However, their computational cost has hindered their wider adoption. By exploiting the power of deep learning, we developed a new speckle-based phase-contrast imaging neural network (SPINNet) that boosts the phase retrieval speed by at least two orders of magnitude compared to existing methods. To achieve this performance, we combined SPINNet with a novel coded-mask-based technique, an enhanced version of the speckle-based method. Using this scheme, we demonstrate a simultaneous reconstruction of absorption and phase images on the order of 100 ms, where a traditional correlation-based analysis would take several minutes even with a cluster. In addition to significant improvement in speed, our experimental results show that the imaging resolution and phase retrieval quality of SPINNet outperform existing single-shot speckle-based methods. Furthermore, we successfully demonstrate its application in 3D X-ray phase-contrast tomography. Our result shows that SPINNet could enable many applications requiring high-resolution and fast data acquisition and processing, such as in-situ and in-operando 2D and 3D phase-contrast imaging and real-time at-wavelength metrology and wavefront sensing.
△ Less
Submitted 18 January, 2022;
originally announced January 2022.
-
A hierarchical approach for modelling X-ray beamlines. Application to a coherent beamline
Authors:
Manuel Sanchez del Rio,
Rafael Celestre,
Mark Glass,
Giovanni Pirro,
Juan Reyes-Herrera,
Ray Barrett,
Julio Cesar da Silva,
Peter Cloetens,
Xianbo Shi,
Luca Rebuffi
Abstract:
We consider different approaches to simulate a modern X-ray beamline. Several methodologies with increasing complexity are applied to discuss the relevant parameters that quantify the beamline performance. Parameters such as flux, dimensions and intensity distribution of the focused beam and coherence properties are obtained from simple analytical calculations to sophisticated computer simulations…
▽ More
We consider different approaches to simulate a modern X-ray beamline. Several methodologies with increasing complexity are applied to discuss the relevant parameters that quantify the beamline performance. Parameters such as flux, dimensions and intensity distribution of the focused beam and coherence properties are obtained from simple analytical calculations to sophisticated computer simulations using ray-tracing and wave optics techniques. A latest-generation X-ray nanofocusing beamline for coherent applications (ID16A at the ESRF) has been chosen to study in detail the issues related to highly demagnifying synchrotron sources and exploiting the beam coherence. The performance of the beamline is studied for two storage rings: the old ESRF-1 (emittance 4000~pm) and the new ESRF-EBS (emittance 150~pm). In addition to traditional results in terms of flux and beam sizes, an innovative study on the partial coherence properties based on the propagation of coherent modes is presented. The different algorithms and methodologies are implemented in the software suite OASYS. Those are discussed with emphasis placed upon the their benefits and limitations of each.
△ Less
Submitted 17 June, 2019;
originally announced June 2019.