Click on the profile picture to view the speaker’s personal homepage.
Title: Concentration of Stochastic Approximation Under Markovian Noise
Speaker: Shubhada Agrawal
Abstract: We establish maximal concentration inequalities for the iterates generated by stochastic approximation algorithms under different noise assumptions. First, we consider bounded Markovian noise. If the noise variance is uniformly bounded (for all states), we show that the error is always sub-Gaussian. However, if the variance of the noise is affine in the norm square of the state, then we have a phase transition on the tail of the error: when step size is \(1/k\), we get sub-Weibull, when step size is \(1/k^z\) with \(z < 1\) we get something lighter than any Pareto but heavier than any Weibull. To establish these, we use a novel Lyapunov function which involves the MGF of the solution of a Poisson equation, and an auxiliary projected algorithm. Second, for the case where the operator is a contraction, we generalize the noise to have an arbitrary distribution. If the noise variance is uniformly bounded, the error has almost the same tail as the noise. If the variance is affine in the norm square of the state, then the error has a significantly heavier tail. To establish these, we develop a novel black-box approach based on truncation that reduces the problem to the bounded Markovian noise setting.
Title: Ballot design and electoral outcomes: The role of candidate order and party affiliation
Speaker: Alessandro Arlotto
Abstract: We study how designing ballots with and without party designations impacts electoral outcomes when partisan voters rely on party-order cues to infer candidate affiliation in races without designations. If the party orders of candidates in races with and without party designations differ, these voters might cast their votes incorrectly. We identify a quasi-randomized natural experiment with contest-level treatment assignment pertaining to North Carolina judicial elections and use double machine learning to accurately capture the magnitude of such incorrectly cast votes. Using precinct-level election and demographic data, we estimate that 12.08% (95% confidence interval: 4.95%, 19.20%) of democratic partisan voters and 13.63% (95% confidence interval: 5.14%, 22.10%) of republican partisan voters cast their votes incorrectly due to the difference in party orders. Our results indicate that ballots mixing contests with and without party designations mislead many voters, leading to outcomes that do not reflect true voter preferences. To accurately capture voter intent, such ballot designs should be avoided. (Joint work with A. Belloni, F. Fang, and S. Pekec.)
Title: TBA.
Speaker: Baris Ata
Abstract: TBA.
Title: TBA.
Speaker: Yilun Chen
Abstract: TBA.
Title: Reward-Directed Diffusion Models via Continuous-Time Reinforcement Learning
Speaker: Xuefeng Gao
Abstract: In this talk, I will discuss some reinforcement learning approach for training continuous-time score-based diffusion models in generative AI to generate samples that maximize reward functions while keeping the generated distributions close to the unknown target data distributions. Unlike most existing studies, our approach does not involve any pretrained model. Theoretical and numerical results will be presented to demonstrate the efficacy of our approach.
Title: Stopping Self-Excitement
Speaker: Andrew Daw
Abstract: From management of service interactions to studies of neural signal propagation, there is a subtle but fundamental challenge that comes with such endogenously spurred activity: when should you say it’s over? If each point has the potential to beget future points, there is an inherent tradeoff between omission and excess: stop too early and you may miss further observations, but stop too late and you’ve wasted too much time. In this paper, we study an optimal stopping problem that aims to balance this omission-excess tradeoff within a cluster of history-driven events. We analyze and prescribe stopping policies for the self-exciting stochastic process through a transformation to the workload process for an M/D/1 queueing system. Then, to connect these model-based insights to practice, we draw upon ideas from the algorithms-with-predictions literature and improve the model-based policies to adapt to contextual information, such as the texts within a service interaction. Finally, we demonstrate the performance of these policies through application to a publicly available linguistics data set.
Title: Online learning for dynamic pricing with delay-sensitive customers
Speaker: Guiyu Hong
Abstract: Motivated by applications in sharing economy and online marketplaces, we introduce an online learning method to optimize surge pricing strategies for service systems. Specifically, we propose a non-parametric framework that doesn’t rely on any assumptions about the function form of a customer’s sensitivity to price and delay. Both our algorithmic design and theoretical analysis arise from a detailed examination of the underlying queueing structure. We demonstrate that our online learning algorithm achieves an improved regret order over standard multi-armed bandit algorithms and showcase its efficiency through a set of numerical examples.
Title: Exploring the Operational Potential of Queue-Ratio Policies in Multiclass Queueing Networks
Speaker: Lucy Huo
Abstract: Motivated by the recent stability analysis of multiclass queueing networks under queue-ratio policies by Zhao, Gurvich, and Hasenbein (2024), we build on their results to explore the operational implications of this policy class. Under their assumptions, together with a P-matrix reflection matrix, we establish that, in a multi-scale heavy traffic regime, the stationary distribution of the scaled workload converges to a product-form limit with exponentially distributed marginals. Together with state space collapse, the scaled queue lengths are also exponentially distributed in the limit. These results yield a tractable closed-form approximation for steady-state performance evaluation without simulation. Leveraging the closed-form approximations, we investigate their potential as a robust engineering tool for operational optimization, for example, identifying near-optimal policies for objectives such as delay minimization and resource balancing. We present numerical experiments and report our ongoing exploration in this direction, concluding with a discussion of open problems.
Title: TBA.
Speaker: Sandeep Juneja
Abstract: TBA.
Title: Data-Driven Matching for Impatient and Heterogeneous Demand and Supply
Speaker: Weiliang Liu
Abstract: Motivated by two-sided service platforms, we study the problem of matching heterogeneous demand (customers) and supply (workers) that arrive randomly over time. Customers and workers arrive with a randomly sampled patience time, and are lost (renege) if forced to wait longer than that time for a match, creating a fundamental tradeoff between immediate matching and waiting for better pairings. We consider a data-driven setting where the system operator designs matching polices based only on historical data on inter-arrival and patience times, without any knowledge of their ground-truth distributions. Finding an optimal policy is intractable even with full information due to potentially non-Markovian arrivals and renegings. To address this, we formulate a deterministic, data-driven fluid matching problem (DDMP) that serves as an approximation to the original stochastic matching problem when demand and supply rates grow large. We establish finite-sample statistical guarantees on the DDMP based on a novel moderated uniform error bound for estimating the patience time quantile function—results that may be of independent interest. Using an optimal DDMP solution, we propose a discrete-review matching policy and prove that it is \(\epsilon\)-asymptotically optimal (on fluid scale, as demand and supply rates grow large) with high probability. Furthermore, we show that the DDMP can be approximately solved via a mixed-integer linear program; we quantify the resulting approximation gap and analyze its impact on policy performance. Overall, this talk provides a framework for integrating statistical learning with queueing asymptotics to develop effective data-driven policies under unknown model primitives.
Title: Inference problems in stochastic networks
Speaker: Michel Mandjes
Abstract: The bulk of the random graph literature concerns models that are of an inherently static nature, in that features of the random graph at a single point in time are considered. There are strong practical motivations, however, to consider random graphs that are stochastically evolving, so as to model networks’ inherent dynamics. I’ll start this talk by briefly discussing a set of dynamic random graph mechanisms and their probabilistic properties. Key results cover functional diffusion limits for subgraph counts (describing the behaviour around the mean) and a sample-path large-deviation principle (describing the rare-event behaviour, thus extending the seminal result for the static case developed by Chatterjee and Varadhan). The second part of my talk will be about estimation of the model parameters from partial information. We for instance demonstrate how the model’s underlying parameters can be estimated from just snapshots of the number of edges. We also consider settings in which particles move around on a dynamically evolving random graph, and in which the graph dynamics are inferred from the movements of the particles (i.e., not observing the graph process). In the third part I will consider a static network, its nodes being infinite-server queues. The goal is to device a way to estimate key parameters, such as the arrival rates, service-time parameters and routing matrix, using observations of the network population vector at Poisson time points. We propose a method-of-moments estimator and establish its consistency. Numerical experiments demonstrate that the method yields accurate estimates, even in settings with a large number of parameters. We present two variants: one that assumes a known parametric form for the service-time distributions, and a model-free version that does not require such assumptions. (Joint work with Peter Braunsteins, Hritika Gupta, Rajat Hazra, Frank den Hollander, Nikolai Krioukov, Liron Ravner, and Jiesen Wang)
Title: Stochastic Approximation Algorithms for Regulating Queues with Strategic Customers
Speaker: Liron Ravner
Abstract: Performance analysis and control of non-Markovian queueing systems often require simulation, as closed-form solutions are rarely available. This challenge becomes even more complex when customers behave strategically—adjusting their decisions (e.g., whether to join) based on expected delays—creating a feedback loop between customer strategy and system performance. In this talk, we present stochastic approximation algorithms tailored to such settings. We demonstrate their efficiency in two canonical regulation problems: (1) social welfare optimization and (2) revenue maximization. Our approach leverages the regenerative structure of the queueing system, enabling asymptotic performance guarantees under mild regularity assumptions.
Title: Does Stochastic Gradient really succeed for bandits?
Speaker: Patrick Rebeschin
Abstract: Stochastic Gradient Bandit (SGB) is a policy-gradient algorithm for the multi-armed bandit problem that has recently received significant attention due to its simplicity and its role as a fundamental primitive in reinforcement learning. Under certain a-priori assumptions about the algorithm’s dynamics, recent work has shown that using a constant learning rate guarantees asymptotic convergence to the optimal policy, and that sufficiently small learning rates can lead to logarithmic regret. However, it remains unclear whether such regret guarantees extend beyond small learning rates. In contrast to much of the existing literature—which primarily adopts an optimization-focused perspective and often relies on additional regularization—we study unregularized SGB from the standpoint of regret minimization under no additional assumptions. We characterize the regret behavior of SGB as a function of its learning rate. For two-armed bandits, we identify a sharp threshold, dependent on the sub-optimality gap, below which SGB achieves logarithmic regret on all instances, and above which it may suffer polynomial regret on certain instances. For general K-armed bandits, we show that the learning rate must scale inversely with K to avoid polynomial regret. Our analysis introduces novel techniques for deriving regret upper bounds for SGB, providing foundational insights for the development of future gradient-based algorithms in bandit settings. (Based on joint work with Dorian Baudry, Emmeran Johnson, Simon Vary, and Ciara Pike-Burke)
Title: Experimentation for Different Scheduling Policies on Queues: Mixed Differences-in-Q Estimators Based on Little’s Law
Speaker: Nian Si
Abstract: In data centers, tasks are dispatched to various servers to evenly distribute the workload. When a data center considers implementing a new scheduling algorithm, it typically conducts an A/B test prior to deployment to assess the real-world impact of this new method. However, a straightforward A/B test might be interfered with so-called “Markovian” interference. We utilized the Differences-in-Q estimator, as developed by Farias et al. (2022), and introduced mixed Differences-in-Q estimators grounded in Little’s Law. We show that our A/B testing methods significantly reduce bias and variance when testing various power-of-\(d\) policies. Extensive simulations were conducted under scenarios like non-stationary arrival rates, heterogeneous service rates, and communication delays. These simulations highlight the robustness and efficacy of our A/B testing approach. It is a joint work with Nanshan Jia and Zeyu Zheng from UC Berkeley and Ramesh Johari from Stanford.
Title: TBA.
Speaker: Alexandre Thiéry
Abstract: TBA.
Title: Finite-Time Information-Theoretic Bounds in Queueing Control
Speaker: Yunbei Xu
Abstract: We establish the first finite‑time information‑theoretic lower bounds—and derive new policies that achieve them—for the total queue length in scheduling problems over stochastic processing networks with both adversarial and stochastic arrivals. Prior analyses of MaxWeight guarantee only stability and asymptotic optimality in heavy traffic; we prove that, at finite horizons, MaxWeight can incur strictly larger backlog by problem‑dependent factors which we identify. Our main innovations are 1) a minimax framework that pinpoints the precise problem parameters governing any policy’s finite‑time performance; 2) an information‑theoretic lower bound on total queue length; 3) fundamental limitation of MaxWeight that it is suboptimal in finite time; and 4) a new scheduling rule that minimizes the full Lyapunov drift—including its second‑order term—thereby matching the lower bound under certain conditions, up to universal constants. These findings reveal a fundamental limitation on “drift‑only” methods and points the way toward principled, non‑asymptotic optimality in queueing control.”