Click on the profile picture to view the speaker’s personal homepage.
Title: Concentration of Stochastic Approximation under Markovian Noise
Speaker: Shubhada Agrawal
Abstract: We establish maximal concentration inequalities for the iterates generated by stochastic approximation algorithms under different noise assumptions. First, we consider bounded Markovian noise. If the noise variance is uniformly bounded (for all states), we show that the error is always sub-Gaussian. However, if the variance of the noise is affine in the norm square of the state, then we have a phase transition on the tail of the error: when step size is \(1/k\), we get sub-Weibull, when step size is \(1/k^z\) with \(z < 1\) we get something lighter than any Pareto but heavier than any Weibull. To establish these, we use a novel Lyapunov function which involves the MGF of the solution of a Poisson equation, and an auxiliary projected algorithm. Second, for the case where the operator is a contraction, we generalize the noise to have an arbitrary distribution. If the noise variance is uniformly bounded, the error has almost the same tail as the noise. If the variance is affine in the norm square of the state, then the error has a significantly heavier tail. To establish these, we develop a novel black-box approach based on truncation that reduces the problem to the bounded Markovian noise setting.
Title: Ballot Design and Electoral Outcomes: The Role of Candidate Order and Party Affiliation
Speaker: Alessandro Arlotto
Abstract: We study how designing ballots with and without party designations impacts electoral outcomes when partisan voters rely on party-order cues to infer candidate affiliation in races without designations. If the party orders of candidates in races with and without party designations differ, these voters might cast their votes incorrectly. We identify a quasi-randomized natural experiment with contest-level treatment assignment pertaining to North Carolina judicial elections and use double machine learning to accurately capture the magnitude of such incorrectly cast votes. Using precinct-level election and demographic data, we estimate that 12.08% (95% confidence interval: 4.95%, 19.20%) of democratic partisan voters and 13.63% (95% confidence interval: 5.14%, 22.10%) of republican partisan voters cast their votes incorrectly due to the difference in party orders. Our results indicate that ballots mixing contests with and without party designations mislead many voters, leading to outcomes that do not reflect true voter preferences. To accurately capture voter intent, such ballot designs should be avoided. (Joint work with A. Belloni, F. Fang, and S. Pekec.)
Title: The Side Effects of Adaptive Learning Algorithms
Speaker: Yilun Chen
Abstract: Adaptive learning algorithms, such as stochastic multi-armed bandits, are widely applicable across diverse settings. Yet, their inherent adaptivity can introduce unintended side effects in downstream tasks. In the first part of this talk, I focus on the side effect of bandit algorithms when they are employed by platforms to allocate demand between competing sellers (arms). We show that optimizing classic regret objectives must induce instability. We establish a fundamental trade-off between regret and the variance of allocations. In particular, the product of worst-case regret and worst-case standard deviation of arm pulls must grow as \(\Omega(T^{3/2})\) with horizon length \(T\). This impossibility result shows that algorithms with optimal regret guarantees inevitably generate highly fluctuating allocations, raising concerns about fairness and risk exposure for the participating agents. We then propose a simple class of UCB-type policies to achieve any point on the Pareto frontier of the two objectives. In the second part, I turn to the statistical property of data collected using bandit algorithms. We show that naive Z-statistics constructed from bandit-sampled data (for each arm) suffer from intrinsic bias. For UCB-type algorithms, we precisely characterize the leading asymptotic term of this bias and show that its decay rate is inversely tied to the regret scaling. Consequently, under regret-minimizing algorithms, the bias decreases only at an extremely slow rate and remains significant in practical, finite-sample regimes. This establishes a fundamental tension between regret minimization and the validity of classical statistical inference.
Title: Stopping Self-excitement
Speaker: Andrew Daw
Abstract: From management of service interactions to studies of neural signal propagation, there is a subtle but fundamental challenge that comes with such endogenously spurred activity: when should you say it’s over? If each point has the potential to beget future points, there is an inherent tradeoff between omission and excess: stop too early and you may miss further observations, but stop too late and you’ve wasted too much time. In this paper, we study an optimal stopping problem that aims to balance this omission-excess tradeoff within a cluster of history-driven events. We analyze and prescribe stopping policies for the self-exciting stochastic process through a transformation to the workload process for an M/D/1 queueing system. Then, to connect these model-based insights to practice, we draw upon ideas from the algorithms-with-predictions literature and improve the model-based policies to adapt to contextual information, such as the texts within a service interaction. Finally, we demonstrate the performance of these policies through application to a publicly available linguistics data set.
Title: Reward-directed Diffusion Models via Continuous-time Reinforcement Learning
Speaker: Xuefeng Gao
Abstract: In this talk, I will discuss some reinforcement learning approach for training continuous-time score-based diffusion models in generative AI to generate samples that maximize reward functions while keeping the generated distributions close to the unknown target data distributions. Unlike most existing studies, our approach does not involve any pretrained model. Theoretical and numerical results will be presented to demonstrate the efficacy of our approach.
Title: Belief-driven Strategic Queueing Models
Speaker: John J. Hasenbein
Abstract: In various service systems, the beliefs regarding the systems parameters, on the part of the customers, may be driven (or derived) by different data sources. In addition, it is common for these data sources to be inaccurate reflection of the true systems statistics. In this work, we use classical models with strategic customers to give some insight into these systems characteristics. In particular, we analyze models in which the system manager has full (and accurate) knowledge of system data, but the customers do not. We compare three cases: all customers beliefs are driven by a common source, customers have heterogenous belief sources, and customers have full and accurate knowledge. This in turn gives insight into the system manager’s desire to provide additional data (or not) to disabuse the customer of their current beliefs.
Title: Learning When to Surge: Online Queue-based Pricing for Delay-sensitive Customers
Speaker: Guiyu Hong
Abstract: Motivated by applications in sharing economy and online marketplaces, we introduce an online learning method to optimize surge pricing strategies for service systems. Specifically, we propose a non-parametric framework that doesn’t rely on any assumptions about the function form of a customer’s sensitivity to price and delay. Both our algorithmic design and theoretical analysis arise from a detailed examination of the underlying queueing structure. We demonstrate that our online learning algorithm achieves an improved regret order over standard multi-armed bandit algorithms and showcase its efficiency through a set of numerical examples.
Title: Exploring the Operational Potential of Queue-ratio Policies in Multiclass Queueing Networks
Speaker: Lucy Huo
Abstract: Motivated by the recent stability analysis of multiclass queueing networks under queue-ratio policies by Zhao, Gurvich, and Hasenbein (2024), we build on their results to explore the operational implications of this policy class. Under their assumptions, together with a P-matrix reflection matrix, we establish that, in a multi-scale heavy traffic regime, the stationary distribution of the scaled workload converges to a product-form limit with exponentially distributed marginals. Together with state space collapse, the scaled queue lengths are also exponentially distributed in the limit. These results yield a tractable closed-form approximation for steady-state performance evaluation without simulation. Leveraging the closed-form approximations, we investigate their potential as a robust engineering tool for operational optimization, for example, identifying near-optimal policies for objectives such as delay minimization and resource balancing. We present numerical experiments and report our ongoing exploration in this direction, concluding with a discussion of open problems.
Title: Generating Samples from Exponentially Tilted Distributions via Diffusions
Speaker: Sandeep Juneja
Abstract: In many practical settings, given samples from an original distribution, new independent samples are needed from a distribution obtained by suitably exponentially tilting the original one. Applications include finance where one may want to generate samples from a distribution close to the physical probability measure but one that satisfies experts view on the markets. Similarly, in climate modelling samples may be needed that resemble current conditions but satisfy the global warming constraints. Exponentially tilted distributions are fundamental to rare event simulation and large deviations theory. In our work, we achieve samples from exponentially tilted distributions in two steps. First, we reweigh the samples of the original distribution appropriately using a tilted empirical estimator. Subsequently, we perform diffusion sampling on the output of the estimator, thereby obtaining many independent samples from the tilted distributions. We delineate regimes where the empirical estimator performs well, and where it does not: informally, large tilts cannot be performed with very few samples, and tilted unbounded distributions is harder than tilting bounded distributions. We also provide theoretical guarantees on the accuracy of diffusion sampling in these regimes.
Title: Data-driven Matching for Impatient and Heterogeneous Demand and Supply
Speaker: Weiliang Liu
Abstract: Motivated by two-sided service platforms, we study the problem of matching heterogeneous demand (customers) and supply (workers) that arrive randomly over time. Customers and workers arrive with a randomly sampled patience time, and are lost (renege) if forced to wait longer than that time for a match, creating a fundamental tradeoff between immediate matching and waiting for better pairings. We consider a data-driven setting where the system operator designs matching polices based only on historical data on inter-arrival and patience times, without any knowledge of their ground-truth distributions. Finding an optimal policy is intractable even with full information due to potentially non-Markovian arrivals and renegings. To address this, we formulate a deterministic, data-driven fluid matching problem (DDMP) that serves as an approximation to the original stochastic matching problem when demand and supply rates grow large. We establish finite-sample statistical guarantees on the DDMP based on a novel moderated uniform error bound for estimating the patience time quantile function—results that may be of independent interest. Using an optimal DDMP solution, we propose a discrete-review matching policy and prove that it is \(\epsilon\)-asymptotically optimal (on fluid scale, as demand and supply rates grow large) with high probability. Furthermore, we show that the DDMP can be approximately solved via a mixed-integer linear program; we quantify the resulting approximation gap and analyze its impact on policy performance. Overall, this talk provides a framework for integrating statistical learning with queueing asymptotics to develop effective data-driven policies under unknown model primitives.
Title: Inference Problems in Stochastic Networks
Speaker: Michel Mandjes
Abstract: The bulk of the random graph literature concerns models that are of an inherently static nature, in that features of the random graph at a single point in time are considered. There are strong practical motivations, however, to consider random graphs that are stochastically evolving, so as to model networks’ inherent dynamics. I’ll start this talk by briefly discussing a set of dynamic random graph mechanisms and their probabilistic properties. Key results cover functional diffusion limits for subgraph counts (describing the behaviour around the mean) and a sample-path large-deviation principle (describing the rare-event behaviour, thus extending the seminal result for the static case developed by Chatterjee and Varadhan). The second part of my talk will be about estimation of the model parameters from partial information. We for instance demonstrate how the model’s underlying parameters can be estimated from just snapshots of the number of edges. We also consider settings in which particles move around on a dynamically evolving random graph, and in which the graph dynamics are inferred from the movements of the particles (i.e., not observing the graph process). In the third part I will consider a static network, its nodes being infinite-server queues. The goal is to device a way to estimate key parameters, such as the arrival rates, service-time parameters and routing matrix, using observations of the network population vector at Poisson time points. We propose a method-of-moments estimator and establish its consistency. Numerical experiments demonstrate that the method yields accurate estimates, even in settings with a large number of parameters. We present two variants: one that assumes a known parametric form for the service-time distributions, and a model-free version that does not require such assumptions. (Joint work with Peter Braunsteins, Hritika Gupta, Rajat Hazra, Frank den Hollander, Nikolai Krioukov, Liron Ravner, and Jiesen Wang)
Title: Stochastic Approximation Algorithms for Regulating Queues with Strategic Customers
Speaker: Liron Ravner
Abstract: Performance analysis and control of non-Markovian queueing systems often require simulation, as closed-form solutions are rarely available. This challenge becomes even more complex when customers behave strategically—adjusting their decisions (e.g., whether to join) based on expected delays—creating a feedback loop between customer strategy and system performance. In this talk, we present stochastic approximation algorithms tailored to such settings. We demonstrate their efficiency in two canonical regulation problems: (1) social welfare optimization and (2) revenue maximization. Our approach leverages the regenerative structure of the queueing system, enabling asymptotic performance guarantees under mild regularity assumptions.
Title: Optimal Sequential Inference and Decision Making: From Uniform A/B Testing to Gradient-Based Bandits
Speaker: Patrick Rebeschini
Abstract: Sequential data collection and decision making lie at the core of modern statistical learning, from anytime-valid A/B testing in online experiments to adaptive multi-armed bandit problems. This talk presents recent advances in the design and analysis of optimal algorithms for these two settings. In the first part, I will introduce a framework for anytime-valid, variance-adaptive inference in monotonic processes–such as cumulative distribution functions–that builds on the coin-betting paradigm of game-theoretic statistics and integrates PAC-Bayesian principles to yield tight hypothesis tests that are uniform not only in time but also in space. In the second part, I will focus on stochastic gradient bandits, a fundamental policy-gradient approach to online decision making, and present theoretical results showing how the learning rate governs the algorithm’s regret, revealing sharp thresholds that separate logarithmic and polynomial regimes and depend on the (unknown) sub-optimality gap. (Based on joint work with E. Clerico and H. E. Flynn, and with D. Baudry, E. Johnson, S. Vary, and C. Pike-Burke)
Title: The Out-of-sample Rrediction Error of the Square-root Lasso and Related Estimators
Speaker: Cynthia Rush
Abstract: We study the classical problem of predicting an outcome variable, \(Y\), using a linear combination of a d-dimensional covariate vector, \(X\). We are interested in linear predictors whose coefficients solve: \(\inf_{\beta} \Big(\mathbb{E}\big[(Y – \langle \beta, X \rangle)^r\big]\Big)^{1/r} + \delta \left\Vert \beta \right\Vert\), where \(r > 1\) and \(𝛿 > 0\) is a regularization parameter. We provide conditions under which linear predictors based on these estimators minimize the worst-case prediction error over a ball of distributions determined by a type of max-sliced Wasserstein metric. A detailed analysis of the statistical properties of this metric yields a simple recommendation for the choice of regularization parameter. The suggested order of \(𝛿\), after a suitable normalization of the covariates, is typically \(\frac{d}{n}\), up to logarithmic factors. Our recommendation is computationally straightforward to implement, pivotal, has provable out-of-sample performance guarantees, and does not rely on sparsity assumptions about the true data generating process. This is joint work with Jose Montiel Olea, Amilcar Velez and Johannes Wiesel.
Title: Evaluation of Queueing Scheduling and LLM Inference via Stability Conditions and A/B Tests
Speaker: Nian Si
Abstract: The talk examines the evaluation of scheduling policies in both LLM inference and traditional data center settings. In the first part of the talk, we develop a queueing-theoretic framework for LLM inference that jointly accounts for computation and GPU memory constraints from key–value (KV) caching. This framework yields sharp stability and instability conditions, which we validate with real GPU experiments showing predictions accurate within 10%.
In the second part, we study A/B testing of queueing-based scheduling policies in data centers. Standard A/B tests in this setting suffer from Markovian interference. Therefore, we utilized the Differences-in-Q estimator, as developed by Farias et al. (2022), and introduced mixed Differences-in-Q estimators grounded in Little’s Law. Our methods substantially reduce bias and variance when testing power-of-d policies, with robustness demonstrated through extensive simulations under non-stationary arrivals, heterogeneous servers, and communication delays.
Title: Finite-time Information-theoretic Bounds in Queueing Control
Speaker: Yunbei Xu
Abstract: We establish the first finite‑time information‑theoretic lower bounds—and derive new policies that achieve them—for the total queue length in scheduling problems over stochastic processing networks with both adversarial and stochastic arrivals. Prior analyses of MaxWeight guarantee only stability and asymptotic optimality in heavy traffic; we prove that, at finite horizons, MaxWeight can incur strictly larger backlog by problem‑dependent factors which we identify. Our main innovations are 1) a minimax framework that pinpoints the precise problem parameters governing any policy’s finite‑time performance; 2) an information‑theoretic lower bound on total queue length; 3) fundamental limitation of MaxWeight that it is suboptimal in finite time; and 4) a new scheduling rule that minimizes the full Lyapunov drift—including its second‑order term—thereby matching the lower bound under certain conditions, up to universal constants. These findings reveal a fundamental limitation on “drift‑only” methods and points the way toward principled, non‑asymptotic optimality in queueing control.”
Title: Optimizing Visitor Load at National Parks: Balancing Accessibility and Overcrowding through Online Reservation Systems
Speaker: Galit Yom-Tov
Abstract: The management of national parks (NPs) involves striking a delicate balance between conserving nature and its inhabitants by limiting human access versus promoting awareness of the park’s wonders by allowing access to designated trails and public areas. Many park authorities (PA) restrict access by establishing visiting hours and limiting the number of visitors allowed to enter per day. Such limitations are managed through a reservation system, requiring visitors to reserve a visiting permit before arrival. We analyze reservation system data provided by the PA of Israel, showing cancellation and no-show behaviors that change dynamically over time and depend on the number of days a reservation is made before the visiting day.
We develop a dynamic policy for the number of reservations the system should be allowed to make every day for a specific focal date. The solution depends on the ratio between the cost of not allowing visitors to make a reservation to the park (“blocking cost”) and the park’s overloading cost normalized by the probability of a reservation being realized (due to cancellations or no-shows). The policy takes the form of a two-threshold policy, where the first is a time threshold determining periods where reservations are unrestricted, and the second is a capacity threshold limiting the total number of reservations made over the entire reservation horizon. We simulate the performance of our fluid policy and compare it to several benchmarks, showing that our proposed policy is the only one inducing minimal cost in all scenarios.
We show how our policies can inform NP authorities regarding the number of days in which the reservation system should be opened for reservations before the focal day. Our model can help NP to better conserve natural resources and provide access to public spaces in a balanced way.
















