Speakers

Click on the profile picture to view the speaker’s personal homepage.

Shubhada Agrawal

Carnegie Mellon University / IISc Bengaluru

Title: Concentration of Stochastic Approximation Under Markovian Noise

Speaker: Shubhada Agrawal

Abstract: We establish maximal concentration inequalities for the iterates generated by stochastic approximation algorithms under different noise assumptions. First, we consider bounded Markovian noise. If the noise variance is uniformly bounded (for all states), we show that the error is always sub-Gaussian. However, if the variance of the noise is affine in the norm square of the state, then we have a phase transition on the tail of the error: when step size is \(1/k\), we get sub-Weibull, when step size is \(1/k^z\) with \(z < 1\) we get something lighter than any Pareto but heavier than any Weibull. To establish these, we use a novel Lyapunov function which involves the MGF of the solution of a Poisson equation, and an auxiliary projected algorithm. Second, for the case where the operator is a contraction, we generalize the noise to have an arbitrary distribution. If the noise variance is uniformly bounded, the error has almost the same tail as the noise. If the variance is affine in the norm square of the state, then the error has a significantly heavier tail. To establish these, we develop a novel black-box approach based on truncation that reduces the problem to the bounded Markovian noise setting.

Alessandro Arlotto

Duke University

Title: Ballot design and electoral outcomes: The role of candidate order and party affiliation

Speaker: Alessandro Arlotto

Abstract: We study how designing ballots with and without party designations impacts electoral outcomes when partisan voters rely on party-order cues to infer candidate affiliation in races without designations. If the party orders of candidates in races with and without party designations differ, these voters might cast their votes incorrectly. We identify a quasi-randomized natural experiment with contest-level treatment assignment pertaining to North Carolina judicial elections and use double machine learning to accurately capture the magnitude of such incorrectly cast votes. Using precinct-level election and demographic data, we estimate that 12.08% (95% confidence interval: 4.95%, 19.20%) of democratic partisan voters and 13.63% (95% confidence interval: 5.14%, 22.10%) of republican partisan voters cast their votes incorrectly due to the difference in party orders. Our results indicate that ballots mixing contests with and without party designations mislead many voters, leading to outcomes that do not reflect true voter preferences. To accurately capture voter intent, such ballot designs should be avoided. (Joint work with A. Belloni, F. Fang, and S. Pekec.)

Baris Ata

University of Chicago Booth School of Business

Title: TBA.

Speaker: Baris Ata

Abstract: TBA.

Yilun Chen

The Chinese University of Hong Kong, Shenzhen

Title: TBA.

Speaker: Yilun Chen

Abstract: TBA.

Xuefeng Gao

The Chinese University of Hong Kong

Title: Reward-Directed Diffusion Models via Continuous-Time Reinforcement Learning

Speaker: Xuefeng Gao

Abstract: In this talk, I will discuss some reinforcement learning approach for training continuous-time score-based diffusion models in generative AI to generate samples that maximize reward functions while keeping the generated distributions close to the unknown target data distributions. Unlike most existing studies, our approach does not involve any pretrained model. Theoretical and numerical results will be presented to demonstrate the efficacy of our approach.

Andrew Daw

University of Southern California

Title: Stopping Self-Excitement

Speaker: Andrew Daw

Abstract: From management of service interactions to studies of neural signal propagation, there is a subtle but fundamental challenge that comes with such endogenously spurred activity: when should you say it’s over? If each point has the potential to beget future points, there is an inherent tradeoff between omission and excess: stop too early and you may miss further observations, but stop too late and you’ve wasted too much time. In this paper, we study an optimal stopping problem that aims to balance this omission-excess tradeoff within a cluster of history-driven events. We analyze and prescribe stopping policies for the self-exciting stochastic process through a transformation to the workload process for an M/D/1 queueing system. Then, to connect these model-based insights to practice, we draw upon ideas from the algorithms-with-predictions literature and improve the model-based policies to adapt to contextual information, such as the texts within a service interaction. Finally, we demonstrate the performance of these policies through application to a publicly available linguistics data set.

Guiyu Hong

The Chinese University of Hong Kong, Shenzhen

Title: Online learning for dynamic pricing with delay-sensitive customers

Speaker: Guiyu Hong

Abstract: Motivated by applications in sharing economy and online marketplaces, we introduce an online learning method to optimize surge pricing strategies for service systems. Specifically, we propose a non-parametric framework that doesn’t rely on any assumptions about the function form of a customer’s sensitivity to price and delay. Both our algorithmic design and theoretical analysis arise from a detailed examination of the underlying queueing structure. We demonstrate that our online learning algorithm achieves an improved regret order over standard multi-armed bandit algorithms and showcase its efficiency through a set of numerical examples.

Lucy Huo

Hong Kong University of Science and Technology (HKUST)

Title: Exploring the Operational Potential of Queue-Ratio Policies in Multiclass Queueing Networks

Speaker: Lucy Huo

Abstract: Motivated by the recent stability analysis of multiclass queueing networks under queue-ratio policies by Zhao, Gurvich, and Hasenbein (2024), we build on their results to explore the operational implications of this policy class. Under their assumptions, together with a P-matrix reflection matrix, we establish that, in a multi-scale heavy traffic regime, the stationary distribution of the scaled workload converges to a product-form limit with exponentially distributed marginals. Together with state space collapse, the scaled queue lengths are also exponentially distributed in the limit. These results yield a tractable closed-form approximation for steady-state performance evaluation without simulation. Leveraging the closed-form approximations, we investigate their potential as a robust engineering tool for operational optimization, for example, identifying near-optimal policies for objectives such as delay minimization and resource balancing. We present numerical experiments and report our ongoing exploration in this direction, concluding with a discussion of open problems.

Sandeep Juneja

Tata Institute of Fundamental Research

Title: TBA.

Speaker: Sandeep Juneja

Abstract: TBA.

Weiliang Liu

University of Chicago Booth School of Business

Title: Data-Driven Matching for Impatient and Heterogeneous Demand and Supply

Speaker: Weiliang Liu

Abstract: Motivated by two-sided service platforms, we study the problem of matching heterogeneous demand (customers) and supply (workers) that arrive randomly over time. Customers and workers arrive with a randomly sampled patience time, and are lost (renege) if forced to wait longer than that time for a match, creating a fundamental tradeoff between immediate matching and waiting for better pairings. We consider a data-driven setting where the system operator designs matching polices based only on historical data on inter-arrival and patience times, without any knowledge of their ground-truth distributions. Finding an optimal policy is intractable even with full information due to potentially non-Markovian arrivals and renegings. To address this, we formulate a deterministic, data-driven fluid matching problem (DDMP) that serves as an approximation to the original stochastic matching problem when demand and supply rates grow large. We establish finite-sample statistical guarantees on the DDMP based on a novel moderated uniform error bound for estimating the patience time quantile function—results that may be of independent interest. Using an optimal DDMP solution, we propose a discrete-review matching policy and prove that it is \(\epsilon\)-asymptotically optimal (on fluid scale, as demand and supply rates grow large) with high probability. Furthermore, we show that the DDMP can be approximately solved via a mixed-integer linear program; we quantify the resulting approximation gap and analyze its impact on policy performance. Overall, this talk provides a framework for integrating statistical learning with queueing asymptotics to develop effective data-driven policies under unknown model primitives.

Michel Mandjes

Leiden University

Title: Inference problems in stochastic networks

Speaker: Michel Mandjes

Abstract: The bulk of the random graph literature concerns models that are of an inherently static nature, in that features of the random graph at a single point in time are considered. There are strong practical motivations, however, to consider random graphs that are stochastically evolving, so as to model networks’ inherent dynamics. I’ll start this talk by briefly discussing a set of dynamic random graph mechanisms and their probabilistic properties. Key results cover functional diffusion limits for subgraph counts (describing the behaviour around the mean) and a sample-path large-deviation principle (describing the rare-event behaviour, thus extending the seminal result for the static case developed by Chatterjee and Varadhan). The second part of my talk will be about estimation of the model parameters from partial information. We for instance demonstrate how the model’s underlying parameters can be estimated from just snapshots of the number of edges. We also consider settings in which particles move around on a dynamically evolving random graph, and in which the graph dynamics are inferred from the movements of the particles (i.e., not observing the graph process). In the third part I will consider a static network, its nodes being infinite-server queues. The goal is to device a way to estimate key parameters, such as the arrival rates, service-time parameters and routing matrix, using observations of the network population vector at Poisson time points. We propose a method-of-moments estimator and establish its consistency. Numerical experiments demonstrate that the method yields accurate estimates, even in settings with a large number of parameters. We present two variants: one that assumes a known parametric form for the service-time distributions, and a model-free version that does not require such assumptions. (Joint work with Peter Braunsteins, Hritika Gupta, Rajat Hazra, Frank den Hollander, Nikolai Krioukov, Liron Ravner, and Jiesen Wang)

Liron Ravner

University of Haifa

Title: Stochastic Approximation Algorithms for Regulating Queues with Strategic Customers

Speaker: Liron Ravner

Abstract: Performance analysis and control of non-Markovian queueing systems often require simulation, as closed-form solutions are rarely available. This challenge becomes even more complex when customers behave strategically—adjusting their decisions (e.g., whether to join) based on expected delays—creating a feedback loop between customer strategy and system performance. In this talk, we present stochastic approximation algorithms tailored to such settings. We demonstrate their efficiency in two canonical regulation problems: (1) social welfare optimization and (2) revenue maximization. Our approach leverages the regenerative structure of the queueing system, enabling asymptotic performance guarantees under mild regularity assumptions.

Patrick Rebeschini

University of Oxford

Title: Does Stochastic Gradient really succeed for bandits?

Speaker: Patrick Rebeschin

Abstract: Stochastic Gradient Bandit (SGB) is a policy-gradient algorithm for the multi-armed bandit problem that has recently received significant attention due to its simplicity and its role as a fundamental primitive in reinforcement learning. Under certain a-priori assumptions about the algorithm’s dynamics, recent work has shown that using a constant learning rate guarantees asymptotic convergence to the optimal policy, and that sufficiently small learning rates can lead to logarithmic regret. However, it remains unclear whether such regret guarantees extend beyond small learning rates. In contrast to much of the existing literature—which primarily adopts an optimization-focused perspective and often relies on additional regularization—we study unregularized SGB from the standpoint of regret minimization under no additional assumptions. We characterize the regret behavior of SGB as a function of its learning rate. For two-armed bandits, we identify a sharp threshold, dependent on the sub-optimality gap, below which SGB achieves logarithmic regret on all instances, and above which it may suffer polynomial regret on certain instances. For general K-armed bandits, we show that the learning rate must scale inversely with K to avoid polynomial regret. Our analysis introduces novel techniques for deriving regret upper bounds for SGB, providing foundational insights for the development of future gradient-based algorithms in bandit settings. (Based on joint work with Dorian Baudry, Emmeran Johnson, Simon Vary, and Ciara Pike-Burke)

Cynthia Rush

Columbia University

Title: TBA.

Speaker: Cynthia Rush

Abstract: TBA.

Nian Si

Hong Kong University of Science and Technology (HKUST)

Title: Experimentation for Different Scheduling Policies on Queues: Mixed Differences-in-Q Estimators Based on Little’s Law

Speaker: Nian Si

Abstract: In data centers, tasks are dispatched to various servers to evenly distribute the workload. When a data center considers implementing a new scheduling algorithm, it typically conducts an A/B test prior to deployment to assess the real-world impact of this new method. However, a straightforward A/B test might be interfered with so-called “Markovian” interference. We utilized the Differences-in-Q estimator, as developed by Farias et al. (2022), and introduced mixed Differences-in-Q estimators grounded in Little’s Law. We show that our A/B testing methods significantly reduce bias and variance when testing various power-of-\(d\) policies. Extensive simulations were conducted under scenarios like non-stationary arrival rates, heterogeneous service rates, and communication delays. These simulations highlight the robustness and efficacy of our A/B testing approach. It is a joint work with Nanshan Jia and Zeyu Zheng from UC Berkeley and Ramesh Johari from Stanford.

Alexandre Thiéry

National University of Singapore

Title: TBA.

Speaker: Alexandre Thiéry

Abstract: TBA.

Yunbei Xu

National University of Singapore

Title: Finite-Time Information-Theoretic Bounds in Queueing Control

Speaker: Yunbei Xu

Abstract: We establish the first finite‑time information‑theoretic lower bounds—and derive new policies that achieve them—for the total queue length in scheduling problems over stochastic processing networks with both adversarial and stochastic arrivals. Prior analyses of MaxWeight guarantee only stability and asymptotic optimality in heavy traffic; we prove that, at finite horizons, MaxWeight can incur strictly larger backlog by problem‑dependent factors which we identify. Our main innovations are 1) a minimax framework that pinpoints the precise problem parameters governing any policy’s finite‑time performance; 2) an information‑theoretic lower bound on total queue length; 3) fundamental limitation of MaxWeight that it is suboptimal in finite time; and 4) a new scheduling rule that minimizes the full Lyapunov drift—including its second‑order term—thereby matching the lower bound under certain conditions, up to universal constants. These findings reveal a fundamental limitation on “drift‑only” methods and points the way toward principled, non‑asymptotic optimality in queueing control.”

Galit Yom-Tov

Technion – Israel Institute of Technology

Title: TBA.

Speaker: Galit Yom-Tov

Abstract: TBA.

归档

分类

Shubhada Agrawal

Title: Concentration of Stochastic Approximation Under Markovian Noise

Alessandro Arlotto

Title: Ballot design and electoral outcomes: The role of candidate order and party affiliation

Baris Ata

Title: TBA.

Yilun Chen

Title: TBA.

Xuefeng Gao

Title: Reward-Directed Diffusion Models via Continuous-Time Reinforcement Learning

Andrew Daw

Title: Stopping Self-Excitement

Guiyu Hong

Title: Online learning for dynamic pricing with delay-sensitive customers

Lucy Huo

Title: Exploring the Operational Potential of Queue-Ratio Policies in Multiclass Queueing Networks

Sandeep Juneja

Title: TBA.

Weiliang Liu

Title: Data-Driven Matching for Impatient and Heterogeneous Demand and Supply

Michel Mandjes

Title: Inference problems in stochastic networks

Liron Ravner

Title: Stochastic Approximation Algorithms for Regulating Queues with Strategic Customers

Patrick Rebeschini

Title: Does Stochastic Gradient really succeed for bandits?

Cynthia Rush

Title: TBA.

Nian Si

Title: Experimentation for Different Scheduling Policies on Queues: Mixed Differences-in-Q Estimators Based on Little’s Law

Alexandre Thiéry

Title: TBA.

Yunbei Xu

Title: Finite-Time Information-Theoretic Bounds in Queueing Control

Galit Yom-Tov

Title: TBA.