OT+MFG reading 1: Variational Mean Field Games

 

(29 May 2023)

This will be one (of hopefully a complete sequence) of posts I am writing as I read about Optimal Transport and/or Mean Field Games. I will not try to give anything like a complete picture or historical introduction here, but discuss and summarize papers in the order that I read them which will likely result in a very nonlinear chronology. For now I will say that I have fond memories as a graduate student circa 2006-2010 when each year for a few weeks in February Pierre Louis Lions would come to Austin and lecture on Mean Field Games. At the time I barely knew any PDE and I could only follow the first couple of minutes of each lecture, but those lectures were the first place where I heard about hydrodynamic limits, Nash equilibria, and infinite dimensional Hamilton-Jacobi equations. Maybe as I work my way back to the papers my memories will get refreshed and I might be able to add more to my recollections.

I am going to start my reading with the paper Variational Mean Field Games by Benamou, Carlier, and Santambrogio in Active Particles, 2017.

The basic problem

The motivating question to mean field games is understanding $N$-player differential games when $N$ is large i.e. in the limit $N\to \infty$.

Let us first describe the $N$-player gamer. For each $i$ ($i=1,\ldots,N$) player $i$ chooses their trajectory $x_i(t)$ by optimizing (minimizing) the objective functional

\[x_i \mapsto \int_0^T \frac{1}{2}|\dot x_i(t)|^2 + g_i(x_1(t),\ldots,x_N(t)) \;dt + \Psi_i(x_i(T))\]

This objective functional covers the time interval $[0,T]$ and it is made out of three parts. First there is a ``kinetic energy’’ term. Second there is the integral over time of the quantity $g_i(x_1(t),\ldots,x_2(t))$, which is how the different interact with each other. Lastly, there is a term $\Psi_i$ which is a contribution to the objective functional depending only on the position of player $i$ at the final time $T$.

The interaction terms $g_1,\ldots,g_N$ are chosen so as to model a key feature of the game, which is that for each player the other $N-1$ players are undistinguishable from each other. This means that each $g_i$ has the same value if one reshuffles all the players other than $i$, and that this relation is the same for each $i$. Indeed, in such a case we can express all the $g_i$ in the form

\[g_i(x_1,\ldots,x_N) = g(x_i, \frac{1}{N-1}\sum \limits_{j\neq i}\delta_{x_j})\]

where $g(x,\mu)$ is a real valued function that depends on a point $x \in \mathbb{R}^d$ and a probability distribution $\mu \in \mathcal{P}(\mathbb{R}^d)$, so $g:\mathbb{R}^d\times \mathcal{P}(\mathbb{R}^d)\to\mathbb{R}$.

With this setup for each $N$, one wants to understand Nash equilibria for the game for large $N$. As the players are undistinguishabel for one another one cares about the overall distribution of players in such equilibria as $N \to \infty$, and so one comes to the problem of analyzing the $N\to \infty$ limit of the time-dependent probability measures

\[\mu^{(N)}_t := \frac{1}{N}\sum \limits_{i=1}^N \delta_{x_i^{N,*}(t)}\]

where for each $N$, the $x_1^{N,*},\ldots,x_N^{N,*}$ form a Nash equilbrium for the corresponding game.

The continuum model, part 1

The question of the convergence of the Nash equilibria as $N\to \infty$ is an important and delicate one that will not be discussed here. Instead, let us describe the problem one expects to obtain in the limit. By this we mean one where (in an ideal world) the measures $\mu^{(N)}_t$ given by the Nash equilibria converge to a $\mu_t$ which is the unique solution to this putative problem.

Suppose we are given such an evolution of probability measures $\mu_t$. Given such a distribution, what does it mean for a single agent to have an optimal trajectory? It means that its trajectory $\gamma(t)$ minimizes the functional

\[J_{\mu}(\gamma) := \int_0^T\frac{1}{2}|\dot \gamma(t)|^2 + g(\gamma(t),\mu_t)\;dt + \Psi(\gamma(T))\]

over all $\gamma$’s with given initial value $\gamma(0)$. Any minimizer of this functional has a simple characterization in terms of the value function,

\[\dot \gamma(t) = -(\nabla \phi)(\gamma(t),t)\]

where the value function $\phi$ is defined as

\[J(x,t) = \inf \left \{ \int_{t}^T\frac{1}{2}|\dot\gamma(s)|^2+g(\gamma(s),\mu_s)\;ds + \Psi(\gamma(T)) \mid \sigma(t) = x \right \}\]

This characterization, at least as stated, works only as long as the value function is a differentiable function. For a differentiable value function $\phi$ it is a classical fact that it must solve the Hamilton-Jacobi equation

\[-\partial_t \phi + \frac{1}{2}|\nabla \phi|^2 = h(x,t)\]

where the function $h$ is defined by $h(x,t) := g(x,\mu_t)$ for every $(x,t)$. In general the value function might not be differentiable but it will solve the equation above in the viscosity sense.

The continuum model, part 2

Whatever the limiting problem as $N\to \infty$, we expect the curve of empirical probability measures $\mu^{(N)}_t$ just constructed to converge to a curve of probability measures $\mu_t$. For an equilibrium situation, each ofn the particles that make up $\mu^{(N)}_t$ will be moving according to the Euler-Lagrange equation, that is their dynamics are governed by a flow (from the previous discussion we expect this flow to be $v(x,t) = -\nabla\phi(x,t)$ for a $\phi$ solving a Hamilton-Jacobi equation)

Therefore, in the limit model one expects to have a curve of measures $\mu_t$ having the form

\[\mu_t = (\Phi_t)_{\#}\mu_0\]

where $\Phi_t$ is the evolution map $\Phi_t:\mathbb{R}^d \to \mathbb{R}^d$ given by some vector field $v(x,t)$. In such a case $\mu_t$ and $v$ will solve the continuity equation

\[\partial_t \mu_t + \text{div}(\mu_t v) = 0\]

Combining this equation with the condition $v= -\nabla \phi$ from earlier, we arrive at the following system of equations

\[\left \{ \begin{array}{rl} \partial_t \mu_t - \text{div}(\mu_t \nabla \phi) & = 0 \\ -\partial_t \phi + \frac{1}{2}|\nabla \phi|^2 & = g(x,\mu_t)\end{array} \right.\]

Together with the boundary conditions

\[\phi(x,T) = \Psi(x),\; (\mu_t)_{\mid t=0} = \mu_0\]

These equations and boundary conditions form what is known as a mean field game, and we call the above the Mean Field Game (MFG) equations. A solution gives a pair $(\mu_t,\phi)$ describing an equilibrium situation for a game. Ideally, one expects (and would like to show) that the limit of the measures $\mu^{(N)}_t$ gives a $\mu_t$ which together with some $\phi$ form a solution to the MFG equations.

A variational principle

Lasry and Lions showed that for the MFG system one obtains a remarkable simplification of the problem in comparison to the case of finite $N$, that is, one can characterize equilibria for the MFG as minimizers for a variational problem.

The problem is the following (here, one needs only consider measures of the form $\mu_t = \rho(x,t) \;dx$).

\[\begin{array}{rl} \text{Minimize } & (\rho,v)\mapsto \int_0^T\int \frac{1}{2}\rho(x,t) |v(x,t)|^2 + G(x,\rho(x,t))\;dxdt + \int_{\mathbb{R}^d}\Psi(x)\rho(x,T)\;dx \\ \text{subject to } & \partial_t\rho + \text{div}(\rho v) = 0 \text{ and } \rho(x,0) = \rho_0(x) \end{array}\]

Here, $G:\mathbb{R}^d\times \mathbb{R} \to \mathbb{R}$ is the function defined by $\partial_\beta G(\alpha,\beta) = g(\alpha,\beta)$ for $\beta>0$, $G(\alpha,0) = 0$ and $G(\alpha,\beta) = +\infty$ for $\beta<0$.

For those familiar with optimal transport, this resembles the Benamou-Brenier problem for the optimal transport problem with quadratic cost. Indeed, the difference here is that we have added additional terms (the integral involving $G$ and $\Psi$). Accordingly, the fields of optimal transport and mean field games are closely related. We will explore be exploring this connection in future posts.