Stochastic optimal control with random coefficients and associated stochastic Hamilton–Jacobi–Bellman equations – Advances in Continuous and Discrete Models

ByJun Moon

Jan 14, 2022 In this section, we consider the stochastic optimal control problem with random coefficients. The objective functional is the recursive type captured by the backward stochastic differential equation (BSDE) with random coefficients. We prove the DPP, the continuity property of the value function, and the verification theorem. We also consider the indefinite LQ problem as an application of the verification theorem.

Problem statement

The stochastic differential equation (SDE) is given by

begin{aligned} textstylebegin{cases} mathrm{d}x_{s}^{t,a;u} = f(s,x_{s}^{t,a;u},u_{s}),mathrm{d}s + sigma (s,x_{s}^{t,a;u},u_{s}) ,mathrm{d}B_{s}, \ x_{t}^{t,a;u} = a, end{cases}displaystyle end{aligned}

(5)

where f and σ are the coefficients in (1) and (3). Note that ((x_{s}^{t,a;u})_{s in [t,T]}) is the (mathbb{R}^{n})-valued (forward) state process with the initial condition (x_{t}^{t,a;u} = a) and ((u_{s})_{s in [t,T]}) is the U-valued control process with the control space U. The space of admissible controls is defined by (mathcal{U}_{t,T} := mathcal{L}^{2}_{mathcal{F}}(U)).

We introduce the backward SDE (BSDE) given by

begin{aligned} textstylebegin{cases} mathrm{d}y_{s}^{t,a;u} = -l(s,x_{s}^{t,a;u},u_{s}, y_{s}^{t,a;u}, z_{s}^{t,a;u}) ,mathrm{d}s + z_{s}^{t,a;u} ,mathrm{d}B_{s}, \ y_{T}^{t,a;u} = m(x_{T}^{t,a;u}), end{cases}displaystyle end{aligned}

(6)

where l and m are the coefficients in (1) and (3). The pair ((y_{s}^{t,a;u},z_{s}^{t,a;u})_{s in [t,T]}) is the ((mathbb{R},mathbb{R}^{1 times r}))-valued backward process. (y_{T}^{t,a;u} = m(x_{T}^{t,a;u})) is the terminal condition that is (mathcal{F}_{T})-measurable. As stated in (2) and (3), (f: Omega times [0,T] times mathbb{R}^{n} times U rightarrow mathbb{R}^{n}), (sigma : Omega times [0,T] times mathbb{R}^{n} times U rightarrow mathbb{R}^{n times r}), (l: Omega times [0,T] times mathbb{R}^{n} times U times mathbb{R} times mathbb{R}^{1 times r} rightarrow mathbb{R}), and (m:Omega times mathbb{R}^{n} rightarrow mathbb{R}) are random coefficients of (5) and (6), where U is the control space that is a nonempty compact subset of (mathbb{R}^{m}). Note that (5) and (6) constitute a forward–backward SDE with random coefficients, where the BSDE is coupled with the forward SDE in (4).

The assumptions for (5) and (6) are given as follows:

1. (H.1)

For (zeta = f,sigma ), ζ is (mathbb{P} times mathcal{B}(mathbb{R}^{n}) times mathcal{B}(U))-measurable, where (mathcal{B}(cdot )) is the Borel σ-algebra. For almost all (omega in Omega ), ζ is (uniformly) continuous in ((s,u) in [0,T] times U) and Lipschitz continuous in (x in mathbb{R}^{n}) with the Lipschitz constant L.

2. (H.2)

l and m are (mathbb{P} times mathcal{B}(mathbb{R}^{n}) times mathcal{B}(U) times mathcal{B}(mathbb{R}) times mathcal{B}(mathbb{R}^{1 times r}) ) and (mathbb{P} times mathcal{B}(mathbb{R}^{n})) measurable, respectively. For almost all (omega in Omega ), l is (uniformly) continuous in ((s,u) in [0,T] times U) and Lipschitz continuous in ((x,y,z) in mathbb{R}^{n} times mathbb{R} times mathbb{R}^{1 times r}) with the Lipschitz constant L. For almost all (omega in Omega ), m is Lipschitz continuous in (x in mathbb{R}^{n}) with L.

Remark 1

We should mention that in (5) and (6), the coefficients f, σ, l and m are allowed to be random, which are just measurable with respect to (omega in Omega ). In particular, unlike the path-dependent stochastic control problems and differential games in , there are no specific assumptions for the coefficients with respect to (omega in Omega ) and there is no specified topology on Ω.

We have the following lemma. The proof can be found in [18, Chaps. 1 and 7], [13, Chaps. 3, 4 and 8], .

Lemma 1

Assume that (H.1) and (H.2) hold. Then, for (t in [0,T]), (s,l in [t,T]), (l leq s), (u in mathcal{U}_{t,T}), and (a,a^{prime }in L^{2}(Omega ,mathcal{F}_{t};mathbb{R}^{n})), the following results hold:

1. (i)

(5) admits a unique (strong) solution in (mathcal{C}_{mathcal{F}}^{2}(mathbb{R}^{n})). Moreover, for (p geq 1), ((x_{s}^{t,a;u})_{s in [l,T]} = (x_{s}^{t,x_{l}^{t,a;u};u})_{s in [l,T]}) and there exists a constant (C>0), dependent on L, T and p, such that ((mathbb{P})almost surely (a.s.))

begin{aligned} &mathbb{E}_{mathcal{F}_{t}} Bigl[max_{s in [t,T]} biglvert x_{s}^{t,a;u} bigrvert ^{p} Bigr] leq C bigl(1+ vert a vert ^{p} bigr), \ &mathbb{E}_{mathcal{F}_{t}} bigl[ biglvert x_{s}^{t,a;u} – x_{l}^{t,a;u} bigrvert ^{p} bigr]leq C bigl(1+ vert a vert ^{p} bigr) (s-l)^{frac{p}{2}}, \ &mathbb{E}_{mathcal{F}_{t}} Bigl[max_{s in [t,T]} biglvert x_{s}^{t,a;u} – x_{s}^{t,a^{prime };u} bigrvert ^{p} Bigr]leq C biglvert a-a^{prime } bigrvert ^{p}; end{aligned}

2. (ii)

(6) admits a unique solution ((y_{s}^{t,a;u},z_{s}^{t,a;u})_{s in [t,T]} in mathcal{C}_{ mathcal{F}}^{2}(mathbb{R}) times mathcal{L}_{mathcal{F}}^{2}( mathbb{R}^{1 times r})). Furthermore, for (p geq 2), there exists a constant (C>0), dependent on L, p and T, such that ((mathbb{P})a.s.)

begin{aligned}& mathbb{E}_{mathcal{F}_{t}} biggl[max_{s in [t,T]} biglvert y_{s}^{t,a;u} bigrvert ^{p} + biggl( int _{t}^{T} biglvert z_{s}^{t,a;u} bigrvert ^{2} ,mathrm{d}s biggr)^{frac{p}{2}} biggr] leq C bigl(1 + vert a vert ^{p} bigr), \& mathbb{E}_{mathcal{F}_{t}} bigl[ biglvert y_{s}^{t,a;u} – y_{t}^{t,a;u} bigrvert ^{p} bigr] leq C bigl(1+ vert a vert ^{p} bigr) (t-s)^{frac{p}{2}}, \& mathbb{E}_{mathcal{F}_{t}} Bigl[max_{s in [t,T]} biglvert y_{s}^{t,a;u} – y_{s}^{t,a^{prime };u} bigrvert ^{p} Bigr] leq C biglvert a – a^{prime } bigrvert ^{p}; end{aligned}

3. (iii)

Suppose that ((tilde{y}_{s}^{t,a;u}, tilde{z}_{s}^{t,a;u})_{s in [t,T]} in mathcal{C}_{mathcal{F}}^{2}(mathbb{R}) times mathcal{L}_{ mathcal{F}}^{2}(mathbb{R}^{1 times r})) is the solution of (6), where (tilde{y}_{T}^{t,a;u} = m(x_{T}^{t,a;u}) + epsilon ) and (epsilon > 0). Then, there exists a constant (C > 0), dependent on L and T, such that (mathbb{E}_{mathcal{F}_{t}}[max_{s in [t,T]} |y_{s}^{t,a;u} – tilde{y}_{s}^{t,a;u}|^{2}] < C epsilon ). Assume that ((widehat{y}_{s}^{t,a;u}, widehat{z}_{s}^{t,a;u})_{s in [t,T]} in mathcal{C}_{mathcal{F}}^{2}(mathbb{R}) times mathcal{L}_{ mathcal{F}}^{2}(mathbb{R}^{1 times r})) is the solution of (6) with and , where (l geq widehat{l}) and (m geq widehat{m}), (mathbb{P})a.s. Then, (y_{s}^{t,a;u} geq widehat{y}_{s}^{t,a;u}) for (s in [t,T]), (mathbb{P})a.s.

The objective functional is a recursive type given by

begin{aligned} J(t,a;u) = y_{t}^{t,a;u} = mathbb{E}_{mathcal{F}_{t}} bigl[y_{t}^{t,a;u} bigr]. end{aligned}

(7)

Then, the stochastic optimal control problem considered in this paper can be stated as follows:

begin{aligned} quad operatorname*{ess,inf}_{u in mathcal{U}_{t,T}} J(t,a;u),quad text{subject to (3)}. end{aligned}

(P)

Remark 2

When l in (6) does not depend on y and z, the objective functional J in (7) can be simplified as follows:

begin{aligned} J(t,a;u) = mathbb{E}_{mathcal{F}_{t}} biggl[ int _{t}^{T} l bigl(s,x_{s}^{t,a;u},u_{s} bigr) ,mathrm{d}s + m bigl(x_{T}^{t,a;u} bigr) biggr]. end{aligned}

This is a special case of (P), which was considered in [1, 11].

For (t in [0,T]) and (a in L^{2}(Omega ,mathcal{F}_{t};mathbb{R}^{n})), the value function of (P) is defined by

begin{aligned} V(t,a) = operatorname*{ess,inf}_{u in mathcal{U}_{t,T}} J(t,a;u),quad mathbb{P}text{-a.s.} end{aligned}

(8)

Note that from Lemma 1, (P) is well posed; hence, (8) is the well-defined value function. If the coefficients in (5) and (6) are not dependent on (omega in Omega ), then the problem above corresponds to stochastic optimal control with deterministic coefficients, which has been studied in various aspects in the literature; see [17, 18, 20] and the references therein. Unlike the case of deterministic coefficients, the value function in (8) is a random field.

Remark 3

We mention that the purpose of choosing stochastic optimal controller design is to broaden its potential applications. Specifically, there are various applications of stochastic control problems in finance, economics, science, and engineering. Then, these applications can be studied in different aspects using the approaches of this paper, which allows capturing more practical situations including the general dynamic behavior of the objective functional and the random parameter variations due to imprecisions (see the detailed discussion in Sect. 1).

Dynamic programming principle and verification theorem

This subsection provides the continuity property of (8). We show that (8) satisfies the DPP, which is the recursive-type value iteration algorithm to solve (P). Then, we prove the verification theorem for (P).

We first state the following result due to Lemma 1:

Lemma 2

Assume that (H.1) and (H.2) hold. Then, there exists a constant (C>0) such that for (a,a^{prime }in mathbb{R}^{n}),

begin{aligned} biglvert V(t,a) – V bigl(t,a^{prime } bigr) bigrvert leq C biglvert a-a^{prime } bigrvert ,qquad biglvert V(t,a) bigrvert leq C bigl(1+ vert a vert bigr),quad mathbb{P}textit{-a.s.} end{aligned}

The backward semigroup operator associated with the BSDE is defined as follows: for (t,t+tau in [0,T]) with (t < t+tau ),

begin{aligned} Phi _{s,t+tau }^{t,a;u}[b] := bar{y}_{s}^{t,a;u},quad s in [t,t+tau ], end{aligned}

(9)

where ((bar{y}_{s}^{t,a;u},bar{z}_{s}^{t,a;u})_{s in [t,t+tau ]}) is the solution of the following BSDE on ([t,t+tau ]):

begin{aligned}& mathrm{d}bar{y}_{s}^{t,a;u} = -l bigl(s,x_{s}^{t,a;u},u_{s}, bar{y}_{s}^{t,a;u}, bar{z}_{s}^{t,a;u} bigr),mathrm{d}s + bar{z}_{s}^{t,a;u} ,mathrm{d}B_{s}, \& bar{y}_{t+tau }^{t,a;u} = b. end{aligned}

Here, (b in L^{2}(Omega ,mathcal{F}_{t+tau };mathbb{R})). Obviously, when (b=y_{t+tau }^{t,a;u}) (note that (y_{t+tau }^{t,a;u} in L^{2}(Omega ,mathcal{F}_{t+tau };mathbb{R}))), we have (y_{t}^{t,a;u} = bar{y}_{t}^{t,a;u} = Phi _{t,t+tau }^{t,a;u}[y_{t+ tau }^{t,a;u}]), (mathbb{P})-a.s.

Remark 4

By (9) and (i) of Lemma 1, the objective functional in (7) can be rewritten as follows:

begin{aligned} J(t,a;u) & = Phi _{t,T}^{t,a;u} bigl[m bigl(x_{T}^{t,a;u} bigr) bigr] = Phi _{t,t+tau }^{t,a;u} bigl[y_{t+ tau }^{t,a;u} bigr] = Phi _{t,t+tau }^{t,a;u} bigl[J bigl(t+tau ,x_{t+tau }^{t,a;u};u bigr) bigr]. end{aligned}

We now state the DPP for (P).

Theorem 1

Suppose that (H.1) and (H.2) hold. Then, the value function in (8) satisfies the following dynamic programming principle (DPP): for (t,t+tau in [0,T]) with (t < t+tau ) and (a in L^{2}(Omega ,mathcal{F}_{t};mathbb{R}^{n})),

begin{aligned} V(t,a) = operatorname*{ess,inf}_{u in mathcal{U}_{t,t+tau }} Phi _{t,t+tau }^{t,a;u} bigl[ V bigl(t+tau , x_{t+tau }^{t,a;u} bigr) bigr], quad mathbb{P}textit{-a.s.} end{aligned}

Proof

Note that in view of Lemma 1, the FBSDE in (5) and (6) admit a unique solution of ((x_{s}^{t,a;u}, y_{s}^{t,a;u},z_{s}^{t,a;u})_{s in [t,T]} in mathcal{C}_{mathcal{F}}^{2}(mathbb{R}^{n}) times mathcal{C}_{ mathcal{F}}^{2}(mathbb{R}) times mathcal{L}_{mathcal{F}}^{2}( mathbb{R}^{1 times r})).

Let

begin{aligned} V^{prime }(t,a) & := operatorname*{ess,inf}_{u in mathcal{U}_{t,t+tau }} Phi _{t,t+ tau }^{t,a;u} bigl[ V bigl(t+tau , x_{t+tau }^{t,a;u} bigr) bigr], quad mathbb{P}text{-a.s.} end{aligned}

We show that (V^{prime }(t,a) leq V(t,a)) and (V^{prime }(t,a) geq V(t,a)).

First, note from (7) and Remark 4 that

begin{aligned} V(t,a) & = operatorname*{ess,inf}_{u in mathcal{U}_{t,T}} Phi _{t,t+tau }^{t,a;u} bigl[J bigl(t+ tau ,x_{t+tau }^{t,a;u};u bigr) bigr] \ & geq operatorname*{ess,inf}_{u in mathcal{U}_{t,t+tau }} Phi _{t,t+tau }^{t,a;u} bigl[V bigl(t+ tau ,x_{t+tau }^{t,a;u} bigr) bigr] = V^{prime }(t,a), end{aligned}

where the inequality follows from (8) and (iii) of Lemma 1. This implies that (V(t,a) geq V^{prime }(t,a)).

We now prove (V(t,a) leq V^{prime }(t,a)). By Lemma 2 and (ii) of Lemma 1, for each (epsilon > 0), there exists (delta > 0) such that whenever (|x – hat{x}| < delta ), it holds that for all (u in mathcal{U}_{t+tau ,T}),

begin{aligned} & biglvert V(t+tau ,x) – V(t+tau ,hat{x}) bigrvert + biglvert J(t+tau ,x;u) – J(t+tau , hat{x};u) bigrvert < epsilon . end{aligned}

(10)

Denote ({D_{j}}_{j geq 1}) by the (disjoint) Borel partition of (mathbb{R}^{n}) having the diameter of δ, i.e., (operatorname{diam}(D_{j}) < delta ). This is equivalently saying that (D_{j}) is Borel measurable, i.e., (D_{j} in mathcal{B}(mathbb{R}^{n})), with (bigcup_{j geq 1} D_{j} = mathbb{R}^{n}) and (D_{j} cap D_{l} = phi ) for (j neq l). By definition, for (x,hat{x} in D_{j}), we have (|x-hat{x}|< delta ). For each j, choose (x^{(j)} in D_{j}). Then, by the measurable selection theorem in [11, Theorem A.1] (see also [60, 61]), there exists (u^{(j)} in mathcal{U}_{t+tau ,T}) such that (J(t+tau ,x^{(j)};u^{(j)}) leq V(t+tau ,x^{(j)}) +epsilon ). Hence, by (10), for any (x in D_{j}),

begin{aligned} & J bigl(t+tau ,x;u^{(j)} bigr) – V(t+tau ,x) \ &quad leq biglvert J bigl(t+tau ,x;u^{(j)} bigr) – J bigl(t+tau ,x^{(j)};u^{(j)} bigr) bigrvert \ &qquad {} + biglvert J bigl(t+tau ,x^{(j)};u^{(j)} bigr) – V bigl(t+tau ,x^{(j)} bigr) bigrvert + biglvert V bigl(t+tau ,x^{(j)} bigr) – V(t+tau ,x) bigrvert leq 3 epsilon . end{aligned}

(11)

For any (u^{prime prime } in mathcal{U}_{t,t+tau }), we define where is the indicator function. Clearly, (tilde{u} in mathcal{U}_{t,T}). Let . Then, by Remark 4,

begin{aligned} V(t,a) &leq J(t,a;tilde{u}) \ & = Phi _{t,t+tau }^{t,a;u^{prime prime }} bigl[J bigl(t+tau , x_{t+tau }^{t,a;u^{ prime prime }}; u^{prime } bigr) bigr] leq Phi _{t,t+tau }^{t,a;u^{prime prime }} bigl[V bigl(t+tau ,x_{t+tau }^{t,a;u^{prime prime }} bigr) bigr] + 3epsilon , end{aligned}

(12)

where the second inequality is due to (11) and (iii) of Lemma 1. Then, (12) and the definition of (V^{prime }), together with the arbitrariness of ϵ, imply that (after taking the essential infimum) we can obtain (V(t,a) leq V^{prime }(t,a)). This shows that (V(t,a) = V^{prime }(t,a)); thus completing the proof. □

We now state the continuity property of (8) in (t in [0,T]).

Proposition 1

Suppose that (H.1) and (H.2) hold. Then, (8) is continuous in (t in [0,T]). Specifically, there exists a constant (C>0) such that for (a in mathbb{R}^{n}) and (t,t+tau in [0,T]) with (t < t+tau ),

begin{aligned} biglvert V(t+tau ,a) – V(t,a) bigrvert leq C bigl(1+ vert a vert bigr)tau ^{frac{1}{2}},quad mathbb{P}textit{-a.s.} end{aligned}

Proof

It is necessary to prove that

begin{aligned} – C bigl(1+ vert a vert bigr)tau ^{frac{1}{2}} & leq V(t,a) – V(t+ tau ,a) leq C bigl(1+ vert a vert bigr) tau ^{frac{1}{2}},quad mathbb{P}text{-a.s.} end{aligned}

Below, it is shown that (V(t,a) – V(t+tau ,a) leq C (1+|a|)tau ^{frac{1}{2}}).

In view of Theorem 1, for each (epsilon > 0), there exists (u^{prime }in mathcal{U}_{t,t+tau }) such that

begin{aligned} biglvert V(t,a) – Phi _{t,t+tau }^{t,a;u^{prime }} bigl[ V bigl(t+tau , x_{t+ tau }^{t,a;u^{prime }} bigr) bigr] bigrvert leq epsilon ,quad mathbb{P}text{-a.s.} end{aligned}

This implies that

begin{aligned} V(t,a) – V(t+tau ,a) & leq I^{(1)} + I^{(2)} + epsilon , quad mathbb{P}text{-a.s.}, end{aligned}

where

begin{aligned}& I^{(1)} := Phi _{t,t+tau }^{t,a;u^{prime }} bigl[ V bigl(t+tau , x_{t+ tau }^{t,a;u^{prime }} bigr) bigr] – Phi _{t,t+tau }^{t,a;u^{prime }} bigl[ V(t+tau ,a) bigr], \& I^{(2)} := Phi _{t,t+tau }^{t,a;u^{prime }} bigl[ V(t+tau ,a) bigr] – V(t+tau ,a). end{aligned}

From (i) of Lemma 1, Lemma 2, and Jensen’s inequality, ((mathbb{P})-a.s.)

begin{aligned} biglvert I^{(1)} bigrvert & leq C mathbb{E} bigl[ biglvert V bigl(t+tau , x_{t+tau }^{t,a;u^{prime }} bigr) – V(t+ tau ,a) bigrvert ^{2} | mathcal{F}_{t} bigr]^{frac{1}{2}} \ & leq C mathbb{E} bigl[ biglvert x_{t+tau }^{t,a;u^{prime }} – a bigrvert ^{2} | mathcal{F}_{t} bigr] ^{frac{1}{2}} leq C bigl(1+ vert a vert bigr)tau ^{frac{1}{2}}. end{aligned}

(13)

Moreover, from the definition of Φ and the terminal condition of Φ in (I^{(2)}), we use Lemma 1 and (H.2) to obtain

begin{aligned} biglvert I^{(2)} bigrvert & = bigglvert mathbb{E}_{mathcal{F}_{t}} biggl[ int _{t}^{t+ tau } l bigl(s,x_{s}^{t,a;u^{prime }},u_{s}^{prime }, bar{y}_{s}^{t,a;u^{prime }}, bar{z}_{s}^{t,a;u^{prime }} bigr) ,mathrm{d}s biggr] biggrvert \ & leq tau ^{frac{1}{2}} mathbb{E}_{mathcal{F}_{t}} biggl[ int _{t}^{t+ tau } |l bigl(s,x_{s}^{t,a;u^{prime }},u_{s}^{prime }, bar{y}_{s}^{t,a;u^{prime }}, bar{z}_{s}^{t,a;u^{prime }}|^{2} bigr) ,mathrm{d}s biggr]^{frac{1}{2}} \ & leq C tau ^{frac{1}{2}} mathbb{E}_{mathcal{F}_{t}} biggl[ int _{t}^{t+tau } bigl[ 1 + biglvert x_{s}^{t,a;u^{prime }} bigrvert ^{2} + biglvert bar{y}_{s}^{t,a;u^{prime }} bigrvert ^{2} + biglvert bar{z}_{s}^{t,a;u^{prime }} bigrvert ^{2} bigr] ,mathrm{d}s biggr] \ & leq C bigl(1+ vert a vert bigr) tau ^{frac{1}{2}}, quad mathbb{P}text{-a.s.} end{aligned}

(14)

Note that (13) and (14) lead to

begin{aligned} V(t,a) – V(t+tau ,a) & leq C bigl(1+ vert a vert bigr)tau ^{frac{1}{2}} + epsilon , quad mathbb{P}text{-a.s.} end{aligned}

Hence, the arbitrariness of ϵ implies (V(t,a) – V(t+tau ,a) leq C(1+|a|) tau ^{1/2}), (mathbb{P})-a.s. The other inequality can be proven in a similar way. This completes the proof. □

From Lemma 2 and Proposition 1, the following result holds:

Corollary 1

Assume that (H.1) and (H.2) hold. Then, the value function in (8) is continuous on ([0,T] times mathbb{R}^{n}). Specifically, for (a,a^{prime }in mathbb{R}^{n}) and (t,t+tau in [0,T]) with (t < t+tau ),

begin{aligned} biglvert V bigl(t+tau ,a^{prime } bigr) – V(t,a) bigrvert leq C bigl( biglvert a-a^{prime } bigrvert + bigl(1+ vert a vert + biglvert a^{prime } bigrvert bigr)tau ^{frac{1}{2}} bigr), quad mathbb{P}textit{-a.s.} end{aligned}

We now state the verification theorem for (P).

Theorem 2

Assume that (H.1) and (H.2) hold. Suppose that the pair ((V,q) in mathcal{L}^{infty }_{mathcal{F}}(C^{2}(mathbb{R}^{n})) times mathcal{L}^{2}_{mathcal{F}}(C^{2}(mathbb{R}^{n};mathbb{R}^{1 times r}))) is the solution to the SHJB equation in (3). Then, for (t in [0,T]), (x in L^{2}(Omega ,mathcal{F}_{t};mathbb{R}^{n})) and (u in mathcal{U}_{t,T}), (V(t,x) leq J(t,x;u)), (mathbb{P})-.a.s. Furthermore, assume that (widehat{u}_{s} in U) with (widehat{u} := (widehat{u}_{s})_{s in [t,T]} in mathcal{U}_{t,T}) is the minimizer of the Hamiltonian in (3) for (s in [t,T]), (mathbb{P})-.a.s. Then, for (t in [0,T]) and (x in L^{2}(Omega ,mathcal{F}_{t};mathbb{R}^{n})), we have (V(t,x) =J(t,x;widehat{u})), (mathbb{P})-.a.s. and (widehat{u} in mathcal{U}_{t,T}) is the corresponding optimal control.

Proof

Suppose that ((V,q) in mathcal{L}^{infty }_{mathcal{F}}(C^{2}(mathbb{R}^{n})) times mathcal{L}^{2}_{mathcal{F}}(C^{2}(mathbb{R}^{n};mathbb{R}^{1 times r}))) is the solution of (3). Let ((x_{s}^{t,x;widehat{u}})_{s in [t,T]}) be the state trajectory generated by (widehat{u} in mathcal{U}_{t,T}) with (x_{t}^{t,x;widehat{u}} = x in L^{2}(Omega ,mathcal{F}_{t}; mathbb{R}^{n})). Note that (V(T,x_{T}^{t,x;widehat{u}}) = m(x_{T}^{t,x;widehat{u}})) and (V(t,x_{t}^{t,x;widehat{u}}) = V(t,x)), (mathbb{P})-a.s.

By using the Itô–Kunita formula  and the SHJB in (3), we have ((mathbb{P})-a.s.)

begin{aligned} V bigl(T,x_{T}^{t,x;widehat{u}} bigr) ={}& V(t,x) + int _{t}^{T} bigllangle D V bigl(s,x_{s}^{t,x; widehat{u}} bigr), f bigl(s,x_{s}^{t,x;widehat{u}}, widehat{u}_{s} bigr) bigrrangle ,mathrm{d}s \ & {} + frac{1}{2} int _{t}^{T} operatorname{Tr}bigl(sigma sigma ^{top } bigl(s,x_{s}^{t,x; widehat{u}}, widehat{u}_{s} bigr) D^{2} V bigl(s,x_{s}^{t,x;widehat{u}} bigr) bigr) ,mathrm{d}s \ & {} + int _{t}^{T} operatorname{Tr}bigl(sigma bigl(s,x_{s}^{t,x;widehat{u}}, widehat{u}_{s} bigr) D q bigl(s,x_{s}^{t,x;widehat{u}} bigr) bigr) ,mathrm{d}s \ & {} + int _{t}^{T} bigllangle D V bigl(s,x_{s}^{t,x;widehat{u}} bigr), sigma bigl(s,x_{s}^{t,x;widehat{u}}, widehat{u}_{s} bigr) bigrrangle ,mathrm{d}B_{s} \ & {} – int _{t}^{T} H bigl(s,x_{s}^{t,x;widehat{u}}, bigl(V,D V, D^{2} V, q, D q bigr) bigl(s,x_{s}^{t,x;widehat{u}} bigr) bigr) ,mathrm{d}s \ & {} + int _{t}^{T} q bigl(s,x_{s}^{t,x;widehat{u}} bigr) ,mathrm{d}B_{s} \ ={}& V(t,x) – int _{t}^{T} l bigl(s,x_{s}^{t,x;widehat{u}}, widehat{u}_{s}, V bigl(s,x_{s}^{t,x;widehat{u}} bigr), \ &{} bigllangle D V bigl(s,x_{s}^{t,x;widehat{u}} bigr), sigma bigl(s,x_{s}^{t,x; widehat{u}},widehat{u}_{s} bigr) bigrrangle + q bigl(s,x_{s}^{t,x;widehat{u}} bigr) bigr) ,mathrm{d}s \ & {} + int _{t}^{T} bigl[ bigllangle D V bigl(s,x_{s}^{t,x;widehat{u}} bigr), sigma bigl(s,x_{s}^{t,x;widehat{u}}, widehat{u}_{s} bigr) bigrrangle + q bigl(s,x_{s}^{t,x; widehat{u}} bigr) bigr] ,mathrm{d}B_{s}. end{aligned}

Let ((y_{s}^{t,x;widehat{u}}, z_{s}^{t,x;widehat{u}})_{s in [t,T]}) be the BSDE in (6) with (widehat{u} in mathcal{U}_{t,T}). Let (widehat{y}_{s}^{widehat{u}} := V(s,x_{s}^{t,x;widehat{u}}) – y_{s}^{t,x; widehat{u}}) and (widehat{z}_{s}^{widehat{u}} := langle D V(s,x_{s}^{t,x; widehat{u}}), sigma (s,x_{s}^{t,x;widehat{u}},widehat{u}_{s}) rangle + q(s,x_{s}^{t,x;widehat{u}}) – z_{s}^{t,x;widehat{u}}). Note that (widehat{y}_{T}^{widehat{u}} = 0), (mathbb{P})-a.s. Then, we have

begin{aligned} mathrm{d}widehat{y}_{s}^{widehat{u}} ={}& – bigl[ l bigl(s,x_{s}^{t,x; widehat{u}},widehat{u}_{s}, V bigl(s,x_{s}^{t,x;widehat{u}} bigr), bigllangle D V bigl(s,x_{s}^{t,x; widehat{u}} bigr), sigma bigl(s,x_{s}^{t,x;widehat{u}}, widehat{u}_{s} bigr) bigrrangle \ & {} + q bigl(s,x_{s}^{t,x;widehat{u}} bigr) bigr) – l bigl(s,x_{s}^{t,x; widehat{u}},widehat{u}_{s}, y_{s}^{t,x;widehat{u}}, z_{s}^{t,x; widehat{u}} bigr) bigr] ,mathrm{d}s + widehat{z}_{s}^{widehat{u}} ,mathrm{d}B_{s} \ ={}& – bigl[ A_{s}^{(1)} widehat{y}_{s}^{widehat{u}} + A_{s}^{(2)} widehat{z}_{s}^{widehat{u}} bigr] ,mathrm{d}s + widehat{z}_{s}^{widehat{u}} ,mathrm{d}B_{s}, end{aligned}

(15)

where (A^{(1)}) and (A^{(2)}) are bounded coefficients (independent of ŷ and ) due to (H.1) and (H.2). Since (15) is a linear BSDE, in view of [13, Proposition 4.1.2], we have (widehat{y}_{s}^{widehat{u}} = 0) for (s in [t,T]), (mathbb{P})-a.s. Hence, it holds that (V(t,x_{t}^{t,x;widehat{u}}) = V(t,x) = y_{t}^{t,x;widehat{u}} = J(t,x; widehat{u}) ), (mathbb{P})-a.s.

On the other hand, for any (u in mathcal{U}_{t,T}), by using the approach analogous to that above and (iii) of Lemma 1, we can show that (widehat{y}_{s}^{u} leq 0) for (s in [t,T]), (mathbb{P})-a.s., which implies that (V(t,x_{t}^{t,x;u}) = V(t,x) leq y_{t}^{t,x;u} = J(t,x;u)), (mathbb{P})-a.s. Note that the equality can be achieved when (u = widehat{u} in mathcal{U}_{t,T}). This shows that for any (u in mathcal{U}_{t,T}) and (x in L^{2}(Omega ,mathcal{F}_{t};mathbb{R}^{n})), we have

begin{aligned} J(t,x;u) = y_{t}^{t,x;u} geq y_{t}^{t,x;widehat{u}} = J(t,x; widehat{u}) = V(t,x), quad mathbb{P}text{-a.s.}, end{aligned}

where the last equality follows from the definition of the value function V in (8). This completes the proof of the theorem. □

Remark 5

In Sect. 3, we show the existence and uniqueness of the viscosity solution to the SHJB equation in (3). Furthermore, in the appendix, the existence and uniqueness of the weak solution to (3) is shown via the Sobolev-space technique.

General indefinite linear–quadratic problem with random coefficients

This subsection considers the general indefinite linear–quadratic (LQ) problem of (P) as an application of Theorem 2. For notational simplicity, we assume that (r=1), i.e., the one-dimensional Brownian motion.

The LQ problem in this subsection is referred to as (LQ-P) with

begin{aligned} textstylebegin{cases} f(s,x,u) = A_{s} x + F_{s} u, qquad sigma (s,x,u) = C_{s} x + E_{s} u, \ l(s,x,u,y,z) = frac{1}{2} [ langle x, Q_{s} x rangle + langle u, R_{s} u rangle + y ] + z, \ m(x) = frac{1}{2}langle x, M x rangle , end{cases}displaystyle end{aligned}

(16)

where A, F, C, E, Q, R are ({mathcal{F}_{s}}_{s geq 0})-adapted continuous stochastic processes with appropriate dimensions, which are uniformly bounded in (omega in Omega ) (they belong to (mathcal{L}_{mathcal{F}}^{infty })) and (M in L^{infty }(Omega ,mathcal{F}_{T};mathbb{S}^{n})). We assume that Q, R, M are symmetric matrices, which need not be definite matrices.Footnote 5 When l in (16) is independent of y and z, (LQ-P) is reduced to the simplified LQ problem (with random coefficients) studied in [25, 26, 4345] and the references therein.

From (4), the Hamiltonian can be written as (s argument is suppressed)

begin{aligned} & H(s,x,y,p,P,q,bar{P}) \ &quad = operatorname*{ess,inf}_{u} biggl{ langle p, Ax + F u rangle + frac{1}{2} bigl[ langle x, Q x rangle + langle u, R u rangle bigr] + frac{1}{2}y + q + langle p, C x + E u rangle \ &qquad {} + frac{1}{2} bigllangle C x + E u, P(C x + E u) bigrrangle + langle Cx + Eu, bar{P} rangle biggr} . end{aligned}

(17)

Assume that (R_{s} + E_{s}^{top }P E_{s}) is (uniformly) positive-definite for almost all (omega in Omega ) and (s in [0,T]). Then, we can easily see that H in (17) admits a unique minimizer, which can be written as follows:

begin{aligned} widehat{u} & = – bigl(R + E^{top }P E bigr)^{-1} bigl[ F^{top }p + E^{top }p + E^{top }PC x + E^{top }bar{P} bigr] x. end{aligned}

(18)

By substituting (18) into (17), the SHJB in (3) is obtained by

begin{aligned} textstylebegin{cases} mathrm{d}V(s,x) = – H(s,x,(V,DV,D^{2} V, q, D q)(s,x)) ,mathrm{d}s + q(s,x) ,mathrm{d}B_{s}, \ (s,x) in [0,T) times mathbb{R}^{n}, \ V(T,x) = frac{1}{2} x^{top }M x, quad x in mathbb{R}^{n}, end{cases}displaystyle end{aligned}

(19)

where (s argument is suppressed)

begin{aligned} & H(s,x,y,p,P,q,bar{P}) \ &quad = x^{top }A^{top }p + frac{1}{2} x^{top }Q x + frac{1}{2} y + x^{top }C^{top }p + frac{1}{2} x^{top }C^{top }P C x + x^{top }C^{top }bar{P} + q \ &qquad {} – frac{1}{2} bigl[ F^{top }p + E^{top }p + E^{top }PC x + E^{top }bar{P} bigr]^{top } bigl(R + E^{top }P E bigr)^{-1} \ &qquad {} times bigl[ F^{top }p + E^{top }p + E^{top }PC x + E^{top }bar{P} bigr]. end{aligned}

(20)

In view of the verification theorem in Theorem 2, we need to seek for the solution of (19) to solve (LQ-P).

We conjecture that the general solutions for (19) are quadratic in x, i.e.,

begin{aligned} &V(s,x) = frac{1}{2} x^{top }Lambda _{s} x,qquad q(s,x) = frac{1}{2} x^{top }bar{Lambda }_{s} x, end{aligned}

(21)

where it is assumed that Λ, Λ̄ are ({mathcal{F}_{s}}_{s geq 0})-adapted symmetric (n times n)-valued bounded stochastic processes with (Lambda _{T} = M), i.e., ((Lambda ,bar{Lambda }) in mathcal{L}_{mathcal{F}}^{infty }( mathbb{S}^{n}) times mathcal{L}_{mathcal{F}}^{2}(mathbb{S}^{n})). Under this assumption, V and q in (21) are smooth, i.e., ((V,q) in mathcal{L}^{infty }_{mathcal{F}}(C^{2}(mathbb{R}^{n})) times mathcal{L}^{2}_{mathcal{F}}(C^{2}(mathbb{R}^{n};mathbb{R}^{1 times r}))), where (D V(s,x) = Lambda _{s} x) and (D q(s,x) = bar{Lambda }_{s} x) are well defined. Then, by substituting (21) into (20), we can easily see that the SHJB equation in (19) admits a unique smooth solution if the following stochastic Riccati differential equation (SRDE) admits a unique solution:

begin{aligned} textstylebegin{cases} mathrm{d}Lambda _{s} = – [ A_{s}^{top }Lambda _{s} + Lambda _{s} A_{s} + Q_{s} + Lambda _{s} + C_{s}^{top }Lambda _{s} C_{s} \ hphantom{mathrm{d}Lambda _{s} =}{} + bar{Lambda }_{s} + C_{s}^{top }Lambda _{s} + Lambda _{s} C_{s} + C_{s}^{top }bar{Lambda }_{s} + bar{Lambda }_{s} C_{s} \ hphantom{mathrm{d}Lambda _{s} =}{} – [ F_{s}^{top }Lambda _{s} + E_{s}^{top }Lambda _{s} + E_{s}^{top }Lambda _{s} C_{s} + E_{s}^{top }bar{Lambda }_{s} ]^{top }(R_{s} + E_{s}^{top }Lambda _{s} E_{s})^{-1} \ hphantom{mathrm{d}Lambda _{s} =}{} times [ F_{s}^{top }Lambda _{s} + E_{s}^{top }Lambda _{s} + E_{s}^{top }Lambda _{s} C_{s} + E_{s}^{top }bar{Lambda }_{s} ] ],mathrm{d}s + bar{Lambda }_{s} ,mathrm{d}B_{s}, \ Lambda _{T} = M. end{cases}displaystyle end{aligned}

(22)

Note that (22) is a symmetric (n times n)-valued stochastic process. Here, the solution of the SRDE in (22) is defined by the adapted pair ((Lambda ,bar{Lambda }) in mathcal{L}_{mathcal{F}}^{infty }( mathbb{S}^{n}) times mathcal{L}_{mathcal{F}}^{2}(mathbb{S}^{n})), which can be viewed as a matrix-valued BSDE with random coefficients.

By substituting (21) into (18), from Theorem 2, the optimal control for (LQ-P) can be obtained by

begin{aligned} widehat{u}_{s} & = – bigl(R_{s} + E_{s}^{top }Lambda _{s} E_{s} bigr)^{-1} bigl[ F_{s}^{top }Lambda _{s} + E_{s}^{top }Lambda _{s} + E_{s}^{top }Lambda _{s} C_{s} + E_{s}^{top }bar{Lambda }_{s} bigr] x_{s}^{t,a; widehat{u}}, end{aligned}

(23)

provided that (R_{s} + E_{s}^{top }Lambda _{s} E_{s}) is (uniformly) positive-definite for almost all (omega in Omega ) and (s in [0,T]).Footnote 6

In summary, by applying the verification theorem in Theorem 2, we have the following result:

Proposition 2

Suppose that the pair ((Lambda ,bar{Lambda }) in mathcal{L}_{mathcal{F}}^{infty }( mathbb{S}^{n}) times mathcal{L}_{mathcal{F}}^{2}(mathbb{S}^{n})) is the solution of the SRDE in (22) and that (R_{s} + E_{s}^{top }Lambda _{s} E_{s}) is (uniformly) positivedefinite for almost all (omega in Omega ) and (s in [0,T]). Then, for (x in L^{2}(Omega ,mathcal{F}_{t};mathbb{R}^{n})), (V(t,x) = frac{1}{2}langle x, Lambda _{t} x rangle ) is the value function of (LQ-P) (equivalently, (V(t,x) = frac{1}{2}langle x, Lambda _{t} x rangle ) is the optimal cost), and (23) is the corresponding optimal control.

Remark 6

The solvability of the SRDE in (22) is an open problem. When l does not depend on y and z, the solvability of the corresponding SRDEs has been discussed extensively in the literature; see [25, 26, 4345] and the references therein. Moreover, we can consider the case of jump-diffusion models as in .