In this section, we consider the stochastic optimal control problem with random coefficients. The objective functional is the recursive type captured by the backward stochastic differential equation (BSDE) with random coefficients. We prove the DPP, the continuity property of the value function, and the verification theorem. We also consider the indefinite LQ problem as an application of the verification theorem.
Problem statement
The stochastic differential equation (SDE) is given by
$$begin{aligned} textstylebegin{cases} mathrm{d}x_{s}^{t,a;u} = f(s,x_{s}^{t,a;u},u_{s}),mathrm{d}s + sigma (s,x_{s}^{t,a;u},u_{s}) ,mathrm{d}B_{s}, \ x_{t}^{t,a;u} = a, end{cases}displaystyle end{aligned}$$
(5)
where f and σ are the coefficients in (1) and (3). Note that ((x_{s}^{t,a;u})_{s in [t,T]}) is the (mathbb{R}^{n})valued (forward) state process with the initial condition (x_{t}^{t,a;u} = a) and ((u_{s})_{s in [t,T]}) is the Uvalued control process with the control space U. The space of admissible controls is defined by (mathcal{U}_{t,T} := mathcal{L}^{2}_{mathcal{F}}(U)).
We introduce the backward SDE (BSDE) given by
$$begin{aligned} textstylebegin{cases} mathrm{d}y_{s}^{t,a;u} = l(s,x_{s}^{t,a;u},u_{s}, y_{s}^{t,a;u}, z_{s}^{t,a;u}) ,mathrm{d}s + z_{s}^{t,a;u} ,mathrm{d}B_{s}, \ y_{T}^{t,a;u} = m(x_{T}^{t,a;u}), end{cases}displaystyle end{aligned}$$
(6)
where l and m are the coefficients in (1) and (3). The pair ((y_{s}^{t,a;u},z_{s}^{t,a;u})_{s in [t,T]}) is the ((mathbb{R},mathbb{R}^{1 times r}))valued backward process. (y_{T}^{t,a;u} = m(x_{T}^{t,a;u})) is the terminal condition that is (mathcal{F}_{T})measurable. As stated in (2) and (3), (f: Omega times [0,T] times mathbb{R}^{n} times U rightarrow mathbb{R}^{n}), (sigma : Omega times [0,T] times mathbb{R}^{n} times U rightarrow mathbb{R}^{n times r}), (l: Omega times [0,T] times mathbb{R}^{n} times U times mathbb{R} times mathbb{R}^{1 times r} rightarrow mathbb{R}), and (m:Omega times mathbb{R}^{n} rightarrow mathbb{R}) are random coefficients of (5) and (6), where U is the control space that is a nonempty compact subset of (mathbb{R}^{m}). Note that (5) and (6) constitute a forward–backward SDE with random coefficients, where the BSDE is coupled with the forward SDE in (4).
The assumptions for (5) and (6) are given as follows:

(H.1)
For (zeta = f,sigma ), ζ is (mathbb{P} times mathcal{B}(mathbb{R}^{n}) times mathcal{B}(U))measurable, where (mathcal{B}(cdot )) is the Borel σalgebra. For almost all (omega in Omega ), ζ is (uniformly) continuous in ((s,u) in [0,T] times U) and Lipschitz continuous in (x in mathbb{R}^{n}) with the Lipschitz constant L.

(H.2)
l and m are (mathbb{P} times mathcal{B}(mathbb{R}^{n}) times mathcal{B}(U) times mathcal{B}(mathbb{R}) times mathcal{B}(mathbb{R}^{1 times r}) ) and (mathbb{P} times mathcal{B}(mathbb{R}^{n})) measurable, respectively. For almost all (omega in Omega ), l is (uniformly) continuous in ((s,u) in [0,T] times U) and Lipschitz continuous in ((x,y,z) in mathbb{R}^{n} times mathbb{R} times mathbb{R}^{1 times r}) with the Lipschitz constant L. For almost all (omega in Omega ), m is Lipschitz continuous in (x in mathbb{R}^{n}) with L.
Remark 1
We should mention that in (5) and (6), the coefficients f, σ, l and m are allowed to be random, which are just measurable with respect to (omega in Omega ). In particular, unlike the pathdependent stochastic control problems and differential games in [53–59], there are no specific assumptions for the coefficients with respect to (omega in Omega ) and there is no specified topology on Ω.
We have the following lemma. The proof can be found in [18, Chaps. 1 and 7], [13, Chaps. 3, 4 and 8], [20].
Lemma 1
Assume that (H.1) and (H.2) hold. Then, for (t in [0,T]), (s,l in [t,T]), (l leq s), (u in mathcal{U}_{t,T}), and (a,a^{prime }in L^{2}(Omega ,mathcal{F}_{t};mathbb{R}^{n})), the following results hold:

(i)
(5) admits a unique (strong) solution in (mathcal{C}_{mathcal{F}}^{2}(mathbb{R}^{n})). Moreover, for (p geq 1), ((x_{s}^{t,a;u})_{s in [l,T]} = (x_{s}^{t,x_{l}^{t,a;u};u})_{s in [l,T]}) and there exists a constant (C>0), dependent on L, T and p, such that ((mathbb{P})–almost surely (a.s.))
$$begin{aligned} &mathbb{E}_{mathcal{F}_{t}} Bigl[max_{s in [t,T]} biglvert x_{s}^{t,a;u} bigrvert ^{p} Bigr] leq C bigl(1+ vert a vert ^{p} bigr), \ &mathbb{E}_{mathcal{F}_{t}} bigl[ biglvert x_{s}^{t,a;u} – x_{l}^{t,a;u} bigrvert ^{p} bigr]leq C bigl(1+ vert a vert ^{p} bigr) (sl)^{frac{p}{2}}, \ &mathbb{E}_{mathcal{F}_{t}} Bigl[max_{s in [t,T]} biglvert x_{s}^{t,a;u} – x_{s}^{t,a^{prime };u} bigrvert ^{p} Bigr]leq C biglvert aa^{prime } bigrvert ^{p}; end{aligned}$$

(ii)
(6) admits a unique solution ((y_{s}^{t,a;u},z_{s}^{t,a;u})_{s in [t,T]} in mathcal{C}_{ mathcal{F}}^{2}(mathbb{R}) times mathcal{L}_{mathcal{F}}^{2}( mathbb{R}^{1 times r})). Furthermore, for (p geq 2), there exists a constant (C>0), dependent on L, p and T, such that ((mathbb{P})–a.s.)
$$begin{aligned}& mathbb{E}_{mathcal{F}_{t}} biggl[max_{s in [t,T]} biglvert y_{s}^{t,a;u} bigrvert ^{p} + biggl( int _{t}^{T} biglvert z_{s}^{t,a;u} bigrvert ^{2} ,mathrm{d}s biggr)^{frac{p}{2}} biggr] leq C bigl(1 + vert a vert ^{p} bigr), \& mathbb{E}_{mathcal{F}_{t}} bigl[ biglvert y_{s}^{t,a;u} – y_{t}^{t,a;u} bigrvert ^{p} bigr] leq C bigl(1+ vert a vert ^{p} bigr) (ts)^{frac{p}{2}}, \& mathbb{E}_{mathcal{F}_{t}} Bigl[max_{s in [t,T]} biglvert y_{s}^{t,a;u} – y_{s}^{t,a^{prime };u} bigrvert ^{p} Bigr] leq C biglvert a – a^{prime } bigrvert ^{p}; end{aligned}$$

(iii)
Suppose that ((tilde{y}_{s}^{t,a;u}, tilde{z}_{s}^{t,a;u})_{s in [t,T]} in mathcal{C}_{mathcal{F}}^{2}(mathbb{R}) times mathcal{L}_{ mathcal{F}}^{2}(mathbb{R}^{1 times r})) is the solution of (6), where (tilde{y}_{T}^{t,a;u} = m(x_{T}^{t,a;u}) + epsilon ) and (epsilon > 0). Then, there exists a constant (C > 0), dependent on L and T, such that (mathbb{E}_{mathcal{F}_{t}}[max_{s in [t,T]} y_{s}^{t,a;u} – tilde{y}_{s}^{t,a;u}^{2}] < C epsilon ). Assume that ((widehat{y}_{s}^{t,a;u}, widehat{z}_{s}^{t,a;u})_{s in [t,T]} in mathcal{C}_{mathcal{F}}^{2}(mathbb{R}) times mathcal{L}_{ mathcal{F}}^{2}(mathbb{R}^{1 times r})) is the solution of (6) with l̂ and m̂, where (l geq widehat{l}) and (m geq widehat{m}), (mathbb{P})–a.s. Then, (y_{s}^{t,a;u} geq widehat{y}_{s}^{t,a;u}) for (s in [t,T]), (mathbb{P})–a.s.
The objective functional is a recursive type given by
$$begin{aligned} J(t,a;u) = y_{t}^{t,a;u} = mathbb{E}_{mathcal{F}_{t}} bigl[y_{t}^{t,a;u} bigr]. end{aligned}$$
(7)
Then, the stochastic optimal control problem considered in this paper can be stated as follows:
$$begin{aligned} quad operatorname*{ess,inf}_{u in mathcal{U}_{t,T}} J(t,a;u),quad text{subject to (3)}. end{aligned}$$
(P)
Remark 2
When l in (6) does not depend on y and z, the objective functional J in (7) can be simplified as follows:
$$begin{aligned} J(t,a;u) = mathbb{E}_{mathcal{F}_{t}} biggl[ int _{t}^{T} l bigl(s,x_{s}^{t,a;u},u_{s} bigr) ,mathrm{d}s + m bigl(x_{T}^{t,a;u} bigr) biggr]. end{aligned}$$
This is a special case of (P), which was considered in [1, 11].
For (t in [0,T]) and (a in L^{2}(Omega ,mathcal{F}_{t};mathbb{R}^{n})), the value function of (P) is defined by
$$begin{aligned} V(t,a) = operatorname*{ess,inf}_{u in mathcal{U}_{t,T}} J(t,a;u),quad mathbb{P}text{a.s.} end{aligned}$$
(8)
Note that from Lemma 1, (P) is well posed; hence, (8) is the welldefined value function. If the coefficients in (5) and (6) are not dependent on (omega in Omega ), then the problem above corresponds to stochastic optimal control with deterministic coefficients, which has been studied in various aspects in the literature; see [17, 18, 20] and the references therein. Unlike the case of deterministic coefficients, the value function in (8) is a random field.
Remark 3
We mention that the purpose of choosing stochastic optimal controller design is to broaden its potential applications. Specifically, there are various applications of stochastic control problems in finance, economics, science, and engineering. Then, these applications can be studied in different aspects using the approaches of this paper, which allows capturing more practical situations including the general dynamic behavior of the objective functional and the random parameter variations due to imprecisions (see the detailed discussion in Sect. 1).
Dynamic programming principle and verification theorem
This subsection provides the continuity property of (8). We show that (8) satisfies the DPP, which is the recursivetype value iteration algorithm to solve (P). Then, we prove the verification theorem for (P).
We first state the following result due to Lemma 1:
Lemma 2
Assume that (H.1) and (H.2) hold. Then, there exists a constant (C>0) such that for (a,a^{prime }in mathbb{R}^{n}),
$$begin{aligned} biglvert V(t,a) – V bigl(t,a^{prime } bigr) bigrvert leq C biglvert aa^{prime } bigrvert ,qquad biglvert V(t,a) bigrvert leq C bigl(1+ vert a vert bigr),quad mathbb{P}textit{a.s.} end{aligned}$$
The backward semigroup operator associated with the BSDE is defined as follows: for (t,t+tau in [0,T]) with (t < t+tau ),
$$begin{aligned} Phi _{s,t+tau }^{t,a;u}[b] := bar{y}_{s}^{t,a;u},quad s in [t,t+tau ], end{aligned}$$
(9)
where ((bar{y}_{s}^{t,a;u},bar{z}_{s}^{t,a;u})_{s in [t,t+tau ]}) is the solution of the following BSDE on ([t,t+tau ]):
$$begin{aligned}& mathrm{d}bar{y}_{s}^{t,a;u} = l bigl(s,x_{s}^{t,a;u},u_{s}, bar{y}_{s}^{t,a;u}, bar{z}_{s}^{t,a;u} bigr),mathrm{d}s + bar{z}_{s}^{t,a;u} ,mathrm{d}B_{s}, \& bar{y}_{t+tau }^{t,a;u} = b. end{aligned}$$
Here, (b in L^{2}(Omega ,mathcal{F}_{t+tau };mathbb{R})). Obviously, when (b=y_{t+tau }^{t,a;u}) (note that (y_{t+tau }^{t,a;u} in L^{2}(Omega ,mathcal{F}_{t+tau };mathbb{R}))), we have (y_{t}^{t,a;u} = bar{y}_{t}^{t,a;u} = Phi _{t,t+tau }^{t,a;u}[y_{t+ tau }^{t,a;u}]), (mathbb{P})a.s.
Remark 4
By (9) and (i) of Lemma 1, the objective functional in (7) can be rewritten as follows:
$$begin{aligned} J(t,a;u) & = Phi _{t,T}^{t,a;u} bigl[m bigl(x_{T}^{t,a;u} bigr) bigr] = Phi _{t,t+tau }^{t,a;u} bigl[y_{t+ tau }^{t,a;u} bigr] = Phi _{t,t+tau }^{t,a;u} bigl[J bigl(t+tau ,x_{t+tau }^{t,a;u};u bigr) bigr]. end{aligned}$$
We now state the DPP for (P).
Theorem 1
Suppose that (H.1) and (H.2) hold. Then, the value function in (8) satisfies the following dynamic programming principle (DPP): for (t,t+tau in [0,T]) with (t < t+tau ) and (a in L^{2}(Omega ,mathcal{F}_{t};mathbb{R}^{n})),
$$begin{aligned} V(t,a) = operatorname*{ess,inf}_{u in mathcal{U}_{t,t+tau }} Phi _{t,t+tau }^{t,a;u} bigl[ V bigl(t+tau , x_{t+tau }^{t,a;u} bigr) bigr], quad mathbb{P}textit{a.s.} end{aligned}$$
Proof
Note that in view of Lemma 1, the FBSDE in (5) and (6) admit a unique solution of ((x_{s}^{t,a;u}, y_{s}^{t,a;u},z_{s}^{t,a;u})_{s in [t,T]} in mathcal{C}_{mathcal{F}}^{2}(mathbb{R}^{n}) times mathcal{C}_{ mathcal{F}}^{2}(mathbb{R}) times mathcal{L}_{mathcal{F}}^{2}( mathbb{R}^{1 times r})).
Let
$$begin{aligned} V^{prime }(t,a) & := operatorname*{ess,inf}_{u in mathcal{U}_{t,t+tau }} Phi _{t,t+ tau }^{t,a;u} bigl[ V bigl(t+tau , x_{t+tau }^{t,a;u} bigr) bigr], quad mathbb{P}text{a.s.} end{aligned}$$
We show that (V^{prime }(t,a) leq V(t,a)) and (V^{prime }(t,a) geq V(t,a)).
First, note from (7) and Remark 4 that
$$begin{aligned} V(t,a) & = operatorname*{ess,inf}_{u in mathcal{U}_{t,T}} Phi _{t,t+tau }^{t,a;u} bigl[J bigl(t+ tau ,x_{t+tau }^{t,a;u};u bigr) bigr] \ & geq operatorname*{ess,inf}_{u in mathcal{U}_{t,t+tau }} Phi _{t,t+tau }^{t,a;u} bigl[V bigl(t+ tau ,x_{t+tau }^{t,a;u} bigr) bigr] = V^{prime }(t,a), end{aligned}$$
where the inequality follows from (8) and (iii) of Lemma 1. This implies that (V(t,a) geq V^{prime }(t,a)).
We now prove (V(t,a) leq V^{prime }(t,a)). By Lemma 2 and (ii) of Lemma 1, for each (epsilon > 0), there exists (delta > 0) such that whenever (x – hat{x} < delta ), it holds that for all (u in mathcal{U}_{t+tau ,T}),
$$begin{aligned} & biglvert V(t+tau ,x) – V(t+tau ,hat{x}) bigrvert + biglvert J(t+tau ,x;u) – J(t+tau , hat{x};u) bigrvert < epsilon . end{aligned}$$
(10)
Denote ({D_{j}}_{j geq 1}) by the (disjoint) Borel partition of (mathbb{R}^{n}) having the diameter of δ, i.e., (operatorname{diam}(D_{j}) < delta ). This is equivalently saying that (D_{j}) is Borel measurable, i.e., (D_{j} in mathcal{B}(mathbb{R}^{n})), with (bigcup_{j geq 1} D_{j} = mathbb{R}^{n}) and (D_{j} cap D_{l} = phi ) for (j neq l). By definition, for (x,hat{x} in D_{j}), we have (xhat{x}< delta ). For each j, choose (x^{(j)} in D_{j}). Then, by the measurable selection theorem in [11, Theorem A.1] (see also [60, 61]), there exists (u^{(j)} in mathcal{U}_{t+tau ,T}) such that (J(t+tau ,x^{(j)};u^{(j)}) leq V(t+tau ,x^{(j)}) +epsilon ). Hence, by (10), for any (x in D_{j}),
$$begin{aligned} & J bigl(t+tau ,x;u^{(j)} bigr) – V(t+tau ,x) \ &quad leq biglvert J bigl(t+tau ,x;u^{(j)} bigr) – J bigl(t+tau ,x^{(j)};u^{(j)} bigr) bigrvert \ &qquad {} + biglvert J bigl(t+tau ,x^{(j)};u^{(j)} bigr) – V bigl(t+tau ,x^{(j)} bigr) bigrvert + biglvert V bigl(t+tau ,x^{(j)} bigr) – V(t+tau ,x) bigrvert leq 3 epsilon . end{aligned}$$
(11)
For any (u^{prime prime } in mathcal{U}_{t,t+tau }), we define
where is the indicator function. Clearly, (tilde{u} in mathcal{U}_{t,T}). Let . Then, by Remark 4,
$$begin{aligned} V(t,a) &leq J(t,a;tilde{u}) \ & = Phi _{t,t+tau }^{t,a;u^{prime prime }} bigl[J bigl(t+tau , x_{t+tau }^{t,a;u^{ prime prime }}; u^{prime } bigr) bigr] leq Phi _{t,t+tau }^{t,a;u^{prime prime }} bigl[V bigl(t+tau ,x_{t+tau }^{t,a;u^{prime prime }} bigr) bigr] + 3epsilon , end{aligned}$$
(12)
where the second inequality is due to (11) and (iii) of Lemma 1. Then, (12) and the definition of (V^{prime }), together with the arbitrariness of ϵ, imply that (after taking the essential infimum) we can obtain (V(t,a) leq V^{prime }(t,a)). This shows that (V(t,a) = V^{prime }(t,a)); thus completing the proof. □
We now state the continuity property of (8) in (t in [0,T]).
Proposition 1
Suppose that (H.1) and (H.2) hold. Then, (8) is continuous in (t in [0,T]). Specifically, there exists a constant (C>0) such that for (a in mathbb{R}^{n}) and (t,t+tau in [0,T]) with (t < t+tau ),
$$begin{aligned} biglvert V(t+tau ,a) – V(t,a) bigrvert leq C bigl(1+ vert a vert bigr)tau ^{frac{1}{2}},quad mathbb{P}textit{a.s.} end{aligned}$$
Proof
It is necessary to prove that
$$begin{aligned} – C bigl(1+ vert a vert bigr)tau ^{frac{1}{2}} & leq V(t,a) – V(t+ tau ,a) leq C bigl(1+ vert a vert bigr) tau ^{frac{1}{2}},quad mathbb{P}text{a.s.} end{aligned}$$
Below, it is shown that (V(t,a) – V(t+tau ,a) leq C (1+a)tau ^{frac{1}{2}}).
In view of Theorem 1, for each (epsilon > 0), there exists (u^{prime }in mathcal{U}_{t,t+tau }) such that
$$begin{aligned} biglvert V(t,a) – Phi _{t,t+tau }^{t,a;u^{prime }} bigl[ V bigl(t+tau , x_{t+ tau }^{t,a;u^{prime }} bigr) bigr] bigrvert leq epsilon ,quad mathbb{P}text{a.s.} end{aligned}$$
This implies that
$$begin{aligned} V(t,a) – V(t+tau ,a) & leq I^{(1)} + I^{(2)} + epsilon , quad mathbb{P}text{a.s.}, end{aligned}$$
where
$$begin{aligned}& I^{(1)} := Phi _{t,t+tau }^{t,a;u^{prime }} bigl[ V bigl(t+tau , x_{t+ tau }^{t,a;u^{prime }} bigr) bigr] – Phi _{t,t+tau }^{t,a;u^{prime }} bigl[ V(t+tau ,a) bigr], \& I^{(2)} := Phi _{t,t+tau }^{t,a;u^{prime }} bigl[ V(t+tau ,a) bigr] – V(t+tau ,a). end{aligned}$$
From (i) of Lemma 1, Lemma 2, and Jensen’s inequality, ((mathbb{P})a.s.)
$$begin{aligned} biglvert I^{(1)} bigrvert & leq C mathbb{E} bigl[ biglvert V bigl(t+tau , x_{t+tau }^{t,a;u^{prime }} bigr) – V(t+ tau ,a) bigrvert ^{2}  mathcal{F}_{t} bigr]^{frac{1}{2}} \ & leq C mathbb{E} bigl[ biglvert x_{t+tau }^{t,a;u^{prime }} – a bigrvert ^{2}  mathcal{F}_{t} bigr] ^{frac{1}{2}} leq C bigl(1+ vert a vert bigr)tau ^{frac{1}{2}}. end{aligned}$$
(13)
Moreover, from the definition of Φ and the terminal condition of Φ in (I^{(2)}), we use Lemma 1 and (H.2) to obtain
$$begin{aligned} biglvert I^{(2)} bigrvert & = bigglvert mathbb{E}_{mathcal{F}_{t}} biggl[ int _{t}^{t+ tau } l bigl(s,x_{s}^{t,a;u^{prime }},u_{s}^{prime }, bar{y}_{s}^{t,a;u^{prime }}, bar{z}_{s}^{t,a;u^{prime }} bigr) ,mathrm{d}s biggr] biggrvert \ & leq tau ^{frac{1}{2}} mathbb{E}_{mathcal{F}_{t}} biggl[ int _{t}^{t+ tau } l bigl(s,x_{s}^{t,a;u^{prime }},u_{s}^{prime }, bar{y}_{s}^{t,a;u^{prime }}, bar{z}_{s}^{t,a;u^{prime }}^{2} bigr) ,mathrm{d}s biggr]^{frac{1}{2}} \ & leq C tau ^{frac{1}{2}} mathbb{E}_{mathcal{F}_{t}} biggl[ int _{t}^{t+tau } bigl[ 1 + biglvert x_{s}^{t,a;u^{prime }} bigrvert ^{2} + biglvert bar{y}_{s}^{t,a;u^{prime }} bigrvert ^{2} + biglvert bar{z}_{s}^{t,a;u^{prime }} bigrvert ^{2} bigr] ,mathrm{d}s biggr] \ & leq C bigl(1+ vert a vert bigr) tau ^{frac{1}{2}}, quad mathbb{P}text{a.s.} end{aligned}$$
(14)
Note that (13) and (14) lead to
$$begin{aligned} V(t,a) – V(t+tau ,a) & leq C bigl(1+ vert a vert bigr)tau ^{frac{1}{2}} + epsilon , quad mathbb{P}text{a.s.} end{aligned}$$
Hence, the arbitrariness of ϵ implies (V(t,a) – V(t+tau ,a) leq C(1+a) tau ^{1/2}), (mathbb{P})a.s. The other inequality can be proven in a similar way. This completes the proof. □
From Lemma 2 and Proposition 1, the following result holds:
Corollary 1
Assume that (H.1) and (H.2) hold. Then, the value function in (8) is continuous on ([0,T] times mathbb{R}^{n}). Specifically, for (a,a^{prime }in mathbb{R}^{n}) and (t,t+tau in [0,T]) with (t < t+tau ),
$$begin{aligned} biglvert V bigl(t+tau ,a^{prime } bigr) – V(t,a) bigrvert leq C bigl( biglvert aa^{prime } bigrvert + bigl(1+ vert a vert + biglvert a^{prime } bigrvert bigr)tau ^{frac{1}{2}} bigr), quad mathbb{P}textit{a.s.} end{aligned}$$
We now state the verification theorem for (P).
Theorem 2
Assume that (H.1) and (H.2) hold. Suppose that the pair ((V,q) in mathcal{L}^{infty }_{mathcal{F}}(C^{2}(mathbb{R}^{n})) times mathcal{L}^{2}_{mathcal{F}}(C^{2}(mathbb{R}^{n};mathbb{R}^{1 times r}))) is the solution to the SHJB equation in (3). Then, for (t in [0,T]), (x in L^{2}(Omega ,mathcal{F}_{t};mathbb{R}^{n})) and (u in mathcal{U}_{t,T}), (V(t,x) leq J(t,x;u)), (mathbb{P}).a.s. Furthermore, assume that (widehat{u}_{s} in U) with (widehat{u} := (widehat{u}_{s})_{s in [t,T]} in mathcal{U}_{t,T}) is the minimizer of the Hamiltonian in (3) for (s in [t,T]), (mathbb{P}).a.s. Then, for (t in [0,T]) and (x in L^{2}(Omega ,mathcal{F}_{t};mathbb{R}^{n})), we have (V(t,x) =J(t,x;widehat{u})), (mathbb{P}).a.s. and (widehat{u} in mathcal{U}_{t,T}) is the corresponding optimal control.
Proof
Suppose that ((V,q) in mathcal{L}^{infty }_{mathcal{F}}(C^{2}(mathbb{R}^{n})) times mathcal{L}^{2}_{mathcal{F}}(C^{2}(mathbb{R}^{n};mathbb{R}^{1 times r}))) is the solution of (3). Let ((x_{s}^{t,x;widehat{u}})_{s in [t,T]}) be the state trajectory generated by (widehat{u} in mathcal{U}_{t,T}) with (x_{t}^{t,x;widehat{u}} = x in L^{2}(Omega ,mathcal{F}_{t}; mathbb{R}^{n})). Note that (V(T,x_{T}^{t,x;widehat{u}}) = m(x_{T}^{t,x;widehat{u}})) and (V(t,x_{t}^{t,x;widehat{u}}) = V(t,x)), (mathbb{P})a.s.
By using the Itô–Kunita formula [62] and the SHJB in (3), we have ((mathbb{P})a.s.)
$$begin{aligned} V bigl(T,x_{T}^{t,x;widehat{u}} bigr) ={}& V(t,x) + int _{t}^{T} bigllangle D V bigl(s,x_{s}^{t,x; widehat{u}} bigr), f bigl(s,x_{s}^{t,x;widehat{u}}, widehat{u}_{s} bigr) bigrrangle ,mathrm{d}s \ & {} + frac{1}{2} int _{t}^{T} operatorname{Tr}bigl(sigma sigma ^{top } bigl(s,x_{s}^{t,x; widehat{u}}, widehat{u}_{s} bigr) D^{2} V bigl(s,x_{s}^{t,x;widehat{u}} bigr) bigr) ,mathrm{d}s \ & {} + int _{t}^{T} operatorname{Tr}bigl(sigma bigl(s,x_{s}^{t,x;widehat{u}}, widehat{u}_{s} bigr) D q bigl(s,x_{s}^{t,x;widehat{u}} bigr) bigr) ,mathrm{d}s \ & {} + int _{t}^{T} bigllangle D V bigl(s,x_{s}^{t,x;widehat{u}} bigr), sigma bigl(s,x_{s}^{t,x;widehat{u}}, widehat{u}_{s} bigr) bigrrangle ,mathrm{d}B_{s} \ & {} – int _{t}^{T} H bigl(s,x_{s}^{t,x;widehat{u}}, bigl(V,D V, D^{2} V, q, D q bigr) bigl(s,x_{s}^{t,x;widehat{u}} bigr) bigr) ,mathrm{d}s \ & {} + int _{t}^{T} q bigl(s,x_{s}^{t,x;widehat{u}} bigr) ,mathrm{d}B_{s} \ ={}& V(t,x) – int _{t}^{T} l bigl(s,x_{s}^{t,x;widehat{u}}, widehat{u}_{s}, V bigl(s,x_{s}^{t,x;widehat{u}} bigr), \ &{} bigllangle D V bigl(s,x_{s}^{t,x;widehat{u}} bigr), sigma bigl(s,x_{s}^{t,x; widehat{u}},widehat{u}_{s} bigr) bigrrangle + q bigl(s,x_{s}^{t,x;widehat{u}} bigr) bigr) ,mathrm{d}s \ & {} + int _{t}^{T} bigl[ bigllangle D V bigl(s,x_{s}^{t,x;widehat{u}} bigr), sigma bigl(s,x_{s}^{t,x;widehat{u}}, widehat{u}_{s} bigr) bigrrangle + q bigl(s,x_{s}^{t,x; widehat{u}} bigr) bigr] ,mathrm{d}B_{s}. end{aligned}$$
Let ((y_{s}^{t,x;widehat{u}}, z_{s}^{t,x;widehat{u}})_{s in [t,T]}) be the BSDE in (6) with (widehat{u} in mathcal{U}_{t,T}). Let (widehat{y}_{s}^{widehat{u}} := V(s,x_{s}^{t,x;widehat{u}}) – y_{s}^{t,x; widehat{u}}) and (widehat{z}_{s}^{widehat{u}} := langle D V(s,x_{s}^{t,x; widehat{u}}), sigma (s,x_{s}^{t,x;widehat{u}},widehat{u}_{s}) rangle + q(s,x_{s}^{t,x;widehat{u}}) – z_{s}^{t,x;widehat{u}}). Note that (widehat{y}_{T}^{widehat{u}} = 0), (mathbb{P})a.s. Then, we have
$$begin{aligned} mathrm{d}widehat{y}_{s}^{widehat{u}} ={}& – bigl[ l bigl(s,x_{s}^{t,x; widehat{u}},widehat{u}_{s}, V bigl(s,x_{s}^{t,x;widehat{u}} bigr), bigllangle D V bigl(s,x_{s}^{t,x; widehat{u}} bigr), sigma bigl(s,x_{s}^{t,x;widehat{u}}, widehat{u}_{s} bigr) bigrrangle \ & {} + q bigl(s,x_{s}^{t,x;widehat{u}} bigr) bigr) – l bigl(s,x_{s}^{t,x; widehat{u}},widehat{u}_{s}, y_{s}^{t,x;widehat{u}}, z_{s}^{t,x; widehat{u}} bigr) bigr] ,mathrm{d}s + widehat{z}_{s}^{widehat{u}} ,mathrm{d}B_{s} \ ={}& – bigl[ A_{s}^{(1)} widehat{y}_{s}^{widehat{u}} + A_{s}^{(2)} widehat{z}_{s}^{widehat{u}} bigr] ,mathrm{d}s + widehat{z}_{s}^{widehat{u}} ,mathrm{d}B_{s}, end{aligned}$$
(15)
where (A^{(1)}) and (A^{(2)}) are bounded coefficients (independent of ŷ and ẑ) due to (H.1) and (H.2). Since (15) is a linear BSDE, in view of [13, Proposition 4.1.2], we have (widehat{y}_{s}^{widehat{u}} = 0) for (s in [t,T]), (mathbb{P})a.s. Hence, it holds that (V(t,x_{t}^{t,x;widehat{u}}) = V(t,x) = y_{t}^{t,x;widehat{u}} = J(t,x; widehat{u}) ), (mathbb{P})a.s.
On the other hand, for any (u in mathcal{U}_{t,T}), by using the approach analogous to that above and (iii) of Lemma 1, we can show that (widehat{y}_{s}^{u} leq 0) for (s in [t,T]), (mathbb{P})a.s., which implies that (V(t,x_{t}^{t,x;u}) = V(t,x) leq y_{t}^{t,x;u} = J(t,x;u)), (mathbb{P})a.s. Note that the equality can be achieved when (u = widehat{u} in mathcal{U}_{t,T}). This shows that for any (u in mathcal{U}_{t,T}) and (x in L^{2}(Omega ,mathcal{F}_{t};mathbb{R}^{n})), we have
$$begin{aligned} J(t,x;u) = y_{t}^{t,x;u} geq y_{t}^{t,x;widehat{u}} = J(t,x; widehat{u}) = V(t,x), quad mathbb{P}text{a.s.}, end{aligned}$$
where the last equality follows from the definition of the value function V in (8). This completes the proof of the theorem. □
Remark 5
In Sect. 3, we show the existence and uniqueness of the viscosity solution to the SHJB equation in (3). Furthermore, in the appendix, the existence and uniqueness of the weak solution to (3) is shown via the Sobolevspace technique.
General indefinite linear–quadratic problem with random coefficients
This subsection considers the general indefinite linear–quadratic (LQ) problem of (P) as an application of Theorem 2. For notational simplicity, we assume that (r=1), i.e., the onedimensional Brownian motion.
The LQ problem in this subsection is referred to as (LQP) with
$$begin{aligned} textstylebegin{cases} f(s,x,u) = A_{s} x + F_{s} u, qquad sigma (s,x,u) = C_{s} x + E_{s} u, \ l(s,x,u,y,z) = frac{1}{2} [ langle x, Q_{s} x rangle + langle u, R_{s} u rangle + y ] + z, \ m(x) = frac{1}{2}langle x, M x rangle , end{cases}displaystyle end{aligned}$$
(16)
where A, F, C, E, Q, R are ({mathcal{F}_{s}}_{s geq 0})adapted continuous stochastic processes with appropriate dimensions, which are uniformly bounded in (omega in Omega ) (they belong to (mathcal{L}_{mathcal{F}}^{infty })) and (M in L^{infty }(Omega ,mathcal{F}_{T};mathbb{S}^{n})). We assume that Q, R, M are symmetric matrices, which need not be definite matrices.^{Footnote 5} When l in (16) is independent of y and z, (LQP) is reduced to the simplified LQ problem (with random coefficients) studied in [25, 26, 43–45] and the references therein.
From (4), the Hamiltonian can be written as (s argument is suppressed)
$$begin{aligned} & H(s,x,y,p,P,q,bar{P}) \ &quad = operatorname*{ess,inf}_{u} biggl{ langle p, Ax + F u rangle + frac{1}{2} bigl[ langle x, Q x rangle + langle u, R u rangle bigr] + frac{1}{2}y + q + langle p, C x + E u rangle \ &qquad {} + frac{1}{2} bigllangle C x + E u, P(C x + E u) bigrrangle + langle Cx + Eu, bar{P} rangle biggr} . end{aligned}$$
(17)
Assume that (R_{s} + E_{s}^{top }P E_{s}) is (uniformly) positivedefinite for almost all (omega in Omega ) and (s in [0,T]). Then, we can easily see that H in (17) admits a unique minimizer, which can be written as follows:
$$begin{aligned} widehat{u} & = – bigl(R + E^{top }P E bigr)^{1} bigl[ F^{top }p + E^{top }p + E^{top }PC x + E^{top }bar{P} bigr] x. end{aligned}$$
(18)
By substituting (18) into (17), the SHJB in (3) is obtained by
$$begin{aligned} textstylebegin{cases} mathrm{d}V(s,x) = – H(s,x,(V,DV,D^{2} V, q, D q)(s,x)) ,mathrm{d}s + q(s,x) ,mathrm{d}B_{s}, \ (s,x) in [0,T) times mathbb{R}^{n}, \ V(T,x) = frac{1}{2} x^{top }M x, quad x in mathbb{R}^{n}, end{cases}displaystyle end{aligned}$$
(19)
where (s argument is suppressed)
$$begin{aligned} & H(s,x,y,p,P,q,bar{P}) \ &quad = x^{top }A^{top }p + frac{1}{2} x^{top }Q x + frac{1}{2} y + x^{top }C^{top }p + frac{1}{2} x^{top }C^{top }P C x + x^{top }C^{top }bar{P} + q \ &qquad {} – frac{1}{2} bigl[ F^{top }p + E^{top }p + E^{top }PC x + E^{top }bar{P} bigr]^{top } bigl(R + E^{top }P E bigr)^{1} \ &qquad {} times bigl[ F^{top }p + E^{top }p + E^{top }PC x + E^{top }bar{P} bigr]. end{aligned}$$
(20)
In view of the verification theorem in Theorem 2, we need to seek for the solution of (19) to solve (LQP).
We conjecture that the general solutions for (19) are quadratic in x, i.e.,
$$begin{aligned} &V(s,x) = frac{1}{2} x^{top }Lambda _{s} x,qquad q(s,x) = frac{1}{2} x^{top }bar{Lambda }_{s} x, end{aligned}$$
(21)
where it is assumed that Λ, Λ̄ are ({mathcal{F}_{s}}_{s geq 0})adapted symmetric (n times n)valued bounded stochastic processes with (Lambda _{T} = M), i.e., ((Lambda ,bar{Lambda }) in mathcal{L}_{mathcal{F}}^{infty }( mathbb{S}^{n}) times mathcal{L}_{mathcal{F}}^{2}(mathbb{S}^{n})). Under this assumption, V and q in (21) are smooth, i.e., ((V,q) in mathcal{L}^{infty }_{mathcal{F}}(C^{2}(mathbb{R}^{n})) times mathcal{L}^{2}_{mathcal{F}}(C^{2}(mathbb{R}^{n};mathbb{R}^{1 times r}))), where (D V(s,x) = Lambda _{s} x) and (D q(s,x) = bar{Lambda }_{s} x) are well defined. Then, by substituting (21) into (20), we can easily see that the SHJB equation in (19) admits a unique smooth solution if the following stochastic Riccati differential equation (SRDE) admits a unique solution:
$$begin{aligned} textstylebegin{cases} mathrm{d}Lambda _{s} = – [ A_{s}^{top }Lambda _{s} + Lambda _{s} A_{s} + Q_{s} + Lambda _{s} + C_{s}^{top }Lambda _{s} C_{s} \ hphantom{mathrm{d}Lambda _{s} =}{} + bar{Lambda }_{s} + C_{s}^{top }Lambda _{s} + Lambda _{s} C_{s} + C_{s}^{top }bar{Lambda }_{s} + bar{Lambda }_{s} C_{s} \ hphantom{mathrm{d}Lambda _{s} =}{} – [ F_{s}^{top }Lambda _{s} + E_{s}^{top }Lambda _{s} + E_{s}^{top }Lambda _{s} C_{s} + E_{s}^{top }bar{Lambda }_{s} ]^{top }(R_{s} + E_{s}^{top }Lambda _{s} E_{s})^{1} \ hphantom{mathrm{d}Lambda _{s} =}{} times [ F_{s}^{top }Lambda _{s} + E_{s}^{top }Lambda _{s} + E_{s}^{top }Lambda _{s} C_{s} + E_{s}^{top }bar{Lambda }_{s} ] ],mathrm{d}s + bar{Lambda }_{s} ,mathrm{d}B_{s}, \ Lambda _{T} = M. end{cases}displaystyle end{aligned}$$
(22)
Note that (22) is a symmetric (n times n)valued stochastic process. Here, the solution of the SRDE in (22) is defined by the adapted pair ((Lambda ,bar{Lambda }) in mathcal{L}_{mathcal{F}}^{infty }( mathbb{S}^{n}) times mathcal{L}_{mathcal{F}}^{2}(mathbb{S}^{n})), which can be viewed as a matrixvalued BSDE with random coefficients.
By substituting (21) into (18), from Theorem 2, the optimal control for (LQP) can be obtained by
$$begin{aligned} widehat{u}_{s} & = – bigl(R_{s} + E_{s}^{top }Lambda _{s} E_{s} bigr)^{1} bigl[ F_{s}^{top }Lambda _{s} + E_{s}^{top }Lambda _{s} + E_{s}^{top }Lambda _{s} C_{s} + E_{s}^{top }bar{Lambda }_{s} bigr] x_{s}^{t,a; widehat{u}}, end{aligned}$$
(23)
provided that (R_{s} + E_{s}^{top }Lambda _{s} E_{s}) is (uniformly) positivedefinite for almost all (omega in Omega ) and (s in [0,T]).^{Footnote 6}
In summary, by applying the verification theorem in Theorem 2, we have the following result:
Proposition 2
Suppose that the pair ((Lambda ,bar{Lambda }) in mathcal{L}_{mathcal{F}}^{infty }( mathbb{S}^{n}) times mathcal{L}_{mathcal{F}}^{2}(mathbb{S}^{n})) is the solution of the SRDE in (22) and that (R_{s} + E_{s}^{top }Lambda _{s} E_{s}) is (uniformly) positive–definite for almost all (omega in Omega ) and (s in [0,T]). Then, for (x in L^{2}(Omega ,mathcal{F}_{t};mathbb{R}^{n})), (V(t,x) = frac{1}{2}langle x, Lambda _{t} x rangle ) is the value function of (LQP) (equivalently, (V(t,x) = frac{1}{2}langle x, Lambda _{t} x rangle ) is the optimal cost), and (23) is the corresponding optimal control.
Remark 6
The solvability of the SRDE in (22) is an open problem. When l does not depend on y and z, the solvability of the corresponding SRDEs has been discussed extensively in the literature; see [25, 26, 43–45] and the references therein. Moreover, we can consider the case of jumpdiffusion models as in [63].
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
Disclaimer:
This article is autogenerated using RSS feeds and has not been created or edited by OA JF.
Click here for Source link (https://www.springeropen.com/)