This article mainly contains the following 3 parts of the content:
- The definition and properties of $\lambda$-strong convex function.
- $\mu$-the definition and nature of smoothing functions.
- A link between the above two concepts is established by the conjugate sub-gradient theorem.
define 1[strong convex function]: If the function $f (\CDOT) $ is the $\lambda$-strong convex function on the set $c$, then $f (\cdot)-\frac{\lambda}{2} \|\cdot\|^2$ is the convex function on the $c$.
Intuitively, if a function is a strong convex function, it must be at least as "steep" as the two-time function, and it has some equivalent descriptions:
Proposition 2: function $f$ is a $\lambda$-strong convex function on a set $c$ when and only if for $\forall \boldsymbol{x}, \boldsymbol{y} \in C $ and $\forall \alpha \in [0, 1 ]$, with \begin{align*}\alpha f (\boldsymbol{x}) + (1-\alpha) F (\boldsymbol{y}) \geq f (\alpha \boldsymbol{x} + (1-\alpha) \b Oldsymbol{y}) + \frac{\lambda \alpha (1-\alpha)}{2} \| \BOLDSYMBOL{X}-\boldsymbol{y} \|^2 \end{align*} was established.
Proof:Since $f (\cdot)-\frac{\lambda}{2} \|\cdot\|^2$ is a convex function, then \begin{align*} \alpha \left (f (\boldsymbol{x})-\frac{\lambda}{2} \ |\BOLDSYMBOL{X} \|^2 \right) + (1-\alpha) \left (f (\boldsymbol{y})-\frac{\lambda}{2} \|\boldsymbol{y} \|^2 \right) \g EQ f (\alpha \boldsymbol{x} + (1-\alpha) \boldsymbol{y})-\frac{\lambda}{2} \| \alpha \boldsymbol{x} + (1-\alpha) \boldsymbol{y} \|^2 \end{align*} move items to organize \begin{align*} \alpha f (\ BOLDSYMBOL{X}) + (1-\alpha) F (\boldsymbol{y}) & \geq f (\alpha \boldsymbol{x} + (1-\alpha) \boldsymbol{y}) + \frac{ \LAMBDA}{2} \alpha \|\boldsymbol{x} \|^2 + \frac{\lambda}{2} (1-\alpha) \|\boldsymbol{y} \|^2-\frac{\lambda}{2} \| \alpha \boldsymbol{x} + (1-\alpha) \boldsymbol{y} \|^2 \ & = f (\alpha \boldsymbol{x} + (1-\alpha) \bo Ldsymbol{y}) + \frac{\lambda}{2} (\alpha (1-\alpha) \|\boldsymbol{x} \|^2 + \alpha (1-\alpha) \|\boldsymbol{y} \|^2- 2 \alpha (1-\alpha) \boldsymbol{x} ^\top \boldsymbol{y}) \ &= f (\alpha \boldsymbol{x} + (1-\alpha) \boldsymbol{y}) + \frac{\lambda \alpha (1-\alpha)}{2} \| \boldsymbol{x} -\boldsymbol{y} \|^2 \end{align*}
proposition 3[Uniqueness]: If the function $f$ is a $\lambda$-strong convex function, then its minimum value point is unique.
Proof: It may be assumed that $\boldsymbol{x} $ and $\boldsymbol{y} $ are the minimum points of $f$, i.e. $f (\boldsymbol{x}) = f (\boldsymbol{y}) $, which makes Proposition 2 $\ Alpha=\frac{1}{2}$ has \begin{align*} f (\boldsymbol{x}) \geq F\left (\frac{\boldsymbol{x} + \boldsymbol{y}}{2}\right) + \ FRAC{\LAMBDA}{8} \| \BOLDSYMBOL{X}-\boldsymbol{y} \|^2 \geq f\left (\frac{\boldsymbol{x} + \boldsymbol{y}}{2}\right) \geq f (\boldsymbol{x} \end{align*} All equal equals only, so $\boldsymbol{x} =\boldsymbol{y} $, which is the minimum point, is unique.
Proposition 4[First Order property]: function $f$ is a $\lambda$-strong convex function on a set $c$ when and only if for $\forall \boldsymbol{x}, \boldsymbol{y} \in C $ has \begin{align} \ Label{equ:first order} \forall \boldsymbol{g} \in \partial f (\boldsymbol{x}), \ f (\boldsymbol{y}) \geq f (\boldsymbol{x} ) + \boldsymbol{g} ^\top (\boldsymbol{y}-\boldsymbol{x}) + \frac{\lambda}{2} \| \boldsymbol{y}-\boldsymbol{x} \|^2 \end{align}
Proof:On the one hand by $f$ is the $\lambda$-strong convex function known to $\forall \boldsymbol{g} \in \partial f (\boldsymbol{x}) $ has \begin{align*} f (\boldsymbol{ y})-\frac{\lambda}{2} \|\boldsymbol{y} \|^2 \geq f (\boldsymbol{x})-\frac{\lambda}{2} \|\boldsymbol{x} \|^2 + (\boldsym bol{g} -\lambda \boldsymbol{x}) ^\top (\boldsymbol{y} -\boldsymbol{x}) \end{align*} move items to organize \begin{align*} f ( \boldsymbol{y}) & \geq f (\boldsymbol{x}) + \boldsymbol{g} ^\top (\boldsymbol{y} -\boldsymbol{x}) + \frac{\lamb DA}{2} \|\boldsymbol{y} \|^2-\frac{\lambda}{2} \|\boldsymbol{x} \|^2-\lambda \boldsymbol{x} ^\top (\boldsymbol{y}&nbs P -\boldsymbol{x}) \ \ & = f (\boldsymbol{x}) + \boldsymbol{g} ^\top (\boldsymbol{y} -\boldsymbol{x}) + \frac{\l AMBDA}{2} \|\boldsymbol{y} \|^2-\lambda \boldsymbol{x} ^\top \boldsymbol{y} + \frac{\lambda}{2} \|\boldsymbol{x} \|^2 \ & = f (\boldsymbol{x}) + \boldsymbol{g} ^\top (\boldsymbol{y} -\boldsymbol{x}) + \frac{\lambda}{2} \| \boldsymbol{y} -\boldsymbol{x} \|^2 \end{align*} on the other hand, remember $\boldsymbol{z} = \alpha \boldsymbol{x} + (1-\alpha) \boldsymbol{y} $, by formula ( \ref{equ:first Order}) known to $\forall \boldsymbol{g} \in \partial f (\boldsymbol{z}) $ has \begin{align} \label{equ:first Order Proof 1} f (\boldsymbol{x}) & \geq f (\boldsymbol{z}) + \boldsymbol{g} ^\top (\boldsymbol{x} -\boldsymbol{z }) + \frac{\lambda}{2} \| \boldsymbol{x} -\boldsymbol{z} \|^2 = f (\boldsymbol{z}) + \boldsymbol{g} ^\top (\boldsymbol{x} -\bold Symbol{z}) + \frac{\lambda}{2} (1-\alpha) ^2 \| \boldsymbol{x} -\boldsymbol{y} \|^2 \ \label{equ:first order Proof 2} f (\boldsymbol{y}) & \geq f (\bolds Ymbol{z}) + \boldsymbol{g} ^\top (\boldsymbol{y} -\boldsymbol{z}) + \frac{\lambda}{2} \| \boldsymbol{y} -\boldsymbol{z} \|^2 = f (\boldsymbol{z}) + \boldsymbol{g} ^\top (\boldsymbol{y} -\bold Symbol{z}) + \frac{\lambda}{2} \alpha^2 \| \boldsymbol{y} -\boldsymbol{x} \|^2 \end{align}$ (\ref{equ:first order Proof 1}) \times \alpha + (\ref{equ:first order Proof 2}) \times (1-\alpha) $ available \begin{align*} \a Lpha f (\boldsymbol{x}) + (1-\alpha) f (\boldsymbol{y}) & \geq f (\boldsymbol{z}) + \frac{\lambda}{2} (1-\alpha) ^2 \alp Ha \| \boldsymbol{x} -\boldsymbol{y} \|^2 + \frac{\lambda}{2} \alpha^2 (1-\alpha) \| \boldsymbol{y} -\boldsymbol{x} \|^2 \ & = f (\alpha \boldsymbol{x} + (1-\alpha) \boldsymbol{y}) + \ FRAC{\LAMBDA}{2} \alpha (1-\alpha) \| \boldsymbol{x} -\boldsymbol{y} \|^2 \end{align*} by proposition 2 known $f$ is the $\lambda$-strong convex function.
proposition 5: If the function $f$ is a micro function on a set $c$, then $f$ is $\lambda$-strong convex function when and only if for $\forall \boldsymbol{x}, \boldsymbol{y} \in C $ has \begin{align*} (\nabla f (\boldsymbol{y})-\nabla f ( \BOLDSYMBOL{X})) ^\top (\boldsymbol{y} -\boldsymbol{x}) \geq \lambda \| \boldsymbol{y} -\boldsymbol{x} \|^2 \end{align*} Also, if $f$ differentiable micro, $f $ is a sufficient condition for the $\lambda$-strong convex function is \begin{align*} \ BOLDSYMBOL{X} ^\top \nabla^2 f (\boldsymbol{y}) \boldsymbol{x} \geq \frac{\lambda}{2} \|\boldsymbol{x} \|^2, \ \for All \boldsymbol{y}, \boldsymbol{x} \end{align*}
Proof: On the one hand, by proposition 4 known \begin{align*} f (\boldsymbol{y}) \geq f (\boldsymbol{x}) + \nabla f (\boldsymbol{x}) ^\top (\boldsymbol{ Y}-\boldsymbol{x}) + \frac{\lambda}{2} \| \boldsymbol{y}-\boldsymbol{x} \|^2 \ f (\boldsymbol{x}) \geq f (\boldsymbol{y}) + \nabla f (\boldsymbol{y}) ^\top (\bold SYMBOL{X}-\boldsymbol{y}) + \frac{\lambda}{2} \| \BOLDSYMBOL{X}-\boldsymbol{y} \|^2 \end{align*} The add-ons are added \begin{align*} (\nabla f (\boldsymbol{y})-\nabla f (\boldsymbol{x })) ^\top (\boldsymbol{y}-\boldsymbol{x}) \geq \lambda \| \boldsymbol{y}-\boldsymbol{x} \|^2 \end{align*}
On the other hand, Kee $h (\alpha) = f (\boldsymbol{y} + \alpha (\boldsymbol{x}-\boldsymbol{y})) $ and $\boldsymbol{w} = \boldsymbol{y } + \alpha (\boldsymbol{x}-\boldsymbol{y}) $, so $h ' (\alpha) = \nabla f (\boldsymbol{w}) ^\top (\boldsymbol{x}-\boldsy Mbol{y}) $, thus having \begin{align*} h ' (\alpha)-H ' (0) = \nabla f (\boldsymbol{w}) ^\top (\boldsymbol{x}-\boldsymbol{y})-\na Bla f (\boldsymbol{y}) ^\top (\boldsymbol{x}-\boldsymbol{y}) \geq \frac{\lambda}{\alpha} \| \BOLDSYMBOL{W}-\boldsymbol{y} \|^2 = \lambda \alpha \| \BOLDSYMBOL{X}-\boldsymbol{y} \|^2\end{align*} \begin{align*} f (\boldsymbol{x})-F (\boldsymbol{y})-\nabla f (\bolds Ymbol{y}) ^\top (\boldsymbol{x}-\boldsymbol{y}) = h (1)-H (0)-H ' (0) = \int_0^1 (h ' (\alpha)-H ' (0)) \mbox{d} \alpha \geq \frac{\lambda}{2} \| \BOLDSYMBOL{X}-\boldsymbol{y} \|^2 \end{align*}
The $f$ is a $\lambda$-strong convex function, which is known by Proposition 4.
if $f$ differentiable micro, then $h "(\alpha) = (\boldsymbol{x}-\boldsymbol{y}) ^\top \nabla^2 f (\boldsymbol{w}) (\boldsymbol{x}-\bo Ldsymbol{y}) \geq \frac{\lambda}{2} \| \BOLDSYMBOL{X}-\boldsymbol{y} \|^2$. Known by the Taylor ' s formula exists $\theta \in [0,1]$ makes \begin{align*} h (1) = h (0) + H ' (0) + \frac{1}{2} h ' (\theta) \end{align*} so \begin{align *} f (\boldsymbol{x}) = h (1) = h (0) + H ' (0) + \frac{1}{2} h ' (\theta) \geq f (\boldsymbol{y}) + \nabla f (\boldsymbol{y}) ^ \top (\boldsymbol{x}-\boldsymbol{y}) + \frac{\lambda}{2} \| \BOLDSYMBOL{X}-\boldsymbol{y} \|^2 \end{align*} by proposition 4 known $f$ is the $\lambda$-strong convex function.
define 6[Smoothing Function]: If the function $f (\CDOT) $ is a $\mu$-smoothing function on a set $c$, then it is differentiable and the derivative is $c$ function on $\mu$-lipschitz.
Intuitively, if a function is a smoothing function, its derivative changes cannot be too "drastic".
Proposition 7: If the function $f$ is a $\mu$-smoothing function, then for $\forall \boldsymbol{x}, \boldsymbol{y} \in C $ and $\forall \alpha \in [0, 1]$, \begin{ align*} f (\boldsymbol{x}) \leq f (\boldsymbol{y}) + \nabla f (\boldsymbol{y}) ^\top (\boldsymbol{x}-\boldsymbol{y}) + \f RAC{\MU}{2} \| \BOLDSYMBOL{X}-\boldsymbol{y} \|^2 \end{align*} was established.
Proof:Kee $h (\alpha) = f (\boldsymbol{y} + \alpha (\boldsymbol{x} -\boldsymbol{y})) $ and $\boldsymbol{w} = \ boldsymbol{y} + \alpha (\boldsymbol{x} -\boldsymbol{y}) $, so $h ' (\alpha) = \nabla f (\boldsymbol{w}) ^\top (\b oldsymbol{x} -\boldsymbol{y}) $,\begin{align*} f (\boldsymbol{x})-F (\boldsymbol{y})-\nabla f (\boldsymbol{y}) ^ \top (\boldsymbol{x} -\boldsymbol{y}) & = h (1)-H (0)-H ' (0) \ & = \int_0^1 (h ' (\alpha)-H ' (0)) \mbox{d } \alpha \ & = \int_0^1 (\nabla f (\boldsymbol{w})-\nabla f (\boldsymbol{y})) ^\top (\boldsymbol{x} -\boldsym Bol{y}) \mbox{d} \alpha \ & \leq \int_0^1 \|\nabla f (\boldsymbol{w})-\nabla f (\boldsymbol{y}) \| \| (\boldsymbol{x} -\boldsymbol{y} \| \mbox{d} \alpha \ & \leq \int_0^1 \mu \|\boldsymbol{w} -\boldsymbol{ Y} \| \|\boldsymbol{x} -\boldsymbol{y} \| \mbox{d} \alpha \ & = \int_0^1 \mu \alpha \| \boldsymbol{x} -\boldsymbol{y} \|^2 \mbox{d} \alpha \ & = \frac{\mu}{2} \| \boldsymbol{x} -\boldsymbol{y} \|^2 \end{align*}
The final strong convex and smooth can be connected by the following propositions:
Proposition 8: The function $f$ is a $\lambda$-strong convex function when and only if its conjugate function is the $\frac{1}{\lambda}$-smoothing function.
However, before detailed proof, we need the following conjugate sub-gradient theorem and its inference as our tool.
propositional 9[conjugate sub-gradient theorem]: set function $f: \mathbb{r}^n \mapsto (-\infty, \infty]$ is normal closed convex function, for vector pair $ (\boldsymbol{x}, \boldsymbol{y}) $ , the following three conditions are equivalent
- $\boldsymbol{x} ^\top \boldsymbol{y} = F (\boldsymbol{x}) + f^* (\boldsymbol{y}) $.
- $\boldsymbol{y} \in \partial F (\boldsymbol{x}) $.
- $\boldsymbol{x} \in \partial F ^* (\boldsymbol{y}) $.
proof: The first-pass condition (1) and the condition (2) are equivalent: the vector pair $ (\ Boldsymbol{x}, \boldsymbol{y}) $ meet condition (1) equivalent to \begin{align*} \boldsymbol{x} ^\top \boldsymbol{y} -f (\boldsymbol{x} ) = f^* (\boldsymbol{y}) \geq \boldsymbol{y} ^\top \boldsymbol{z} -f (\boldsymbol{z}), \forall \boldsymbol{z} \in \mathbb{r}^n\end{align*} Further collation has $\forall \boldsymbol{z} \in \mathbb{r}^n$ has $f (\boldsymbol{z}) \geq F (\ BOLDSYMBOL{X}) + \boldsymbol{y} ^\top (\boldsymbol{z} -\boldsymbol{x}) $, also $\boldsymbol{y} \in \partial f (\ BOLDSYMBOL{X}) $.
For any vector $\boldsymbol{z} $, by conjugate secondary gradient theorem \begin{align*} \boldsymbol{z} \in \arg \max_{\boldsymbol{x} \in \mathbb{r}^n} \left\{ \BOLDSYMBOL{X} ^\top \boldsymbol{y}-F (\boldsymbol{x}) \right\} \leftrightarrow \boldsymbol{z} ^\top \boldsymbol{y}- F (\boldsymbol{z}) = f^* (\boldsymbol{y}) \leftrightarrow \boldsymbol{z} ^\top \boldsymbol{y} = f (\boldsymbol{z}) + f^* ( \boldsymbol{y}) \leftrightarrow \boldsymbol{z} \in \partial f^* (\boldsymbol{y}) \end{align*} If $f$ is a strong convex function, the proposition 3 is known $\ BOLDSYMBOL{X} ^\top \boldsymbol{y}-F (\boldsymbol{x}) $ The maximum point is unique, thus $\partial f^* (\boldsymbol{y}) $ contains only unique elements, so $f^*$ can be micro, that is, $ \nabla f^* (\boldsymbol{y}) = \arg \max_{\boldsymbol{x} \in \mathbb{r}^n} \left\{\boldsymbol{x} ^\top \boldsymbol{y}- F (\boldsymbol{x}) \right\}$.
Finally, we give the proof of Proposition 8:
On the one hand, if the $f$ is a $\lambda$-strong convex function, $f the micro-^*$ of the above has been proven. For $\forall \boldsymbol{x}_1, \boldsymbol{x}_2$ and $\forall \alpha \in [0, 1]$, set $\boldsymbol{y} _1 \in \partial f (\ BOLDSYMBOL{X} _1) $,$\boldsymbol{y} _2 \in \partial f (\boldsymbol{x} _2) $,$\boldsymbol{x} = \alpha \boldsymbol{x} _ 1 + (1-\alpha) \boldsymbol{x} _2$, so by Proposition 4 know \begin{align} \label{equ:final proof 1} f (\boldsymbol{x}) & \geq f (\bold symbol{x}_1) + \boldsymbol{y} _1^\top (\boldsymbol{x} -\boldsymbol{x} _1) + \frac{\lambda}{2} \| \boldsymbol{x} -\boldsymbol{x} _1 \|^2 = f (\boldsymbol{x}_1) + (1-\alpha) \boldsymbol{y} _1^\top (\boldsymbol{x} _2 -\boldsymbol{x} _1) + \frac{\lambda}{2} (1-\alpha) ^2 \| \BOLDSYMBOL{X} _1-\boldsymbol{x} _2 \|^2 \ \label{equ:final Proof 2} f (\boldsymbol{x}) & \geq f (\boldsymbol{x}_2) + \boldsymbol{y} _2^\top (\boldsymbol{x} -\boldsymbol{x} _2) + \frac{\lambda}{2} \| \boldsymbol{x} -\boldsymbol{x} _2 \|^2 = f (\boldsymbol{x}_2) + \alpha\boldsymbol{y} _2^\top (\boldsymboL{X} _1-\boldsymbol{x} _2) + \frac{\lambda}{2} \alpha^2 \| \BOLDSYMBOL{X} _1-\boldsymbol{x} _2 \|^2 \end{align}$ (\ref{equ:final proof 1}) \times \alpha + (\ref{equ:final proof 2} ) \times (1-\alpha) $ available \begin{align*} f (\alpha \boldsymbol{x} _1 + (1-\alpha) \boldsymbol{x} _2) \geq \alpha F (\boldsym Bol{x}_1) + (1-\alpha) f (\boldsymbol{x}_2)-\alpha (1-\alpha) (\boldsymbol{y} _2-\boldsymbol{y} _1) ^\top (\boldsymbol{x } _2-\boldsymbol{x} _1) + \frac{\lambda}{2} \alpha (1-\alpha) \| \BOLDSYMBOL{X} _1-\boldsymbol{x} _2 \|^2 \end{align*} Again by proposition 2 know \begin{align*} \alpha f (\boldsymbol{x}_1) + (1-\alpha) F (\bo Ldsymbol{x}_2) \geq f (\alpha \boldsymbol{x} _1 + (1-\alpha) \boldsymbol{x} _2) + \frac{\lambda}{2} \alpha (1-\alpha) \ | \BOLDSYMBOL{X} _1-\boldsymbol{x} _2 \|^2 \end{align*} So the composite above two have \begin{align*} (\boldsymbol{y} _2-\boldsymbol{y} _1) ^\t OP (\boldsymbol{x} _2-\boldsymbol{x} _1) \geq \lambda \| \BOLDSYMBOL{X} _2-\boldsymbol{x} _1 \|^2 \end{align*} apparently $ (\boldsymbol{y} _2-\bolDsymbol{y} _1) ^\top (\boldsymbol{x} _2-\boldsymbol{x} _1) \leq \|\boldsymbol{y} _2-\boldsymbol{y} _1\| \|\BOLDSYMBOL{X} _2-\boldsymbol{x} _1\|$, so \begin{align*} \| \BOLDSYMBOL{X} _2-\boldsymbol{x} _1 \| \leq \frac{1}{\lambda} \|\boldsymbol{y} _2-\boldsymbol{y} _1\| \end{align*} by inference of the conjugate sub-gradient theorem $\boldsymbol{y} _1 \in \partial f (\boldsymbol{x} _1) \rightarrow \boldsymbol{x} _1 = \nabla f^* (\ Boldsymbol{y} _1) $,$\boldsymbol{y} _2 \in \partial f (\boldsymbol{x} _2) \rightarrow \boldsymbol{x} _2 = \nabla f^* (\boldsy Mbol{y} _2) $, so \begin{align*} \| \nabla f^* (\boldsymbol{y} _2)-\nabla f^* (\boldsymbol{y} _1) \| \leq \frac{1}{\lambda} \|\boldsymbol{y} _2-\boldsymbol{y} _1\| \end{align*} This proves that $f^*$ is a $\frac{1}{\lambda}$-smoothing function.
On the other hand, if $f^*$ is a $\frac{1}{\lambda}$-smoothing function, set $g (\boldsymbol{y}) = f^* (\boldsymbol{x} + \boldsymbol{y})-F^* (\boldsymbol{x} )-\nabla f^* (\boldsymbol{x}) ^\top \boldsymbol{y} $, by proposition 7 known $g (\boldsymbol{y}) \leq \frac{1}{2\lambda} \| \boldsymbol{y} \|^2 = h (\boldsymbol{y}) $, so \begin{align*} \frac{\lambda}{2} \| \boldsymbol{a} \|^2 = h^* (\boldsymbol{a}) = \sup_{\boldsymbol{y}} \{\boldsymbol{y} ^\top \boldsymbol{a}-H (\boldsymbol{ y}) \} \leq \sup_{\boldsymbol{y}} \{\boldsymbol{y} ^\top \boldsymbol{a}-G (\boldsymbol{y}) \} = g^* (\boldsymbol{a}) \e nd{align*} \begin{align*} g^* (\boldsymbol{a}) & = \sup_{\boldsymbol{y}} \{\boldsymbol{y} ^\top \boldsymbol{a}-G (\ Boldsymbol{y}) \ \ \ & = \sup_{\boldsymbol{y}} \{\boldsymbol{y} ^\top \boldsymbol{a}-f^* (\boldsymbol{x} + \boldsy Mbol{y}) + f^* (\boldsymbol{x}) + \nabla f^* (\boldsymbol{x}) ^\top \boldsymbol{y} \} \ & = \sup_{\boldsymbol{y} } \{\boldsymbol{y} ^\top (\boldsymbol{a} + \nabla f^* (\boldsymbol{x}))-f^* (\BOLDSYMBOL{X} + \boldsymbol{y}) \} + f^* (\boldsymbol{x}) \ \ & = \sup_{\boldsymbol{y}} \{(\boldsymbol{x} + \bol Dsymbol{y}) ^\top (\boldsymbol{a} + \nabla f^* (\boldsymbol{x}))-f^* (\boldsymbol{x} + \boldsymbol{y}) \} + f^* (\boldsymb OL{X})-\boldsymbol{x} ^\top (\boldsymbol{a} + \nabla f^* (\boldsymbol{x})) \ \ & = f^{**} (\boldsymbol{a} + \nabla f^* (\boldsymbol{x})) + f^* (\boldsymbol{x})-\boldsymbol{x} ^\top (\boldsymbol{a} + \nabla f^* (\boldsymbol{x})) \ \ & = f (\boldsymbol{a} + \nabla f^* (\boldsymbol{x})) + f^* (\boldsymbol{x})-\boldsymbol{x} ^\top (\boldsymbol{a} + \nabla f^* (\boldsymbol{x})) \ end{align*} $\boldsymbol{u} = \nabla f^* (\boldsymbol{x}) $, by conjugate sub-gradient theorem $\boldsymbol{x} ^\top \boldsymbol{u} = f^* (\boldsymbol{x}) + f (\boldsymbol{u}) $, so \begin{align*} g^* (\boldsymbol{a}) = f (\boldsymbol{a} + \boldsymbol{u}) + f^* (\boldsymbol{x})-\boldsymbol{x} ^\top \boldsymbol{a}-\boldsymbol{x} ^\top \boldsymbol{u} = f (\boldsymbol{a} + \BOLDSYMBOl{u})-F (\boldsymbol{u})-\boldsymbol{x} ^\top \boldsymbol{a} \end{align*} with $g^* (\boldsymbol{a}) \geq \frac{\lambda} {2} \| \boldsymbol{a} \|^2$ known to any $\boldsymbol{a}$ and $\boldsymbol{x} $ have \begin{align*} f (\boldsymbol{a} + \boldsymbol{u})-F (\ Boldsymbol{u})-\boldsymbol{x} ^\top \boldsymbol{a} \geq \frac{\lambda}{2} \| \boldsymbol{a} \|^2 \end{align*} where $\boldsymbol{u} = \nabla f^* (\boldsymbol{x}) $. Known by the conjugate sub-gradient theorem $\boldsymbol{u} ' = \nabla f^* (\boldsymbol{x}) \leftrightarrow \boldsymbol{x} \in \partial F (\boldsymbol {u} ') $, by Proposition 4 known $f$ is the $\lambda$-strong convex function.
Strong convex, smooth, and conjugate