A 3-dimensional convex polytope. Convex analysis includes not only the study of convex subsets of Euclidean spaces but also the study of convex functions on abstract spaces.
Convex analysis is the branch of
mathematics devoted to the study of properties of
convex functions and
convex sets , often with applications in
convex minimization , a subdomain of
optimization theory .
Convex sets
A subset
C
⊆
X
{\displaystyle C\subseteq X}
of some
vector space
X
{\displaystyle X}
is convex if it satisfies any of the following equivalent conditions:
If
0
≤
r
≤
1
{\displaystyle 0\leq r\leq 1}
is real and
x
,
y
∈
C
{\displaystyle x,y\in C}
then
r
x
+
(
1
−
r
)
y
∈
C
.
{\displaystyle rx+(1-r)y\in C.}
[1]
If
0
<
r
<
1
{\displaystyle 0<r<1}
is real and
x
,
y
∈
C
{\displaystyle x,y\in C}
with
x
≠
y
,
{\displaystyle x\neq y,}
then
r
x
+
(
1
−
r
)
y
∈
C
.
{\displaystyle rx+(1-r)y\in C.}
Convex function on an interval.
Throughout,
f
:
X
→
−
∞
,
∞
{\displaystyle f:X\to [-\infty ,\infty ]}
will be a map valued in the
extended real numbers
−
∞
,
∞
=
R
∪
{
±
∞
}
{\displaystyle [-\infty ,\infty ]=\mathbb {R} \cup \{\pm \infty \}}
with a
domain
domain
f
=
X
{\displaystyle \operatorname {domain} f=X}
that is a convex subset of some vector space.
The map
f
:
X
→
−
∞
,
∞
{\displaystyle f:X\to [-\infty ,\infty ]}
is a convex function if
f
(
r
x
+
(
1
−
r
)
y
)
≤
r
f
(
x
)
+
(
1
−
r
)
f
(
y
)
{\displaystyle f(rx+(1-r)y)\leq rf(x)+(1-r)f(y)}
(Convexity ≤ )
holds for any real
0
<
r
<
1
{\displaystyle 0<r<1}
and any
x
,
y
∈
X
{\displaystyle x,y\in X}
with
x
≠
y
.
{\displaystyle x\neq y.}
If this remains true of
f
{\displaystyle f}
when the defining inequality (
Convexity ≤ ) is replaced by the strict inequality
f
(
r
x
+
(
1
−
r
)
y
)
<
r
f
(
x
)
+
(
1
−
r
)
f
(
y
)
{\displaystyle f(rx+(1-r)y)<rf(x)+(1-r)f(y)}
(Convexity < )
then
f
{\displaystyle f}
is called strictly convex .
[1]
Convex functions are related to convex sets. Specifically, the function
f
{\displaystyle f}
is convex if and only if its
epigraph
A function (in black) is convex if and only if its epigraph, which is the region above its
graph (in green), is a
convex set .
A graph of the
bivariate convex function
x
2
+
x
y
+
y
2
.
{\displaystyle x^{2}+xy+y^{2}.}
epi
f
:=
{
(
x
,
r
)
∈
X
×
R
:
f
(
x
)
≤
r
}
{\displaystyle \operatorname {epi} f:=\left\{(x,r)\in X\times \mathbb {R} ~:~f(x)\leq r\right\}}
(Epigraph def. )
is a convex set. The epigraphs of extended real-valued functions play a role in convex analysis that is analogous to the role played by
graphs of real-valued function in
real analysis . Specifically, the epigraph of an extended real-valued function provides geometric intuition that can be used to help formula or prove conjectures.
The domain of a function
f
:
X
→
−
∞
,
∞
{\displaystyle f:X\to [-\infty ,\infty ]}
is denoted by
domain
f
{\displaystyle \operatorname {domain} f}
while its
effective domain is the set
dom
f
:=
{
x
∈
X
:
f
(
x
)
<
∞
}
.
{\displaystyle \operatorname {dom} f:=\{x\in X~:~f(x)<\infty \}.}
(dom f def. )
The function
f
:
X
→
−
∞
,
∞
{\displaystyle f:X\to [-\infty ,\infty ]}
is called
proper if
dom
f
≠
∅
{\displaystyle \operatorname {dom} f\neq \varnothing }
and
f
(
x
)
>
−
∞
{\displaystyle f(x)>-\infty }
for all
x
∈
domain
f
.
{\displaystyle x\in \operatorname {domain} f.}
Alternatively, this means that there exists some
x
{\displaystyle x}
in the domain of
f
{\displaystyle f}
at which
f
(
x
)
∈
R
{\displaystyle f(x)\in \mathbb {R} }
and
f
{\displaystyle f}
is also never equal to
−
∞
.
{\displaystyle -\infty .}
In words, a function is proper if its domain is not empty, it never takes on the value
−
∞
,
{\displaystyle -\infty ,}
and it also is not identically equal to
+
∞
.
{\displaystyle +\infty .}
If
f
:
R
n
→
−
∞
,
∞
{\displaystyle f:\mathbb {R} ^{n}\to [-\infty ,\infty ]}
is a
proper convex function then there exist some vector
b
∈
R
n
{\displaystyle b\in \mathbb {R} ^{n}}
and some
r
∈
R
{\displaystyle r\in \mathbb {R} }
such that
f
(
x
)
≥
x
⋅
b
−
r
{\displaystyle f(x)\geq x\cdot b-r}
for every
x
{\displaystyle x}
where
x
⋅
b
{\displaystyle x\cdot b}
denotes the
dot product of these vectors.
Convex conjugate
The convex conjugate of an extended real-valued function
f
:
X
→
−
∞
,
∞
{\displaystyle f:X\to [-\infty ,\infty ]}
(not necessarily convex) is the function
f
∗
:
X
∗
→
−
∞
,
∞
{\displaystyle f^{*}:X^{*}\to [-\infty ,\infty ]}
from the
(continuous) dual space
X
∗
{\displaystyle X^{*}}
of
X
,
{\displaystyle X,}
and
f
∗
(
x
∗
)
=
sup
z
∈
X
{
⟨
x
∗
,
z
⟩
−
f
(
z
)
}
{\displaystyle f^{*}\left(x^{*}\right)=\sup _{z\in X}\left\{\left\langle x^{*},z\right\rangle -f(z)\right\}}
where the brackets
⟨
⋅
,
⋅
⟩
{\displaystyle \left\langle \cdot ,\cdot \right\rangle }
denote the
canonical duality
⟨
x
∗
,
z
⟩
:=
x
∗
(
z
)
.
{\displaystyle \left\langle x^{*},z\right\rangle :=x^{*}(z).}
The biconjugate of
f
{\displaystyle f}
is the map
f
∗
∗
=
(
f
∗
)
∗
:
X
→
−
∞
,
∞
{\displaystyle f^{**}=\left(f^{*}\right)^{*}:X\to [-\infty ,\infty ]}
defined by
f
∗
∗
(
x
)
:=
sup
z
∗
∈
X
∗
{
⟨
x
,
z
∗
⟩
−
f
(
z
∗
)
}
{\displaystyle f^{**}(x):=\sup _{z^{*}\in X^{*}}\left\{\left\langle x,z^{*}\right\rangle -f\left(z^{*}\right)\right\}}
for every
x
∈
X
.
{\displaystyle x\in X.}
If
Func
(
X
;
Y
)
{\displaystyle \operatorname {Func} (X;Y)}
denotes the set of
Y
{\displaystyle Y}
-valued functions on
X
,
{\displaystyle X,}
then the map
Func
(
X
;
−
∞
,
∞
)
→
Func
(
X
∗
;
−
∞
,
∞
)
{\displaystyle \operatorname {Func} (X;[-\infty ,\infty ])\to \operatorname {Func} \left(X^{*};[-\infty ,\infty ]\right)}
defined by
f
↦
f
∗
{\displaystyle f\mapsto f^{*}}
is called the Legendre-Fenchel transform .
Subdifferential set and the Fenchel-Young inequality
If
f
:
X
→
−
∞
,
∞
{\displaystyle f:X\to [-\infty ,\infty ]}
and
x
∈
X
{\displaystyle x\in X}
then the subdifferential set is
∂
f
(
x
)
:
=
{
x
∗
∈
X
∗
:
f
(
z
)
≥
f
(
x
)
+
⟨
x
∗
,
z
−
x
⟩
for all
z
∈
X
}
(
“
z
∈
X
''
can be replaced with:
“
z
∈
X
such that
z
≠
x
''
)
=
{
x
∗
∈
X
∗
:
⟨
x
∗
,
x
⟩
−
f
(
x
)
≥
⟨
x
∗
,
z
⟩
−
f
(
z
)
for all
z
∈
X
}
=
{
x
∗
∈
X
∗
:
⟨
x
∗
,
x
⟩
−
f
(
x
)
≥
sup
z
∈
X
⟨
x
∗
,
z
⟩
−
f
(
z
)
}
The right hand side is
f
∗
(
x
∗
)
=
{
x
∗
∈
X
∗
:
⟨
x
∗
,
x
⟩
−
f
(
x
)
=
f
∗
(
x
∗
)
}
Taking
z
:=
x
in the
sup
gives the inequality
≤
.
{\displaystyle {\begin{alignedat}{4}\partial f(x):&=\left\{x^{*}\in X^{*}~:~f(z)\geq f(x)+\left\langle x^{*},z-x\right\rangle {\text{ for all }}z\in X\right\}&&({\text{“}}z\in X{\text{''}}{\text{ can be replaced with: }}{\text{“}}z\in X{\text{ such that }}z\neq x{\text{''}})\\&=\left\{x^{*}\in X^{*}~:~\left\langle x^{*},x\right\rangle -f(x)\geq \left\langle x^{*},z\right\rangle -f(z){\text{ for all }}z\in X\right\}&&\\&=\left\{x^{*}\in X^{*}~:~\left\langle x^{*},x\right\rangle -f(x)\geq \sup _{z\in X}\left\langle x^{*},z\right\rangle -f(z)\right\}&&{\text{ The right hand side is }}f^{*}\left(x^{*}\right)\\&=\left\{x^{*}\in X^{*}~:~\left\langle x^{*},x\right\rangle -f(x)=f^{*}\left(x^{*}\right)\right\}&&{\text{ Taking }}z:=x{\text{ in the }}\sup {}{\text{ gives the inequality }}\leq .\\\end{alignedat}}}
For example, in the important special case where
f
=
‖
⋅
‖
{\displaystyle f=\|\cdot \|}
is a norm on
X
{\displaystyle X}
, it can be shown
[proof 1]
that if
0
≠
x
∈
X
{\displaystyle 0\neq x\in X}
then this definition reduces down to:
∂
f
(
x
)
=
{
x
∗
∈
X
∗
:
⟨
x
∗
,
x
⟩
=
‖
x
‖
and
‖
x
∗
‖
=
1
}
{\displaystyle \partial f(x)=\left\{x^{*}\in X^{*}~:~\left\langle x^{*},x\right\rangle =\|x\|{\text{ and }}\left\|x^{*}\right\|=1\right\}}
and
∂
f
(
0
)
=
{
x
∗
∈
X
∗
:
‖
x
∗
‖
≤
1
}
.
{\displaystyle \partial f(0)=\left\{x^{*}\in X^{*}~:~\left\|x^{*}\right\|\leq 1\right\}.}
For any
x
∈
X
{\displaystyle x\in X}
and
x
∗
∈
X
∗
,
{\displaystyle x^{*}\in X^{*},}
f
(
x
)
+
f
∗
(
x
∗
)
≥
⟨
x
∗
,
x
⟩
,
{\displaystyle f(x)+f^{*}\left(x^{*}\right)\geq \left\langle x^{*},x\right\rangle ,}
which is called the Fenchel-Young inequality . This inequality is an equality (i.e.
f
(
x
)
+
f
∗
(
x
∗
)
=
⟨
x
∗
,
x
⟩
{\displaystyle f(x)+f^{*}\left(x^{*}\right)=\left\langle x^{*},x\right\rangle }
) if and only if
x
∗
∈
∂
f
(
x
)
.
{\displaystyle x^{*}\in \partial f(x).}
It is in this way that the subdifferential set
∂
f
(
x
)
{\displaystyle \partial f(x)}
is directly related to the convex conjugate
f
∗
(
x
∗
)
.
{\displaystyle f^{*}\left(x^{*}\right).}
Biconjugate
The biconjugate of a function
f
:
X
→
−
∞
,
∞
{\displaystyle f:X\to [-\infty ,\infty ]}
is the conjugate of the conjugate, typically written as
f
∗
∗
:
X
→
−
∞
,
∞
.
{\displaystyle f^{**}:X\to [-\infty ,\infty ].}
The biconjugate is useful for showing when
strong or
weak duality hold (via the
perturbation function ).
For any
x
∈
X
,
{\displaystyle x\in X,}
the inequality
f
∗
∗
(
x
)
≤
f
(
x
)
{\displaystyle f^{**}(x)\leq f(x)}
follows from the Fenchel–Young inequality . For
proper functions ,
f
=
f
∗
∗
{\displaystyle f=f^{**}}
if and only if
f
{\displaystyle f}
is convex and
lower semi-continuous by
Fenchel–Moreau theorem .
[4]
Convex minimization
A convex minimization (primal ) problem is one of the form
find
inf
x
∈
M
f
(
x
)
{\displaystyle \inf _{x\in M}f(x)}
when given a convex function
f
:
X
→
−
∞
,
∞
{\displaystyle f:X\to [-\infty ,\infty ]}
and a convex subset
M
⊆
X
.
{\displaystyle M\subseteq X.}
Dual problem
In optimization theory, the duality principle states that optimization problems may be viewed from either of two perspectives, the primal problem or the dual problem.
In general given two
dual pairs
separated
locally convex spaces
(
X
,
X
∗
)
{\displaystyle \left(X,X^{*}\right)}
and
(
Y
,
Y
∗
)
.
{\displaystyle \left(Y,Y^{*}\right).}
Then given the function
f
:
X
→
−
∞
,
∞
,
{\displaystyle f:X\to [-\infty ,\infty ],}
we can define the primal problem as finding
x
{\displaystyle x}
such that
inf
x
∈
X
f
(
x
)
.
{\displaystyle \inf _{x\in X}f(x).}
If there are constraint conditions, these can be built into the function
f
{\displaystyle f}
by letting
f
=
f
+
I
c
o
n
s
t
r
a
i
n
t
s
{\displaystyle f=f+I_{\mathrm {constraints} }}
where
I
{\displaystyle I}
is the
indicator function . Then let
F
:
X
×
Y
→
−
∞
,
∞
{\displaystyle F:X\times Y\to [-\infty ,\infty ]}
be a
perturbation function such that
F
(
x
,
0
)
=
f
(
x
)
.
{\displaystyle F(x,0)=f(x).}
[5]
The dual problem with respect to the chosen perturbation function is given by
sup
y
∗
∈
Y
∗
−
F
∗
(
0
,
y
∗
)
{\displaystyle \sup _{y^{*}\in Y^{*}}-F^{*}\left(0,y^{*}\right)}
where
F
∗
{\displaystyle F^{*}}
is the convex conjugate in both variables of
F
.
{\displaystyle F.}
The
duality gap is the difference of the right and left hand sides of the inequality
[5]
[7]
sup
y
∗
∈
Y
∗
−
F
∗
(
0
,
y
∗
)
≤
inf
x
∈
X
F
(
x
,
0
)
.
{\displaystyle \sup _{y^{*}\in Y^{*}}-F^{*}\left(0,y^{*}\right)\leq \inf _{x\in X}F(x,0).}
This principle is the same as
weak duality . If the two sides are equal to each other, then the problem is said to satisfy
strong duality .
There are many conditions for strong duality to hold such as:
Lagrange duality
For a convex minimization problem with inequality constraints,
min
x
f
(
x
)
{\displaystyle \min {}_{x}f(x)}
subject to
g
i
(
x
)
≤
0
{\displaystyle g_{i}(x)\leq 0}
for
i
=
1
,
…
,
m
.
{\displaystyle i=1,\ldots ,m.}
the Lagrangian dual problem is
sup
u
inf
x
L
(
x
,
u
)
{\displaystyle \sup {}_{u}\inf {}_{x}L(x,u)}
subject to
u
i
(
x
)
≥
0
{\displaystyle u_{i}(x)\geq 0}
for
i
=
1
,
…
,
m
.
{\displaystyle i=1,\ldots ,m.}
where the objective function
L
(
x
,
u
)
{\displaystyle L(x,u)}
is the Lagrange dual function defined as follows:
L
(
x
,
u
)
=
f
(
x
)
+
∑
j
=
1
m
u
j
g
j
(
x
)
{\displaystyle L(x,u)=f(x)+\sum _{j=1}^{m}u_{j}g_{j}(x)}
See also
Notes
^
a
b
Rockafellar, R. Tyrrell (1997) [1970]. Convex Analysis . Princeton, NJ: Princeton University Press.
ISBN
978-0-691-01586-6 .
^ Borwein, Jonathan; Lewis, Adrian (2006).
Convex Analysis and Nonlinear Optimization: Theory and Examples (2 ed.). Springer. pp.
76 –77.
ISBN
978-0-387-29570-1 .
^
a
b Boţ, Radu Ioan; Wanka, Gert; Grad, Sorin-Mihai (2009). Duality in Vector Optimization . Springer.
ISBN
978-3-642-02885-4 .
^ Csetnek, Ernö Robert (2010). Overcoming the failure of the classical generalized interior-point regularity conditions in convex optimization. Applications of the duality theory to enlargements of maximal monotone operators . Logos Verlag Berlin GmbH.
ISBN
978-3-8325-2503-3 .
^ Borwein, Jonathan; Lewis, Adrian (2006). Convex Analysis and Nonlinear Optimization: Theory and Examples (2 ed.). Springer.
ISBN
978-0-387-29570-1 .
^ Boyd, Stephen; Vandenberghe, Lieven (2004).
Convex Optimization (PDF) . Cambridge University Press.
ISBN
978-0-521-83378-3 . Retrieved October 3, 2011 .
^ The conclusion is immediate if
X
=
{
0
}
{\displaystyle X=\{0\}}
so assume otherwise. Fix
x
∈
X
.
{\displaystyle x\in X.}
Replacing
f
{\displaystyle f}
with the norm gives
∂
f
(
x
)
=
{
x
∗
∈
X
∗
:
⟨
x
∗
,
x
⟩
−
‖
x
‖
≥
⟨
x
∗
,
z
⟩
−
‖
z
‖
for all
z
∈
X
}
.
{\displaystyle \partial f(x)=\left\{x^{*}\in X^{*}~:~\left\langle x^{*},x\right\rangle -\|x\|\geq \left\langle x^{*},z\right\rangle -\|z\|{\text{ for all }}z\in X\right\}.}
If
x
∗
∈
∂
f
(
x
)
{\displaystyle x^{*}\in \partial f(x)}
and
r
≥
0
{\displaystyle r\geq 0}
is real then using
z
:=
r
x
{\displaystyle z:=rx}
gives
⟨
x
∗
,
x
⟩
−
‖
x
‖
≥
⟨
x
∗
,
r
x
⟩
−
‖
r
x
‖
=
r
⟨
x
∗
,
x
⟩
−
‖
x
‖
,
{\displaystyle \left\langle x^{*},x\right\rangle -\|x\|\geq \left\langle x^{*},rx\right\rangle -\|rx\|=r\left[\left\langle x^{*},x\right\rangle -\|x\|\right],}
where in particular, taking
r
:=
2
{\displaystyle r:=2}
gives
x
∗
(
x
)
≥
‖
x
‖
{\displaystyle x^{*}(x)\geq \|x\|}
while taking
r
:=
1
2
{\displaystyle r:={\frac {1}{2}}}
gives
x
∗
(
x
)
≤
‖
x
‖
{\displaystyle x^{*}(x)\leq \|x\|}
and thus
x
∗
(
x
)
=
‖
x
‖
{\displaystyle x^{*}(x)=\|x\|}
; moreover, if in addition
x
≠
0
{\displaystyle x\neq 0}
then because
x
∗
(
x
‖
x
‖
)
=
1
,
{\displaystyle x^{*}\left({\frac {x}{\|x\|}}\right)=1,}
it follows from the definition of the
dual norm that
‖
x
∗
‖
≥
1.
{\displaystyle \left\|x^{*}\right\|\geq 1.}
Because
∂
f
(
x
)
⊆
{
x
∗
∈
X
∗
:
x
∗
(
x
)
=
‖
x
‖
}
,
{\displaystyle \partial f(x)\subseteq \left\{x^{*}\in X^{*}~:~x^{*}(x)=\|x\|\right\},}
which is equivalent to
∂
f
(
x
)
=
∂
f
(
x
)
∩
{
x
∗
∈
X
∗
:
x
∗
(
x
)
=
‖
x
‖
}
,
{\displaystyle \partial f(x)=\partial f(x)\cap \left\{x^{*}\in X^{*}~:~x^{*}(x)=\|x\|\right\},}
it follows that
∂
f
(
x
)
=
{
x
∗
∈
X
∗
:
x
∗
(
x
)
=
‖
x
‖
and
‖
z
‖
≥
⟨
x
∗
,
z
⟩
for all
z
∈
X
}
,
{\displaystyle \partial f(x)=\left\{x^{*}\in X^{*}~:~x^{*}(x)=\|x\|{\text{ and }}\|z\|\geq \left\langle x^{*},z\right\rangle {\text{ for all }}z\in X\right\},}
which implies
‖
x
∗
‖
≤
1
{\displaystyle \left\|x^{*}\right\|\leq 1}
for all
x
∗
∈
∂
f
(
x
)
.
{\displaystyle x^{*}\in \partial f(x).}
From these facts, the conclusion can now be reached. ∎
References
Bauschke, Heinz H. ;
Combettes, Patrick L. (28 February 2017). Convex Analysis and Monotone Operator Theory in Hilbert Spaces . CMS Books in Mathematics.
Springer Science & Business Media .
ISBN
978-3-319-48311-5 .
OCLC
1037059594 .
Boyd, Stephen ;
Vandenberghe, Lieven (8 March 2004). Convex Optimization . Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge, U.K. New York:
Cambridge University Press .
ISBN
978-0-521-83378-3 .
OCLC
53331084 .
Hiriart-Urruty, J.-B. ;
Lemaréchal, C. (2001). Fundamentals of convex analysis . Berlin: Springer-Verlag.
ISBN
978-3-540-42205-1 .
Kusraev, A.G.;
Kutateladze, Semen Samsonovich (1995). Subdifferentials: Theory and Applications . Dordrecht: Kluwer Academic Publishers.
ISBN
978-94-011-0265-0 .
Rockafellar, R. Tyrrell ;
Wets, Roger J.-B. (26 June 2009). Variational Analysis . Grundlehren der mathematischen Wissenschaften. Vol. 317. Berlin New York:
Springer Science & Business Media .
ISBN
9783642024313 .
OCLC
883392544 .
Rudin, Walter (1991).
Functional Analysis . International Series in Pure and Applied Mathematics. Vol. 8 (Second ed.). New York, NY:
McGraw-Hill Science/Engineering/Math .
ISBN
978-0-07-054236-5 .
OCLC
21163277 .
Singer, Ivan (1997). Abstract convex analysis . Canadian Mathematical Society series of monographs and advanced texts. New York: John Wiley & Sons, Inc. pp. xxii+491.
ISBN
0-471-16015-6 .
MR
1461544 .
Stoer, J.; Witzgall, C. (1970). Convexity and optimization in finite dimensions . Vol. 1. Berlin: Springer.
ISBN
978-0-387-04835-2 .
Zălinescu, Constantin (30 July 2002).
Convex Analysis in General Vector Spaces . River Edge, N.J. London:
World Scientific Publishing .
ISBN
978-981-4488-15-0 .
MR
1921556 .
OCLC
285163112 – via
Internet Archive .
External links