From Wikipedia, the free encyclopedia
The Davidon–Fletcher–Powell formula (or DFP ; named after
William C. Davidon ,
Roger Fletcher , and
Michael J. D. Powell ) finds the solution to the secant equation that is closest to the current estimate and satisfies the curvature condition. It was the first
quasi-Newton method to generalize the
secant method to a multidimensional problem. This update maintains the symmetry and positive definiteness of the
Hessian matrix .
Given a function
f
(
x
)
{\displaystyle f(x)}
, its
gradient (
∇
f
{\displaystyle \nabla f}
), and
positive-definite
Hessian matrix
B
{\displaystyle B}
, the
Taylor series is
f
(
x
k
+
s
k
)
=
f
(
x
k
)
+
∇
f
(
x
k
)
T
s
k
+
1
2
s
k
T
B
s
k
+
…
,
{\displaystyle f(x_{k}+s_{k})=f(x_{k})+\nabla f(x_{k})^{T}s_{k}+{\frac {1}{2}}s_{k}^{T}{B}s_{k}+\dots ,}
and the
Taylor series of the gradient itself (secant equation)
∇
f
(
x
k
+
s
k
)
=
∇
f
(
x
k
)
+
B
s
k
+
…
{\displaystyle \nabla f(x_{k}+s_{k})=\nabla f(x_{k})+Bs_{k}+\dots }
is used to update
B
{\displaystyle B}
.
The DFP formula finds a solution that is symmetric, positive-definite and closest to the current approximate value of
B
k
{\displaystyle B_{k}}
:
B
k
+
1
=
(
I
−
γ
k
y
k
s
k
T
)
B
k
(
I
−
γ
k
s
k
y
k
T
)
+
γ
k
y
k
y
k
T
,
{\displaystyle B_{k+1}=(I-\gamma _{k}y_{k}s_{k}^{T})B_{k}(I-\gamma _{k}s_{k}y_{k}^{T})+\gamma _{k}y_{k}y_{k}^{T},}
where
y
k
=
∇
f
(
x
k
+
s
k
)
−
∇
f
(
x
k
)
,
{\displaystyle y_{k}=\nabla f(x_{k}+s_{k})-\nabla f(x_{k}),}
γ
k
=
1
y
k
T
s
k
,
{\displaystyle \gamma _{k}={\frac {1}{y_{k}^{T}s_{k}}},}
and
B
k
{\displaystyle B_{k}}
is a symmetric and
positive-definite matrix .
The corresponding update to the inverse Hessian approximation
H
k
=
B
k
−
1
{\displaystyle H_{k}=B_{k}^{-1}}
is given by
H
k
+
1
=
H
k
−
H
k
y
k
y
k
T
H
k
y
k
T
H
k
y
k
+
s
k
s
k
T
y
k
T
s
k
.
{\displaystyle H_{k+1}=H_{k}-{\frac {H_{k}y_{k}y_{k}^{T}H_{k}}{y_{k}^{T}H_{k}y_{k}}}+{\frac {s_{k}s_{k}^{T}}{y_{k}^{T}s_{k}}}.}
B
{\displaystyle B}
is assumed to be positive-definite, and the vectors
s
k
T
{\displaystyle s_{k}^{T}}
and
y
{\displaystyle y}
must satisfy the curvature condition
s
k
T
y
k
=
s
k
T
B
s
k
>
0.
{\displaystyle s_{k}^{T}y_{k}=s_{k}^{T}Bs_{k}>0.}
The DFP formula is quite effective, but it was soon superseded by the
Broyden–Fletcher–Goldfarb–Shanno formula , which is its
dual (interchanging the roles of y and s ).
[1]
See also
References
^ Avriel, Mordecai (1976). Nonlinear Programming: Analysis and Methods . Prentice-Hall. pp. 352–353.
ISBN
0-13-623603-0 .
Further reading
Optimization computes maxima and minima.