In
dual decomposition a problem is broken into smaller subproblems and a solution to the relaxed problem is found. This method can be employed for
MRF optimization.
[1] Dual decomposition is applied to markov logic programs as an inference technique.
[2]
Background
Discrete MRF Optimization (inference) is very important in
Machine Learning and
Computer vision, which is realized on CUDA graphical processing units.
[3] Consider a graph
with nodes
and Edges
. The goal is to assign a label
to each
so that the MRF Energy is minimized:
(1)
Major MRF
Optimization methods are based on
Graph cuts or
Message passing. They rely on the following integer linear programming formulation
(2)
In many applications, the
MRF-variables are {0,1}-variables that satisfy: ![{\displaystyle x_{p}(l)=1}](https://wikimedia.org/api/rest_v1/media/math/render/svg/0c2a0a7c3a25f4edc164b0725f26e0990bcdbb79)
label
is assigned to
, while
, labels
are assigned to
.
Dual Decomposition
The main idea behind
decomposition is surprisingly simple:
- decompose your original complex problem into smaller solvable subproblems,
- extract a solution by cleverly combining the solutions from these subproblems.
A sample problem to decompose:
where
In this problem, separately minimizing every single
over
is easy; but minimizing their sum is a complex problem. So the problem needs to get decomposed using auxiliary variables
and the problem will be as follows:
where
Now we can
relax the constraints by multipliers
which gives us the following Lagrangian dual function:
Now we eliminate
from the dual function by minimizing over
and dual function becomes:
We can set up a Lagrangian dual problem:
(3)
The Master problem
(4)
where
The Slave problems
MRF optimization via Dual Decomposition
The original MRF optimization problem is
NP-hard and we need to transform it into something easier.
is a set of sub-trees of graph
where its trees cover all nodes and edges of the main graph. And MRFs defined for every tree
in
will be smaller. The vector of MRF parameters is
and the vector of MRF variables is
, these two are just smaller in comparison with original MRF vectors
. For all vectors
we'll have the following:
(5)
Where
and
denote all trees of
than contain node
and edge
respectively. We simply can write:
(6)
And our constraints will be:
(7)
Our original MRF problem will become:
(8)
where
and
And we'll have the dual problem we were seeking:
(9)
The Master problem
where each function
is defined as:
(10)
where
The Slave problems
Theoretical Properties
Theorem 1. Lagrangian relaxation (9) is equivalent to the LP relaxation of (2).
Theorem 2. If the sequence of multipliers
satisfies
then the algorithm converges to the optimal solution of (9).
Theorem 3. The distance of the current solution
to the optimal solution
, which decreases at every iteration.
Theorem 4. Any solution obtained by the method satisfies the WTA (weak tree agreement) condition.
Theorem 5. For binary MRFs with sub-modular energies, the method computes a globally optimal solution.
References