lec23 semidef opt
Post on 02-Mar-2018
222 Views
Preview:
TRANSCRIPT
-
7/26/2019 Lec23 Semidef Opt
1/54
Introduction to Semidefinite Programming (SDP)
Robert M. Freund
March, 2004
1
2004 Massachusetts Institute of Technology.
-
7/26/2019 Lec23 Semidef Opt
2/54
1 Introduction
Semidefiniteprogramming(SDP)isthemostexcitingdevelopmentinmath-ematicalprogramminginthe1990s. SDP hasapplications insuchdiversefields as traditional convex constrained optimization, control theory, andcombinatorial optimization. Because SDP is solvable via interior pointmethods, most of these applications can usually be solved very efficientlyinpracticeaswellas intheory.
2 Review of Linear Programming
Considerthe linearprogrammingproblem instandardform:
LP : minimize c x
s.t. ai
x=bi, i = 1, . . . , m nx +.
Herexisavectorofnvariables,andwewritec xforthe inner-product
n
j=1 cjxj,etc.
n n
nAlso,+ :={x | x 0}, and we call + thenonnegativeorthant.Infact,n isaclosed convex cone, whereK iscalledaclosedaconvexcone+ifK satisfiesthefollowingtwoconditions:
Ifx,w K,thenx+w K forallnonnegativescalarsand.
K
is
a
closed
set.
2
-
7/26/2019 Lec23 Semidef Opt
3/54
Inwords,LP isthefollowingproblem:
Minimize the linear function c x, subject to the condition that xmust solvemgiven equations ai
x=bi, i = 1, . . . , m, and that xmust lie in the closednconvex cone K=+.
Wewillwritethestandard linearprogrammingdualproblemas:
m
LD: maximize yibii=1
m
s.t. yiai+s=ci=1
ns +.
Given a feasible solution x of LP and a feasible solution (y,s) of LD, them mdualitygapissimplyc x i=1 yibi = (c x=s x 0,becausei=1 yiai)
x 0 and s 0. WeknowfromLP dualitytheorythatso longasthepri-malproblemLP is feasibleandhasboundedoptimalobjectivevalue,then
the
primal
and
the
dual
both
attain
their
optima
with
no
duality
gap.
That
is,thereexistsx and(y , s )feasiblefortheprimalanddual,respectively,
m suchthatc x i=1 yibi =s x = 0.
3 Facts about Matrices and the Semidefinite Cone
IfX isann nmatrix,thenX isapositivesemidefinite(psd)matrixif
vTXv 0 forany v n.
3
-
7/26/2019 Lec23 Semidef Opt
4/54
IfX isann nmatrix,thenX isapositivedefinite(pd)matrix if
vTXv>0 forany v n, v = 0.
Let Sn denote the set of symmetric nn matrices, and let nS+ denotethe set of positive semidefinite (psd) n n symmetric matrices. SimilarlyletSn denotethesetofpositivedefinite(pd)n nsymmetricmatrices.++
LetX andY beanysymmetricmatrices. WewriteX 0todenotethat X is symmetric and positive semidefinite, and we write X Y to denotethatX Y 0. WewriteX 0todenotethatX issymmetricandpositivedefinite,etc.
| nX 0} is a closed convex cone in 2 ofRemark 1 {X Snn (n+ 1)/2.
nS+ =dimension
Toseewhythisremark istrue,supposethat n
X,W S+. Pickanyscalars
,
0.
For
any
v
n, we have:
vT(X+W)v=vTXv+vTW v 0,
whereby n+ X W S + Thisshowsthat nS+ isacone. It isalsostraight-.forwardtoshowthat nS+ isaclosedset.
Recallthefollowingpropertiesofsymmetricmatrices:
If X Sn, then X = QDQT for some orthonormal matrix Q andsome diagonal matrix D. (Recall that Q is orthonormal means that
4
-
7/26/2019 Lec23 Semidef Opt
5/54
Q1 =QT,andthatD isdiagonalmeansthattheoff-diagonalentriesof
D
are
all
zeros.)
If X = QDQT as above, then the columns of Q form a set of northogonaleigenvectorsofX,whoseeigenvaluesarethecorrespondingdiagonalentriesofD.
X 0 if and only if X = QDQT where the eigenvalues (i.e., thediagonalentriesofD)areallnonnegative.
X 0 if and only if X = QDQT where the eigenvalues (i.e., the
diagonalentriesofD)areallpositive.
If
X
0 and
if
Xii =
0,
then
Xij =
Xji = 0
for
all
j
= 1, . . . , n.
ConsiderthematrixM definedasfollows:
P vM = Tv d
,
where P 0, v is a vector, and d is a scalar. Then M 0 if and TP1only ifd v v > 0.
4 Semidefinite Programming
LetXSn. WecanthinkofX asamatrix,orequivalently,asanarrayofn2 components of the form (x11, . . . , xnn). We can alsojust think of X asanobject(avector) inthespaceSn. AllthreedifferentequivalentwaysoflookingatX willbeuseful.
Whatwilla linearfunctionofX look like? IfC(X) isa linear function
of
X,
then
C(X)
can
be
written
as
C
X,
where
5
-
7/26/2019 Lec23 Semidef Opt
6/54
n n
CX := Cij
Xij
.i=1 j=1
IfX isa symmetricmatrix,there isno lossof generality inassumingthatthe matrix C is also symmetric. With this notation, we are now ready todefine a semidefinite program. A semidefinite program (SDP) is an opti-mizationproblemoftheform:
SDP : minimize CX
s.t. AiX=bi , i = 1, . . . , m ,
X0,
Notice that in an SDP that the variable is the matrix X, but it mightbe helpful to think of X as an array of n2 numbers or simply as a vectorinSn. Theobjective function isthe linearfunctionCX andtherearemlinear equations that X must satisfy, namely Ai
X = bi
, i = 1, . . . , m.ThevariableXalsomustlieinthe(closedconvex)coneofpositivesemidef-inite symmetric matrices Sn
+. Note that the data for SDP consists of the
symmetric
matrix
C
(which
is
the
data
for
the
objective
function)
and
the
msymmetricmatricesA1, . . . , Am,andthemvectorb,whichformthemlinearequations.
Let us see an example of an SDP for n = 3 and m = 2. Define thefollowingmatrices:
1 0 1 0 2 8 1 2 3
A1 =0 3 7 , A2 =2 6 0 , and C=2 9 0 ,1 7 5
8 0 4
3 0 7
6
-
7/26/2019 Lec23 Semidef Opt
7/54
andb1 = 11 and b2 =19. ThenthevariableX willbethe33symmetricmatrix:
x11 x12 x13X=x21 x22 x23 ,
x31 x32 x33
andso,forexample,
CX = x11 + 2x12 + 3x13 + 2x21 + 9x22 + 0x23 + 3x31 + 0x32 + 7x33
= x11 + 4x12 + 6x13 + 9x22 + 0x23 + 7x33.
since, in particular, X is symmetric. Therefore the SDP can be writtenas:
SDP : minimize x11 + 4x12 + 6x13 + 9x22 + 0x23 + 7x33
s.t.
x11 + 0x12 + 2x13 + 3x22 + 14x23 + 5x33 = 11
0x11 + 4x12 + 16x13 + 6x22 + 0x23 + 4x33 = 19
x11 x12 x13X=x21 x22 x23 0.
x31 x32 x33
Notice that SDP looks remarkably similar to a linear program. However,
the
standard
LP
constraint
that
x
must
lie
in
the
nonnegative
orthant
is
re-placedbytheconstraintthatthevariableX mustlieintheconeofpositivesemidefinitematrices. Justasx0statesthateachofthencomponents
7
-
7/26/2019 Lec23 Semidef Opt
8/54
ofxmustbenonnegative, itmaybehelpfultothinkofX
0asstatingthat
each
of
the
n
eigenvalues of
X
must
be
nonnegative.
It is easy to see that a linear program LP is a special instance of anSDP. Toseeonewayofdoingthis,supposethat(c, a1, . . . , am, b1, . . . , bm)comprisethedataforLP. Thendefine:
ai1 0 . . . 0 c1 0 . . . 0
0
ai2 . . . 0
0 c2 . . . 0
Ai = . . . . , i = 1, . . . , m , and C= . . . . . ..
. .. .
. . . .. .
. .
. .0 0 . . . ain 0 0 . . . cn
ThenLP canbewrittenas:
SDP : minimize CX
s.t. Ai
X=bi
, i = 1, . . . , m , Xij = 0, i = 1, . . . , n , j =i + 1, . . . , n , X
0,
withtheassociationthat
x1 0 . . . 0 0 x2 . . . 0
X=
. . .. .
.. . . . .. .0 0 . . . xn
Of course, in practice one would neverwant to convert an instance of LPintoan instanceofSDP. TheaboveconstructionmerelyshowsthatSDP
8
-
7/26/2019 Lec23 Semidef Opt
9/54
includes linearprogrammingasaspecialcase.
5 Semidefinite Programming Duality
ThedualproblemofSDP isdefined(orderivedfromfirstprinciples)tobe:
m
SDD: maximize yibii=1
m
s.t.
yiAi
+
S
=
C
i=1
S0.
Oneconvenientwayofthinkingaboutthisproblemisasfollows. Givenmul-mtipliersy1, . . . , ym,theobjectiveistomaximizethelinearfunction
i=1 yibi.TheconstraintsofSDDstatethatthematrixS definedas
m
S=C yiAi
i=1
mustbepositivesemidefinite. That is,
m
C yiAi
0.i=1
We illustrate this construction with the example presented earlier. Thedualproblem is:
9
-
7/26/2019 Lec23 Semidef Opt
10/54
SDD
:
maximize
11y1 + 19y2
1 0 1 0 2 8 1 2 3 s.t. y1 0 3 7 +y2 2 6 0 +S=2 9 0 1 7 5 8 0 4 3 0 7
S0,
whichwecanrewrite inthefollowingform:
SDD: maximize 11y1 + 19y2
s.t.
11y1 0y2 20y1 2y2 31y1 8y2
20y1 2y2 93y1 6y2 07y1 0y2 0.31y1 8y2 07y1 0y2 75y1 4y2
It
is
often
easier
to
see
and
to
work
with
a
semidefinite
program
when
itispresentedintheformatofthedualSDD,sincethevariablesarethemmultipliersy1, . . . , ym.
Asinlinearprogramming,wecanswitchfromoneformatofSDP (pri-mal or dual) to any other format with great ease, and there is no loss ofgenerality in assuming a particular specific format for the primal or thedual.
The following proposition states that weak duality must hold for theprimalanddualofSDP:
10
-
7/26/2019 Lec23 Semidef Opt
11/54
Proposition 5.1 Given a feasible solution X of SDP and a feasible solumtion (y,
S)
of SDD, the duality gap is C
X
=
S
X
0. Ifi=1 yibimCX i=1 yibi = 0, then Xand (y,S)are each optimal solutions to SDP
and SDD, respectively, and furthermore, SX= 0.
InordertoproveProposition5.1,itwillbeconvenienttoworkwiththetrace ofamatrix,definedbelow:
n
trace(M) = Mjj .j=1
Simple arithmetic can be used to establish the following two elementaryidentities:
trace(M N) = trace(N M)
AB=trace(ATB)
Proof of Proposition 5.1. For the first part of the proposition, wemustshowthat ifS0 and X0,thenSX0. LetS =P DPT andX=QEQT whereP,QareorthonormalmatricesandD,Earenonnegativediagonal
matrices.
We
have:
SX=trace(STX) = trace(SX) = trace(P DPTQEQT)
n
=trace(DPTQEQTP) = Djj (PT
QEQTP)jj 0,j=1
wherethelastinequalityfollowsfromthefactthatallDjj
0andthefact
11
-
7/26/2019 Lec23 Semidef Opt
12/54
thatthediagonalofthesymmetricpositivesemidefinitematrixPTQEQTPmust
be
nonnegative.
Toprovethesecondpartoftheproposition,supposethattrace(SX) = 0. Thenfromtheaboveequalities,wehave
n
Djj (PTQEQTP)jj
= 0.j=1
However, this implies that for eachj = 1, . . . , n, either Djj
= 0 or the(PTQEQTP)jj
=0. Furthermore, the lattercase implies thatthejth rowof PTQEQTP is all zeros. Therefore DPTQEQTP = 0, and so SX =P DPTQEQT = 0. q.e.d.
Unlikethecaseoflinearprogramming,wecannotassertthateitherSDPor SDD will attain their respective optima, and/or that there will be no
duality
gap,
unless
certain
regularity
conditions
hold.
One
such
regularity
condition which ensures that strong duality will prevail is a version of theSlatercondition,summarized inthe followingtheoremwhichwewillnotprove:
Theorem 5.1 Let zP
and zD
denote the optimal objective function valuesof SDPand SDD, respectively. Suppose that there exists a feasible solution y, S)X of SDP such that X0, and that there exists a feasible solution (of SDDsuch that S 0. Then both SDP and SDDattain their optimal
values, and zP =zD.
12
-
7/26/2019 Lec23 Semidef Opt
13/54
6 Key Properties of Linear Programming that do
not extend to SDP
The followingsummarizessomeofthemore importantpropertiesof linearprogrammingthatdonotextendtoSDP:
Theremaybeafiniteorinfinitedualitygap. Theprimaland/ordualmay or may not attain their optima. As noted above in Theorem5.1, both programs will attain their common optimal value if bothprogramshavefeasiblesolutionsintheinteriorofthesemidefinitecone.
There
is
no
finite
algorithm
for
solving
SDP.
There
is
a
simplex
algorithm, but it is not a finite algorithm. There is no direct analogofabasicfeasiblesolutionforSDP.
Givenrationaldata,thefeasibleregionmayhavenorationalsolutions.The optimal solution may not have rational components or rationaleigenvalues.
GivenrationaldatawhosebinaryencodingissizeL,thenormsofany
feasibleand/oroptimalsolutionsmayexceed22L
(orworse).
GivenrationaldatawhosebinaryencodingissizeL,thenormsofanyfeasible
and/or
optimal
solutions
may
be
less
than
22L
(or
worse).
7 SDP in Combinatorial Optimization
SDP has wide applicability in combinatorial optimization. A number ofNPhardcombinatorialoptimizationproblemshaveconvexrelaxationsthatare semidefinite programs. Inmany instances, theSDP relaxation is verytightinpractice,andincertaininstancesinparticular,theoptimalsolutiontotheSDP relaxationcanbeconvertedtoafeasiblesolutionfortheorigi-nalproblemwithprovablygoodobjectivevalue. Anexampleoftheuseof
SDP
in
combinatorial
optimization
is
given
below.
13
-
7/26/2019 Lec23 Semidef Opt
14/54
7.1 An SDP Relaxation of the MAX CUT Problem
LetGbeanundirectedgraphwithnodesN ={1, . . . , n},andedgesetE.Let wij
=wji
be theweight on edge (i, j), for(i, j)E. We assume thatwij
0forall(i, j)E. TheMAXCUTproblemistodetermineasubsetS of the nodesN forwhichthesumofthe weightsoftheedgesthatcross
fromS to itscomplementS ismaximized(where(S:=N\S) .
WecanformulateMAXCUTasan integerprogramasfollows. Letxj = 1 forjS andxj
=1 for jS. Thenourformulation is:
n n
1MAXCUT
:
maximizex 4 wij(1xixj)i=1j=1
s.t. xj {1, 1}, j = 1, . . . , n .
Now let
Y =xxT,
whereby
Yij =xixj i= 1, . . . , n , j = 1, . . . , n .
Also let W be the matrix whose (i, j)th element is wij
for i = 1, . . . , n andj= 1, . . . , n. ThenMAXCUTcanbeequivalentlyformulatedas:
14
-
7/26/2019 Lec23 Semidef Opt
15/54
n n
1MAXCUT: maximizeY,x wijWY4i=1j=1
s.t. xj
{1, 1}, j = 1, . . . , n
Y =xxT.
Notice in this problem that the first set of constraints are equivalent toYjj = 1, j = 1, . . . , n. Wethereforeobtain:
n n1MAXCUT: maximizeY,x
wijWY4i=1j=1
s.t. Yjj = 1, j = 1, . . . , n
Y =xxT.
Last of all, notice that the matrix Y = xxT is a symmetric rank-1 posi-tive semidefinite matrix. If we relax this condition by removing the rank-1 restriction, we obtain the following relaxtion of MAX CUT, which is a
semidefinite
program:
n n
1RELAX : maximizeY 4 wijWY
i=1j=1
s.t. Yjj
= 1, j = 1, . . . , n
Y 0.
It isthereforeeasytoseethatRELAXprovidesanupperboundonMAX-
CUT,
i.e.,
15
-
7/26/2019 Lec23 Semidef Opt
16/54
MAXCUT
RELAX.
Asitturnsout,onecanalsoprovewithouttoomucheffortthat:
0.87856RELAXMAXCUTRELAX.
This isan impressiveresult, inthatitstatesthatthevalueofthesemidefi-
nite
relaxation
is
guaranteed
to
be
no
more
than
12%
higher
than
the
value
ofNP-hardproblemMAXCUT.
8 SDP in Convex Optimization
As stated above, SDP has very wide applications in convex optimization.ThetypesofconstraintsthatcanbemodeledintheSDPframeworkinclude:linear inequalities, convex quadratic inequalities, lower bounds on matrixnorms, lower bounds on determinants of symmetric positive semidefinitematrices,lowerboundsonthegeometricmeanofanonnegativevector,plus
many
others.
Using
these
and
other
constructions,
the
following
problems
(among many others) can be cast in the form of a semidefinite program:linearprogramming,optimizingaconvexquadratic formsubjecttoconvexquadraticinequalityconstraints,minimizingthevolumeofanellipsoidthatcovers a given set of points and ellipsoids, maximizing the volume of anellipsoid that is contained in a given polytope, plus a variety of maximumeigenvalueandminimumeigenvalueproblems. Inthesubsectionsbelowwedemonstrate how some important problems in convex optimization can bere-formulatedas instancesofSDP.
16
-
7/26/2019 Lec23 Semidef Opt
17/54
8.1
SDP for Convex Quadratically Constrained Quadratic
Programming, Part I
A convex quadraticallyconstrained quadratic program is a problem oftheform:
TQCQP : minimize xTQ0x +q0 x +c0x
Ts.t. xTQix +qi
x +ci 0 , i = 1, . . . , m ,
where
the
Q0
0 and
Qi
0, i
= 1, . . . , m.
This
problem
is
the
same
as:
QCQP : minimize x,
Ts.t. xTQ0x +q0 x +c0 0TxTQix +qi
x +ci
0 , i = 1, . . . , m .
WecanfactoreachQi into
Qi
=MiTMi
forsomematrixMi. Thennotetheequivalence:
I Mix T T
ci
qiTx0 x Qix +qi
x +ci 0.xTMTi
InthiswaywecanwriteQCQP as:
17
-
7/26/2019 Lec23 Semidef Opt
18/54
QCQP : minimize x,s.t.
I M0x 0TxTM0T c0 q0 x+
I Mix 0 , i = 1, . . . , m . xTMi
T ci
qiTx
Notice in the above formulation that the variables are and x and thatallmatrixcoefficientsare linearfunctionsofandx.
8.2
SDP for Convex Quadratically Constrained Quadratic
Programming, Part II
As itturnsout,there isanalternativewayto formulateQCQP asasemi-definiteprogram. Webeginwiththefollowingelementaryproposition.
Proposition 8.1 Given a vector x
k and a matrix W
kk, thenT1 x 0 if andonly if W xxT.
x W
Usingthisproposition,itisstraightforwardtoshowthatQCQP isequiv-alenttothefollowingsemi-definiteprogram:
18
-
7/26/2019 Lec23 Semidef Opt
19/54
QCQP
:
minimize
x,W,s.t.
1 T 1 xc0 2q0
T
012q0 Q0 x W
T1 T 1 xci 2qi 0 , i = 1, . . . , m1
2qi Qi x W
T1 x 0
.
x W
Notice inthisformulationthattherearenow (n+1)(n+2) variables,butthat2theconstraintsarealllinearinequalitiesasopposedtosemi-definiteinequal-ities.
8.3 SDP for the Smallest Circumscrib ed Ellipsoid Problem
AgivenmatrixR 0 and a given point zcanbeusedtodefineanellipsoidinn:
ER,z :={y | (y z)TR(y z) 1}.
OnecanprovethatthevolumeofER,z
isproportionaltodet(R1).
SupposewearegivenaconvexsetX n describedastheconvexhull
19
-
7/26/2019 Lec23 Semidef Opt
20/54
ofkpointsc1
, . . . , ck
. Wewouldliketofindanellipsoidcircumscribingthesek
points
that
has
minimum
volume.
Our
problem
can
be
written
in
the
fol-lowingform:
M CP: minimize vol(ER,z)R, zs.t. ci
ER,z
, i = 1, . . . , k ,
which isequivalentto:
M CP: minimize ln(det(R))R, zs.t. (ci
z)TR(ci
z)1, i = 1, . . . , k
R0,
Now factor R = M2 where M 0 (that is, M is a square root of R),andnowM CPbecomes:
M CP
:
minimize
ln(det(M2))
M, zs.t. (ci
z)TMTM(ci
z)1, i = 1, . . . , k ,
M0.
Nextnoticetheequivalence:
I M ciM z (M ci
M z)T 1
0 (ciz)TMTM(ciz)1
20
-
7/26/2019 Lec23 Semidef Opt
21/54
InthiswaywecanwriteM CPas:
M CP: minimize 2ln(det(M))M, z
s.t.I
(Mci
M z)TMciM z
10, i = 1, . . . , k ,
M0.
Lastofall,makethesubstitutiony=Mz toobtain:
MCP: minimize 2ln(det(M))M, y
I Mci
y
s.t.(M ciy)T 1 0, i = 1, . . . , k ,
M0.
Noticethatthis lastprogram involvessemidefiniteconstraints whereallof
the
matrix
coefficients
are
linear
functions
of
the
variables
M
and
y. How-ever,theobjectivefunctionisnotalinearfunction. ItispossibletoconvertthisproblemfurtherintoagenuineinstanceofSDP,becausethereisawaytoconvertconstraintsoftheform
ln(det(X))
to a semidefinite system. Nevertheless, this is not necessary, either fromatheoreticalorapracticalviewpoint,becauseitturnsoutthatthefunction
f(X) =
ln(det(X))
is
extremely
well-behaved
and
is
very
easy
to
optimize(both intheoryand inpractice).
21
-
7/26/2019 Lec23 Semidef Opt
22/54
Finally,notethataftersolvingthe formulationofM CP above, wecanrecoverthematrixRandthecenterzoftheoptimalellipsoidbycomputing
R=M2 and z=M1y.
8.4 SDP for the Largest Inscribed Ellipsoid Problem
RecallthatagivenmatrixR 0andagivenpointz canbeusedtodefineanellipsoidin
n:
ER,z :={x |(x z)TR(x z) 1},
andthatthevolumeofER,z
isproportionaltodet(R1).
SupposewearegivenaconvexsetX n describedastheintersectionofkhalfspaces
{x
|(ai)
Tx
bi
}, i = 1, . . . , k,that is,
X={x | Ax b}
where the ith row of the matrix A consists of the entries of the vectorai, i = 1, . . . , k. We would like tofind an ellipsoid inscribed inX of maxi-mumvolume. Ourproblemcanbewritten inthefollowingform:
M IP: maximize vol(ER,z
)
R, z
s.t. ER,z X,
22
-
7/26/2019 Lec23 Semidef Opt
23/54
which isequivalentto:
MIP: maximize det(R1)R, zs.t. ER,z
{x| (ai)Tx bi}, i = 1, . . . , k
R 0,
which
is
equivalent
to:
M IP: maximize ln(det(R1))R, z
Ts.t. maxx{ai
x | (x z)TR(x z) 1} bi, i = 1, . . . , k
R 0.
For a given i = 1, . . . , k, the solution to the optimization problem in theith constraint is
R1aix =z+Tai R
1ai
withoptimalobjectivefunctionvalue
Tai
z+ aTi
R1ai
,
andsoMIP canberewrittenas:
23
-
7/26/2019 Lec23 Semidef Opt
24/54
M IP: maximize ln(det(R1))R, z
Ts.t. ai z+ aTR1ai
bi, i = 1, . . . , ki
R0.
Now factorR1 =M2 where M 0(that is, M isa squarerootofR1),andnowM IPbecomes:
M IP: maximize ln(det(M2))M, z
T Ts.t. ai z+ ai
MTMai
bi, i = 1, . . . , k
M0,
whichwecanre-writeas:
M IP
:
maximize
2
ln(det(M))
M, zT Ts.t. ai
MTM ai (biai
z)2, i = 1, . . . , k
Tbiai z0, i = 1, . . . , k
M0.
Nextnoticetheequivalence:
T
(bi
ai z)I Mai
aiTMTM ai
(bi
ai z)2 T 0 andT(Mai)T (bi
ai
z) T bi
ai
z0. 24
-
7/26/2019 Lec23 Semidef Opt
25/54
InthiswaywecanwriteM IPas:
M IP: minimize 2ln(det(M))M, z
s.t.(bi
aTi z)I(M ai)
T
M ai
(bi
aTi
z)0, i = 1, . . . , k ,
M0.
Noticethatthis lastprogram involvessemidefiniteconstraints whereallofthematrixcoefficientsare linear functionsofthevariablesM andz. How-ever, theobjective function isnot a linear function. It ispossible, butnotnecessaryinpractice,toconvertthisproblemfurtherintoagenuineinstanceofSDP,becausethere isawaytoconvertconstraintsoftheform
ln(det(X))
to
a
semidefinite
system.
Such
a
conversion
is
not
necessary,
either
from
atheoreticalorapracticalviewpoint,becauseitturnsoutthatthefunctionf(X) = ln(det(X))isextremelywell-behavedandisveryeasytooptimize(both intheoryand inpractice).
Finally, note that after solving the formulation of M IP above, we canrecoverthematrixRoftheoptimalellipsoidbycomputing
R=M2.
25
-
7/26/2019 Lec23 Semidef Opt
26/54
8.5 SDP for Eigenvalue Optimization
Therearemanytypesofeigenvalueoptimizationproblemsthatcanbefor-mualatedasSDPs. Atypicaleigenvalueoptimizationproblem istocreateamatrix
k
S :=B wiAii=1
givensymmetricdatamatricesBandAi, i = 1, . . . , k,usingweightsw1, . . . , wk,
in
such
a
way
to
minimize
the
difference
between
the
largest
and
smallest
eigenvalueofS. Thisproblemcanbewrittendownas:
EOP : minimize max(S)min(S)w,S
k
s.t. S=B wiAi,i=1
where min(S) and max(S)denotethe smallestand the largesteigenvalue
of
S,
respectively.
We
now
show
how
to
convert
this
problem
into
an
SDP.
RecallthatS canbefactored intoS=QDQT whereQ isanorthonor-mal matrix (i.e., Q1 =QT) and D is a diagonal matrix consisting of theeigenvaluesofS. Theconditions:
ISI
can
be
rewritten
as:
Q(I)QT QDQT Q(I)QT.
26
-
7/26/2019 Lec23 Semidef Opt
27/54
AfterpremultiplyingtheaboveQT andpostmultiplyingQ,theseconditionsbecome:
IDI
whichareequivalentto:
min(S) and max(S).
ThereforeEOP canbewrittenas:
EOP : minimize w,S,,
k
s.t. S=B wiAi
i=1
ISI.
This lastproblemisasemidefiniteprogram.
Using constructs such as those shown above, very many other types ofeigenvalueoptimizationproblemscanbeformulatedasSDPs.
27
-
7/26/2019 Lec23 Semidef Opt
28/54
9 SDP in Control Theory
AvarietyofcontrolandsystemproblemscanbecastandsolvedasinstancesofSDP. However,thistopic isbeyondthescopeofthesenotes.
10 Interior-point Methods for SDP
At the heart of an interior-point method is a barrier function that exertsa repelling force from the boundary of the feasible region. For SDP, we need a barrier function whose values approach +
as points X approach
the
boundary
of
the
semidefinite
cone
Sn
+.
Let X Sn+. Then X will have n eigenvalues, say 1(X), . . . , n(X)(possibly counting multiplicities). We can characterize the interior of thesemidefiniteconeasfollows:
intSn ={XSn |1(X)>0, . . . , n(X)>0}.+
A natural barrier function to use to repel X from the boundary of Sn+then is
n n
ln(i(X))=ln( i(X))=ln(det(X)).j=1 j=1
Consider the logarithmic barrier problem BSDP() parameterized bythepositivebarrierparameter:
28
-
7/26/2019 Lec23 Semidef Opt
29/54
BSDP
()
:
minimize
C
X
ln(det(X))
s.t. AiX=bi , i = 1, . . . , m , X0.
Let f
(X) denote the objective function of BSDP(). Then it is not toodifficulttoderive:
f(X) =
C
X1,
(1)
andsotheKarush-Kuhn-TuckerconditionsforBSDP()are:
Ai
X=bi
, i = 1, . . . , m , X0, (2)
m CX1 = y
iA
i.
i=1
Because X is symmetric, we can factorize X into X = LLT. We thencandefine
S=X1 =LTL1,
which implies
1LTSL =I,
29
-
7/26/2019 Lec23 Semidef Opt
30/54
andwecanrewritetheKarush-Kuhn-Tuckerconditionsas:
Ai
X=bi
, i = 1, . . . , m , X0, X=LLT
(3)m yiAi
+S=C i=1
I1LTSL = 0.
Fromtheequationsof(3)itfollowsthatif(X,y,S)isasolutionof(3),thenX isfeasibleforSDP, (y,S) is feasible for SDD ,andtheresultingdualitygapis
n n n n
SX= Sij
Xij
= (SX)jj
= =n.i=1 j=1 j=1 j=1
This suggests that we try solving BSDP() for a variety of values of as0.
However,wecannotusuallysolve(3)exactly,becausethefourthequationgroupisnotlinearinthevariables. Wewillinsteaddefinea-approximatesolution of the Karush-Kuhn-Tucker conditions (3). Before doing so, weintroducethefollowingnormonmatrices,calledtheFrobeniusnorm:
n n
M
:= M
M = M2 .ij
i=1 j=1
ForsomeimportantpropertiesoftheFrobeniusnorm,seethelastsubsection
30
-
7/26/2019 Lec23 Semidef Opt
31/54
ofthis section. A-approximatesolutionofBSDP() is defined as any solution
(X,
y,
S) of
Ai X=bi , i = 1, . . . , m ,
X 0, X=LLT
(4)m yiAi
+S=C i=1
I 1LTSL .
X, Lemma 10.1 If ( y, S) is a -approximate solution of BSDP() and < 1, then X is feasible for SDP, ( y, S) is feasible for SDD, and theduality gap satisfies:
m
n(1 ) C X yibi
=X S n(1+). (5)i=1
Proof: Primal
feasibility
is
obvious.
To
prove
dual
feasibility,
we
needtoshowthatS 0. Toseethis,define
1LT R=I SL (6)
andnotethatR < 1. Rearranging(6),weobtain L1 0S=LT(I R)
XS) =becauseR
-
7/26/2019 Lec23 Semidef Opt
32/54
q.e.d.
10.1 The Algorithm
Based on the analysisjust presented, we are motivated to develop the fol-lowingalgorithm:
Step 0 . Initialization. Data is (X0, y0, S0, 0). k = 0. Assume that(X0, y0, S0) is a -approximatesolutionofBSDP(0)forsomeknownvalue
of
that
satisfies
4
.
Then for all [0, 1),
2
LT X) +
0 L1D LT +2(1 ) .f0 X+L1D D f0(
In particular,
0.2
LT X) 0.025
0.
(11)f0 X+L1D D f0(
Inordertoprovethisproposition,wewillneedtwopowerfulfactsaboutthelogarithmfunction:
Fact 1. Supposethat |x| < 1. Then2x
ln(1+x) x .2(1
)
Proof: We have:3 4
ln(1+x) = x x2 +x x +. . .2 3 4
x |x|2 |x|3 |x|4 . . .2 3 4
x |x|2 |x|3 |x|4 . . .2 2 22
= x x 1 + |x| +|x|2 +|x|3 +. . .22
=
x
x
2(1|x|)
2x x 2(1) .
44
-
7/26/2019 Lec23 Semidef Opt
45/54
q.e.d.
Fact 2. SupposethatR Sn andthatR < 1. Then
2ln(det(I+R)) I R .
2(1 )
Proof: Factor R = QDQT where Q is orthonormal and D is a diagonaln
matrix of the eigenvalues ofR. Then first note thatR = D2 . We jjthen
have: ln(det(I+R)) =
=
=
=
=
q.e.d.
Proof of Proposition 10.2. Let
j=1
ln(det(I+QDQT))ln(det(I+D))
n
ln(1+Djj )j=1
n D2
jjDjj 2(1)j=1
R2I D 2(1)2I QTRQ 2(1)
2I R 2(1)
=
LT ,L1D
andnoticethatL1D LT =.
45
-
7/26/2019 Lec23 Semidef Opt
46/54
Then
f0 X+
D = f0 X+DL1DLT
L(I+L1D LT))= CX+CD 0 ln(det( LT)
= CX0 ln(det( X))+CD 0 ln(det(I+L1D LT)
X) + CD 0IL1D 2(1) f0( LT +0 2
2= f0( L1X) + CD 0LT D +0 2(1)
2X) + C0 = f0( X1 D +0 2(1) .
Now,(D
, y )solvetheNormalequations:
X1D
X1
C0 X1 +0 = m yi
Ai
i=1 (12)
Ai
D = 0, i = 1, . . . , m .
Takingthe innerproductofbothsidesofthefirstequationabovewithD
andrearrangingyields:
X1D
0 X1D =(C0 X1)D.
Substitutingthis inour inequalityaboveyields:
46
-
7/26/2019 Lec23 Semidef Opt
47/54
X1D 2f0 X+
D f0( X)0 X1D +0 2(1)L1DLT
X)0 LT L1D 2(1)= f0( L1D LT +0
2
LT2 +0 2= f0(X)0L1D 2(1)
LT+0 2= f0(X)0L1D 2(1)2= f0(X) +
0
L1D
LT
+2(1) .
1Subsituting= 0.2andandL1DLT> 4 yieldsthefinalresult.q.e.d.
Lastofall,weproveaboundonthenumberofiterationsthatthealgo-rithmwillneed inordertofinda 14 -approximatesolutionofBSDP(
0):
Proposition 10.3 Suppose that X0 satisfies Ai
X0 =bi, i = 1, . . . , m, andX0
0. Let 0 be given and let f0 be the optimal objective function value
of BSDP
(0). Then the algorithm initiated at X0 will find a 14 -approximatesolution of BSDP(0) in at most
f0(X0)f0k=
0.0250
iterations.
Proof: Thisfollows immediatelyfrom(11). Eachiterationthat isnota 1 -4approximatesolutiondecreasestheobjectivefunctionf0(X) of BSDP(
0)byat least0.0250. Therefore,therecannotbemorethan
f0(X0)f0
0.0250
47
-
7/26/2019 Lec23 Semidef Opt
48/54
iterationsthatarenot 14
-approximatesolutionsofBSDP(0).q.e.d.
10.5 Some Properties of the Frobenius Norm
Proposition 10.4 If M Sn, then
n
1. M = (j(M))2, where 1(M), 2(M), . . . , n(M) is an enu-j=1
meration of the neigenvalues of M.
2. If is any eigenvalue of M, then || M.
3. |trace(M)| nM.4. If M
0,
and
so
M
0.
q.e.d.
48
-
7/26/2019 Lec23 Semidef Opt
49/54
Proposition 10.5 If A,B
Sn, then
AB
A
B.
Proof: We have 2
n n
n
AB = AikBkji=1 j=1 k=1
n
n n
n
A2 B2 ik
kj
i=1 j=1 k=1 k=1
n n n n
= A2 B2 =AB.ik kji=1 k=1 j=1 k=1q.e.d.
11 Issues in Solving SDP using the Ellipsoid Al
gorithm
Toseehowtheellipsoidalgorithm isusedtosolveasemidefiniteprogram,
assume
for
convenience
that
the
format
of
the
problem
is
that
of
the
dual
problemSDD. Thenthefeasibleregionoftheproblemcanbewrittenas:
m
mF = (y1, . . . , ym) |C yiAi 0 ,i=1
mand the objective function is Note that F isjust a convex re-i=1 yibi.mgion in .
Recallthatatanyiterationoftheellipsoidalgorithm,thesetofsolutionsofSDDisknowntolieinthecurrentellipsoid,andthecenterofthecurrent
49
-
7/26/2019 Lec23 Semidef Opt
50/54
y= ( y
F, thenweperformanoptimalityellipsoid is,say, y1
, . . . , ym
). If m mcut
of
the
form i=1 yibi i=1 yibi,andusestandardformulastoupdate
theellipsoidand itsnewcenter. If F,thenweperformafeasibilitycuty /h m suchthatbycomputingavector hTy > hTy forally F.
There are four issues that must be resolved in order to implement theaboveversionoftheellipsoidalgorithmtosolveSDD:
1. Testing if y F. This is done by computing the matrix S = Cm
yiAi. If S
0, then y
F. Testing if the matrix S
0 can i=1
be
done
by
computing
an
exact
Cholesky
factorization
of
S,
which
takesO(n3)operations,assumingthatcomputingsquarerootscanbeperformed(exactly)inoneoperation.
2. Computing a feasibility cut. As above, testing if S 0 can be com-putedinO(n3)operations. IfS 0,thenagainassumingexactarith-metic, we can find an n-vector v Sv such that T v < 0 in O(n3) oper-ations as well. Then the feasiblity cut vector h is computed by theformula:
v
hi
=
T
Aiv,
i
= 1, . . . , m ,
whose computation requires O(mn2)operations. Noticethat for anyy F,that
m mT
Tvy h= yivT
Aiv = yiAi
vi=1 i=1
m m
Tv < v yiAi v = yivT
Aiv = T
vTC y h,i=1 i=1
therebyshowing y.h indeedprovidesafeasibilitycutforF aty=
3. Startingtheellipsoidalgorithm. WeneedtodetermineanupperboundR on the distance of some optimal solution y from the origin. This
50
-
7/26/2019 Lec23 Semidef Opt
51/54
cannot bedonebyexaminingtheinputlengthofthedata,asisthecasein
linear
programming.
One
needs
to
know
some
special
information
about the specific problem at hand in order to determine R beforesolvingthesemidefiniteprogram.
4. Stopping the ellipsoid algorithm. Suppose that we seek an-optimalsolutionofSDD. Inordertoproveacomplexityboundonthenumberof iterations needed to find an -optimal solution, we need to knowbeforehand the radius r of a Euclidean ball that is contained in theset of -optimal solutions of SDD. The value of r also cannot bedeterminedbyexamining the input lengthofthedata, as isthe casein linearprogramming. One needstoknow some special information
about
the
specific
problem
at
hand
in
order
to
determine
r
beforesolvingthesemidefiniteprogram.
12 Current Research in SDP
There are many very active research areas in semidefinite programming innonlinear(convex)programming,incombinatorialoptimization,andincon-trol theory. In the area of convex analysis, recent research topics includethegeometryandtheboundarystructureofSDPfeasibleregions(includingnotionsofdegeneracy)andresearchrelatedtothecomputationalcomplex-
ity
of
SDP
such
as
decidability
questions,
certificates
of
infeasibility,
and
duality theory. In the area of combinatorial optimization, there has beenmuchresearchonthepracticalandthetheoreticaluseofSDPrelaxationsofhardcombinatorialoptimizationproblems. Asregards interiorpointmeth-ods, there are a host of research issues, mostly involving the developmentofdifferentinteriorpointalgorithmsandtheirproperties,includingratesofconvergence,performanceguarantees,etc.
13 Computational State of the Art of SDP
Because
SDP
has
so
many
applications,
and
because
interior
point
methods
showsomuchpromise,perhapsthemostexcitingareaofresearchonSDPhastodowithcomputationandimplementationofinteriorpointalgorithms
51
-
7/26/2019 Lec23 Semidef Opt
52/54
for solving SDP. Much research has focused on the practical efficiency ofinterior
point
methods
for
SDP
.
However,
in
the
research
to
date,
com-putational issues have arisen that are much more complex than those forlinear programming, and these computational issues are only beginning tobewell-understood. Theyprobablystemfromavarietyoffactors,includingthefactthatSDP isnotguaranteedtohavestrictlycomplementaryoptimalsolutions (as is the case in linear programming). Finally, because SDP issuch a new field, there is no representative suite of practical problems onwhich to test algorithms, i.e., there is no equivalent version of thenetlibsuiteof industrial linearprogrammingproblems.
Agoodwebsiteforsemidefiniteprogramming is:
http://www.zib.de/helmberg/semidef.html.
14 Exercises
n
1. Fora(square)matrixM IRnn,definetrace(M) = Mjj ,andforj=1
twomatricesA,B
IRkl define
k l
A B:= AijBij .i=1 j=1
Provethat:
(a) A B=trace(ATB).(b) trace(M N)=trace(N M).
2. LetSkk denotetheconeofpositivesemi-definitesymmetricmatrices,+nnamelySkk ={X Skk | vTXv0 forallv }. Considering+
SkkSkk as
a
cone,
prove
that
=
Snn,
thus
showing
that
Skk+ + + +isself-dual.
52
-
7/26/2019 Lec23 Semidef Opt
53/54
3. Considertheproblem:
(P): minimized dTQd
s.t. dTM d = 1 .
IfM0,showthat(P) isequivalentto:(S): minimizeX QX
s.t. MX= 1
X
0 .
What istheSDPdualof(S)?
4. ProvethatXxxT
ifandonly if
X x 0.xT 1
5. LetK :={XSkk |X0}anddefine
K
:=
{S
Skk
|
S
X
0
for
all
X
0}
.
ProvethatK =K.
6. Let min denote the smallest eigenvalue of the symmetric matrix Q.Showthatthefollowingthreeoptimizationproblemseachhaveoptimalobjectivefunctionvalueequaltomin.
(P1): minimized
dTQd
s.t. dTId = 1 .
(P2)
:
maximize
s.t. QI .
53
-
7/26/2019 Lec23 Semidef Opt
54/54
(P3): minimizeX
Q X
s.t. IX= 1 X0 .
7. SupposethatQ0. Provethefollowing:
Tx Qx +qTx +c0
ifandonly ifthereexistsW forwhich
12q
TT 1 xT 1 xc 0 and 0.1
2q Q x W x W
Hint: usethepropertyshown inExercise4.
8. Considerthematrix
A BM = ,
BT C
whereA, B aresymmetricmatricesandA isnonsingular. ProvethatM0ifandonly ifCBTA1B0.
top related