MA3071 : Algoritma Newton & Conjugate Gradien
Algoritma Umum Masalah Optimisasi Nonlinear Tanpa Kendala¶
--Catatan kuliah MA3071 20210715
Misalkan $f(\mathbb{x})$ memiliki turunan pertama dan kedua yang kontinu pada $\mathbb{R}^n$
Diberikan titik $x\in\Omega$, suatu vektor $\mathbf{d}$ dikatakan arah feasible dari $x$ jika terdapat $\bar{\alpha} >0$ sehingga $𝐱(𝛼)=𝐱+𝛼𝐝∈Ω$ untuk semua $𝛼$ yang memenuhi $0≤𝛼≤\bar{\alpha}$
Dalam kasus meminimumkan, yang kita inginkan adalah arah $\mathbf{d}$ yang membuat kita bergerak turun, yaitu
$𝑓(𝐱+𝛼𝐝)<𝑓(𝐱)$
$$\mathbf{x}_{k+1}=\mathbf{x}_k+\alpha \mathbf{d}$$
dengan $\mathbf{d}$ adalah vektor arah feasibel
Berdasarkan konsep turunan berarah, pilih $\mathbf{d}=-\nabla f(\mathbf{x}_k)$ agar mendapatkan vektor arah feasibel yang turun paling cepat. Jadi
$$\mathbf{x}_{k+1}=\mathbf{x}_k-\alpha \nabla f(\mathbf{x}_k)$$
-- Operasi aljabar pada python : https://www.ridwanreza.com/2021/07/aljabar-python.html
Contoh 1¶
Minimumkan $$f(x_1,x_2)=x_1^2+x_1x_2+x_2^2-4x_1-5x_2$$ ditulis dalam bentuk lain $$f(x_1,x_2)=\frac{1}{2}\begin{bmatrix} x_1&x_2 \end{bmatrix}\begin{bmatrix} 2 & 1 \\ 1 & 2 \end{bmatrix}\begin{bmatrix} x_1 \\ x_2 \end{bmatrix}-\begin{bmatrix} 4&5 \end{bmatrix}\begin{bmatrix} x_1 \\ x_2 \end{bmatrix}$$ $$\nabla f(x_1,x_2)=\begin{bmatrix} 2 & 1 \\ 1 & 2 \end{bmatrix}\begin{bmatrix} x_1 \\ x_2 \end{bmatrix}-\begin{bmatrix} 4\\5 \end{bmatrix}=\begin{bmatrix} 2x_1+x_2-4\\ x_1+2x_2-5 \end{bmatrix}$$ $$\nabla^2 f(x_1,x_2)=\begin{bmatrix} 2 & 1 \\ 1 & 2 \end{bmatrix}$$
import numpy as np
#Algoritma umum
x=np.array([0,0]) #nilai awal
alpha=0.1 #coba kombinasi (0,0,1) , (0,0,0.1), (0,0,0.5), (1,1,1) , (1,1,0.1), (1,1,0.5)
print("x0= ",x)
nablaf=np.array([2*x[0]+x[1]-4 , x[0]+2*x[1]-5]) #menghitung nabla f(x) dengan x awal
print("nabla f(x0)= ",nablaf)
print("norm(nabla f(x0))= ",np.linalg.norm(nablaf))
error=0.00001
iter=0
maxiter=1000
while np.linalg.norm(nablaf)>error and iter<maxiter:
iter=iter+1
x=x-alpha*nablaf #memperbaharui x
nablaf=np.array([2*x[0]+x[1]-4,x[0]+2*x[1]-5]) #menghitung nabla f(x) yang baru dengan x baru
if(iter%20==0):
print("\n","x",iter,"= ",x)
print("nabla f(x",iter,")= ",nablaf)
print("norm(nabla f(xx",iter,")= ",np.linalg.norm(nablaf))
print('\nTotal iteration number',iter-1); print('x =',x)
print("nabla f(x",iter-1,")= ",nablaf)
print("norm(nabla f(xx",iter-1,")= ",np.linalg.norm(nablaf))
x0= [0 0] nabla f(x0)= [-4 -5] norm(nabla f(x0))= 6.4031242374328485 x 20 = [1.05959144 1.93801479] nabla f(x 20 )= [ 0.05719768 -0.06437898] norm(nabla f(xx 20 )= 0.08611751874066621 x 40 = [1.00738949 1.9926086 ] nabla f(x 40 )= [ 0.00738758 -0.00739331] norm(nabla f(xx 40 )= 0.01045166334519779 x 60 = [1.0008985 1.99910149] nabla f(x 60 )= [ 0.0008985 -0.00089851] norm(nabla f(xx 60 )= 0.0012706781689357455 x 80 = [1.00010924 1.99989076] nabla f(x 80 )= [ 0.00010924 -0.00010924] norm(nabla f(xx 80 )= 0.00015448480083971538 x 100 = [1.00001328 1.99998672] nabla f(x 100 )= [ 1.32806994e-05 -1.32806994e-05] norm(nabla f(xx 100 )= 1.8781745271123164e-05 Total iteration number 105 x = [1.00000706 1.99999294] nabla f(x 105 )= [ 7.05790819e-06 -7.05790819e-06] norm(nabla f(xx 105 )= 9.981389488521654e-06
Apa makna yang dapat diambil dari percobaan di atas?
Masalah Optimisasi Nonlinear Tanpa Kendala, Kasus Kuadratik¶
Minimumkan $𝑓(𝐱)=\frac{1}{2} 𝐱^𝑇 𝐐𝐱−𝐱^𝑇 𝐛$
Dengan 𝐐 adalah matriks simetri definit positif. Sehingga 𝑓 adalah fungsi konveks.
$∇𝑓(𝐱)=𝐐𝐱−𝐛$
$∇^2 𝑓(𝐱)=𝐐$
SPO1
$∇𝑓(𝐱)=𝐐𝐱−𝐛=𝟎$
$𝐐𝐱=𝐛$
Metode Steepest Descent¶
Misalkan 𝑓 kasus kuadratik.
Metode Steepest-Descent meminimumkan nilai $f$ pada garis $𝐱_{k+1}=𝐱_{k}+\alpha \mathbb{d}_k$ sedemikian sehingga $𝑓(𝐱_{𝑘+1} )\leq 𝑓(𝐱_𝑘 )$, untuk $k=1,2,..$
Misalkan $g(\alpha)=f(𝐱_𝑘+\alpha \mathbb{d}_k)$
Hampiran Deret Taylor dari $g$ di sekitar $\alpha=0$ sampai orde kedua adalah
$$𝑔(\alpha)≈g(0)+\alpha ∇g(0)+\frac{1}{2} \alpha^2 ∇^2g(0)$$
Terapkan SPO1 pada $𝑔(\alpha)$ menjadi
$∇𝑔(\alpha)=0$
$∇g(0)+\alpha ∇^2g(0)=0$
$\alpha=-\frac{∇g(0)}{∇^2g(0)}$
$\alpha=-\frac{\mathbb{d}_k^T∇f(𝐱_k)}{\mathbb{d}_k^T∇^2f(𝐱_k)\mathbb{d}_k}$
Jadi iterasi untuk metode Steepest-Descent untuk kasus kuadratik adalah : $$𝐱_{k+1}=𝐱_𝑘+\alpha ∇𝑓(𝐱_𝑘 )$$ dengan $\alpha=-\frac{\mathbb{d}_k^T∇f(𝐱_k)}{\mathbb{d}_k^T∇^2f(𝐱_k)\mathbb{d}_k}$
#Steepest Descent Method
x=np.array([0,0]) #nilai awal
print("x0= ",x)
nablaf=np.array([2*x[0]+x[1]-4,x[0]+2*x[1]-5]) #menghitung nabla f(x) dengan x awal
nabla2f=np.array([[2,1],[1,2]]) #kebetulan konstan, tidak bergantung pada x
print("nabla f(x0)= ",nablaf)
print("norm(nabla f(x0))= ",np.linalg.norm(nablaf))
error=0.00001
iter=0
maxiter=1000
while np.linalg.norm(nablaf)>error and iter<maxiter:
iter+=1
alpha=-(nablaf@nablaf)/(nablaf@nabla2f@nablaf)
x=x+alpha*nablaf #memperbaharui x
nablaf=np.array([2*x[0]+x[1]-4,x[0]+2*x[1]-5]) #menghitung nabla f(x) dengan x awal
nabla2f=np.array([[2,1],[1,2]]) #kebetulan konstan, tidak bergantung pada x
if(iter%2==0):
print("\n","x",iter,"= ",x)
print("nabla f(x",iter,")= ",nablaf)
print("norm(nabla f(x",iter,")= ",np.linalg.norm(nablaf))
print('Total iteration number',iter); print('x =',x)
x0= [0 0] nabla f(x0)= [-4 -5] norm(nabla f(x0))= 6.4031242374328485 x 2 = [0.98419204 1.96838407] nabla f(x 2 )= [-0.06323185 -0.07903981] norm(nabla f(x 2 )= 0.10122034801562493 x 4 = [0.99975011 1.99950022] nabla f(x 4 )= [-0.00099957 -0.00124946] norm(nabla f(x 4 )= 0.0016000874686308642 x 6 = [0.99999605 1.9999921 ] nabla f(x 6 )= [-1.58011132e-05 -1.97513915e-05] norm(nabla f(x 6 )= 2.5294122747049902e-05 Total iteration number 7 x = [1.00000136 1.99999874]
Metode Newton¶
Misalkan 𝑓 kasus kuadratik.
Hampiran Deret Taylor dari $𝑓$ di sekitar $𝐱_𝑘$ sampai orde kedua adalah
$$𝑔(𝐱)≈𝑓(𝐱_𝑘 )+∇𝑓(𝐱_𝑘 )(𝐱−𝐱_𝑘 )+\frac{1}{2} (𝐱−𝐱_𝑘 )^𝑇 ∇^2𝑓(𝐱_𝑘 )(𝐱−𝐱_𝑘 )$$
$𝑔(𝐱)$ merupakan hampiran fungsi $𝑓$ menjadi fungsi kuadrat
Terapkan SPO1 pada $𝑔(𝐱)$ menjadi
$∇𝑔(𝐱)=0$
$∇𝑓(𝐱_𝑘 )+∇^2 𝑓(𝐱_𝑘 )(𝐱−𝐱_𝑘 )=0$
$∇^2 𝑓(𝐱_𝑘 )𝐱−∇^2 𝑓(𝐱_𝑘 ) 𝐱_𝑘=−∇𝑓(𝐱_𝑘 )$
$∇^2 𝑓(𝐱_𝑘 )𝐱=∇^2 𝑓(𝐱_𝑘 ) 𝐱_𝑘−∇𝑓(𝐱_𝑘 )$
$𝐱=𝐱_𝑘-[∇^2 𝑓(𝐱_𝑘 )]^{−1} ∇𝑓(𝐱_𝑘 )$
Jadi iterasi untuk metode Newton untuk kasus kuadratik adalah : $$𝐱_{k+1}=𝐱_𝑘+\alpha ∇𝑓(𝐱_𝑘 )$$ dengan $\alpha=−[∇^2 𝑓(𝐱_𝑘 )]^{−1}$
#Metode Newton
x=np.array([999999,123891273]) #nilai awal
print("x0= ",x)
nablaf=np.array([2*x[0]+x[1]-4,x[0]+2*x[1]-5]) #menghitung nabla f(x) dengan x awal
nabla2f=np.array([[2,1],[1,2]]) #kebetulan konstan, tidak bergantung pada x
print("nabla f(x0)= ",nablaf)
print("norm(nabla f(x0))= ",np.linalg.norm(nablaf))
error=0.00001
iter=1
maxiter=100
while np.linalg.norm(nablaf)>error and iter<maxiter:
x=x-np.linalg.inv(nabla2f)@nablaf #memperbaharui x
print("\n","x",iter,"= ",x)
nablaf=np.array([2*x[0]+x[1]-4,x[0]+2*x[1]-5]) #menghitung nabla f(x) dengan x awal
nabla2f=np.array([[2,1],[1,2]]) #kebetulan konstan, tidak bergantung pada x
print("nabla f(x",iter,")= ",nablaf)
print("norm(nabla f(x",iter,")= ",np.linalg.norm(nablaf))
iter=iter+1
print('Total iteration number',iter); print('x =',x)
x0= [ 999999 123891273] nabla f(x0)= [125891267 248782540] norm(nabla f(x0))= 278821382.45786834 x 1 = [1. 2.] nabla f(x 1 )= [0. 0.] norm(nabla f(x 1 )= 0.0 Total iteration number 2 x = [1. 2.]
Metode Conjugate Gradien¶
Misalkan 𝑓 kasus kuadratik.
Misalkan vektor-vektor tak nol ${𝐝_𝑖 }_{(𝑖=0)}^{(𝑛−1)}$ bersifat 𝐐−orthogonal
Untuk setiap $𝐱_0\in \mathbb{R}^n$, barisan ${𝐱_𝑘 }$ yang dibangun oleh
$𝐱_{𝑘+1}=𝐱_𝑘+𝛼_𝑘 𝐝_𝑘$
Dengan $𝛼_𝑘=−\frac{𝐝_𝑘^𝑇 ∇𝑓(𝐱_𝑘 )}{𝐝_𝑘^𝑇 ∇^2 𝑓(𝐱_𝑘 ) 𝐝_𝑘 }$
Akan konvergen ke solusi $𝐱^∗$ dari $𝐐𝐱=𝐛$ setelah 𝑛 Langkah
Yaitu $𝐱_𝑛=𝐱^∗$
Solusi bisa didapatkan kurang dari 𝑛 langkah apabila terdapat ∇𝑓(𝐱_𝑘 )=𝟎
Iterasi Metode Conjugate Gradien¶
Pilih nilai awal $𝐱_0$, kemudian hitung
$𝐝_0=−∇𝑓(𝐱_0)$
$𝛼_0=−\frac{𝐝_0^𝑇 ∇𝑓(𝐱_0 )}{𝐝_0^𝑇 ∇^2 𝑓(𝐱_0) 𝐝_0} $
$𝐱_{1}=𝐱_0+𝛼_0 𝐝_0$
Cek apakah $|∇𝑓(𝐱_1)|$ sudah cukup kecil (menuju 0)
Berikutnya hitung :
$𝐝_{𝑘+1}=−∇𝑓(𝐱_0 )+\frac{(∇𝑓(𝑥_{𝑘+1} ))^T ∇^2 𝑓(x_𝑘 ) 𝐝_𝑘}{𝐝_𝑘^𝑇 ∇^2 𝑓(𝐱_𝑘 ) 𝐝_𝑘} 𝐝_𝑘$
$𝛼_𝑘=−\frac{𝐝_𝑘^𝑇 ∇𝑓(𝐱_𝑘 )}{𝐝_𝑘^𝑇 ∇^2 𝑓(𝐱_𝑘) 𝐝_𝑘}$
$𝐱_{𝑘+1}=𝐱_𝑘+𝛼_𝑘 𝐝_𝑘$
sampai diperoleh $|∇𝑓(𝐱_{k+1})|$ sudah cukup kecil (menuju 0)
Menurut Teorema, solusi akan diperoleh setelah $n$ iterasi. $n$ adalah dimensi dari matriks persegi $𝐐$.
Solusi bisa didapatkan kurang dari $𝑛$ langkah apabila terdapat $∇𝑓(𝐱_𝑘 )=𝟎$
x=np.array([10,10]) #nilai awal
print("x0= ",x)
nablaf=np.array([2*x[0]+x[1]-4,x[0]+2*x[1]-5]) #menghitung nabla f(x) dengan x awal
print("norm(nabla f(x0))= ",np.linalg.norm(nablaf))
nabla2f=np.array([[2,1],[1,2]])
d=-nablaf #menghitung d
iter=1
maxiter=100
error=0.000001
while np.linalg.norm(nablaf)>error and iter<maxiter:
alpha=-d@nablaf/(d@nabla2f@d)
x=x+alpha*d
print("x",iter,"= ", x)
nablaf=np.array([2*x[0]+x[1]-4,x[0]+2*x[1]-5])
print("norm(nabla f(x",iter,"))= ",np.linalg.norm(nablaf))
d=-nablaf+( nabla2f@nablaf@d/(d@nabla2f@d) )*d
iter=iter+1
x0= [10 10] norm(nabla f(x0))= 36.069377593742864 x 1 = [1.33111225 1.66453101] norm(nabla f(x 1 ))= 0.4714347148336458 x 2 = [1. 2.] norm(nabla f(x 2 ))= 1.2560739669470201e-15
Solusi didapatkan tepat setelah 2 langkah
USING AUTOGRAD¶
Autograd is python library which can calculate derivative of function
Example 2¶
Consider a problem minimize : $$f(\bar{x})=x_1^2+2x_2^2+3x_3^2+x_4^2+2x_5^2+4x_6^2-x_1-x_2-x_3-x_4-x_5-x_6$$
The problem will be solved using:
- General Descent
- Steepest Decent Method
- Newton Method
- Conjugate Gradient Method
import autograd.numpy as np
from autograd import grad, jacobian, hessian
def nablaf(x,f):
j=jacobian(f)
return j(x)
def nabla2f(x,f):
j=hessian(f)
return j(x)
def Anablaf(x,f):
return nabla2f(x,f)@nablaf(x,f)
def general_descent(x,f,alpha=0.1,error=0.1**8,max_iter=1000):
stop=False
iter=0;
print("Initial x= ",x)
while stop==False:
iter+=1
x=x-alpha*nablaf(x,f)
if iter%20==1:
print('Iter =',iter) ; print('x =',x)
if iter>=max_iter or np.linalg.norm(nablaf(x,f))<=error: stop = True
if iter!=1: print('Iter =',iter) ; print('x =',x)
return x,iter
def steepest_descent(x,f,error=0.1**8,max_iter=1000):
stop=False
iter=0;
print("Initial x= ",x)
while stop==False:
iter+=1
alpha=-(nablaf(x,f)@nablaf(x,f))/(nablaf(x,f)@Anablaf(x,f))
x=x+alpha*nablaf(x,f)
if iter%10==1:
print('Iter =',iter) ; print('x =',x)
if iter>=max_iter or np.linalg.norm(nablaf(x,f))<=error: stop = True
if iter!=1: print('Iter =',iter) ; print('x =',x)
return x,iter
def newton(x,f,error=0.1**8,max_iter=1000):
stop=False
iter=0;
print("Initial x= ",x)
while stop==False:
iter+=1
alpha=-np.linalg.inv(nabla2f(x,f))
x=x+alpha@nablaf(x,f)
print('Iter =',iter); print('x =',x)
if iter>=max_iter or np.linalg.norm(nablaf(x,f))<=error: stop = True
return x,iter
def conjugate_gradien(x,f,error=0.1**8,max_iter=1000):
stop=False
iter=0;
print("Initial x= ",x)
d=-nablaf(x,f)
while stop==False:
iter+=1
alpha=-d@nablaf(x,f)/(d@nabla2f(x,f)@d)
x=x+alpha*d
print('Iter =',iter); print('x =',x)
if iter>=max_iter or np.linalg.norm(nablaf(x,f))<=error: stop = True
d=-nablaf(x,f)+( nabla2f(x,f)@nablaf(x,f)@d/(d@nabla2f(x,f)@d) )*d
return x,iter
#General Descent Direction
def f(x):
return x[0]**2 +2*x[1]**2+3*x[2]**2+1*x[3]**2+2*x[4]**2+4*x[5]**2-x[0]-x[1]-x[2]-x[3]-x[4]-x[5]
print('\n=======Enhanced Steepest Decent Method with n_hat=1 ========')
x=np.array([1,1,1,1/2,1/2,1/2], dtype=float) #Initial Value
alpha=0.1 #try combination (x1,x2,alpha)={(0,0,1) , (0,0,0.1), (0,0,0.5), (1,1,1) , (1,1,0.1), (1,1,0.5)}
x,iter=general_descent(x,f,alpha)
print('Total iteration number',iter); print('x =',x)
=======Enhanced Steepest Decent Method with n_hat=1 ======== Initial x= [1. 1. 1. 0.5 0.5 0.5] Iter = 1 x = [0.9 0.7 0.5 0.5 0.4 0.2] Iter = 21 x = [0.50461169 0.25001645 0.16666667 0.5 0.25000548 0.125 ] Iter = 41 x = [0.50005317 0.25 0.16666667 0.5 0.25 0.125 ] Iter = 61 x = [0.50000061 0.25 0.16666667 0.5 0.25 0.125 ] Iter = 81 x = [0.50000001 0.25 0.16666667 0.5 0.25 0.125 ] Iter = 83 x = [0.5 0.25 0.16666667 0.5 0.25 0.125 ] Total iteration number 83 x = [0.5 0.25 0.16666667 0.5 0.25 0.125 ]
#Steepest Descent
def f(x):
return x[0]**2 +2*x[1]**2+3*x[2]**2+1*x[3]**2+2*x[4]**2+4*x[5]**2-x[0]-x[1]-x[2]-x[3]-x[4]-x[5]
print('\n=======Enhanced Steepest Decent Method with n_hat=6 ========')
x=np.array([1,1,1,1/2,1/2,1/2], dtype=float) #Initial Value
x,iter=steepest_descent(x,f)
print('Total iteration number',iter); print('x =',x)
=======Enhanced Steepest Decent Method with n_hat=6 ======== Initial x= [1. 1. 1. 0.5 0.5 0.5] Iter = 1 x = [ 0.82954545 0.48863636 0.14772727 0.5 0.32954545 -0.01136364] Iter = 11 x = [0.5021882 0.25000004 0.16666667 0.5 0.25000001 0.12447757] Iter = 21 x = [0.50001314 0.25 0.16666667 0.5 0.25 0.12499686] Iter = 31 x = [0.50000008 0.25 0.16666667 0.5 0.25 0.12499998] Iter = 38 x = [0.5 0.25 0.16666667 0.5 0.25 0.125 ] Total iteration number 38 x = [0.5 0.25 0.16666667 0.5 0.25 0.125 ]
#Newton Method
def f(x):
return x[0]**2 +2*x[1]**2+3*x[2]**2+1*x[3]**2+2*x[4]**2+4*x[5]**2-x[0]-x[1]-x[2]-x[3]-x[4]-x[5]
print('\n=======Enhanced Steepest Decent Method with n_hat=6 ========')
x=np.array([1,1,1,1/2,1/2,1/2], dtype=float) #Initial Value
x,iter=newton(x,f)
print('Total iteration number',iter); print('x =',x)
=======Enhanced Steepest Decent Method with n_hat=6 ======== Initial x= [1. 1. 1. 0.5 0.5 0.5] Iter = 1 x = [0.5 0.25 0.16666667 0.5 0.25 0.125 ] Total iteration number 1 x = [0.5 0.25 0.16666667 0.5 0.25 0.125 ]
#Conjugate Gradient
def f(x):
return x[0]**2 +2*x[1]**2+3*x[2]**2+1*x[3]**2+2*x[4]**2+4*x[5]**2-x[0]-x[1]-x[2]-x[3]-x[4]-x[5]
print('\n=======Enhanced Steepest Decent Method with n_hat=6 ========')
x=np.array([1,1,1,1/2,1/2,1/2], dtype=float) #Initial Value
x,iter=conjugate_gradien(x,f)
print('Total iteration number',iter); print('x =',x)
=======Enhanced Steepest Decent Method with n_hat=6 ======== Initial x= [1. 1. 1. 0.5 0.5 0.5] Iter = 1 x = [ 0.82954545 0.48863636 0.14772727 0.5 0.32954545 -0.01136364] Iter = 2 x = [0.68966857 0.26825479 0.11250647 0.5 0.25608493 0.16675298] Iter = 3 x = [0.57823366 0.21479485 0.1823134 0.5 0.23826495 0.11848053] Iter = 4 x = [0.5 0.25 0.16666667 0.5 0.25 0.125 ] Total iteration number 4 x = [0.5 0.25 0.16666667 0.5 0.25 0.125 ]
Apabila ada masukkan silahkan sampaikan
Terima kasih
Comments
Post a Comment