非線形最適化

モデル式 $$\boldsymbol{y}=\boldsymbol{f}(\boldsymbol{x})$$

ここで

  • $\boldsymbol{y} \in \mathbb{R}^n$
  • $\boldsymbol{x} \in \mathbb{R}^m$
  • $n>m$
  • $\boldsymbol{f}:m \rightarrow n$

$\boldsymbol{y}$は各$y_i$に対応する関数$f_i(\boldsymbol{x})$を用いて$[f_1(\boldsymbol{x}),f_2(\boldsymbol{x}),\cdots]^\mathrm{T}$とも書ける

誤差評価関数($m$次元ベクトル→スカラー量を出力) $$C(\boldsymbol{x}) = \epsilon $$

ナブラ演算子

形式的には以下で表す

$$ \nabla = \begin{pmatrix} \frac{\partial}{\partial x_1} \\ \frac{\partial}{\partial x_2} \\ \vdots \\ \frac{\partial}{\partial x_m} \\ \end{pmatrix} \in \mathbb{R}^m $$$$ \nabla^2 = \nabla\nabla^\mathrm{T} \in \mathbb{R}^{m \times m} $$

物理学だと$\nabla^2=\nabla^\mathrm{T}\nabla \in \mathbb{R}$としてラプラシアンが定義されているが、最適化関係では上記の記述が使われることが多い.

ナブラ演算子を用いた重要な行列

ヤコビアン $$J(\boldsymbol{x}) = \boldsymbol{f}(\boldsymbol{x})\nabla^\mathrm{T} = \begin{pmatrix} \nabla f_1(\boldsymbol{x})^\mathrm{T} \\ \nabla f_2(\boldsymbol{x})^\mathrm{T} \\ \vdots \\ \nabla f_n(\boldsymbol{x})^\mathrm{T} \\ \end{pmatrix} \in \mathbb{R}^{n \times m} $$

勾配ベクトル $$\boldsymbol{g} = \nabla C(\boldsymbol{x}) \in \mathbb{R}^m$$

ヘッシアン $$\boldsymbol{H} = \nabla^2 C(\boldsymbol{x}) \in \mathbb{R}^{m \times m}$$

ベクトル値関数$\boldsymbol{f}(\boldsymbol{x})\in \mathbb{R}^m$の1次テーラー展開 $$\boldsymbol{f}(\boldsymbol{x_0}+\Delta\boldsymbol{x}) = \boldsymbol{f}(\boldsymbol{x_0}) + \boldsymbol{J}\Delta\boldsymbol{x} $$ または $$\boldsymbol{f}(\boldsymbol{x}) - \boldsymbol{f}(\boldsymbol{x_0}) = \Delta\boldsymbol{f}(\boldsymbol{x})= \boldsymbol{J}\Delta\boldsymbol{x} \\ $$

スカラー値関数$C(\boldsymbol{x}) \in \mathbb{R}$の2次テーラー展開 $$C(\boldsymbol{x}_0+\Delta\boldsymbol{x})= C(\boldsymbol{x}_0) + \langle \boldsymbol{g}\cdot\Delta\boldsymbol{x} \rangle+ \frac{1}{2}\langle \Delta\boldsymbol{x}\cdot\boldsymbol{H}\Delta\boldsymbol{x} \rangle \\ = C(\boldsymbol{x}_0) + \boldsymbol{g}^\mathrm{T}\Delta\boldsymbol{x} + \frac{1}{2}(\Delta\boldsymbol{x})^\mathrm{T}\boldsymbol{H}\Delta\boldsymbol{x}$$ または $$C(\boldsymbol{x}) - C(\boldsymbol{x}_0) = \Delta C(\boldsymbol{x}) = \boldsymbol{g}^\mathrm{T}\Delta\boldsymbol{x} + \frac{1}{2}(\Delta\boldsymbol{x})^\mathrm{T}\boldsymbol{H}\Delta\boldsymbol{x}$$

※テーラー展開は用途に応じて2つの形式を取ることができる。

  1. $f(x)=f(a)+f'(a)(x-a)+\frac{1}{2}f''(a)(x-a)^2+\cdots$
    • ただし$x\approx a$の場合のみ
    • $f(x)$を$a$近傍にて線形化したい場合に用いる
    • 主に解析計算で使う
  2. $f(a+h)=f(a)+f'(a)h+\frac{1}{2}f''(a)h^2+\cdots$
    • $x=a$近傍における$f(x)$の微小変分を求めたい場合に用いる。
    • 主に数値計算で使う

最小二乗法

誤差評価関数が残差二乗和で与えられるときに有効.

$$C(\boldsymbol{x})= (\boldsymbol{y}-f(\boldsymbol{x}))^{\mathrm{T}}(\boldsymbol{y}-f(\boldsymbol{x})) $$

初期値$\boldsymbol{x}=\boldsymbol{x_0}$を仮定し、モデル式を1次テーラー展開すると $$f(\boldsymbol{x}+\Delta\boldsymbol{x})-f(\boldsymbol{x_0})=\boldsymbol{J}\Delta{x}$$

この方程式について解く $$ \Delta\hat{\boldsymbol{x}} = (\boldsymbol{J}^\mathrm{T}\boldsymbol{J})^{-1}\boldsymbol{J}^\mathrm{T}(\boldsymbol{y}-\boldsymbol{x}) $$ この解を使って次の$\boldsymbol{x}_0$を決めて反復計算すればよい。

ちなみにこの解はガウス・ニュートン法の近似になっているみたい。 数式的には以下の近似が成り立つ。 $$\nabla C(\boldsymbol{x})=\boldsymbol{J}^\mathrm{T}\boldsymbol{f}(\boldsymbol{x})$$ $$\nabla^2 C(\boldsymbol{x})=\boldsymbol{J}^\mathrm{T}\boldsymbol{J}$$

なので、適当な係数$c$を用いて$\boldsymbol{J}^\mathrm{T}\boldsymbol{J} =\boldsymbol{J}^\mathrm{T}\boldsymbol{J} + \boldsymbol{I}c $として反復計算をすればレーベンバーグマーカート法になる。

最急降下法

評価関数の微小変分は1次のテーラー展開を用いて $$\Delta C(\boldsymbol{x})= \boldsymbol{g}^\mathrm{T} \Delta\boldsymbol{x}$$

ここで右辺は評価関数が増加する方向を向いたベクトルになる。 だから解の更新則はある適当な係数$\alpha$を用いて、評価関数が減少する方向に動かせばいい $$\Delta \hat{\boldsymbol{x}} = -\alpha \boldsymbol{g} \Delta \boldsymbol{x}$$

ガウス・ニュートン法

評価関数の微小変分は2次までのテーラー展開を用いて $$C(\boldsymbol{x}_0+\Delta\boldsymbol{x}) = C(\boldsymbol{x}_0) + \boldsymbol{g}^\mathrm{T}\Delta\boldsymbol{x} + \frac{1}{2}(\Delta\boldsymbol{x})^\mathrm{T}\boldsymbol{H}\Delta\boldsymbol{x}$$

だから評価関数の未知数ベクトル$\boldsymbol{x}$による微分は $$\frac{\Delta C(\boldsymbol{x})}{\Delta{\boldsymbol{x}}}= \boldsymbol{g} + \boldsymbol{H} \Delta\boldsymbol{x}$$

※行列をベクトルで微分する公式を使う $$\frac{\partial}{\partial \boldsymbol{x}}\boldsymbol{Ax} = \boldsymbol{A}^\mathrm{T} \\ \frac{\partial}{\partial \boldsymbol{x}}\boldsymbol{x}^\mathrm{T}\boldsymbol{A} = \boldsymbol{A} \\ \frac{\partial}{\partial \boldsymbol{x}}\boldsymbol{x}^\mathrm{T}\boldsymbol{Ax} = (\boldsymbol{A}+\boldsymbol{A}^\mathrm{T})\boldsymbol{x} $$

結果が極値を取るときが最適解だから $$ \Delta\hat{\boldsymbol{x}} = -\boldsymbol{H}^{-1} \boldsymbol{g} $$

この解を使って次の$\boldsymbol{x}_0$を決めて反復計算すればよい。

レーベンバーグ・マーカート法

ガウス・ニュートン法の解の更新則を考察する。 $$ \Delta\hat{\boldsymbol{x}} = -\boldsymbol{H}^{-1} \boldsymbol{g} $$ この更新則は最適解近傍において$\Delta\boldsymbol{x}\approx\boldsymbol{0}$になる。 ということはヘッセ行列の値によらず、勾配$\boldsymbol{g}\approx\boldsymbol{0}$になるから、ヘッセ行列は反復において厳密である必要は実はない。

ではヘッセ行列はどのような影響があるのかというと、収束速度に影響がある。 なので、逆にヘッセ行列を調節してあげれば収束速度と精度を両立した最適化が行える。 具体的には更新則を以下の式の解とする。 $$ (\boldsymbol{H}+\boldsymbol{I}c)\Delta\boldsymbol{x} = - \boldsymbol{g} $$

この式を解いた結果を要素ごとに見るとざっくり以下のような形をしている。 $$ \iff \Delta\boldsymbol{x}[i] = -\frac{1}{c}\boldsymbol{g}[i] + \epsilon $$

ここで$\epsilon$はヘッセ行列の対角要素に影響を受けない項である。 cが大きければ、収束は遅いが更新幅が小さく確実に解に近づく。 cが小さければ収束は早いが更新幅が大きく、解を通り越してしまう可能性がある。

無限小回転

無限小回転は以下の式。 $$\boldsymbol{R} = e^{\boldsymbol{A}} = \boldsymbol{I} + \boldsymbol{A} + \frac{1}{2}\boldsymbol{A}^2 + \cdots + \frac{1}{k!}\boldsymbol{A}^k$$

また,回転行列$\boldsymbol{R}_0$の微小変化は以下で与えられる。 $$\boldsymbol{R}_0+\Delta\boldsymbol{R} = (\boldsymbol{I}+\boldsymbol{A}+\cdots)\boldsymbol{R}_0 \approx \boldsymbol{R}_0+\boldsymbol{AR}_0$$

ここで、$\boldsymbol{A}$は無限小回転の生成子。 $$A= \begin{pmatrix} 0 & -l_z & l_y \\ l_z & 0 & -l_x \\ -l_y & l_x & 0 \\ \end{pmatrix} $$

生成子は3パラメータで記述できる。3パラメータ並べたベクトルは無限小回転ベクトルという $$\boldsymbol{l} = [l_x,l_y,l_z]^\mathrm{T}$$

無限小回転の意味

  • $\boldsymbol{l}$軸回りに
  • $|\boldsymbol{l}| = \sqrt{l_x^2+l_y^2+l_z^2}$ ラジアン回転する

この微小回転が微小時間$\delta t$の間に起きるものだとすると、無限小回転ベクトルは角速度ベクトルを用いて以下で表せる. $$\boldsymbol{l} = \boldsymbol{\omega} \delta t$$

無限小回転の導出はロドリゲス回転から考えると楽. なぜならロドリゲスの回転公式は証明簡単だから. ロドリゲスの回転公式で得られる回転行列について、回転角度が微小、すなわち$sin\theta \approx \theta, cos\theta \approx 1$と近似すれば無限小回転の1次の項まで得られる。 近似を高次化すれば高次の項も導出できるはず。

回転行列を含む評価関数の最適化

回転行列を含む評価関数$C(\boldsymbol{R})$を最小にするような回転行列を最適化計算によって求める。 ある適当な回転行列$\boldsymbol{R}_0$近傍における評価関数を無限小回転ベクトルを使ってパラメータ化する。 $$C(\boldsymbol{R}_0)=C(\boldsymbol{\omega})=C(\boldsymbol{R}_0 + \boldsymbol{A_{\omega=0} R}_0)$$ ここで無限小回転ベクトル$\boldsymbol{\omega}$によってパラメータ化される無限小回転の生成子を$A_\omega$で表す。 上記の式は、ある回転行列$\boldsymbol{R}_0$近傍において評価関数は$\boldsymbol{\omega}$の関数であって、たまたま$\boldsymbol{\omega}=\boldsymbol{0}$の評価値である、という意味。

すると、$\boldsymbol{R}_0$近傍で評価関数$C$は$\boldsymbol{\omega}=\boldsymbol{\omega}_0=\boldsymbol{0}$近傍の微小な無限小回転ベクトル$\Delta\boldsymbol{\omega}$を用いて線形化できる。

$$C(\boldsymbol{\omega}) = C(\boldsymbol{\omega}_0) + \boldsymbol{g}^\mathrm{T}\Delta\boldsymbol{\omega} + \frac{1}{2}(\Delta\boldsymbol{\omega})^\mathrm{T}\boldsymbol{H}\Delta\boldsymbol{\omega}$$

この線形化した評価関数に対する解の更新則は $$\Delta\boldsymbol{\omega}=-\boldsymbol{H}^{-1}\boldsymbol{g}$$ $$\boldsymbol{R}_{1} \leftarrow \mathrm{exp}(\boldsymbol{A}_{\omega = \Delta\omega})\boldsymbol{R}_0$$

プログラムサンプル

数値微分による勾配ベクトル・ヘッセ行列の近似

In [41]:
from numpy import *
from matplotlib.pyplot import *
from mpl_toolkits.mplot3d import Axes3D
import sympy as sp
from IPython.display import display
sp.init_printing()

import matplotlib as mpl
zoom=1.3
mpl.rcParams['figure.figsize'] = [6.4 *zoom , 4.8*zoom]
mpl.rcParams['figure.dpi'] = 80*zoom
mpl.rcParams['lines.linewidth'] = 2.0
ion()

def fprime(f,x0,d):
    # d : [0,0,...,0,dx,0,...,0]
    f0 = f(x0)
    xn = x0+d
    fn = f(xn)
    gn = (fn-f0)/(sum(d))
    return gn

def fprime2(f,x0,d0,d):
    f0 = fprime(f,x0,d0)
    fn = fprime(f,x0,d0+d)
    gn = (fn-f0)/sum(d)

def grad(f,x0,delta):
    # delta : [dx1,dx2,dx3,...]
    g = array([fprime(f,x0,d) for d in diag(delta)])
    return g

def hessian(f,x0,delta):
    # delta : [dx1,dx2,dx3,...]
    D = diag(delta)
    H = zeros(D.shape)
    for i,di in enumerate(D):
        for j,dj in enumerate(D):
            g0 = grad(f,x0,delta)
            gi0 = g0[i]
            gj0 = g0[j]
            gin = grad(f,x0+dj,delta)[i]
            gjn = grad(f,x0+di,delta)[j]
            H[i,j] = (gin-gi0)/(2*sum(dj)) + (gjn-gj0)/(2*sum(di))
    return H

def hessian2(f,x0,delta):
    def fprime2(d1,d2):
        f00 = f(x0)
        f10 = f(x0+d1)
        f01 = f(x0+d2)
        f11 = f(x0+d1+d2)
        gn  = (f11-f10-f01+f00)/(sum(d1)*sum(d2))
        return gn
    D = diag(delta)
    H = zeros(D.shape)
    for i,d1 in enumerate(D):
        for j,d2 in enumerate(D):
            H[i,j] = fprime2(d1,d2)
    return H
In [42]:
sx1,sx2=sp.symbols("x1,x2")
sX = sp.Matrix([[sx1],[sx2]])
sf = 2*sp.sqrt(sx1)+3*sx1*sx2+4*sp.sqrt(sx2)
sg = sp.derive_by_array(sf,sX)
sH = sp.derive_by_array(sg,sX)
print("f(x)=")
display(sf)
print("g(x)=")
display(sg)
print("H(x)=")
display(sH)

sx1_ , sx2_ = (5,5)
gref = array(sp.lambdify((sx1,sx2),sg)(sx1_,sx2_)).squeeze()
Href = array(sp.lambdify((sx1,sx2),sH)(sx1_,sx2_)).squeeze()
  
print("x=",sx1_,sx2_)
print("g=",gref)
print("H=",Href)
f(x)=
$\displaystyle 2 \sqrt{x_{1}} + 3 x_{1} x_{2} + 4 \sqrt{x_{2}}$
g(x)=
$\displaystyle \left[\begin{matrix}3 x_{2} + \frac{1}{\sqrt{x_{1}}}\\3 x_{1} + \frac{2}{\sqrt{x_{2}}}\end{matrix}\right]$
H(x)=
$\displaystyle \left[\begin{matrix}\left[\begin{matrix}- \frac{1}{2 x_{1}^{\frac{3}{2}}}\\3\end{matrix}\right]\\\left[\begin{matrix}3\\- \frac{1}{x_{2}^{\frac{3}{2}}}\end{matrix}\right]\end{matrix}\right]$
x= 5 5
g= [15.4472136  15.89442719]
H= [[-0.04472136  3.        ]
 [ 3.         -0.08944272]]
In [43]:
f = lambda x : 2*sqrt(x[0])+3*x[0]*x[1]+4*sqrt(x[1])
y = array([5.0,5.0])

dt = 0.0001
delta=array([dt,dt])
g = grad(f,y,delta)
H = hessian(f,y,delta)

print("g-1 : numerical diff = {0:.5f} , ref = {1:.5f}".format(g[0],gref[0]))
print("g-2 : numerical diff = {0:.5f} , ref = {1:.5f}".format(g[1],gref[1]))
print("H-11 : numerical diff = {0:.5f} , ref = {1:.5f}".format(H[0,0],Href[0,0]))
print("H-22 : numerical diff = {0:.5f} , ref = {1:.5f}".format(H[1,1],Href[1,1]))
print("H-12 : numerical diff = {0:.5f} , ref = {1:.5f}".format(H[0,1],Href[0,1]))
print("H-21 : numerical diff = {0:.5f} , ref = {1:.5f}".format(H[1,0],Href[1,0]))
g-1 : numerical diff = 15.44721 , ref = 15.44721
g-2 : numerical diff = 15.89442 , ref = 15.89443
H-11 : numerical diff = -0.04472 , ref = -0.04472
H-22 : numerical diff = -0.08944 , ref = -0.08944
H-12 : numerical diff = 3.00000 , ref = 3.00000
H-21 : numerical diff = 3.00000 , ref = 3.00000

無限小回転

In [44]:
#  seed data
thz = deg2rad(15)
thy = deg2rad(40)
thx = deg2rad(65)
Xref = np.array([[0,-1,-1,0,1,1,0],[0.8,0,-1,-0.5,-1,0,0.8],[0,0,0,0,0,0,0]]) 

# refdata
Rz = array([[cos(thz),-sin(thz),0],[sin(thz),cos(thz),0],[0,0,1]])
Ry = array([[cos(thy),0,sin(thy)],[0,1,0],[-sin(thy),0,cos(thy)]])
Rx = array([[1,0,0],[0,cos(thx),-sin(thx)],[0,sin(thx),cos(thx)]])
Rref= Rz@Ry@Rx
N=Xref.shape[1]
Yref= Rref@Xref

ion()
def getfig():
    fig = figure()
    ax = fig.add_subplot(111,projection="3d")
    ax.set_xlabel("x")
    ax.set_ylabel("y")
    ax.set_zlabel("z")
    ax.set_xlim(-1,1)
    ax.set_ylim(-1,1)
    ax.set_zlim(-1,1)
    return fig,ax

def plot_ref():
    fig,ax = getfig()
    ax.plot(Xref[0,:],Xref[1,:],Xref[2,:],'ob:',label="Xref")
    ax.plot(Yref[0,:],Yref[1,:],Yref[2,:],'or:',label="Yref")
    legend()
    title("ref data")
    fig.show()
plot_ref()
C:\Users\acketred\anaconda3\lib\site-packages\ipykernel_launcher.py:33: UserWarning: Matplotlib is currently using module://ipykernel.pylab.backend_inline, which is a non-GUI backend, so cannot show the figure.
In [45]:
 #  infsimal define
 A = lambda w : array([[0,-w[2],w[1]],[w[2],0,-w[0]],[-w[1],w[0],0]])
 def infrot(w,R,dim=1):
     Rot = np.eye(3)
     Aw = A(w)
     for i in range(dim):
         k = i+1
         Rot = Rot + (1/math.factorial(k))*Aw
         Aw = Aw@Aw
     return Rot@R

strs = ["around X","around Y","around Z"]
for n,s in zip(np.eye(3)*deg2rad(30),strs):
    Rn = infrot(n,np.eye(3))
    Yn = Rn@Xref
    fig,ax = getfig()
    ax.plot(Xref[0,:],Xref[1,:],Xref[2,:],'ob:',label="src")
    ax.plot(Yn[0,:],Yn[1,:],Yn[2,:],'ok:',label="tgt")
    title(s)
    legend()
    fig.show()
C:\Users\acketred\anaconda3\lib\site-packages\ipykernel_launcher.py:21: UserWarning: Matplotlib is currently using module://ipykernel.pylab.backend_inline, which is a non-GUI backend, so cannot show the figure.
C:\Users\acketred\anaconda3\lib\site-packages\ipykernel_launcher.py:21: UserWarning: Matplotlib is currently using module://ipykernel.pylab.backend_inline, which is a non-GUI backend, so cannot show the figure.
C:\Users\acketred\anaconda3\lib\site-packages\ipykernel_launcher.py:21: UserWarning: Matplotlib is currently using module://ipykernel.pylab.backend_inline, which is a non-GUI backend, so cannot show the figure.

回転最適化

In [46]:
#  noised data
X = Xref
Y = Yref + random.randn(3,N)*0.05

fig,ax = getfig()
ax.plot(Xref[0,:],Xref[1,:],Xref[2,:],'ob:',label="Xref")
ax.plot(Yref[0,:],Yref[1,:],Yref[2,:],'or:',label="Yref")
ax.plot(Y[0,:],Y[1,:],Y[2,:],'oc:',label="Y noised data")
legend()
fig.show()

def Cost(Rot):
    return sum(sqrt(sum((Y-Rot@X)**2,1)))

def bind_cost_by_Rot(Rot):
    def cost_by_w(w):
        R=infrot(w,Rot)
        return Cost(R)
    return cost_by_w
C:\Users\acketred\anaconda3\lib\site-packages\ipykernel_launcher.py:10: UserWarning: Matplotlib is currently using module://ipykernel.pylab.backend_inline, which is a non-GUI backend, so cannot show the figure.
  # Remove the CWD from sys.path while we load stuff.
In [47]:
# optimize
Cmin = Cost(Rref)
print("Cmin = {0:f}".format(Cmin))
R = eye(3)
w0 = array([.0,.0,.0])
delta = array([1.0,1.0,1.0])*deg2rad(0.1)
LMcoef = 1
LMratio = 10
data = []
for i in range(100):
    print("-------iter = {0:02d} -------".format(i))
    print("LMcoef = {0:.1e}".format(LMcoef))

    CR = Cost(R)
    CRw = Cost(infrot(delta,R))
    DeltaC = CRw-CR
    print("Cost = {0:.3f}, Cost+w = {1:.3f} , deltaC = {2:.2e}".format(CR,CRw,DeltaC))

    costfun = bind_cost_by_Rot(R)
    g = grad(costfun,w0,delta)
    H = hessian(costfun,w0,delta)
    # LMH = diag(diag(H))
    LMH = np.eye(3)

    CRw_appx = Cost(R) + dot(g,delta) + dot(delta,H@delta)
    C_appxerr = CRw_appx - CRw
    print("approx = {0:.3f}, err = {1:.2e}".format(CRw_appx,C_appxerr))

    def linsolve(H_,g_):
        dw_ = linalg.solve(H_,-g_)
        R_ = infrot(dw_,R,dim=3)
        C_ = Cost(R_)
        return dw_,R_,C_
        
    def LMsolve(LMcoef):
        c0 = LMcoef
        ck = LMcoef/LMratio
        dw0,R0,C0 = linsolve(H+c0*LMH,g)
        dwk,Rk,Ck = linsolve(H+ck*LMH,g)
        print("cost_stay = {0:.3f} @c = {1:.1e} , cost_try = {2:.3f} @c = {3:.1e} , cost_before = {4:.3f}".format(C0,c0,Ck,ck,CR))

        cost_has_increased_on_try = Ck > CR
        cost_has_increased_on_stay = C0 > CR
        print("cost has increased on try? = ",cost_has_increased_on_try)
        print("cost has increased on stay? = ",cost_has_increased_on_stay)
        if (cost_has_increased_on_try) and (not cost_has_increased_on_stay):
            print("accept stay")
            return c0,dw0,R0
        if not cost_has_increased_on_try:
            print("accept try")
            return ck,dwk,Rk
        if cost_has_increased_on_stay:
            print("retry")
            return LMsolve(c0*LMratio)
    
    LMcoef_new,dw,R_new = LMsolve(LMcoef)
    Ddw = linalg.norm(dw)
    data.append({
        "itr":i,
        "Cost":CR,"dCost":CRw,"DeltaC":DeltaC,"cfun":costfun,
        "g":g,"H":H,"dCostAppx":CRw_appx,"appx_err":C_appxerr,
        "LMcoef":LMcoef,"LMcoef_new":LMcoef_new,"dw":dw,"Ddw":Ddw,
        "Rnew":R_new,"R":R,
    })
    LMcoef = LMcoef_new
    R = R_new
    if Ddw < 1e-16:
        break

print("")
print("Rref=",Rref)
print("Rest=",R)
vitr = [dec["itr"] for dec in data]
vDdw = [dec["Ddw"] for dec in data]
vCost = [dec["Cost"] for dec in data]
vLMcoef = [dec["LMcoef"] for dec in data]
vR = [dec["R"]@Xref for dec in data]
Cmin = 0.497224
-------iter = 00 -------
LMcoef = 1.0e+00
Cost = 3.746, Cost+w = 3.742 , deltaC = -3.69e-03
approx = 3.742, err = 1.92e-05
cost_stay = 4.707 @c = 1.0e+00 , cost_try = 39177.665 @c = 1.0e-01 , cost_before = 3.746
cost has increased on try? =  True
cost has increased on stay? =  True
retry
cost_stay = 3.305 @c = 1.0e+01 , cost_try = 4.707 @c = 1.0e+00 , cost_before = 3.746
cost has increased on try? =  True
cost has increased on stay? =  False
accept stay
-------iter = 01 -------
LMcoef = 1.0e+01
Cost = 3.305, Cost+w = 3.300 , deltaC = -4.97e-03
approx = 3.300, err = 2.09e-05
cost_stay = 2.652 @c = 1.0e+01 , cost_try = 10.383 @c = 1.0e+00 , cost_before = 3.305
cost has increased on try? =  True
cost has increased on stay? =  False
accept stay
-------iter = 02 -------
LMcoef = 1.0e+01
Cost = 2.652, Cost+w = 2.646 , deltaC = -6.17e-03
approx = 2.646, err = 2.54e-05
cost_stay = 1.781 @c = 1.0e+01 , cost_try = 16.680 @c = 1.0e+00 , cost_before = 2.652
cost has increased on try? =  True
cost has increased on stay? =  False
accept stay
-------iter = 03 -------
LMcoef = 1.0e+01
Cost = 1.781, Cost+w = 1.774 , deltaC = -6.94e-03
approx = 1.774, err = 3.89e-05
cost_stay = 0.856 @c = 1.0e+01 , cost_try = 5.862 @c = 1.0e+00 , cost_before = 1.781
cost has increased on try? =  True
cost has increased on stay? =  False
accept stay
-------iter = 04 -------
LMcoef = 1.0e+01
Cost = 0.856, Cost+w = 0.850 , deltaC = -6.16e-03
approx = 0.850, err = 9.89e-05
cost_stay = 0.486 @c = 1.0e+01 , cost_try = 1.363 @c = 1.0e+00 , cost_before = 0.856
cost has increased on try? =  True
cost has increased on stay? =  False
accept stay
-------iter = 05 -------
LMcoef = 1.0e+01
Cost = 0.486, Cost+w = 0.487 , deltaC = 3.84e-04
approx = 0.487, err = 2.18e-04
cost_stay = 0.479 @c = 1.0e+01 , cost_try = 0.479 @c = 1.0e+00 , cost_before = 0.486
cost has increased on try? =  False
cost has increased on stay? =  False
accept try
-------iter = 06 -------
LMcoef = 1.0e+00
Cost = 0.479, Cost+w = 0.479 , deltaC = -1.33e-04
approx = 0.479, err = 2.47e-04
cost_stay = 0.479 @c = 1.0e+00 , cost_try = 0.479 @c = 1.0e-01 , cost_before = 0.479
cost has increased on try? =  False
cost has increased on stay? =  False
accept try
-------iter = 07 -------
LMcoef = 1.0e-01
Cost = 0.479, Cost+w = 0.479 , deltaC = -3.58e-05
approx = 0.479, err = 2.45e-04
cost_stay = 0.479 @c = 1.0e-01 , cost_try = 0.479 @c = 1.0e-02 , cost_before = 0.479
cost has increased on try? =  False
cost has increased on stay? =  False
accept try
-------iter = 08 -------
LMcoef = 1.0e-02
Cost = 0.479, Cost+w = 0.479 , deltaC = -2.93e-05
approx = 0.479, err = 2.45e-04
cost_stay = 0.479 @c = 1.0e-02 , cost_try = 0.479 @c = 1.0e-03 , cost_before = 0.479
cost has increased on try? =  False
cost has increased on stay? =  False
accept try
-------iter = 09 -------
LMcoef = 1.0e-03
Cost = 0.479, Cost+w = 0.479 , deltaC = -2.87e-05
approx = 0.479, err = 2.45e-04
cost_stay = 0.479 @c = 1.0e-03 , cost_try = 0.479 @c = 1.0e-04 , cost_before = 0.479
cost has increased on try? =  False
cost has increased on stay? =  False
accept try
-------iter = 10 -------
LMcoef = 1.0e-04
Cost = 0.479, Cost+w = 0.479 , deltaC = -2.87e-05
approx = 0.479, err = 2.45e-04
cost_stay = 0.479 @c = 1.0e-04 , cost_try = 0.479 @c = 1.0e-05 , cost_before = 0.479
cost has increased on try? =  False
cost has increased on stay? =  False
accept try
-------iter = 11 -------
LMcoef = 1.0e-05
Cost = 0.479, Cost+w = 0.479 , deltaC = -2.87e-05
approx = 0.479, err = 2.45e-04
cost_stay = 0.479 @c = 1.0e-05 , cost_try = 0.479 @c = 1.0e-06 , cost_before = 0.479
cost has increased on try? =  False
cost has increased on stay? =  False
accept try
-------iter = 12 -------
LMcoef = 1.0e-06
Cost = 0.479, Cost+w = 0.479 , deltaC = -2.87e-05
approx = 0.479, err = 2.45e-04
cost_stay = 0.479 @c = 1.0e-06 , cost_try = 0.479 @c = 1.0e-07 , cost_before = 0.479
cost has increased on try? =  False
cost has increased on stay? =  False
accept try
-------iter = 13 -------
LMcoef = 1.0e-07
Cost = 0.479, Cost+w = 0.479 , deltaC = -2.87e-05
approx = 0.479, err = 2.45e-04
cost_stay = 0.479 @c = 1.0e-07 , cost_try = 0.479 @c = 1.0e-08 , cost_before = 0.479
cost has increased on try? =  False
cost has increased on stay? =  False
accept try
-------iter = 14 -------
LMcoef = 1.0e-08
Cost = 0.479, Cost+w = 0.479 , deltaC = -2.87e-05
approx = 0.479, err = 2.45e-04
cost_stay = 0.479 @c = 1.0e-08 , cost_try = 0.479 @c = 1.0e-09 , cost_before = 0.479
cost has increased on try? =  False
cost has increased on stay? =  False
accept try
-------iter = 15 -------
LMcoef = 1.0e-09
Cost = 0.479, Cost+w = 0.479 , deltaC = -2.87e-05
approx = 0.479, err = 2.45e-04
cost_stay = 0.479 @c = 1.0e-09 , cost_try = 0.479 @c = 1.0e-10 , cost_before = 0.479
cost has increased on try? =  False
cost has increased on stay? =  False
accept try
-------iter = 16 -------
LMcoef = 1.0e-10
Cost = 0.479, Cost+w = 0.479 , deltaC = -2.87e-05
approx = 0.479, err = 2.45e-04
cost_stay = 0.479 @c = 1.0e-10 , cost_try = 0.479 @c = 1.0e-11 , cost_before = 0.479
cost has increased on try? =  False
cost has increased on stay? =  False
accept try
-------iter = 17 -------
LMcoef = 1.0e-11
Cost = 0.479, Cost+w = 0.479 , deltaC = -2.87e-05
approx = 0.479, err = 2.45e-04
cost_stay = 0.479 @c = 1.0e-11 , cost_try = 0.479 @c = 1.0e-12 , cost_before = 0.479
cost has increased on try? =  True
cost has increased on stay? =  True
retry
cost_stay = 0.479 @c = 1.0e-10 , cost_try = 0.479 @c = 1.0e-11 , cost_before = 0.479
cost has increased on try? =  True
cost has increased on stay? =  True
retry
cost_stay = 0.479 @c = 1.0e-09 , cost_try = 0.479 @c = 1.0e-10 , cost_before = 0.479
cost has increased on try? =  True
cost has increased on stay? =  True
retry
cost_stay = 0.479 @c = 1.0e-08 , cost_try = 0.479 @c = 1.0e-09 , cost_before = 0.479
cost has increased on try? =  True
cost has increased on stay? =  True
retry
cost_stay = 0.479 @c = 1.0e-07 , cost_try = 0.479 @c = 1.0e-08 , cost_before = 0.479
cost has increased on try? =  True
cost has increased on stay? =  True
retry
cost_stay = 0.479 @c = 1.0e-06 , cost_try = 0.479 @c = 1.0e-07 , cost_before = 0.479
cost has increased on try? =  True
cost has increased on stay? =  True
retry
cost_stay = 0.479 @c = 1.0e-05 , cost_try = 0.479 @c = 1.0e-06 , cost_before = 0.479
cost has increased on try? =  True
cost has increased on stay? =  True
retry
cost_stay = 0.479 @c = 1.0e-04 , cost_try = 0.479 @c = 1.0e-05 , cost_before = 0.479
cost has increased on try? =  True
cost has increased on stay? =  True
retry
cost_stay = 0.479 @c = 1.0e-03 , cost_try = 0.479 @c = 1.0e-04 , cost_before = 0.479
cost has increased on try? =  True
cost has increased on stay? =  True
retry
cost_stay = 0.479 @c = 1.0e-02 , cost_try = 0.479 @c = 1.0e-03 , cost_before = 0.479
cost has increased on try? =  True
cost has increased on stay? =  True
retry
cost_stay = 0.479 @c = 1.0e-01 , cost_try = 0.479 @c = 1.0e-02 , cost_before = 0.479
cost has increased on try? =  True
cost has increased on stay? =  True
retry
cost_stay = 0.479 @c = 1.0e+00 , cost_try = 0.479 @c = 1.0e-01 , cost_before = 0.479
cost has increased on try? =  True
cost has increased on stay? =  True
retry
cost_stay = 0.479 @c = 1.0e+01 , cost_try = 0.479 @c = 1.0e+00 , cost_before = 0.479
cost has increased on try? =  True
cost has increased on stay? =  True
retry
cost_stay = 0.479 @c = 1.0e+02 , cost_try = 0.479 @c = 1.0e+01 , cost_before = 0.479
cost has increased on try? =  True
cost has increased on stay? =  True
retry
cost_stay = 0.479 @c = 1.0e+03 , cost_try = 0.479 @c = 1.0e+02 , cost_before = 0.479
cost has increased on try? =  True
cost has increased on stay? =  True
retry
cost_stay = 0.479 @c = 1.0e+04 , cost_try = 0.479 @c = 1.0e+03 , cost_before = 0.479
cost has increased on try? =  True
cost has increased on stay? =  False
accept stay

Rref= [[ 0.73994211  0.45333139  0.49696712]
 [ 0.19826689  0.5589964  -0.80511693]
 [-0.64278761  0.69427204  0.32374437]]
Rest= [[ 0.75879988  0.42453021  0.4980098 ]
 [ 0.20318742  0.56653349 -0.80374073]
 [-0.62257474  0.71165696  0.34356955]]
In [48]:
def plot_Cost():
    figure()
    plot(vitr,vCost,".-",label="cost")
    plot(vitr,ones(len(vitr))*Cmin,label="theoretically min cost")
    title("cost")
    legend()
    xlabel("iter")
    ylabel("Cost")
    grid()

    figure()
    plot(vitr,vCost,".-",label="cost")
    plot(vitr,ones(len(vitr))*Cmin,label="theoretically cost")
    title("cost (zoom)")
    xlabel("iter")
    ylabel("Cost")
    ylim(0,vCost[-1]*1.5)
    legend()
    grid()
plot_Cost()

def plot_Ddw():
    figure()
    plot(vitr,[rad2deg(r) for r in vDdw],".-")
    title("|dw|")
    xlabel("iter")
    ylabel("|dw| deg")
    grid()
plot_Ddw()

def plot_LMcoef():
    figure()
    plot(vitr,vLMcoef,".-")
    yscale("log")
    grid()
    xlabel("iter")
    ylabel("LM coef")
    title("LMcoef")
plot_LMcoef()

def plot_R():
    fig,ax = getfig()
    for Xest in vR:
        ax.plot(Xest[0,:],Xest[1,:],Xest[2,:],'.k:',label="optimize trace")
    ax.plot(Xref[0,:],Xref[1,:],Xref[2,:],'ob:',label="initial condition")
    ax.plot(Yref[0,:],Yref[1,:],Yref[2,:],'or:',label="Yref")
    title("est")

    fig,ax = getfig()
    for Xest in vR:
        ax.plot(Xest[0,:],Xest[1,:],Xest[2,:],'.k:',label="optimize trace")
    ax.plot(Xref[0,:],Xref[1,:],Xref[2,:],'ob:',label="initial condition")
    ax.plot(Yref[0,:],Yref[1,:],Yref[2,:],'or:',label="Yref")
    title("est")
    ax.view_init(0,45)
plot_R()
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]: