Matlab実現FR共役勾配法


この間無拘束最適化法を学び,今日Matlabを用いて無拘束最適化問題を解くFR共役勾配法を実現した.共役勾配法の理論紹介については,私のもう一つの文章の無拘束最適化法を参照してノートを学習してください.
ファイルmは共役勾配法関数の試験に用いられる.テストファイルは、関数fと引数xを定義し、反復初期値x 0と許容誤差を与える必要があります.ϵ .関数にshow_が設定されていますdetail変数は、各ステップの反復情報を表示するかどうかを制御するために使用されます.
% test conjungate gradient method
% by TomHeaven, [email protected], 2015.08.25

%% define function and variable
syms x1 x2;
%f = xs^2+2*ys^2-2*xs*ys + 2*ys + 2;
f = (x1-1)^4 + (x1 - x2)^2;
%f = (1-x1)^2 + 2*(x2 - x1^2)^2;
x = {x1, x2};

% initial value
x0 = [0 0];
% tolerance
epsilon = 1e-1;

%% call conjungate gradient method
show_detail = true;
[bestf, bestx, count] = conjungate_gradient(f, x, x0, epsilon, show_detail);
% print result
fprintf('bestx = %s, bestf = %f, count = %d
'
, num2str(bestx), bestf, count);

ファイルconjungate_gradient.mは共役勾配法の実現関数である.変数nfは関数fの勾配∇f(勾配のギリシャ文字はnablaであるためnf)を表す.
function [fv, bestx, iter_num] = conjungate_gradient(f, x, x0, epsilon, show_detail)
%% conjungate gradient method
% by TomHeaven, [email protected], 2015.08.25
% Input: 
%   f - syms function
%   x - row cell arrow for input syms variables
%   $x_0$ - init point
%   epsilon - tolerance
%   show_detail - a boolean value for wether to print details
% Output:
%   fv - minimum f value
%   bestx - mimimum point
%   iter_num - iteration count

%% init
syms lambdas  % suffix s indicates this is a symbol variable

% n is the dimension
n = length(x);

% compute differential of function f stored in cell nf
nf = cell(1, n); % using row cells, column cells will result in error
for i = 1 : n
    nf{i} = diff(f, x{i});
end

% $
abla f(x_0)$
nfv = subs(nf, x, x0); % init $
abla f(x_k)$
nfv_pre = nfv; % init count, k and xv for x value. count = 0; k = 0; xv = x0; % initial search direction d = - nfv; % show initial info if show_detail fprintf('Initial:
'
); fprintf('f = %s, x0 = %s, epsilon = %f

'
, char(f), num2str(x0), epsilon); end %% loop while (norm(nfv) > epsilon) %% one-dimensional search % define $x_{k+1} = x_{k} + \lambda d$ xv = xv+lambdas*d; % define $\phi$ and do 1-dim search phi = subs(f, x, xv); nphi = diff(phi); % $
abla \phi$
lambda = solve(nphi); % get rid of complex and minus solution lambda = double(lambda); if length(lambda) > 1 lambda = lambda(abs(imag(lambda)) < 1e-5); lambda = lambda(lambda > 0); lambda = lambda(1); end % if $\lambda$ is too small, stop iteration if lambda < 1e-5 break; end %% update % update $x_{k+1} = x_{k} + \lambda d$ xv = subs(xv, lambdas, lambda); % convert sym to double xv = double(xv); % compute the differential nfv = subs(nf, x, xv); % update counters count = count + 1; k = k + 1; % compute alpha based on FR formula alpha = sumsqr(nfv) / sumsqr(nfv_pre); % show iteration info if show_detail fprintf('Iteration: %d
'
, count); fprintf('x(%d) = %s, lambda = %f
'
, count, num2str(xv), lambda); fprintf('nf(x) = %s, norm(nf) = %f
'
, num2str(double(nfv)), norm(double(nfv))); fprintf('d = %s, alpha = %f
'
, num2str(double(d)), double(alpha)); fprintf('
'
); end % update conjungate direction d = -nfv + alpha * d; % save the previous $$
abla f(x_k)$$
nfv_pre = nfv; % reset the conjungate direction and k if k >= n if k >= n k = 0; d = - nfv; end end % while %% output fv = double(subs(f, x, xv)); bestx = double(xv); iter_num = count; end

testConjungateGradientを実行すると、次のように出力されます.
>> testConjungateGradient
Initial:
f = (x1 - x2)^2 + (x1 - 1)^4, x0 = 0  0, epsilon = 0.100000

Iteration: 1
x(1) = 0.41025           0, lambda = 0.102561
nf(x) = 1.08e-16    -0.82049, norm(nf) = 0.820491
d = 4  0, alpha = 0.042075

Iteration: 2
x(2) = 0.52994     0.58355, lambda = 0.711218
nf(x) = -0.52265     0.10721, norm(nf) = 0.533528
d = 0.1683     0.82049, alpha = 0.422831

Iteration: 3
x(3) = 0.63914     0.56115, lambda = 0.208923
nf(x) = -0.031994    -0.15597, norm(nf) = 0.159223
d = 0.52265    -0.10721, alpha = 0.089062

Iteration: 4
x(4) = 0.76439     0.79465, lambda = 1.594673
nf(x) = -0.11285    0.060533, norm(nf) = 0.128062
d = 0.078542     0.14643, alpha = 0.646892

Iteration: 5
x(5) = 0.79174     0.77998, lambda = 0.242379
nf(x) = -0.012614   -0.023517, norm(nf) = 0.026686
d = 0.11285   -0.060533, alpha = 0.043425

bestx = 0.79174     0.77998, bestf = 0.002019, count = 5

許容誤差の変更
epsilon = 1e-8;

より正確な結果が得られます.
Iteration: 6
x(6) = 0.9026      0.9122, lambda = 6.329707
nf(x) = -0.022884    0.019188, norm(nf) = 0.029864
d = 0.017515    0.020888, alpha = 1.252319

Iteration: 7
x(7) = 0.90828     0.90744, lambda = 0.247992
nf(x) = -0.0014077  -0.0016788, norm(nf) = 0.002191
d = 0.022884   -0.019188, alpha = 0.005382

Iteration: 8
x(8) = 0.97476     0.97586, lambda = 43.429293
nf(x) = -0.0022668   0.0022025, norm(nf) = 0.003161
d = 0.0015309   0.0015756, alpha = 2.080989

Iteration: 9
x(9) = 0.97533     0.97531, lambda = 0.249812
nf(x) = -2.9597e-05 -3.0461e-05, norm(nf) = 0.000042
d = 0.0022668  -0.0022025, alpha = 0.000181

Iteration: 10
x(10) = 0.99709     0.99712, lambda = 725.188481
nf(x) = -5.2106e-05  5.2008e-05, norm(nf) = 0.000074
d = 3.0006e-05  3.0063e-05, alpha = 3.004594

Iteration: 11
x(11) = 0.9971      0.9971, lambda = 0.249997
nf(x) = -4.8571e-08 -4.8663e-08, norm(nf) = 0.000000
d = 5.2106e-05 -5.2008e-05, alpha = 0.000001

Iteration: 12
x(12) = 0.99992     0.99992, lambda = 57856.826721
nf(x) = -9.3751e-08  9.3748e-08, norm(nf) = 0.000000
d = 4.8616e-08  4.8617e-08, alpha = 3.718503

Iteration: 13
x(13) = 0.99992     0.99992, lambda = 0.250000
nf(x) = -1.1858e-12 -1.1855e-12, norm(nf) = 0.000000
d = 9.3751e-08 -9.3748e-08, alpha = 0.000000

bestx = 0.99992     0.99992, bestf = 0.000000, count = 13

これは問題の最適解(1,1)Tに非常に近い.
アルゴリズム実装は大量のテストを経ていないので,実際に使用するとBUGがある可能性がある.ここでは基本的な実現原理を説明するために用いられ,興味のある読者はこれに基づいて改善することができる.