The ano-Deep Learning Tutorisメモ:Class iftying MNIST digits using Logistic Regression


教程の住所:http://www.deeplearning.net/tutorial/logreg.html#logreg
This sections assiumes famirity with the follwing The ano concepts:shared variables、baic arthmetic ops、T.grad、float X.If you inted to run the code on GPU also read GPU.
The code for this section is available for download here.
 
The Model
ufladlのソフトマックス教程:http://deeplearning.stanford.edu/wiki/index.php/Softmax%E5%9B%9E%E5%BD%92
ufladlでsofymaxで答えを練習します.http://blog.csdn.net/u012816943/article/details/50357801
教程はLogistic Regressionと書いていますが、モデルはソフトマックスで、Logistic Regressionはソフトマックスが2種類の時の特殊な状況です.ここの数字は0-9、10種類の問題です.input vectorは10個の超平面に投影し、ある超平面までの距離はある種類の確率を反映しています.
Mathematicaally、the probability that an input vector is a member of a class、a value of a stochastic variable Y、can be written as:
P(Y=i|x, W,b) &= softmax_i(W x + b) \\              &= \frac {e^{W_i x + b_i}} {\sum_j e^{W_j x + b_j}}
The model’s predition is the clast whose probability is maximal、speciflially:
 # initialize with 0 the weights W as a matrix of shape (n_in, n_out)
        self.W = theano.shared(
            value=numpy.zeros(
                (n_in, n_out),
                dtype=theano.config.floatX
            ),
            name='W',
            borrow=True
        )
        # initialize the biases b as a vector of n_out 0s
        self.b = theano.shared(
            value=numpy.zeros(
                (n_out,),
                dtype=theano.config.floatX
            ),
            name='b',
            borrow=True
        )

        # symbolic expression for computing the matrix of class-membership
        # probabilities
        # Where:
        # W is a matrix where column-k represent the separation hyperplane for
        # class-k
        # x is a matrix where row-j  represents input training sample-j
        # b is a vector where element-k represent the free parameter of
        # hyperplane-k
        self.p_y_given_x = T.nnet.softmax(T.dot(input, self.W) + self.b)

        # symbolic description of how to compute prediction as class whose
        # probability is maximal
        self.y_pred = T.argmax(self.p_y_given_x, axis=1)
Note:For a complette list of The ano ops,see:list of ops
 
Defining a Loss Function
Let us first start by defining the likelihood and loss:
\mathcal{L} (\theta=\{W,b\}, \mathcal{D}) =  \sum_{i=0}^{|\mathcal{D}|} \log(P(Y=y^{(i)}|x^{(i)}, W,b)) \\\ell (\theta=\{W,b\}, \mathcal{D}) = - \mathcal{L} (\theta=\{W,b\}, \mathcal{D})
 
return -T.mean(T.log(self.p_y_given_x)[T.arange(y.shape[0]), y])
y.sharpe[0]はyの行数であり、サンプル個数nである.
T.arange(y.sharpe[0])は[0,1,2,…n-1]です.
T.log(self.puyugivenux)はマトリクスであり、n行の各動作はサンプルであり、10列の各列はクラスである.
最後のreturn 1つの10次元ラインベクトル
Creating a LogisticRegression class
class LogisticRegression:この類のコードは貼りません.
We instantiate this clas follows:
# generate symbolic variables for input (x and y represent a
    # minibatch)
    x = T.matrix('x')  # data, presented as rasterized images
    y = T.ivector('y')  # labels, presented as 1D vector of [int] labels

    # construct the logistic regression class
    # Each MNIST image has size 28*28
    classifier = LogisticRegression(input=x, n_in=28 * 28, n_out=10)
注意上の最後のコードが入ってきたパラメータは、LogisticRegression類の_uinit_(self、input、nuuin、nuout)は、コンストラクションに似ているでしょう.
損失関数を定義:
cost = classifier.negative_log_likelihood(y)
ここには入力xが含まれています.
Learning the Model
Thanoは損失関数を導いてパラメータを導き出す必要がないので、直接T.grad関数で求めます.
g_W = T.grad(cost=cost, wrt=classifier.W)
g_b = T.grad(cost=cost, wrt=classifier.b)
トレイmodelはThe ano functionで、勾配降下のたびにコードはこの関数を繰り返し実行します.コードは以下の通りです.
# specify how to update the parameters of the model as a list of
    # (variable, update expression) pairs.
    updates = [(classifier.W, classifier.W - learning_rate * g_W),
               (classifier.b, classifier.b - learning_rate * g_b)]

    # compiling a Theano function `train_model` that returns the cost, but in
    # the same time updates the parameter of the model based on the rules
    # defined in `updates`
    train_model = theano.function(
        inputs=[index],
        outputs=cost,
        updates=updates,
        givens={
            x: train_set_x[index * batch_size: (index + 1) * batch_size],
            y: train_set_y[index * batch_size: (index + 1) * batch_size]
        }
    )
明らかにudatesはパラメータWを表しています.bはどうやって更新しますか?udates is a list of pairs、givens is a dictionary
Testing the model
LogisticRegression類には予測エラー率を求める関数が書いてあります.
def errors(self, y):
        """Return a float representing the number of errors in the minibatch
        over the total number of examples of the minibatch ; zero one
        loss over the size of the minibatch

        :type y: theano.tensor.TensorType
        :param y: corresponds to a vector that gives for each example the
                  correct label
        """

        # check if y has same dimension of y_pred
        if y.ndim != self.y_pred.ndim:
            raise TypeError(
                'y should have the same shape as self.y_pred',
                ('y', y.type, 'y_pred', self.y_pred.type)
            )
        # check if y is of the correct datatype
        if y.dtype.startswith('int'):
            # the T.neq operator returns a vector of 0s and 1s, where 1
            # represents a mistake in prediction
            return T.mean(T.neq(self.y_pred, y))
        else:
            raise NotImplementedError()
test_modelとvalidate_model、違いはデータセットだけ違って、validate_モデルはearly-stoppingで使われます.
 
 # compiling a Theano function that computes the mistakes that are made by
    # the model on a minibatch
    test_model = theano.function(
        inputs=[index],
        outputs=classifier.errors(y),
        givens={
            x: test_set_x[index * batch_size: (index + 1) * batch_size],
            y: test_set_y[index * batch_size: (index + 1) * batch_size]
        }
    )

    validate_model = theano.function(
        inputs=[index],
        outputs=classifier.errors(y),
        givens={
            x: valid_set_x[index * batch_size: (index + 1) * batch_size],
            y: valid_set_y[index * batch_size: (index + 1) * batch_size]
        }
    )
 
後ろにcpickle saveとloadが二つあります.
# save the best model
                    with open('best_model.pkl', 'w') as f:
                        cPickle.dump(classifier, f)
 
# load the saved model
    classifier = cPickle.load(open('best_model.pkl'))