2017-04-13

excel vba で右クリックのメニュを追加し、"値と数値の書式"の形式を指定して貼り付け

次のような感じみたい

Sub AddClickMenu()
  With CommandBars("Cell").Controls.Add(Before:=1)
    .Caption = "値と数値の書式を貼り付け"
    .OnAction = "PasteValAndForm"
  End With
End Sub

Sub DelClickMenu()
    CommandBars("Cell").Controls("値と数値の書式を貼り付け").Delete
End Sub


Sub PasteValAndForm()
    ActiveWindow.ActiveCell.PasteSpecial Paste:=xlPasteValuesAndNumberFormats
End Sub

2017-04-12

改正個人情報保護法の全面施行日は平成29年5月30日

さっき、初めて知りました…

http://www.ppc.go.jp/personal/preparation/

2017-04-09

MNISTデータによる手書き数字「0～9」の文字認識 (deep learning & python)

で、先程のエントリに関連して、MNISTデータによる手書き数字「0～9」の文字認識。というより、これまでと同様の写経。

#!python
# -*- coding: utf-8 -*-
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
from tensorflow.examples.tutorials.mnist import input_data

def main():
    np.random.seed(20170409)

    # MNSIST dataのdownload
    mnist = input_data.read_data_sets("tmp/data/", one_hot=True)


x = tf.placeholder(tf.float32, [None, 784])
    w = tf.Variable(tf.zeros([784, 10]))
    w0 = tf.Variable(tf.zeros([10]))
    f = tf.matmul(x, w) + w0
    p = tf.nn.softmax(f)


    t = tf.placeholder(tf.float32, [None, 10])
    # loss: 誤差関数
    loss = -tf.reduce_sum(t * tf.log(p))
    # train_step: トレーニングアルゴリズム
    train_step = tf.train.AdamOptimizer().minimize(loss)
    # correct_prediction: 予測値と正解値を比較し、正解or notを格納した配列
    # ※1
    correct_prediction = tf.equal(tf.argmax(p, 1), tf.argmax(t, 1))
    # 配列である correct_prediction より、正解率を算出
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

    
    sess = tf.InteractiveSession()
#    sess.run(tf.initialize_all_variables()) # for tensorflow ver0.1
    sess.run( tf.global_variables_initializer() )

    i = 0
    for _ in range(2000):
        i += 1
        batch_xs, batch_ts = mnist.train.next_batch(100)
        sess.run(train_step, feed_dict={x: batch_xs, t: batch_ts})
        if i % 100 == 0:
            loss_val, acc_val = sess.run([loss, accuracy],
                feed_dict={x:mnist.test.images, t: mnist.test.labels})
            print ('Step: %d, Loss: %f, Accuracy: %f'
                   % (i, loss_val, acc_val))



if __name__ == '__main__':
    main()

↑こう書くと↓こう表示されます

$ python foo_2_4.py 
Extracting tmp/data/train-images-idx3-ubyte.gz
Extracting tmp/data/train-labels-idx1-ubyte.gz
Extracting tmp/data/t10k-images-idx3-ubyte.gz
Extracting tmp/data/t10k-labels-idx1-ubyte.gz
Step: 100, Loss: 7747.077637, Accuracy: 0.848400
Step: 200, Loss: 5439.363281, Accuracy: 0.879900
Step: 300, Loss: 4556.467773, Accuracy: 0.890900
  :
Step: 2000, Loss: 2848.940674, Accuracy: 0.922500
$

tf.equal(tf.argmax(p, 1), tf.argmax(t, 1)) の考え方

前回のエントリにもあるように、正解データであるTのn行目データは、l(エル)番目のみ"1"が登録されています。 (例："7"の画像である場合、7番目に"1"が登録)

予測関数であるPのn行目データは、P1～PKが確率である0～1の値を取ります。例えば、"7"の画像である可能性が高い場合、P7が1に最も近い値となります。

tf.argmax()は与えられた配列の中で、最も大きな値を持つインデックス(配列番号)を返す関数ですので、 tf.equal(tf.argmax(p, 1), tf.argmax(t, 1))とすることで正解 or not を評価しています。

このように、全データのうち一部を取出しながら、最適化するトレーニングを「ミニバッチ」と呼ぶようです。

2017-04-09

3次元超も扱う線形多項分類

以前のエントリで扱った線形多項分類は3次元でしたので、モデルの図示も容易でしたが、今回は、3次元超も扱える線形多項分類を考えます。

基本となる予測関数とソフトマックス関数

座標:(x1,x2, … , xM)を持つM次空間をK個の領域に分割する予測関数とソフトマックス関数は次の通り。

$\displaystyle \Large f_{k} (x_1, x_2, \cdots , x_M) = w_{0k} + w_{k1} x_{1} + w_{k2} x_{2} + \cdots + w_{kM} x_{M} \\ k = 1 , \cdots , M$

$\displaystyle \Large P_{k} (x_{1}, \cdots ,x_{M}) = \frac{ e^{fk(x_{1}, \cdots ,x_{M})}} { \sum_{l=1}^{M} e^{fl(x_{1}, \cdots ,x_{M})}}$

予測関数とソフトマックス関数を、行列式で表す

N個のトレーニングデータにある

28x28ピクセル(=784次元)の画像に記載された0～9の数値を想定した場合、

M=784, K=10 となり、先程の予測関数は、次の行列式で表せます。

となる。

更にこれをソフトマックス関数で表すと、次のようになります。

これを更に変形して、n番目(n行目?)の正解を予測する式は次の通り

ここで、tlnは $\displaystyle \large t_{n} = (0, \cdots, 0,1,0,\cdots, 0)$ のように l(エル)番目のみが"1"の行列で、 $\displaystyle \large x^{0} = 1 , x^{1} = x$ の性質を利用しています。

上記のPnは、ある行に限定されたものですので、これをPの行列全体の確率にするには次の通り

上記の式は、掛け算が多く、計算効率が低い為、最後に $\displaystyle \large E = - log P$ の形に変形して完成。

2017-04-08

はてなブログの数式(tex記法)で改行するなら、\\ でなく \\\ (バックスラッシュ3コ)

「tex　数式　改行」や「はてなブログ数式改行」でググっても、なかなか見つからないので、メモ

はてなブログででは、今回の改行に関らず、正しいtex記法でも、数式が崩れる場合、「￥(バックスラッシュ)」でのエスケープや、 <pre> タグで囲む等を必要とする場合があるようです。

2017-04-08

ソフトマックス関数による線形多項分類

前回までのエントリでは、二項分類(パーセプトロン)を扱っていましたが、今回は、3種以上の分類を行う線形多項分類。

end0tknr.hateblo.jp

基本は、予測関数 f(x1,x2) で形成される平面を考える

今回の線形多項分類では、以下の予測関数 f(x1,x2) と x1, x2, f(x1,x2) により形成される平面を利用します。

$\displaystyle f(x_1, x_2) = w_0 + w_1 \cdot x_1 + w_2 \cdot x_2$

f:id:end0tknr:20170408153449p:plain

3種へ分類する場合、交差する3平面による交点を算出

今回の多項分類では3種に分類しますが、この分類の為に、交差する3平面による交点を算出します。

f:id:end0tknr:20170408153455p:plain

$\displaystyle f1(x1, x2) = w01 + w11 \cdot x1 + w21 \cdot x2$

$\displaystyle f2(x1, x2) = w02 + w12 \cdot x1 + w22 \cdot x2$

$\displaystyle f3(x1, x2) = w03 + w13 \cdot x1 + w23 \cdot x2$

上記の3平面の交点は、次の連立方程式により求めることができます

$\begin{cases} f1(x1, x2) = f2(x1, x2) \\ f2(x1, x2) = f3(x1, x2) \end{cases}$

この連立方程式を行列で表すと、次の通り

$\displaystyle M \cdot \left( \begin{array}{c} x1 \\ x2 \end{array} \right) = w$

$\displaystyle M = \left( \begin{array}{cc} w11 - w12 , w21 - w22 \\ w12 - w13 , w22 - w23 \end{array} \right)$ $\displaystyle w = \left( \begin{array}{c} w02 - w01 \\ w03 - w02 \end{array} \right)$

よって、3平面の工程は、Mの逆行列により求まります

$\displaystyle \underline{ \left( \begin{array}{c} x1 \\ x2 \end{array} \right) = M^{-1} \cdot w }$

f1(x1,x2), f2(x1,x2) , f3(x1,x2) をソフトマックスにより確率で表現

ある点(x1, x2)が、領域(1)～(3)に属する確率を、 P1(x1,x2), P2(x1,x2), P2(x1,x2) としたとき、 P1(x1,x2) + P2(x1,x2) + P2(x1,x2) = 1 が成立しますが、これをソフトマックス関数で表すと、次のようになります。

$\displaystyle Pi(x1,x2) = \frac{ e^{fi(x1,x2)} }{ \sum_{j=1}^{3} e^{fj(x1,x2)} }$ 　　 $\displaystyle i=1, 2, 3$

f:id:end0tknr:20170408153500p:plain

ソフトマックス関数からシグモイド関数を導出

先程のソフトマックス関数までで、線形多項分類の内容は、ほぼ完了ですが、おまけでソフトマックス関数からシグモイド関数を導出します。

先程のソフトマックス関数において、j=2 のとき、i=1の式は次のようになります。

$\displaystyle P1(x1,x2) = \frac{ e^{f1(x1,x2)} }{ e^{f1(x1,x2)} + e^{f2(x1,x2)} }$

この分母分子を $\displaystyle e^{f1(x1,x2)}$ で割り、少々、変更するとシグモイド関数を導出できます。 $\displaystyle f1(x1,x2) = \frac{ 1 }{ 1 + e^{f2(x1,x2) - f1(x1,x2)} }$

ソフトマックス関数の微分 (導関数)

もう一つおまけで、ソフトマックス関数の微分 (導関数)を記載しておきます。

$\frac{dyi}{dxi} = \begin{cases} yi \cdot (1 - yi) \leftarrow i＝jの場合 \\ - yi \cdot yj \leftarrow i≠jの場合 \\ \end{cases}$

以前のエントリでシグモイド関数の微分(導関数)の導出を行っていますので、今回、ソフトマックス関数の微分の導出は記載しません。

end0tknr.hateblo.jp

2017-04-08

ロジスティック回帰による二項分類/パーセプトロン (2/2) ( deep learning & python )

先日のシグモイド関数(ロジスティック関数)を用いたtensoflow実装。

end0tknr.hateblo.jp

というより、↓こちらの Chapter2の写経。

github.com

#!/usr/local/bin/python
# -*- coding: utf-8 -*-
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
from numpy.random import multivariate_normal, permutation
import pandas as pd
from pandas import DataFrame, Series

def make_training_data():
    np.random.seed(20160512)
    
    # t=1 : 2種の薬(X1, X2)投与による効果がない場合
    mu0, variance0, n0 = [10, 11], 20, 20  #平均, 分散, data数
    # multivariate_normal() : 多次元正規分布の乱数を生成
    # ├ param1 : 平均
    # ├ param2 : 共分散行列. np.eye(2)は2x2の単位行列生成
    # └ param3 : data数
    data0 = multivariate_normal(mu0, np.eye(2)*variance0 ,n0)
    df0 = DataFrame(data0, columns=['x1','x2'])
    df0['t'] = 0

    # t=1 : 2種の薬(X1, X2)投与による効果がある場合
    mu1, variance1, n1  = [18, 20], 15, 22  #平均, 分散, data数
    data1 = multivariate_normal(mu1, np.eye(2)*variance1 ,n1)
    df1 = DataFrame(data1, columns=['x1','x2'])
    df1['t'] = 1
    
    # 2個の行列を連結(≠結合)
    df = pd.concat([df0, df1], ignore_index=True)

    train_set = df.reindex(permutation(df.index)).reset_index(drop=True)

    # train_setに含まれるx1, x2, t列を{x1, x2}と{t}に分割
    train_x = train_set[['x1','x2']].as_matrix()
    train_t = train_set['t'].as_matrix().reshape([len(train_set), 1])

    return train_x, train_t

# 予測関数作成
def make_predict_func():
    x = tf.placeholder(tf.float32, [None, 2])
    w = tf.Variable(tf.zeros([2, 1]))
    w0 = tf.Variable(tf.zeros([1]))
    f = tf.matmul(x, w) + w0  # f(x) = wx + w0   ※w,x,w0はいずれもベクトル
    p = tf.sigmoid(f)         # シグモイド関数 = ロジスティック関数
    return p, w, x, w0

# 誤差関数
def make_err_func(p):
    t = tf.placeholder(tf.float32, [None, 1])
    # 最尤推定を行う誤差関数
    loss = -tf.reduce_sum(t*tf.log(p) + (1-t)*tf.log(1-p))
    # 勾配降下法によるトレーニングアルゴリズム
    train_step = tf.train.AdamOptimizer().minimize(loss)
    # pとtの符号で比較する為、-0.5を実施
    correct_prediction = tf.equal(tf.sign(p-0.5), tf.sign(t-0.5))
    # reduce_mean()とはベクトルの各成分の平均値算出
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

    return loss, t, train_step, accuracy

def main():
    # トレーニングデータ
    train_x, train_t = make_training_data()
    # 予測関数
    p ,w, x, w0 = make_predict_func()
    # 誤差関数
    loss, t, train_step, accuracy = make_err_func(p)

    # セッション作成 & Variable初期化
    sess = tf.Session()
    sess.run(tf.initialize_all_variables())

    # 勾配降下法によるパラメーター最適化
    i = 0
    for _ in range(30000):
        i += 1
        sess.run(train_step, feed_dict={x:train_x, t:train_t})
        if i % 2000 == 0:
            loss_val, acc_val = sess.run(
                [loss, accuracy], feed_dict={x:train_x, t:train_t})
            print ('itep: %d, loss: %f, accuracy: %f'
                   % (i, loss_val, acc_val))

    # 結果(w0, w1, w2)の取り出し
    w0_val, w_val = sess.run([w0, w])
    w0_val, w1_val, w2_val = w0_val[0], w_val[0][0], w_val[1][0]
    print "w0:", w0_val, " w1:",w1_val, " w2:",w2_val


if __name__ == '__main__':
    main()

↑こう書くと↓こう表示されます

$ ./foo_2.py 
itep: 2000, loss: 17.505960, accuracy: 0.857143
itep: 4000, loss: 12.778822, accuracy: 0.928571
itep: 6000, loss: 9.999125, accuracy: 0.928571
itep: 8000, loss: 8.244436, accuracy: 0.976190
itep: 10000, loss: 7.087447, accuracy: 0.952381
itep: 12000, loss: 6.303907, accuracy: 0.952381
itep: 14000, loss: 5.765183, accuracy: 0.952381
itep: 16000, loss: 5.393257, accuracy: 0.952381
itep: 18000, loss: 5.138913, accuracy: 0.952381
itep: 20000, loss: 4.969873, accuracy: 0.952381
itep: 22000, loss: 4.863929, accuracy: 0.952381
itep: 24000, loss: 4.804683, accuracy: 0.952381
itep: 26000, loss: 4.778569, accuracy: 0.952381
itep: 28000, loss: 4.772072, accuracy: 0.952381
itep: 30000, loss: 4.771708, accuracy: 0.952381
w0: -21.0061  w1: 0.849911  w2: 0.621193

2017-04-06

ロジスティック回帰による二項分類/パーセプトロン (1/2)

シグモイド関数(ロジスティック関数)の理解度の整理を目的に、 2種の薬(X1, X2)投与による効果予測(解消 or not)をロジスティック回帰による二項分類で行います。

今回は、シグモイド関数を使用した予測関数の作成と、最尤推定による誤差関数の作成までを行います。

f:id:end0tknr:20170406095513p:plain

シグモイド関数を使用した予測関数

先程の左上図における境界関数を一次式で次のように設定します。

$\displaystyle f(x_1, x_2) = w_0 + w_1 \cdot x_1 + w_2 \cdot x_2$

次に、この f(X1, X2) による効果のある確率をシグモイド関数を使って表します。

$\displaystyle \sigma(x) = \frac{1}{1 + e^{-x}} \rightarrow \underline{ P(x_1,x_2) = \sigma( f(x_1,x_2)) }$ これを予測関数とします。

最尤推定の為の誤差関数 - STEP ½

次に、誤差関数を考えますが、まず、(X1, X2)で与えられるデータは、N個あるとし、 n番目のデータを(X1n,X2n)と表すことにします。

また、n番目のデータで、効果があった or not を、それぞれ、tn = 1, 0 としたとき、それぞれの確率は次のように表せます。

$\displaystyle t_{n} = 1 \rightarrow Pn = P(x1n, x2n)$

$\displaystyle t_{n} = 0 \rightarrow Pn = 1 - P(x1n, x2n )$

また、上記2式は、次のように統合できます。

$\displaystyle Pn = ( P(x1n, x2n ) )^{tn} \cdot ( 1 - P(x1n, x2n ) )^{1-tn}$

ここで、N個全てを正解する確率は、

$\displaystyle P = P_1 \times P_2 \times \cdots \times P_n = \prod_{n=1}^{N}P_n$

のような総積である為、一旦?、誤差関数は次のようになる。

$\displaystyle P = \underline{ \prod_{n=1}^{N} ( P(x1n, x2n ) )^{tn} \cdot ( 1 - P(x1n, x2n ) )^{1-tn} }$

最尤推定の為の誤差関数 - STEP 2/2

が、先程、作成した誤差関数は掛け算を多く含み、計算効率が悪い為

$\displaystyle E = - \log P$ 、 $\displaystyle \log ab = \log a + \log b$ 、 $\displaystyle \log a^{n} = n \log a$

を使って変形します。

$\displaystyle E = - \log \prod_{n=1}^{N} ( P(x1n, x2n ) )^{tn} \cdot ( 1 - P(x1n, x2n ) )^{1-tn}$

$\displaystyle = \underline{ - \sum_{n-1}^{N} ( tn \cdot \log P(x1n, x2n)) + (1-tn) \cdot \log ( 1 - P(x1n, x2n ) ) }$

上記が最終的な誤差関数。

2017-04-03

シグモイド関数 / ロジスティック関数の導関数(微分)

シグモイド関数(ロジスティック) と、その導関数(微分)

ロジスティック回帰に関連し、以下を証明(導出)

$\displaystyle f(x) = \frac{1}{1 + e^{-x}} \ \Longrightarrow \ f'(x) = ( 1 - f(x) ) f(x)$

証明(導出)手順

$\displaystyle f(x) = \frac{1}{1 + e^{-x}} = (1 + e^{-x})^{-1}$ …(1) に対し

$\displaystyle u = 1 + e^{-x}$ …(2) とおくと $\displaystyle f(x) = u^{-1}$ …(3) となる。

次に、上記(1) の微分を合成関数の微分で表すと

$\displaystyle f'(x) = f'(u) \cdot \frac{du}{dx}$ …(4) となり、式(3)と式(2)をそれぞれ微分し

$\displaystyle f'(u) = -1 \cdot u^{-2} = - (1 + e^{-x})^{-2}$ …(5)　と $\displaystyle \frac{du}{dx} = -e^{-x}$ …(6)　とできる。

最後に式(5),(6) を式(4) へ代入し、変形して完了。

$\displaystyle f'(x) = \frac {-1}{(1 + e^{-x})^{2}} \cdot - e^{-x} = \frac {1}{(1 + e^{-x})^{2}} \cdot e^{-x} = \frac{e^{-x}}{(1 + e^{-x})} \cdot \frac{1}{(1 + e^{-x})}$

$\displaystyle = ( \frac{1+e^{-x}}{1+e^{-x}} - \frac{1}{1+e^{-x}}) \cdot \frac{1}{1+e^{-x}}$ $\displaystyle = \underline{ ( 1 - f(x)) \cdot f(x) }$

2017-04-01

apache commons lang ver.3 for java で escapeSql() が削除されていた

なんで?

ver.2.6 の javadoc

Escapes and unescapes Strings for Java, Java Script, HTML, XML, and SQL.

https://commons.apache.org/proper/commons-lang/javadocs/api-2.6/org/apache/commons/lang/StringEscapeUtils.html

ver.3.5 の javadoc

Escapes and unescapes Strings for Java, Java Script, HTML and XML.

https://commons.apache.org/proper/commons-lang/javadocs/api-3.5/org/apache/commons/lang3/StringEscapeUtils.html

2017-04-01

tensorflowによる勾配降下法 ( deep learning & python )

github.com

↑こちらの Chapter1の写経。

前準備 - 使用する関係式

STEP1 : 予測式 - 1～12月の気温を予測

$\displaystyle y = w_0 x^{0} + w_1 x^{1} + w_2 x^{2} + w_3 x^{3} + w_4 x^{4} y = w x$

$\displaystyle y = X w$

$\displaystyle y = \left( \begin{array}{c} y_1 \\ y_2 \\ \vdots \\ y_{12} \end{array} \right)$

$\displaystyle X = \left( \begin{array}{cccc} 1^{0} & 1^{1} & \ldots & 1^{4} \\\ 2^{0} & 2^{1} & \ldots & 2^{4} \\\ \vdots & \vdots & \ddots & \vdots \\\ 12^{0} & 12^{1} & \ldots & 12^{4} \end{array} \right)$

$\displaystyle w = \left( \begin{array}{c} w_0 \\ w_1 \\ \vdots \\ w_{4} \end{array} \right)$

STEP2 : 誤差関数

$\displaystyle E = \frac{1}{2}\sum_{n=1}^{12} (y_n - t_n)$

最小二乗法や、ニュートン・ラフソン法を思い出します。

で、実装

#!/usr/local/bin/python
# -*- coding: utf-8 -*-
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt

def main():
    #予測関数
    x = tf.placeholder(tf.float32, [None, 5])
    w = tf.Variable(tf.zeros([5, 1]))
    y = tf.matmul(x, w)
    #実測値が入る行列
    t = tf.placeholder(tf.float32, [None, 1])
    #誤差関数
    loss = tf.reduce_sum(tf.square(y-t))
    #勾配降下法によるトレーニングアルゴリズム
    train_step = tf.train.AdamOptimizer().minimize(loss)

    sess = tf.Session()
    sess.run(tf.initialize_all_variables())

    #トレーニングデータ
    train_t = np.array([5.2, 5.7, 8.6, 14.9, 18.2, 20.4,
                        25.5, 26.4, 22.8, 17.5, 11.1, 6.6])
    train_t = train_t.reshape([12,1])
    train_x = np.zeros([12, 5])
    for row, month in enumerate(range(1, 13)):
       for col, n in enumerate(range(0, 5)):
            train_x[row][col] = month**n

    #勾配降下法によるの最適化の繰り返し
    i = 0
    for _ in range(1000000):
        i += 1
        sess.run(train_step, feed_dict={x:train_x, t:train_t})
        if i % 10000 == 0:
            loss_val = sess.run(loss, feed_dict={x:train_x, t:train_t})
            print ('Step: %d, Loss: %f' % (i, loss_val))

    #トレーニング後のパラメーターの値を確認
    w_val = sess.run(w)
    print w_val

#トレーニング後のパラメーターを用いて、予測気温を計算する関数を定義
def predict(x):
    result = 0.0
    for n in range(0, 5):
        result += w_val[n][0] * x**n
    return result


if __name__ == '__main__':
    main()

↑こう書くと↓こう表示されます

[endo@cent7 TENSORFLOW]$ ./foo.py 
Step: 10000, Loss: 31.012341
Step: 20000, Loss: 29.450821
  :
Step: 970000, Loss: 12.155926
Step: 980000, Loss: 34.782570
Step: 990000, Loss: 12.154196
Step: 1000000, Loss: 12.153559
[[ 10.88772202]
 [ -9.05010319]
 [  3.99193835]
 [ -0.44603682]
 [  0.01444708]]

2017-04-01

python 2.7 に _tkinter moduleをinstall

python 2.7で “import matplotlib.pyplot as plt” したら、 tkinter がなく errorとなった為。 tkinter の依存ライブラリ/モジュールはきちんと理解していませんが、次のように作業すると、解消。

# yum install tkinter
# yum install tk tcl tk-devel

$ wget https://www.python.org/ftp/python/2.7.13/Python-2.7.13.tgz
$ tar -zxvf Python-2.7.13.tgz
$ cd Python-2.7.13
$ ./configure --enable-optimizations
$ make
$ make test
$ su -
# make install

参考url

http://tkinter.unpythonic.net/wiki/How_to_install_Tkinter

2017-03-27

PMD で java の循環的複雑度(code metrics CyclomaticComplexity )を計測

https://pmd.github.io/ https://pmd.github.io/pmd-5.5.4/pmd-java/ https://pmd.github.io/pmd-5.5.4/usage/running.html

install

eclipse plug-inもあると思いますが、今回は、command-line用をinstall.

$ cd /home/endo/local
$ wget https://downloads.sourceforge.net/project/pmd/pmd/5.5.4/pmd-bin-5.5.4.zip
$ unzip pmd-bin-5.5.4.zip

run pmd

https://pmd.github.io/pmd-5.5.4/usage/running.html ↑ここにも記載がありますが、↓こんな感じで、実行&表示

$ ~/local/pmd-bin-5.5.4/bin/run.sh pmd \
     -dir /home/endo/tmp/src \
     -format text \
     -rulesets java-basic,java-codesize 
/home/endo/tmp/src/HttpComm.java:39:    This class has too many methods, consider refactoring it.
/home/endo/tmp/src/JsonUtil.java:1: This class has a bunch of public methods and attributes
/home/endo/tmp/src/JsonUtil.java:19:    Avoid really long classes.
/home/endo/tmp/src/JsonUtil.java:19:    The class 'JsonUtil' has a Cyclomatic Complexity of 3 (Highest = 13).
/home/endo/tmp/src/JsonUtil.java:19:    The class 'JsonUtil' has a Modified Cyclomatic Complexity of 3 (Highest = 13).
/home/endo/tmp/src/JsonUtil.java:19:    The class 'JsonUtil' has a Standard Cyclomatic Complexity of 3 (Highest = 13).
/home/endo/tmp/src/JsonUtil.java:23:    This class has too many methods, consider refactoring it.
/home/endo/tmp/src/JsonUtil.java:221:   The method 'getNode' has a Cyclomatic Complexity of 11.
/home/endo/tmp/src/JsonUtil.java:221:   The method 'getNode' has a Modified Cyclomatic Complexity of 11.
/home/endo/tmp/src/JsonUtil.java:221:   The method 'getNode' has a Standard Cyclomatic Complexity of 11.
    :                                        :
/home/endo/tmp/src/TimeUtil.java:583:   These nested if statements could be combined

※ java-basic,java-codesize 以外のrulesetは、pmd付属のjarの内容をご覧下さい

perl と javascriptのmetricsは、以前のエントリ参照

end0tknr.hateblo.jp

2017-03-27

snakeyaml for java による yaml load/read

javaにおけるyaml用ライブラリはいくつもあるようですが、何となく今日はsnakeYAML.

package jp.end0tknr;

import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.InputStreamReader;
import java.io.UnsupportedEncodingException;
import java.util.Map;

import org.yaml.snakeyaml.Yaml;

public class TestSnakeYaml {
    public TestSnakeYaml() {}

    public static void main(String[] args) {
        String confFilePath = "resource/test.yaml";
        String encoding = "UTF-8";
        
        File file = new File(confFilePath);
        FileInputStream input;
        InputStreamReader stream;
        try {
            input = new FileInputStream(file);
            stream = new InputStreamReader(input,encoding);
        } catch (FileNotFoundException | UnsupportedEncodingException e) {
            System.out.println(e.getClass().getName()+ 
                    " fail open file "+ confFilePath);
            return;
        }        
        
        Yaml yaml = new Yaml();
        Map yamlMap = (Map<String, ?>) yaml.load(stream);

        for(Object atriKeyTmp : yamlMap.keySet() ){
            String atriKey = (String) atriKeyTmp;
            System.out.println( yamlMap.get(atriKey).toString() );
        }
    }
}

common:
  system_name: ほげほげ
  encode: utf8
  #yes/no
#  debug_mode: yes
db:
  host: localhost
  port: 3306
  db_name: testdb
  db_user: root
  db_pass: 
  db_opt:
    AutoCommit: 0
    mysql_enable_utf8: 1
  client_encoding: utf8

↑こう書くと、↓こう表示されます

{system_name=ほげほげ, encode=utf8}
{host=localhost, port=3306, db_name=testdb, db_user=root, db_pass=null, db_opt={AutoCommit=0, mysql_enable_utf8=1}, client_encoding=utf8}

利用したjar

snakeyaml-1.18.jar

2017-03-27

apache commons configuration for java で INI file を load / read

http://commons.apache.org/proper/commons-configuration/

ini形式の設定ファイルをloadする必要があったので、探したら、見かけた。

※ini以外にも、　.xmlや .properties 等に対応しているようです。 ( 一方で、.json や、.yaml には対応していません )

package jp.end0tknr;
import java.io.BufferedReader;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.io.InputStreamReader;
import java.io.UnsupportedEncodingException;
import java.util.Iterator;
import org.apache.commons.configuration2.Configuration;
import org.apache.commons.configuration2.INIConfiguration;
import org.apache.commons.configuration2.ex.ConfigurationException;

public class TestApacheCommonsConfiguration {

    public TestApacheCommonsConfiguration() {}

    public static void main(String[] args) {
        String confFilePath = "resource/test.ini";
        String encoding = "SJIS";
        
        File file = new File(confFilePath);
        FileInputStream input;
        InputStreamReader stream;
        try {
            input = new FileInputStream(file);
            stream = new InputStreamReader(input,encoding);
        } catch (FileNotFoundException | UnsupportedEncodingException e) {
            System.out.println(e.getClass().getName()+ 
                    " fail open file "+ confFilePath);
            return;
        }
        
        INIConfiguration configTmp = new INIConfiguration();
        try {
            configTmp.read( new BufferedReader(stream) );
        } catch (ConfigurationException | IOException e) {
            System.out.println(e.getClass().getName()+" fail read file ");
            return;
        }
        
        Configuration config = configTmp;
        Iterator<String> atriKeys = config.getKeys();
        while(atriKeys.hasNext()) {
            String atriKey = (String)atriKeys.next();
            
            if(! config.containsKey(atriKey) ){
                System.out.println( "not exist key "+ atriKey);
            }
            System.out.println( atriKey+ "="+ config.getString(atriKey) );
        }
    }
}

; Test ini file to be included by a configuration definition
[common]
sysTitle = これは、テスト用のタイトルです
[testini]
loaded=yes

↑こう書くと、↓こう表示されます

common.sysTitle=これは、テスト用のタイトルです
testini.loaded=yes

その他 - 参照したjar

commons-configuration2-2.1.1.jar
commons-logging-1.2.jar
commons-beanutils-1.9.2.jar
commons-lang3-3.5.jar