FewRel解析

4716 ワード

クラシック解析

一、データセット解析

glove.5B.50d.json

word to vector変換テーブル

訓練集jsonと検証セットval.json

検証セットは、sample a pair of input and standard output file from the validation setというテストを実現するために2つの部分(***の割合?)に分かれています.

フォーマット解析file_name: Json file storing the data in the following format { “P155”: # relation id [ { “token”: [“Hot”, “Dance”, “Club”, …], # sentence “h”: [“song for a future generation”, “Q7561099”, [[16, 17, …]]], # head entity [word, id, location] “t”: [“whammy kiss”, “Q7990594”, [[11, 12]]], # tail entity [word, id, location] }, … ], “P177”: [ … ] … }

word_vec_file_name: Json file storing word vectors in the following format [ {‘word’: ‘the’, ‘vec’: [0.418, 0.24968, …]}, {‘word’: ‘,’, ‘vec’: [0.013441, 0.23682, …]}, … ]
max_length: The length that all the sentences need to be extend to.
case_sensitive:Whether the data processing is case-sensitive(大文字と小文字を区別するかどうか)、default as False.
reprocess: Do the pre-processing whether there exist pre-processed files, default as False.
cuda: Use cuda or not, default as True.

二、オブジェクトインスタンスの作成

JSOnFileDataLoaderオブジェクトインスタンスtrain_の作成data_loader\val_data_loader\test_data_loader

プロセス（前処理？）または_processed_dataが存在しない

reprocess(前処理?または_processed_dataが存在しない

Loading data file(train.json) & word vector file(glove.6B.50d.json)
self.ori_data = data file self.ori_word_vec = word vector file
大文字と小文字を区別するかどうかを判断する
大文字と小文字を区別しない:各relationの各instanceの各tokensを巡り、各アルファベットを小文字にします.

Pre-process word vec

self.word2id

self.word_vec_tot=40000:glove中word総数

UNKBLANKをword_に置くvec_tot末尾

self.word_vec_dim=50:wordごとの次元="Got 40000 words of 50 dims

Building word vector matrix and mapping:word 2 id対応行列、word_vectorマッピング関係の確立

初期化word_vector行列:tensor(word_vec_tot*word_vec_dim)

word vectorfileの各wordに対して、位置関係に応じてidを変換し、word 2 idに格納する.

各ワードに対応するvectorはvord_に格納されるvec_mat(idインデックス)における

self.word_vec_mat[cur_id]/np.sqrt(np.sum(self.word_vec_mat[cur_id]**2)制御マトリクス内の各値の範囲???

UNKとBLANKはword 2 idの最後に存在する.

Pre-processing data

self.instance_tot(instanceの総数)=relation当たりの数加算=700*relation

self.data_word、self.data_pos1、self.data_pos2、self.data_mask:word,pos 1,pos 2,maskを初期化する;サイズはinstance_tot * max_length

self.data_length初期化長(instanceの総数):data_length[i](各文tokensの長さを記録)

self.rel2scope

cur_ref_data_wordをi番目のinstanceに初期化

cur_ref_data_word[i]i番目のinstanceの各word対応idを保存し、辞書表にないものはUNKであり、長さはmax_未満である.length補BLANK

max_を超えるlength則遮断(長さ制限):data_length[i] <= max_length;pos1\pos2 < max_length

selfを設定.data_pos1

selfを設定.data_mask[i][j]

元の文の長さを超えます:mask=0

両エンティティの前:mask=1

両エンティティ間:mask=2

エンティティ後、元の文の長さはmask=3

です.

self.rel 2 scope[relation]:現在のrelationが持つinstance範囲

を記録する

Storing processed files:読み込み_processed_dataディレクトリファイル

FewShotREFrameworkオブジェクトインスタンスframeworkの作成

CNNSentenceEncoderオブジェクトインスタンスの作成

word embeddingposition embedding embedding後のボリューム化、プール化

三、モデルの選択

metanet

けいさんそんしつ

nn.CrossEntropyLoss()

embedding

encoder

basic_encoder

attention_encoder

linearレイヤlinearレイヤ:線形変換せんけいへんかん

basic_fc

attention_fc

learner_basic

線形変換(2,20)(20,20)(20,1)

learner_attention

LSTM+linear

training

ckpt_dir=’./checkpoint’, test_result_dir=’./test_result’, learning_rate=1e-1, lr_step_size=20000, Decay learning rate every lr_step_size steps weight_decay=1e-5, Rate of decaying weight train_iter=30000, val_iter=1000, val_step=2000, Validate every val_step steps test_iter=3000

初期化

filter

optimizer

scheduler

既存のモデルをロードし、現在の開始位置を計算

Cuda

Cuda:submoduleパラメータをcuda Tensorに変換

if cuda:
     model = model.cuda()

moduleをtraining modeに設定する

model.train()

透明度モーションテスト

どうして2!=True ?