[Coursera]How to Winaデータ科学コンテスト-4週間3強
4218 ワード
1. Ensemble Method
より強力な予測を得るために,複数の機械学習モデルを組み合わせた.
2. Bagging
Means averaging slightly different versions of the same model to improve accuracy
(1) Why Bagging?
: Errors due to Bias(Underfitting) and Variance(Overfitting) exist
(2) Parameters that control bagging
: Changing the seed, Row sampling or Bootstrapping, Shuffling, Column sampling, Model-specific parameters, Number of models or bags, Parallelism
(3) Example of bagging
# train is the training data
# test is the test data
# y is target variable
model = RandomForestRegressor()
bags = 10
seed = 1
bagged_prediction = np.zeros(test.shape[0])
for n in range(0,bags):
model.set_params(random_state = seed+n) # update seed
model.fit(train.y)
preds = model.predict(test)
bagged_prediction +=preds
# take average of predictions
bagged_prediction/= bags
3. Boosting
Form of weighted averaging of models where each model is built sequentially via taking into account the past model performance.
=以前のモデルのパフォーマンスに基づいて、各モデルのモデルの加重平均フォーマットを順次作成します.
(1) Weight based boosting
特定のルールに基づいてweightを作成し、weightをフィーチャーの1つとして追加します.
特定のルールに基づいてエラーを計算し、Old Predictionに基づいてy labelを再決定します.
4. Stacking
Means making predictions of a number of models in a hold-out set and then using a different meta model to train on these predictions.
予測モデルセクションで最も人気のある形式と、最終段階で一般的に使用される方法.
() Stacking Example
from sklearn.ensemble import RandomForestRegressor
training, valid, ytraining, yvalid = train_test_split(train, y, test_size=0.5)
model1 = RandomForestRegressor()
model2 = LinearRegression()
model1.fit(training, ytraining)
model2.fit(training, ytraining)
preds1 = model1.predict(valid)
preds2 = model2.predict(valid)
test_preds1 = model1.predict(test)
test_preds2 = model2.predict(test)
stacked_prediction = np.column_stack(preds1,preds2)
stacked_test_prediction = np.column_stack(test_preds1, test_preds2)
#specifiy meta model
meta_model = LinearRegression()
# fit meta model on stacked predictions
meta_model.fit(stacked_predictions, yvalid)
# make predictions on the stacked predictions of the test data
final_predictions = meta_model.predict(stacked_test_predictions)
() Things to consider5. StackNet
Scalable meta modelling methodology that utilizes stacking to combine multiple models in a neural network architecture of multiple levels
スタックを使用して複数のモデルを複数のラベルのNNにマージする拡張可能なメタモデリング方法
6. Tips and Tricks
(1) 1st level tips
1) simpler algorithms
Reference
この問題について([Coursera]How to Winaデータ科学コンテスト-4週間3強), 我々は、より多くの情報をここで見つけました https://velog.io/@jhbale11/Coursera-How-to-win-a-data-science-competition-4주차-3강テキストは自由に共有またはコピーできます。ただし、このドキュメントのURLは参考URLとして残しておいてください。
Collection and Share based on the CC Protocol