DAY 30, 35

19301 ワード

テキストリンク

DAY 30

表列データ列/テスト

列車組:最旧データ

valset:

テストセット:最近のデータ

# str을 시계열 데이터로 비끄기
pd.to_datetime(df['game_date'])
# test/val 나누기
train= df.loc[df.game_date <'2014-01-01',:]
test= df.loc[df.game_date >='2014-01-01',:]

リファレンス

timedeltaオブジェクトからDayセクションを削除

>>>df['Seconds remaining in the period']= pd.to_timedelta(df['Seconds remaining in the period'], unit='s')
-----------------------------------------------------------
0 days 00:00:42

>>>>df['Seconds remaining in the period'] = df['Seconds remaining in the period'].astype(str).map(lambda x: x[7:])
---------------------------------------------------------------
00:00:42

リファレンス

DAY 35

count plot

:カテゴリ値で表示されるデータ数

import seaborn as sns

sns.countplot(x=train['Inspection Fail']);

リファレンス

pie plot

plt.subplots(figsize = (5,5))
plt.pie(train['feature'].value_counts(), labels = train['feature'].value_counts().index, 
        autopct="%.1f%%", shadow = True, startangle = 90)
plt.title('제목뭐로하지', size=20)
plt.show()

きすうを減らす

「feature」upper case>value counts top 10を除くetc処理

df["feature"] = df["feature"].str.upper()
facility_top10 = df["feature"].value_counts().sort_values(ascending=False).head(10).index.to_list()
df["feature"] = [i if i in facility_top10 else "ETC" for i in df['feature']]

Split>len()を使用して、

"Violations"の測定値を0"|"に変更し、衝突を"個数"

に変換します.

df["Violations"].fillna(0, inplace=True)
df["Violations"] = [0 if i == 0 else len(i.split("| ")) for i in df["Violations"]]

MCDONALD、MCDONALD's、およびMCDONALDSを含む変数をMCDONALDSに変換する

df["Name"] = df["DBA Name"].str.upper()
macdonald = set(df_temp["Name"][df_temp["Name"].str.contains("MCDONALD|MC DONALD'S|MC DONALDS")].values)
df.replace(macdonald, 'MCDONALDS', inplace=True)

str.contain()
str.contain()-複数

年度、月抽出

df["Date"] = df['Date'].apply(pd.to_datetime)
df["year"] = df["Date"].dt.year
df["month"] = df["Date"].dt.month

scale_pos_weight

画像ソース

model = XGBClassifier(n_estimators=1000, verbosity=0, n_jobs=-1, scale_pos_weight=ratio)

Reference

この問題について(DAY 30, 35), 我々は、より多くの情報をここで見つけました https://velog.io/@ayi4067/DAY-30-35

テキストは自由に共有またはコピーできます。ただし、このドキュメントのURLは参考URLとして残しておいてください。

Collection and Share based on the CC Protocol

ThreadLocalの復号化

BAEKJOON : 14500, 1748, 15649