Pythonデータ分析実戦【第三章】2.17-Pandas接続と補修concat、combine_first【python】
26031 ワード
【レッスン2.17】接続とパッチconcat、combine_first
接続→軸に沿った接続
1.接続:concat
2.接続方式:join,join_axes
3.列名の上書き
4.修正pd.combine_first()
接続→軸に沿った接続
1.接続:concat
s1 = pd.Series([1,2,3])
s2 = pd.Series([2,3,4])
s3 = pd.Series([1,2,3],index = ['a','c','h'])
s4 = pd.Series([2,3,4],index = ['b','e','d'])
print(pd.concat([s1,s2]))
print(pd.concat([s3,s4]).sort_index())
print('-----')
# axis=0, +
print(pd.concat([s3,s4], axis=1))
print('-----')
# axis=1, + , Dataframe
----------------------------------------------------------------------
0 1
1 2
2 3
0 2
1 3
2 4
dtype: int64
a 1
b 2
c 2
d 4
e 3
h 3
dtype: int64
-----
0 1
a 1.0 NaN
b NaN 2.0
c 2.0 NaN
d NaN 4.0
e NaN 3.0
h 3.0 NaN
-----
2.接続方式:join,join_axes
s5 = pd.Series([1,2,3],index = ['a','b','c'])
s6 = pd.Series([2,3,4],index = ['b','c','d'])
print(pd.concat([s5,s6], axis= 1))
print(pd.concat([s5,s6], axis= 1, join='inner'))
print(pd.concat([s5,s6], axis= 1, join_axes=[['a','b','d']]))
# join:{'inner','outer'}, “outer”。 。outer inner 。
# join_axes: index
----------------------------------------------------------------------
0 1
a 1.0 NaN
b 2.0 2.0
c 3.0 3.0
d NaN 4.0
0 1
b 2 2
c 3 3
0 1
a 1.0 NaN
b 2.0 2.0
d NaN 4.0
3.列名の上書き
sre = pd.concat([s5,s6], keys = ['one','two'])
print(sre,type(sre))
print(sre.index)
print('-----')
# keys: , 。
sre = pd.concat([s5,s6], axis=1, keys = ['one','two'])
print(sre,type(sre))
# axis = 1,
----------------------------------------------------------------------
one a 1
b 2
c 3
two b 2
c 3
d 4
dtype: int64 <class 'pandas.core.series.Series'>
MultiIndex(levels=[['one', 'two'], ['a', 'b', 'c', 'd']],
labels=[[0, 0, 0, 1, 1, 1], [0, 1, 2, 1, 2, 3]])
-----
one two
a 1.0 NaN
b 2.0 2.0
c 3.0 3.0
d NaN 4.0 <class 'pandas.core.frame.DataFrame'>
4.修正pd.combine_first()
df1 = pd.DataFrame([[np.nan, 3., 5.], [-4.6, np.nan, np.nan],[np.nan, 7., np.nan]])
df2 = pd.DataFrame([[-42.6, np.nan, -8.2], [-5., 1.6, 4]],index=[1, 2])
print(df1)
print(df2)
print(df1.combine_first(df2))
print('-----')
# index,df1 df2
# df2 index df1, df1 , index=['a',1]
df1.update(df2)
print(df1)
# update, df2 df1, index
----------------------------------------------------------------------