Python_Pandasベース

88845 ワード

python

Python_Pandasベース
By: ?
参考ブログ1参考ブログ_2参考ブログ_3

Pandasは、データ分析タスクを解決するために作成されたPythonのデータ分析パッケージです.

Pandasは、大量のライブラリと標準データモデルを組み込み、データセットを効率的に操作するために必要なツールを提供します.

Pandasは、データを迅速かつ容易に処理できる多くの関数と方法を提供しています.

Pandasは辞書形式で、NumPyに基づいて作成され、NumPyを中心としたアプリケーションをより簡単にする

Pandas取付

pip3 install pandas

Pandas導入

import pandas as pd #      pandas   pd

データ構造

series

DataFrame

Series

import numpy as np
import pandas as pd
s=pd.Series([1,2,3,np.nan,5,6])
print(s)#

0    1.0
1    2.0
2    3.0
3    NaN
4    5.0
5    6.0
dtype: float64

DataFrame

dates=pd.date_range('20180310',periods=6)
df = pd.DataFrame(np.random.randn(6,4), index=dates, columns=['A','B','C','D'])#  6 4   
print(df)#  6 4    
print(df['B'])
print("----------------
----------------")

#       DataFrame
df_1=pd.DataFrame({'A' : 1.,
                    'B' : pd.Timestamp('20180310'),
                    'C' : pd.Series(1,index=list(range(4)),dtype='float32'),
                    'D' : np.array([3] * 4,dtype='int32'),
                    'E' : pd.Categorical(["test","train","test","train"]),
                    'F' : 'foo'
                    })
print(df_1)
print(df_1.dtypes)
print(df_1.index)#    
#Int64Index([0, 1, 2, 3], dtype='int64')
print(df_1.columns)#      
print("----------------
----------------")

#Index(['A', 'B', 'C', 'D', 'E', 'F'], dtype='object')
print(df_1.values)#          
print(df_1.describe())#    
print(df_1.T)#    
print("----------------
----------------")

print(df_1.sort_index(axis=1, ascending=False))#axis  1        ABCDEFG   ascending      
print(df_1.sort_values(by='E'))#

                   A         B         C         D
2018-03-10  0.872767  2.188739  0.766781 -0.001429
2018-03-11  0.218740 -0.556263 -0.047700  0.470347
2018-03-12 -0.816785  0.479690  1.722349  1.116260
2018-03-13  0.988138 -0.025760 -0.971384 -0.558211
2018-03-14 -0.581776  1.021027 -1.280569  1.022587
2018-03-15  0.061455 -1.647589 -1.568288 -0.467407
2018-03-10    2.188739
2018-03-11   -0.556263
2018-03-12    0.479690
2018-03-13   -0.025760
2018-03-14    1.021027
2018-03-15   -1.647589
Freq: D, Name: B, dtype: float64
----------------
----------------
     A          B    C  D      E    F
0  1.0 2018-03-10  1.0  3   test  foo
1  1.0 2018-03-10  1.0  3  train  foo
2  1.0 2018-03-10  1.0  3   test  foo
3  1.0 2018-03-10  1.0  3  train  foo
A           float64
B    datetime64[ns]
C           float32
D             int32
E          category
F            object
dtype: object
Int64Index([0, 1, 2, 3], dtype='int64')
Index(['A', 'B', 'C', 'D', 'E', 'F'], dtype='object')
----------------
----------------
[[1.0 Timestamp('2018-03-10 00:00:00') 1.0 3 'test' 'foo']
 [1.0 Timestamp('2018-03-10 00:00:00') 1.0 3 'train' 'foo']
 [1.0 Timestamp('2018-03-10 00:00:00') 1.0 3 'test' 'foo']
 [1.0 Timestamp('2018-03-10 00:00:00') 1.0 3 'train' 'foo']]
         A    C    D
count  4.0  4.0  4.0
mean   1.0  1.0  3.0
std    0.0  0.0  0.0
min    1.0  1.0  3.0
25%    1.0  1.0  3.0
50%    1.0  1.0  3.0
75%    1.0  1.0  3.0
max    1.0  1.0  3.0
                     0                    1                    2  \
A                    1                    1                    1   
B  2018-03-10 00:00:00  2018-03-10 00:00:00  2018-03-10 00:00:00   
C                    1                    1                    1   
D                    3                    3                    3   
E                 test                train                 test   
F                  foo                  foo                  foo   

                     3  
A                    1  
B  2018-03-10 00:00:00  
C                    1  
D                    3  
E                train  
F                  foo  
----------------
----------------
     F      E  D    C          B    A
0  foo   test  3  1.0 2018-03-10  1.0
1  foo  train  3  1.0 2018-03-10  1.0
2  foo   test  3  1.0 2018-03-10  1.0
3  foo  train  3  1.0 2018-03-10  1.0
     A          B    C  D      E    F
0  1.0 2018-03-10  1.0  3   test  foo
2  1.0 2018-03-10  1.0  3   test  foo
1  1.0 2018-03-10  1.0  3  train  foo
3  1.0 2018-03-10  1.0  3  train  foo

Pandas選択データ

特定列のデータ

を選択する.

特定行のデータ

を選択する.

特定行and列のデータ

を選択する.

シーケンスiloc行番号に従って選択データ

条件判断による選別

マルチインデックス

df = pd.DataFrame(np.random.rand(16).reshape(4,4)*100,
                   index = ['one','two','three','four'],
                   columns = ['a','b','c','d'])
df

a
b
c
d
one
73.506341
75.662735
74.675325
7.697207
two
73.055825
83.222481
4.777599
82.534340
three
89.156683
85.001712
47.443443
73.379189
four
95.648043
64.162408
26.731916
73.839172
特定の列のデータの選択

#  
print(df["a"])
print("----------------
----------------")
#  
print(df[["a","b"]])
print("----------------
----------------")
# _  
print(df.loc[:,"b":"d"])

one      73.506341
two      73.055825
three    89.156683
four     95.648043
Name: a, dtype: float64
----------------
----------------
               a          b
one    73.506341  75.662735
two    73.055825  83.222481
three  89.156683  85.001712
four   95.648043  64.162408
----------------
----------------
               b          c          d
one    75.662735  74.675325   7.697207
two    83.222481   4.777599  82.534340
three  85.001712  47.443443  73.379189
four   64.162408  26.731916  73.839172

特定の行のデータの選択

#  
print(df.loc["one"])
print("----------------
----------------")
#  
print(df.loc[["one","two"]])
print("----------------
----------------")
# _  
print(df[0:3])
print(df['one':'three'])

a    73.506341
b    75.662735
c    74.675325
d     7.697207
Name: one, dtype: float64
----------------
----------------
             a          b          c          d
one  73.506341  75.662735  74.675325   7.697207
two  73.055825  83.222481   4.777599  82.534340
----------------
----------------
               a          b          c          d
one    73.506341  75.662735  74.675325   7.697207
two    73.055825  83.222481   4.777599  82.534340
three  89.156683  85.001712  47.443443  73.379189
               a          b          c          d
one    73.506341  75.662735  74.675325   7.697207
two    73.055825  83.222481   4.777599  82.534340
three  89.156683  85.001712  47.443443  73.379189

特定の行and列のデータの選択

#  and  
print(df.loc["one","a"])
print("----------------
----------------")
#  and  
print(df.loc['one', ['a','c']])
print(df.loc[['one','three'],["a","b","c"]])
print("----------------
----------------")
# and _  
print(df.loc["one":"three","b":"c"])

73.50634055308014
----------------
----------------
a    73.506341
c    74.675325
Name: one, dtype: float64
               a          b          c
one    73.506341  75.662735  74.675325
three  89.156683  85.001712  47.443443
----------------
----------------
               b          c
one    75.662735  74.675325
two    83.222481   4.777599
three  85.001712  47.443443

シーケンスiloc-行番号に基づいてデータを選択

#  
print(df.iloc[0])
print("----------------
----------------")
#  
print(df.iloc[[0,3]])
print("----------------
----------------")
#  _ 
print(df.iloc[1:3])
print("----------------
----------------")
#  and  
print(df.iloc[3,1])#           
print("----------------
----------------")
#  and  
print(df.iloc[[1,2,3],[0,2]])#       ，      
print("----------------
----------------")
#  
print(df.iloc[2:4,0:2]) #

a    73.506341
b    75.662735
c    74.675325
d     7.697207
Name: one, dtype: float64
----------------
----------------
              a          b          c          d
one   73.506341  75.662735  74.675325   7.697207
four  95.648043  64.162408  26.731916  73.839172
----------------
----------------
               a          b          c          d
two    73.055825  83.222481   4.777599  82.534340
three  89.156683  85.001712  47.443443  73.379189
----------------
----------------
64.1624082303679
----------------
----------------
               a          c
two    73.055825   4.777599
three  89.156683  47.443443
four   95.648043  26.731916
----------------
----------------
               a          b
three  89.156683  85.001712
four   95.648043  64.162408

条件判断による選別

#    
print(df[df["a"] > 0])#   df.A  0          
print("----------------
----------------")
#    
print(df[df[["a","b"]]>0])

               a          b          c          d
one    73.506341  75.662735  74.675325   7.697207
two    73.055825  83.222481   4.777599  82.534340
three  89.156683  85.001712  47.443443  73.379189
four   95.648043  64.162408  26.731916  73.839172
               a          b   c   d
one    73.506341  75.662735 NaN NaN
two    73.055825  83.222481 NaN NaN
three  89.156683  85.001712 NaN NaN
four   95.648043  64.162408 NaN NaN

マルチインデックス

print(df['a'].loc[['one','three']])   #   a  one，three 
print("----------------
----------------")
print(df[['b','c','d']].iloc[::2])   #   b，c，d  one，three 
print("----------------
----------------")
print(df[df['a'] < 50].iloc[:2])   #               
print("----------------
----------------")
print(df[df < 50][['a','b']])

one      73.506341
three    89.156683
Name: a, dtype: float64
----------------
----------------
               b          c          d
one    75.662735  74.675325   7.697207
three  85.001712  47.443443  73.379189
----------------
----------------
Empty DataFrame
Columns: [a, b, c, d]
Index: []
----------------
----------------
        a   b
one   NaN NaN
two   NaN NaN
three NaN NaN
four  NaN NaN

Pandas設定データ

dates = pd.date_range('20180310', periods=6)
df = pd.DataFrame(np.arange(24).reshape((6,4)), index=dates, columns=['A', 'B', 'C', 'D'])
print(df)
'''
             A   B     C   D
2018-03-10   0   1     2   3
2018-03-11   4   5     6   7
2018-03-12   8   9  1111  11
2018-03-13  12  13    14  15
2018-03-14  16  17    18  19
2018-03-15  20  21    22  23
'''

df.iloc[2,2] = 999#    
df.loc['2018-03-13', 'D'] = 999
print(df)

             A   B   C   D
2018-03-10   0   1   2   3
2018-03-11   4   5   6   7
2018-03-12   8   9  10  11
2018-03-13  12  13  14  15
2018-03-14  16  17  18  19
2018-03-15  20  21  22  23
             A   B    C    D
2018-03-10   0   1    2    3
2018-03-11   4   5    6    7
2018-03-12   8   9  999   11
2018-03-13  12  13   14  999
2018-03-14  16  17   18   19
2018-03-15  20  21   22   23

df[df.A>10]=999# df.A  10    
print(df)

              A    B    C    D
2018-03-10    0    1    2    3
2018-03-11    4    5    6    7
2018-03-12    8    9  999   11
2018-03-13  999  999  999  999
2018-03-14  999  999  999  999
2018-03-15  999  999  999  999

df['F']=np.nan
print(df)

              A    B    C    D   F
2018-03-10    0    1    2    3 NaN
2018-03-11    4    5    6    7 NaN
2018-03-12    8    9  999   11 NaN
2018-03-13  999  999  999  999 NaN
2018-03-14  999  999  999  999 NaN
2018-03-15  999  999  999  999 NaN

df['E']  = pd.Series([1,2,3,4,5,6], index=pd.date_range('20180310', periods=6))#    
print(df)

             A   B    C    D  E
2018-03-10   0   1    2    3  1
2018-03-11   4   5    6    7  2
2018-03-12   8   9  999   11  3
2018-03-13  12  13   14  999  4
2018-03-14  16  17   18   19  5
2018-03-15  20  21   22   23  6

Pandasは損失データを処理する

処理データ中のNaNデータ

dropna()関数を使用してNaNの行または列を削除する

fillna()関数を使用してNaN値

を置換

isnull()関数を用いてデータが失われたか否かを判断する

.
処理データ中のNaNデータ

dates = pd.date_range('20180310', periods=6)
df = pd.DataFrame(np.arange(24).reshape((6,4)), index=dates, columns=['A', 'B', 'C', 'D'])
df.iloc[0,1]=np.nan
df.iloc[1]=np.nan
print(df)

               A     B     C     D
2018-03-10   0.0   NaN   2.0   3.0
2018-03-11   NaN   NaN   NaN   NaN
2018-03-12   8.0   9.0  10.0  11.0
2018-03-13  12.0  13.0  14.0  15.0
2018-03-14  16.0  17.0  18.0  19.0
2018-03-15  20.0  21.0  22.0  23.0

dropna()関数を使用してNaNの行または列を削除

#0       1       
#any:    NaN  drop  
#all:     NaN  drop
print(df.dropna(axis=0,how='any'))
print(df.dropna(axis=0,how='all'))

                   A         B         C         D
2018-03-10  0.872767  2.188739  0.766781 -0.001429
2018-03-11  0.218740 -0.556263 -0.047700  0.470347
2018-03-12 -0.816785  0.479690  1.722349  1.116260
2018-03-13  0.988138 -0.025760 -0.971384 -0.558211
2018-03-14 -0.581776  1.021027 -1.280569  1.022587
2018-03-15  0.061455 -1.647589 -1.568288 -0.467407
                   A         B         C         D
2018-03-10  0.872767  2.188739  0.766781 -0.001429
2018-03-11  0.218740 -0.556263 -0.047700  0.470347
2018-03-12 -0.816785  0.479690  1.722349  1.116260
2018-03-13  0.988138 -0.025760 -0.971384 -0.558211
2018-03-14 -0.581776  1.021027 -1.280569  1.022587
2018-03-15  0.061455 -1.647589 -1.568288 -0.467407

Fillna()関数を使用してNaN値を置換

print(df.fillna(value=233))# NaN    0

                A      B      C      D
2018-03-10    0.0  233.0    2.0    3.0
2018-03-11  233.0  233.0  233.0  233.0
2018-03-12    8.0    9.0   10.0   11.0
2018-03-13   12.0   13.0   14.0   15.0
2018-03-14   16.0   17.0   18.0   19.0
2018-03-15   20.0   21.0   22.0   23.0

isnull()関数を使用して、データが失われたかどうかを判断します.

print(pd.isnull(df))#            nan ture   nan false
print("----------------
----------------")
print(np.any(df.isnull()))#          NaN 
#True

                A      B      C      D
2018-03-10  False   True  False  False
2018-03-11   True   True   True   True
2018-03-12  False  False  False  False
2018-03-13  False  False  False  False
2018-03-14  False  False  False  False
2018-03-15  False  False  False  False
----------------
----------------
True

Pandasインポートエクスポート

data=pd.read_csv('test1.csv')#  csv  
data.to_pickle('test2.pickle')#      pickle   
#

Pandas連結データ

axisマージ方向

joinマージ方式

append追加データ

axisマージ方向

df1 = pd.DataFrame(np.ones((3,4))*0, columns=['a','b','c','d'])
df2 = pd.DataFrame(np.ones((3,4))*1, columns=['a','b','c','d'])
df3 = pd.DataFrame(np.ones((3,4))*2, columns=['a','b','c','d'])
res = pd.concat([df1, df2, df3], axis=0, ignore_index=True)
#0       1       ingnore_index    index index  0 1 2 3 4 5 6 7 8
print(res)

     a    b    c    d
0  0.0  0.0  0.0  0.0
1  0.0  0.0  0.0  0.0
2  0.0  0.0  0.0  0.0
3  1.0  1.0  1.0  1.0
4  1.0  1.0  1.0  1.0
5  1.0  1.0  1.0  1.0
6  2.0  2.0  2.0  2.0
7  2.0  2.0  2.0  2.0
8  2.0  2.0  2.0  2.0

joinマージ方式

df1 = pd.DataFrame(np.ones((3,4))*0, columns=['a','b','c','d'], index=[1,2,3])
df2 = pd.DataFrame(np.ones((3,4))*1, columns=['b','c','d', 'e'], index=[2,3,4])
print(df1)
print(df2)
print("----------------
----------------")

#       ,join='outer'
res=pd.concat([df1,df2],axis=1,join='outer')
print(res)
print("----------------
----------------")

#        ,join='inner'
res=pd.concat([df1,df2],axis=1,join='inner')
print(res)
print("----------------
----------------")

# df1        df2      NaN   
res=pd.concat([df1,df2],axis=1,join_axes=[df1.index])
print(res)

     a    b    c    d
1  0.0  0.0  0.0  0.0
2  0.0  0.0  0.0  0.0
3  0.0  0.0  0.0  0.0
     b    c    d    e
2  1.0  1.0  1.0  1.0
3  1.0  1.0  1.0  1.0
4  1.0  1.0  1.0  1.0
----------------
----------------
     a    b    c    d    b    c    d    e
1  0.0  0.0  0.0  0.0  NaN  NaN  NaN  NaN
2  0.0  0.0  0.0  0.0  1.0  1.0  1.0  1.0
3  0.0  0.0  0.0  0.0  1.0  1.0  1.0  1.0
4  NaN  NaN  NaN  NaN  1.0  1.0  1.0  1.0
----------------
----------------
     a    b    c    d    b    c    d    e
2  0.0  0.0  0.0  0.0  1.0  1.0  1.0  1.0
3  0.0  0.0  0.0  0.0  1.0  1.0  1.0  1.0
----------------
----------------
     a    b    c    d    b    c    d    e
1  0.0  0.0  0.0  0.0  NaN  NaN  NaN  NaN
2  0.0  0.0  0.0  0.0  1.0  1.0  1.0  1.0
3  0.0  0.0  0.0  0.0  1.0  1.0  1.0  1.0

append追加データ

df1 = pd.DataFrame(np.ones((3,4))*0, columns=['a','b','c','d'])
df2 = pd.DataFrame(np.ones((3,4))*1, columns=['a','b','c','d'])
s1 = pd.Series([1,2,3,4], index=['a','b','c','d'])
print(s1)
print("----------------
----------------")

# df2   df1       index
res=df1.append(df2,ignore_index=True)
print(res)
print("----------------
----------------")

# s1   df1      index
res=df1.append(s1,ignore_index=True)
print(res)

a    1
b    2
c    3
d    4
dtype: int64
----------------
----------------
     a    b    c    d
0  0.0  0.0  0.0  0.0
1  0.0  0.0  0.0  0.0
2  0.0  0.0  0.0  0.0
3  1.0  1.0  1.0  1.0
4  1.0  1.0  1.0  1.0
5  1.0  1.0  1.0  1.0
----------------
----------------
     a    b    c    d
0  0.0  0.0  0.0  0.0
1  0.0  0.0  0.0  0.0
2  0.0  0.0  0.0  0.0
3  1.0  2.0  3.0  4.0

Pandasマージ

一組のkeyに従って

を合併する

は2組のkeyに基づいて

を合併する.

Indicator合併

indexによる

のマージ
一連のkeyに基づいてマージ

left = pd.DataFrame({'key': ['K0', 'K1', 'K2', 'K3'],
                     'A': ['A0', 'A1', 'A2', 'A3'],
                     'B': ['B0', 'B1', 'B2', 'B3']})
print(left)
print("----------------
----------------")

right = pd.DataFrame({'key': ['K0', 'K1', 'K2', 'K3'],
                      'C': ['C0', 'C1', 'C2',  'C3'],
                      'D': ['D0', 'D1', 'D2', 'D3']})
print(right)
print("----------------
----------------")

res=pd.merge(left,right,on='key')
print(res)

  key   A   B
0  K0  A0  B0
1  K1  A1  B1
2  K2  A2  B2
3  K3  A3  B3
----------------
----------------
  key   C   D
0  K0  C0  D0
1  K1  C1  D1
2  K2  C2  D2
3  K3  C3  D3
----------------
----------------
  key   A   B   C   D
0  K0  A0  B0  C0  D0
1  K1  A1  B1  C1  D1
2  K2  A2  B2  C2  D2
3  K3  A3  B3  C3  D3

2組のkeyによるマージ

left = pd.DataFrame({'key1': ['K0', 'K0', 'K1', 'K2'],
                             'key2': ['K0', 'K1', 'K0', 'K1'],
                             'A': ['A0', 'A1', 'A2', 'A3'],
                             'B': ['B0', 'B1', 'B2', 'B3']})
print(left)
print("----------------
----------------")

right = pd.DataFrame({'key1': ['K0', 'K1', 'K1', 'K2'],
                              'key2': ['K0', 'K0', 'K0', 'K0'],
                              'C': ['C0', 'C1', 'C2', 'C3'],
                              'D': ['D0', 'D1', 'D2', 'D3']})
print(right)
print("----------------
----------------")

#    
res=pd.merge(left,right,on=['key1','key2'],how='inner')
print(res)
print("----------------
----------------")

#    
res=pd.merge(left,right,on=['key1','key2'],how='outer')
print(res)
print("----------------
----------------")

#    
res=pd.merge(left,right,on=['key1','key2'],how='left')
print(res)
print("----------------
----------------")

#    
res=pd.merge(left,right,on=['key1','key2'],how='right')
print(res)

  key1 key2   A   B
0   K0   K0  A0  B0
1   K0   K1  A1  B1
2   K1   K0  A2  B2
3   K2   K1  A3  B3
----------------
----------------
  key1 key2   C   D
0   K0   K0  C0  D0
1   K1   K0  C1  D1
2   K1   K0  C2  D2
3   K2   K0  C3  D3
----------------
----------------
  key1 key2   A   B   C   D
0   K0   K0  A0  B0  C0  D0
1   K1   K0  A2  B2  C1  D1
2   K1   K0  A2  B2  C2  D2
----------------
----------------
  key1 key2    A    B    C    D
0   K0   K0   A0   B0   C0   D0
1   K0   K1   A1   B1  NaN  NaN
2   K1   K0   A2   B2   C1   D1
3   K1   K0   A2   B2   C2   D2
4   K2   K1   A3   B3  NaN  NaN
5   K2   K0  NaN  NaN   C3   D3
----------------
----------------
  key1 key2   A   B    C    D
0   K0   K0  A0  B0   C0   D0
1   K0   K1  A1  B1  NaN  NaN
2   K1   K0  A2  B2   C1   D1
3   K1   K0  A2  B2   C2   D2
4   K2   K1  A3  B3  NaN  NaN
----------------
----------------
  key1 key2    A    B   C   D
0   K0   K0   A0   B0  C0  D0
1   K1   K0   A2   B2  C1  D1
2   K1   K0   A2   B2  C2  D2
3   K2   K0  NaN  NaN  C3  D3

Indicatorマージ

df1 = pd.DataFrame({'col1':[0,1], 'col_left':['a','b']})
print(df1)

df2 = pd.DataFrame({'col1':[1,2,2],'col_right':[2,2,2]})
print(df2)
print("----------------
----------------")

#  col1        indicator=True        
res=pd.merge(df1,df2,on='col1',how='outer',indicator=True)
print(res)
print("----------------
----------------")

#   indicator column  
res = pd.merge(df1, df2, on='col1', how='outer', indicator='indicator_column')
print(res)

   col1 col_left
0     0        a
1     1        b
   col1  col_right
0     1          2
1     2          2
2     2          2
----------------
----------------
   col1 col_left  col_right      _merge
0     0        a        NaN   left_only
1     1        b        2.0        both
2     2      NaN        2.0  right_only
3     2      NaN        2.0  right_only
----------------
----------------
   col1 col_left  col_right indicator_column
0     0        a        NaN        left_only
1     1        b        2.0             both
2     2      NaN        2.0       right_only
3     2      NaN        2.0       right_only

indexによるマージ

left = pd.DataFrame({'A': ['A0', 'A1', 'A2'],
                                  'B': ['B0', 'B1', 'B2']},
                                  index=['K0', 'K1', 'K2'])
print(left)

right = pd.DataFrame({'C': ['C0', 'C2', 'C3'],
                                     'D': ['D0', 'D2', 'D3']},
                                      index=['K0', 'K2', 'K3'])
print(right)
print("----------------
----------------")

#  index              
res=pd.merge(left,right,left_index=True,right_index=True,how='outer')
print(res)
print("----------------
----------------")

res=pd.merge(left,right,left_index=True,right_index=True,how='inner')
print(res)

     A   B
K0  A0  B0
K1  A1  B1
K2  A2  B2
     C   D
K0  C0  D0
K2  C2  D2
K3  C3  D3
----------------
----------------
      A    B    C    D
K0   A0   B0   C0   D0
K1   A1   B1  NaN  NaN
K2   A2   B2   C2   D2
K3  NaN  NaN   C3   D3
----------------
----------------
     A   B   C   D
K0  A0  B0  C0  D0
K2  A2  B2  C2  D2

centos 7はntfsを識別する

epoll_create, epoll_ctlとepoll_wait解説