pythonノート:groupby()でデータをまとめる


import pandas as pd
df = pd.DataFrame({'key1':list('aabba'),
                  'key2': ['one','two','one','two','one'],
                  'data1': ['1','3','5','7','9'],
                  'data2': ['2','4','6','8','10']})
print(df)

grouped = df.groupby(['key1']).count()
print(grouped)

出力:
  data1 data2 key1 key2
0     1     2    a  one
1     3     4    a  two
2     5     6    b  one
3     7     8    b  two
4     9    10    a  one
      data1  data2  key2
key1                    
a         3      3     3
b         2      2     2

groupbyのパラメータas_を変更index、入力:
import pandas as pd
df = pd.DataFrame({'key1':list('aabba'),
                  'key2': ['one','two','one','two','one'],
                  'data1': ['1','3','5','7','9'],
                  'data2': ['2','4','6','8','10']})
print(df)

grouped = df.groupby(['key1'],as_index = False).count()
print(grouped)

出力(ラベルの位置の違いに注意):
  data1 data2 key1 key2
0     1     2    a  one
1     3     4    a  two
2     5     6    b  one
3     7     8    b  two
4     9    10    a  one
  key1  data1  data2  key2
0    a      3      3     3
1    b      2      2     2

最終統計のコード:
import pandas as pd
df = pd.DataFrame({'key1':list('aabba'),
                  'key2': ['one','two','one','two','one'],
                  'data1': ['1','3','5','7','9'],
                  'data2': ['2','4','6','8','10']})
print(df)

grouped = df.groupby(['key1'],as_index = False).count()
print(grouped)
group2 = grouped[['key1','data1']]#      
group2.columns = ['key1','count']# group2 index   
print(group2)

出力:
  data1 data2 key1 key2
0     1     2    a  one
1     3     4    a  two
2     5     6    b  one
3     7     8    b  two
4     9    10    a  one
  key1  data1  data2  key2
0    a      3      3     3
1    b      2      2     2
  key1  count
0    a      3
1    b      2

Process finished with exit code 0