python高度なステップの語周波数統計問題
3194 ワード
既存のリストは、方法1:まず、これらの要素を辞書のキーとして、初期値が0の辞書を確立する: 方法2:
参照ドキュメント:
[1, 7, 10, 4, 9, 10, 9, 8, 5, 8]
が各要素の出現回数を統計することを望んでおり、最終的には{8: 2, 9: 2...}
、すなわち{ある要素:出現回数...}という結果が得られる.>>> from random import randint
>>> data = [randint(1,10) for x in xrange(10)]
>>> data
[1, 7, 10, 4, 9, 10, 9, 8, 5, 8]
>>> d = dict.fromkeys(data, 0)
>>> d
{1: 0, 4: 0, 5: 0, 7: 0, 8: 0, 9: 0, 10: 0}
>>> for x in data:
>>> d[x] += 1
>>> d
{1: 1, 4: 1, 5: 1, 7: 1, 8: 2, 9: 2, 10: 2}
collections
モジュールのCounter
Counter
を利用する簡単なカウンタ:>>> from collections import Counter
>>> c = Counter(data)
>>> c
Counter({1: 1, 4: 1, 5: 1, 7: 1, 8: 2, 9: 2, 10: 2})
>>> isinstance(c, dict)
True
# Counter dict ,
>>> c[1]
1
# most_common(n), n
>>> c.most_common(2)
[(8, 2), (9, 2)]
参照ドキュメント:
class Counter(__builtin__.dict)
| Dict subclass for counting hashable items. Sometimes called a bag
| or multiset. Elements are stored as dictionary keys and their counts
| are stored as dictionary values.
|
| >>> c = Counter('abcdeabcdabcaba') # count elements from a string
|
| >>> c.most_common(3) # three most common elements
| [('a', 5), ('b', 4), ('c', 3)]
| >>> sorted(c) # list all unique elements
| ['a', 'b', 'c', 'd', 'e']
| >>> ''.join(sorted(c.elements())) # list elements with repetitions
| 'aaaaabbbbcccdde'
| >>> sum(c.values()) # total of all counts
| 15
|
| >>> c['a'] # count of letter 'a'
| 5
| >>> for elem in 'shazam': # update counts from an iterable
| ... c[elem] += 1 # by adding 1 to each element's count
| >>> c['a'] # now there are seven 'a'
| 7
| >>> del c['b'] # remove all 'b'
| >>> c['b'] # now there are zero 'b'
| 0
|
| >>> d = Counter('simsalabim') # make another counter
| >>> c.update(d) # add in the second counter
| >>> c['a'] # now there are nine 'a'
| 9
|
| >>> c.clear() # empty the counter
| >>> c
| Counter()
|
| Note: If a count is set to zero or reduced to zero, it will remain
| in the counter until the entry is deleted or the counter is cleared:
|
| >>> c = Counter('aaabbc')
| >>> c['b'] -= 2 # reduce the count of 'b' by two
| >>> c.most_common() # 'b' is still in, but its count is zero | [('a', 3), ('c', 1), ('b', 0)]