NumPy簡単入門チュートリアル

11834 ワード

python

NumPy簡単入門チュートリアル
NumPyはPythonの演算速度が非常に速い数学ライブラリであり,配列を非常に重視している.Pythonでベクトルとマトリクスの計算を行うことができ、多くの下位関数が実際にCで記述されているため、原生Pythonでは永遠に体験できない速度を体験することができます.

#配列ベース
#配列の作成
NumPyはこれらの配列と呼ばれるものをめぐって展開している.実際にはndarrays。と呼ばれています

import numpy as np

#      
my_array = np.array([1,2,3,4,5])
print(my_array)

出力:
[1,2,3,4,5]
2 D配列の定義

#      
my_array = np.array([[1,2,3],[3,4,5]])
print(my_array)

出力:
[[1 2 3] [3 4 5]]
特定の配列を定義します.np.zeros((x,y)) # 0 x y np.ones((x,y)) # 1 x y np.random.random((x,y)) # 0-1 x y

my_zero_array = np.zeros((2,5)) #2 5        
print(my_zero_array)

my_one_array = np.ones((2,5)) #2 5    1     
print(my_one_array)

my_random_array = np.random.random((2,5)) #2 5  0~1          
print(my_random_array)

出力:
[[0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.]] [[1. 1. 1. 1. 1.] [1. 1. 1. 1. 1.]] [[0.51120513 0.74959219 0.75214013 0.33745769 0.8484016 ] [0.14101515 0.92684853 0.23266783 0.37040777 0.98653671]] [2 4]

#配列の演算

#    
a = np.array([[1,2],[3,4]])
b = np.array([[2,2],[1,1]])
print("a = ",a)
print("b = ",b)
print("a + b = ",a + b)
print("a - b = ",a - b)
print("a * b = ",a * b)     #     
print("a / b = ",a / b)     #

出力結果:
a = [[1 2] [3 4]] b = [[2 2] [1 1]] a + b = [[3 4] [4 5]] a - b = [[-1 0] [ 2 3]] a * b = [[2 4] [3 4]] a/b = [[0.5 1. ] [3. 4. ]]

マトリックス乗算
a.dot(b)#マトリクスaとマトリクスbのマトリクス乗算

print("  a        ：",a.dot(b))

出力結果:
行列aと行列の行列乗算:[[4 4 4][110]]

#配列プロパティ
ndarray.ndim
ランク、すなわち軸の数または次元の数
ndarray.shape
配列の次元、マトリクス、n行m列
ndarray.size
配列要素の総個数は、.shapeのn*mの値
ndarray.dtype
ndarrayオブジェクトの要素タイプ
ndarray.itemsize
ndarrayオブジェクトの各要素のサイズ(バイト単位)
ndarray.flags
ndarrayオブジェクトのメモリ情報
ndarray.real
ndarray要素の実部
ndarray.imag
ndarray要素の虚部
ndarray.data
実際の配列要素を含むバッファは、一般的に配列のインデックスによって要素が取得されるため、この属性を使用する必要はありません.

print(type(a)) # >>>
print(a.dtype) # >>>int32
print(a.size) # >>>4
print(a.shape) # >>>(2,2)
print(a.itemsize) # >>>4
print(a.ndim) # >>>2
print(a.nbytes) # >>>16

ndarray.shape
ndarray.shapeは配列の次元を表し、次元の数、すなわちndim属性(ランク)であるメタグループを返します.たとえば、「行数」と「列数」を表す2 D配列です.
ndarray.reshape
numpy.reshape関数は、numpy.reshape(arr, newshape, order='C')

arr:形状を変更する配列

newshape:整数または整数配列、新しい形状は既存の形状

と互換性があるべきである.

order:'C'--行、'F'--列、'A'--元の順序、'k'--要素のメモリに表示される順序.

import numpy as np
 
a = np.arange(8)
print ('    ：')
print (a)
print ('
')
 
b = a.reshape(4,2)
print ('      ：')
print (b)

#配列切り取り
Numpyは、いくつかのインデックス配列の方法を提供します.
スライス(Slicing):Pythonリストと同様にnumpy配列をスライスできます.配列は多次元である可能性があるため、配列の各次元にスライスを指定する必要があります.

import numpy as np

# Create the following rank 2 array with shape (3, 4)
# [[ 1  2  3  4]
#  [ 5  6  7  8]
#  [ 9 10 11 12]]
a = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])

# Use slicing to pull out the subarray consisting of the first 2 rows
# and columns 1 and 2; b is the following array of shape (2, 2):
# [[2 3]
#  [6 7]]
b = a[:2, 1:3]

# A slice of an array is a view into the same data, so modifying it
# will modify the original array.
print(a[0, 1])   # Prints "2"
b[0, 0] = 77     # b[0, 0] is the same piece of data as a[0, 1]
print(a[0, 1])   # Prints "77"

整数インデックスとスライスインデックスを混在させることもできます.しかし,これにより元の配列よりも下位レベルの配列が生成される.これはMATLABが配列スライスを処理する方法とは全く異なることに注意してください.

import numpy as np

# Create the following rank 2 array with shape (3, 4)
# [[ 1  2  3  4]
#  [ 5  6  7  8]
#  [ 9 10 11 12]]
a = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])

# Two ways of accessing the data in the middle row of the array.
# Mixing integer indexing with slices yields an array of lower rank,
# while using only slices yields an array of the same rank as the
# original array:
row_r1 = a[1, :]    # Rank 1 view of the second row of a
row_r2 = a[1:2, :]  # Rank 2 view of the second row of a
print(row_r1, row_r1.shape)  # Prints "[5 6 7 8] (4,)"
print(row_r2, row_r2.shape)  # Prints "[[5 6 7 8]] (1, 4)"

# We can make the same distinction when accessing columns of an array:
col_r1 = a[:, 1]
col_r2 = a[:, 1:2]
print(col_r1, col_r1.shape)  # Prints "[ 2  6 10] (3,)"
print(col_r2, col_r2.shape)  # Prints "[[ 2]
                             #          [ 6]
                             #          [10]] (3, 1)"

整数配列インデックス:numpy配列にスライスインデックスを使用すると、生成された配列ビューは常に元の配列のサブ配列になります.逆に、整数配列インデックスを使用すると、別の配列のデータを使用して任意の配列を構築できます.これは例です.

import numpy as np

a = np.array([[1,2], [3, 4], [5, 6]])

# An example of integer array indexing.
# The returned array will have shape (3,) and
print(a[[0, 1, 2], [0, 1, 0]])  # Prints "[1 4 5]"

# The above example of integer array indexing is equivalent to this:
print(np.array([a[0, 0], a[1, 1], a[2, 0]]))  # Prints "[1 4 5]"

# When using integer array indexing, you can reuse the same
# element from the source array:
print(a[[0, 0], [1, 1]])  # Prints "[2 2]"

# Equivalent to the previous integer array indexing example
print(np.array([a[0, 1], a[0, 1]]))  # Prints "[2 2]"

整数配列インデックスの有用なテクニックは、行列の各行から要素を選択または変更することです.

import numpy as np

# Create a new array from which we will select elements
a = np.array([[1,2,3], [4,5,6], [7,8,9], [10, 11, 12]])

print(a)  # prints "array([[ 1,  2,  3],
          #                [ 4,  5,  6],
          #                [ 7,  8,  9],
          #                [10, 11, 12]])"

# Create an array of indices
b = np.array([0, 2, 0, 1])

# Select one element from each row of a using the indices in b
print(a[np.arange(4), b])  # Prints "[ 1  6  7 11]"

# Mutate one element from each row of a using the indices in b
a[np.arange(4), b] += 10

print(a)  # prints "array([[11,  2,  3],
          #                [ 4,  5, 16],
          #                [17,  8,  9],
          #                [10, 21, 12]])

ブール配列インデックス:ブール配列インデックスでは、配列の任意の要素を選択できます.通常、このタイプのインデックスは、いくつかの条件を満たす配列要素を選択するために使用されます.次に例を示します.

import numpy as np

a = np.array([[1,2], [3, 4], [5, 6]])

bool_idx = (a > 2)   # Find the elements of a that are bigger than 2;
                     # this returns a numpy array of Booleans of the same
                     # shape as a, where each slot of bool_idx tells
                     # whether that element of a is > 2.

print(bool_idx)      # Prints "[[False False]
                     #          [ True  True]
                     #          [ True  True]]"

# We use boolean array indexing to construct a rank 1 array
# consisting of the elements of a corresponding to the True values
# of bool_idx
print(a[bool_idx])  # Prints "[3 4 5 6]"

# We can do all of the above in a single concise statement:
print(a[a > 2])     # Prints "[3 4 5 6]"

#ブロードキャスト(Broadcasting)
ブロードキャストはnumpyが算術演算を実行する際に異なる形状の配列を使用することを可能にする強力なメカニズムである.通常、私たちは小さな配列と大きな配列を持っています.私たちは何度も小さな配列を使って大きな配列に対していくつかの操作を実行したいと思っています.
たとえば、行列の各行に定数ベクトルを追加するとします.私たちはこのようにすることができます.

import numpy as np

# We will add the vector v to each row of the matrix x,
# storing the result in the matrix y
x = np.array([[1,2,3], [4,5,6], [7,8,9], [10, 11, 12]])
v = np.array([1, 0, 1])
y = np.empty_like(x)   # Create an empty matrix with the same shape as x

# Add the vector v to each row of the matrix x with an explicit loop
for i in range(4):
    y[i, :] = x[i, :] + v

# Now y is the following
# [[ 2  2  4]
#  [ 5  5  7]
#  [ 8  8 10]
#  [11 11 13]]
print(y)

これは効果的です.しかし、マトリクスxが非常に大きい場合、Pythonでの明示的なループの計算は遅くなる可能性がある.なお、行列xの各行にベクトルvを追加することは、複数のvを垂直に積層することによって行列vvを形成し、その後、要素の合計xおよびvvを実行することに等しい.この方法は次のように実現できます.

import numpy as np

# We will add the vector v to each row of the matrix x,
# storing the result in the matrix y
x = np.array([[1,2,3], [4,5,6], [7,8,9], [10, 11, 12]])
v = np.array([1, 0, 1])
vv = np.tile(v, (4, 1))   # Stack 4 copies of v on top of each other
print(vv)                 # Prints "[[1 0 1]
                          #          [1 0 1]
                          #          [1 0 1]
                          #          [1 0 1]]"
y = x + vv  # Add x and vv elementwise
print(y)  # Prints "[[ 2  2  4
          #          [ 5  5  7]
          #          [ 8  8 10]
          #          [11 11 13]]"

Numpyブロードキャストにより、vの複数のコピーを実際に作成せずに計算を実行できます.このニーズを考慮して、ブロードキャストを使用すると、次のようになります.

import numpy as np

# We will add the vector v to each row of the matrix x,
# storing the result in the matrix y
x = np.array([[1,2,3], [4,5,6], [7,8,9], [10, 11, 12]])
v = np.array([1, 0, 1])
y = x + v  # Add v to each row of x using broadcasting
print(y)  # Prints "[[ 2  2  4]
          #          [ 5  5  7]
          #          [ 8  8 10]
          #          [11 11 13]]"

y=x+v行xが形状(4，3)およびvが形状(3,)を有するとしても、放送の関係上、この行の動作は、vが実際に形状(4，3)を有するように動作し、各行はvのコピーであり、和は要素によって実行される.
2つの配列を一緒にブロードキャストするには、次のルールに従います.

配列が同じrankを持たない場合、2つの形状が同じ長さになるまで、下位レベルの配列の形状を1追加します.

2 2 2つの配列が次元で同じサイズを有する場合、またはそのうちの1つの配列が次元で1のサイズを有する場合、この2つの配列は次元的に互換性があると称される.

配列がすべての次元で互換性がある場合、一緒にブロードキャストすることができる.

がブロードキャストされると、各アレイの動作は、2つの入力配列の形状の要素の最大値に等しいようになります.

は、1つの配列のサイズが1であり、もう1つの配列のサイズが1より大きい任意の次元において、最初の配列の挙動は、その次元に沿って複製するように

である.
以上の説明がまだ理解されていない場合は、このドキュメントまたはこの説明の説明を読んでみてください.
ブロードキャストをサポートする機能を汎用機能と呼ぶ.このドキュメントでは、すべての汎用機能のリストを見つけることができます.
以下は、ブロードキャストのいくつかのアプリケーションです.

import numpy as np

# Compute outer product of vectors
v = np.array([1,2,3])  # v has shape (3,)
w = np.array([4,5])    # w has shape (2,)
# To compute an outer product, we first reshape v to be a column
# vector of shape (3, 1); we can then broadcast it against w to yield
# an output of shape (3, 2), which is the outer product of v and w:
# [[ 4  5]
#  [ 8 10]
#  [12 15]]
print(np.reshape(v, (3, 1)) * w)

# Add a vector to each row of a matrix
x = np.array([[1,2,3], [4,5,6]])
# x has shape (2, 3) and v has shape (3,) so they broadcast to (2, 3),
# giving the following matrix:
# [[2 4 6]
#  [5 7 9]]
print(x + v)

# Add a vector to each column of a matrix
# x has shape (2, 3) and w has shape (2,).
# If we transpose x then it has shape (3, 2) and can be broadcast
# against w to yield a result of shape (3, 2); transposing this result
# yields the final result of shape (2, 3) which is the matrix x with
# the vector w added to each column. Gives the following matrix:
# [[ 5  6  7]
#  [ 9 10 11]]
print((x.T + w).T)
# Another solution is to reshape w to be a column vector of shape (2, 1);
# we can then broadcast it directly against x to produce the same
# output.
print(x + np.reshape(w, (2, 1)))

# Multiply a matrix by a constant:
# x has shape (2, 3). Numpy treats scalars as arrays of shape ();
# these can be broadcast together to shape (2, 3), producing the
# following array:
# [[ 2  4  6]
#  [ 8 10 12]]
print(x * 2)

ブロードキャストは通常、コードをより簡潔にし、効率を高めるので、できるだけ使用する必要があります.

Python3: DynamoDB を使う

Boto3を使ったら空文字のままだとDynamoDBにデータが入れられなかった話