Fundamental Statistics for Data Science (Part 02)

5377 ワード

analytics statistics データサイエンス Python Python3 Python テキストリンク

continuation from Fundamental Statistics for Data Science (Part 01).

5) Mean

Mean is also famous as average in mathematics, which can be obtained by summing all the observations and divided by the number of data points.

Let us think that a random variable has the following data. N is the number of data points.

X = { x₁, x₂, x₃, . . . , x_N }

We can calculate the mean using the following equation.

mean.py

import numpy as np

x = np.array([1,2,3,4,5,6])
x_mean = np.mean(x)

print(x_mean)

6) Variance

Variance refers to a spread between numbers in a data set. More pointedly, variance measures how far each number in the set is from the mean value in the dataset. When we calculated the sample variance, we can use it to approximate the population variance.

var.py

import numpy as np
x = np.array([1,4,3,6])
x_variance = np.var(x)

print(x_variance)

7) Standard Deviation

Standard deviation measures the distribution of a dataset relative to its mean.
We can calculate it by the square root of the variance.
Standard deviation is usually favored over the variance since it has the same unit as the data points, implying we can interpret it more easily.

std.py

import numpy as np
x = np.array([1,4,3,6])
x_std = np.std(x)

print(x_std)

Let's continue from part 03

*本記事は @qualitia_cdevの中の一人、@nuwanさんが書いてくれました。
*This article is written by @nuwan a member of @qualitia_cdev.

Author And Source

この問題について(Fundamental Statistics for Data Science (Part 02)), 我々は、より多くの情報をここで見つけました https://qiita.com/qualitia_cdev/items/0bd5a50a73099e5384b1

著者帰属：元の著者の情報は、元のURLに含まれています。著作権は原作者に属する。

Content is automatically searched and collected through network algorithms . If there is a violation . Please contact us . We will adjust (correct author information ,or delete content ) as soon as possible .