Nov.20.21


Prologue


Random value matters in many fields like probability, statistics, ML, and DL etc. When we need random value, python users will script in this way.
import numpy as np
print(np.random.randint(1, 10, 10))
>>> output
array([5, 2, 8, 9, 3, 9, 8, 2, 7, 9])
But the problem is random value keeps changing everytime we re-run it.
for i in range(3):
  print(np.random.randint(1, 10, 10))
>>> output
[4 1 1 6 8 6 1 9 7 6]
[2 8 5 4 7 2 5 1 9 6]
[5 3 8 2 3 2 1 8 2 9]
For consistency of ML project/research test result, getting random but same value is critical.

Random but not literally random


Random number isn't just come out of nowhere but made by tailered function. Which means there are inputs and outputs. At some point the function inputs some value into an algorithm then we get random value. We call this algorithms pseudo-random number generator. If you pass same value to the generator, it gives you back same output. Then how to pass same value to the generator?

The input value


Now np.random.seed() comes into play. np.random.seed() fix input value to the generator. NumPy allows us to set same value for achieving unchanging random value by declaring np.random.seed() globally. This is why random value is random but not literally random.
seed_value = 256
np.random.seed(seed_value)
for _ in range(5):
  print(np.random.rand(1))
If the code is run on other machine, the result is exactly the same.
>>> output
[0.0457838]
[0.58612071]
[0.20323985]
[0.08424309]
[0.02599218]

Epilogue


To go further we can make test result consistent on any machine once same seed value is set. However it can cause slowing down training speed.