小狐狸事務所: Python 學習筆記 : 亂數模組 random

2022年8月20日星期六

Python 學習筆記 : 亂數模組 random

Python 的 random 模組提供豐富的內建函式, 可以從多種隨機分布中產生整數或浮點數亂數, 也可從字串或串列中隨機抽取字元, 字串, 或元素, 也能對串列元素進行隨機排序 (即洗牌), 常用於遊戲或模擬等應用.

不過, 這種亂數其實並非真的隨機, 它們是用一個確定的演算法 (亂數生成器) 算出來的, 故稱為偽亂數 (pseudo randomness), 產生亂數前會以一個隨機種子 (seed) 進行初始化, 預設使用計算開始時的時間 (精確至微秒), 故要破解它並不容易. Python 的 random 模組使用的亂數生成器是 Mersenne Twister 演算法, 它具有 53 位元的浮點數精度, 參考 :

# https://docs.python.org/zh-tw/3/library/random.html

本篇主要是我閱讀下列書籍時隨手測試所做的筆記 :

Python 零基礎入門班 (碁峰 2018, 文淵閣工作室)
增壓的 Python-讓程式碼進化到全新境界 (碁峰 2020)
Python Bible 自學聖經 (碁峰 2020, 文淵閣工作室)
Python 最強入門-邁向數據科學之路第二版 (深智 2020) 第 13 章

參考 :

# Day 15 - 亂數與統計模組

首先來檢視 random 模組的公開成員, 以下使用一個自訂模組 members, 其 list_members() 函式會列出模組或套件中的公開成員 (即屬性與方法), 參考 :

# Python 學習筆記 : 檢視物件成員與取得變數名稱字串的方法

>>> import members as mbr

>>> mbr.list_members(random)

BPF <class 'int'>

LOG4 <class 'float'>

NV_MAGICCONST <class 'float'>

RECIP_BPF <class 'float'>

Random <class 'type'>

SG_MAGICCONST <class 'float'>

SystemRandom <class 'type'>

TWOPI <class 'float'>

betavariate <class 'method'>

choice <class 'method'>

choices <class 'method'>

expovariate <class 'method'>

gammavariate <class 'method'>

gauss <class 'method'>

getrandbits <class 'builtin_function_or_method'>

getstate <class 'method'>

lognormvariate <class 'method'>

normalvariate <class 'method'>

paretovariate <class 'method'>

randint <class 'method'>

random <class 'builtin_function_or_method'>

randrange <class 'method'>

sample <class 'method'>

seed <class 'method'>

setstate <class 'method'>

shuffle <class 'method'>

triangular <class 'method'>

uniform <class 'method'>

vonmisesvariate <class 'method'>

weibullvariate <class 'method'>

可見 random 模組內有許多函式, 其中最常用的如下表所示 :

亂數模組 random 的函式	說明
random()	傳回 0~1 之間的一個均勻分布之隨機浮點數 (包含 0 但不包含 1)
randint(a, b)	傳回整數 a~b 之間的一個均勻分布之隨機整數 (包含 a 與 b)
randrange(stop)	從 range(0, stop) 隨機挑選一個整數傳回 (不包含 stop)
randrange(start, stop, step)	從 range(start, stop, step) 隨機挑選一個整數傳回 (不包含 stop)
uniform(a, b)	傳回 a~b 之間的一個均勻分布之隨機浮點數 (包含 a, b)
choice(str/list)	從字串 str 或串列 list 中隨機挑出 1 個字元或元素傳回
sample(str/list, n)	從字串 str 或串列 list 中隨機挑出 n 個字元或元素組成串列傳回
shuffle(list)	將傳入串列 list 之元素隨機排序 (洗牌), 會改變串列元素順序
normalvariate(mean, dev)	傳回一個平均值 mean, 標準差 dev 的常態分佈亂數
seed([a])	設定隨機產生器的隨機種子 a (以獲得相同的結果)

詳細的 API 說明參考官網 :

# https://docs.python.org/3/library/random.html#module-random

以下分別測試這些函式 :

1. random() :

此函式沒有傳入參數, 其傳回值為 0~1 之間的隨機浮點數 (不包含 1), 例如 :

>>> random.random()

0.38414118898124694

>>> random.random()

0.525094715456061

>>> for i in range(10):

print(random.random())

0.7011734531575016

0.4116055943827881

0.7810757172932943

0.4175523862303292

0.15430764700566946

0.5737323047243192

0.04924526414755204

0.5693762595114389

0.262593445060638

0.789189664609723

random 模組內的其他函式都是以 random() 為基礎去實作的, 因此可說是亂數的基本函式.

2. randint(a, b) :

此函式傳入兩個整數參數 a 與 b, 其傳回值為 a~b 之間的均勻分布之隨機整數 (包含 a, b), 亦即每個整數被選中的機率相等, 例如 :

>>> random.randint(2, 12)

注意, random(a, b) 須 a<=b 否則會出現 ValueError 錯誤 :

>>> random.randint(2, 2)

>>> random.randint(12, 2)

Traceback (most recent call last):

File "<pyshell>", line 1, in <module>

File "C:\Python37\lib\random.py", line 222, in randint

return self.randrange(a, b+1)

File "C:\Python37\lib\random.py", line 200, in randrange

raise ValueError("empty range for randrange() (%d,%d, %d)" % (istart, istop, width))

ValueError: empty range for randrange() (12,3, -9)

3. randrange() :

此函式功能與 randint() 類似, 都是傳回一個整數亂數, 但它是從 range(start, stop, step) 的結果中去隨機選取, 故與 range() 有一樣的參數介面, 比 randint() 多了一個步階參數 step. 但與 randint() 不同之處是, randrange() 不包含 stop, 而 randint() 則包含 stop. 如果 randrange() 只傳入一個參數, 則該參數為 stop, 而 start 預設為 0, step 預設為 1, 所以 randrange(6) 相當於 randrange(0, 6, 1), 例如 :

>>> for i in range(10):

print(random.randrange(0, 10, 2), end=",")

6,4,6,2,8,6,0,0,8,8,

此例在 10 次迴圈中呼叫 random.randrange(0, 10, 2), 每次都會從 range(0, 10, 2) 的輸出 0, 2, 4, 6, 8 這四個數中隨機抽出一個傳回來.

4. uniform() :

此函式是 random() 的擴展, random() 傳回 [0, 1) 之間的浮點亂數, 而 uniform(a, b) 則是傳回一個閉區間 [a, b] 之間的浮點隨機數. 注意, random() 不包含 1, 但 uniform(a, b) 包含終值, 而且 a 不需要小於等於 b, 例如 :

>>> random.uniform(1, 2)

1.0127342386889224

>>> random.uniform(2, 1) # 傳入參數與大小順序無關

1.7649795854056936

>>> random.uniform(1, 100)

77.5095893144651

>>> random.uniform(4.1, 1.93) # 傳入參數與大小順序無關

3.2929810692406702

5. choice() :

此函式會在傳入字串中隨機挑一個字元, 或在傳入串列中隨機挑一個元素傳回, 例如 :

>>> random.choice('abcdefghijk')

'g'

>>> random.choice('abcdefghijk')

'i'

>>> for i in range(10):

print(random.choice('abcdefghijk'), end=",")

k,a,e,f,i,a,h,j,j,a,

>>> random.choice([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

>>> for i in range(10):

print(random.choice([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]), end=",")

3,9,4,9,0,0,5,4,8,9,

6. sample() :

此函式是 choice() 的擴展, 它可傳入第二參數 n (正整數), 在傳入字串中隨機挑 n 個字元, 或在傳入串列中隨機挑 n 個元素, 然後組成一個傳回, 例如 :

>>> random.sample('abcdefghijk', 3) # 從字串中隨機取 3 個字元

['k', 'h', 'j']

>>> random.sample('abcdefghijk', 3)

['b', 'j', 'h']

>>> for i in range(10):

print(random.sample('abcdefghijk',3))

['c', 'g', 'j']

['d', 'h', 'f']

['a', 'f', 'b']

['c', 'k', 'b']

['j', 'i', 'g']

['f', 'a', 'k']

['f', 'd', 'g']

['h', 'd', 'e']

['c', 'e', 'i']

['e', 'g', 'c']

>>> random.sample([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], 3) # 從串列中隨機取 3 個元素

[6, 8, 9]

>>> random.sample([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], 3)

[9, 8, 4]

>>> for i in range(10):

print(random.sample([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], 3))

[8, 7, 5]

[6, 0, 2]

[7, 4, 6]

[0, 4, 7]

[0, 4, 5]

[4, 1, 6]

[0, 3, 4]

[7, 6, 1]

[5, 8, 7]

[9, 8, 0]

sample() 函式很適合用來搖獎, 例如在 "Python 最強入門-邁向數據科學之路第二版" 一書中就有一個產生大樂透彩券號碼的範例 :

import random

def get_lot_number():

lot_numbers=random.sample(range(1, 50), 7)

special_number=lot_numbers.pop()

print("彩券號碼 : ", end="")

for number in sorted(lot_numbers):

print(number, end=" ")

print(f"特別號 : {special_number}")

此處先用 range(1, 50) 產生 1~49 共 49 個大樂透數字, 然後用 sample() 從中隨機抽出 7 個號碼, 最後一個是特別號, 可以用串列的 pop() 將其取出, 剩下的就是 6 個彩券號碼, 呼叫漸漸函式 sorted() 將這六個號碼由小到大排序後迴圈印出, 測試如下 :

>>> def get_lot_number():

lot_numbers=random.sample(range(1, 50), 7)

special_number=lot_numbers.pop()

print("彩券號碼 : ", end="")

for number in sorted(lot_numbers):

print(number, end=" ")

print(f"特別號 : {special_number}")

>>> get_lot_number()

彩券號碼 : 1 11 20 27 31 47 特別號 : 18

>>> get_lot_number()

彩券號碼 : 6 13 24 35 44 47 特別號 : 9

>>> get_lot_number()

彩券號碼 : 14 17 25 27 30 42 特別號 : 23

7. shuffle() :

此函式會將傳入串列的元素隨機重新排列, 是 random 模組中很常用的函式, 可用來洗牌, 例如 :

>>> mylist=['a', 'b', 'c', 'd', 'e']

>>> random.shuffle(mylist)

>>> mylist

['a', 'd', 'c', 'e', 'b']

>>> random.shuffle(mylist)

>>> mylist

['a', 'c', 'b', 'e', 'd']

注意, 此函式直接改變串列內元素之順序 (in-place), 並非傳回隨機排序後的新串列, 傳回值為 None.

8. normalvariate() :

此函式會傳回指定平均值與變異數之常態分佈下的隨機數, 常態分佈是一個鐘形曲線, 它是自然界最常見的分布形式, 在數學上它是 Pascal 三角形收斂的形狀, 其值則可由二項式定理 (binomial theorem) 預測出來, 其分布由平均值與變異數這兩個參數決定, 標準差影響分布占母體的百分比, 有 68% 的值會落在平均值一個標準差範圍內; 有 95% 會落在兩個標準差之內; 有 99.7% 會落在三個標準差之內. 例如 :

>>> random.normalvariate(10, 2)

8.55279379156435

>>> random.normalvariate(10, 2)

10.094573763355903

>>> random.normalvariate(10, 2)

10.909718174329155

>>> for i in range(10):

print(random.normalvariate(10, 2))

10.228797351529435

9.407705622044505

9.30178997384523

8.718578976923714

8.403626760307404

9.013059545231357

6.228336022766144

10.882659303335833

11.67608277952475

10.849935394313107

在 "增壓的 Python-讓程式碼進化到全新境界" 這本書裡使用了一個自訂函式以字元 * 來顯示常態分布的圖形 :

def pr_normal_chart(n):

hits=[0]*20

for i in range(n):

x=random.normalvariate(100, 30)

j=int(x/10)

if 0 <= j < 20:

hits[j] += 1

for i in hits:

print('*' * int(i * 300 / n))

依據大數法則, 當取樣次數很多時, 試驗的結果會越來越接近常態分佈, 例如 :

>>> import random

>>> def pr_normal_chart(n):

hits=[0]*20

for i in range(n):

x=random.normalvariate(100, 30)

j=int(x/10)

if 0 <= j < 20:

hits[j] += 1

for i in hits:

print('*' * int(i * 300 / n))

>>> pr_normal_chart(500)

*****

*******

****************

************************

**********************

*****************************

**************************************

******************************************

*****************************************

***************************

******************

*********

試驗 500 次的結果似乎與常態分布有點距離, 但若提高到 20 萬次就很接近了 :

>>> pr_normal_chart(200000)

***

*******

*************

********************

****************************

***********************************

**************************************

***************************************

***********************************

***************************

********************

************

*******

***

這個圖形就很接近常態分布. 因為這本書沒有介紹 Matplotlib 繪圖套件, 所以作者是用文字形式來顯示分布情形, 我將此函式改為 Matplotlib 版本, 使用長條圖來顯示分布情形 :

import matplotlib.pyplot as plt

def plot_normal_chart(n):

hits=[0]*20

for i in range(n):

x=random.normalvariate(100, 30)

j=int(x/10)

if 0 <= j < 20:

hits[j] += 1

plt.bar(range(20), hits)

plt.show()

例如 :

>>> import matplotlib.pyplot as plt

>>> def plot_normal_chart(n):

hits=[0]*20

for i in range(n):

x=random.normalvariate(100, 30)

j=int(x/10)

if 0 <= j < 20:

hits[j] += 1

plt.bar(range(20), hits)

plt.show()

>>> plot_normal_chart(500)

結果如下 :

將試驗次數提升到 20 萬次 :

>>> plot_normal_chart(200000)

結果如下 :

關於 Matplotlib 參考 :

# Python 學習筆記 : Matplotlib 資料視覺化 (二) 統計圖

9. seed() :

此函式用來設定隨機種子, 也就是偽隨機演算法的初始值, 如果使用亂數函式前沒有呼叫 seed(), 則預設會使用系統時間, 因為精確至微秒, 很難猜測到所用的種子, 因此要破解偽隨機演算法也不容易. 如果有呼叫 seed() 設定隨機種子, 則之後呼叫隨機函式時所獲得的亂數順序是一樣的, 因此在示範隨機函式時, 為了讓重複執行程式碼的人獲得同樣的結果, 例如 :

>>> random.seed(42)

>>> for i in range(5):

print(random.random())

0.6394267984578837

0.025010755222666936

0.27502931836911926

0.22321073814882275

0.7364712141640124

上面這三行程式碼不管重複執行幾次都會得到相同的結果. 隨機種子 42 是軟體工程師習慣上最常用的隨機種子, 使用 42 沒有特別原因, 只是一種流行而已, 此數字源自科幻小說 "銀河便車指南" 系列 (The Hitchhiker’s Guide to the Galaxy), 參考 :

# 常用亂數種子 42 的由來

沒有留言 :

張貼留言

訂閱：張貼留言 ( Atom )

小狐狸事務所

2022年8月20日星期六

Python 學習筆記 : 亂數模組 random

沒有留言 :

文章標籤

常用連結

2022年8月20日 星期六

Python 學習筆記 : 亂數模組 random

沒有留言 :

2022年8月20日星期六