小狐狸事務所: 使用 Keras 測試 MNIST 手寫數字辨識資料集

2018年2月28日星期三

使用 Keras 測試 MNIST 手寫數字辨識資料集

我昨天在 Windows 上安裝好 TensorFlow + Keras 深度學習框架, 分別在 Win7 的 Python 3.6.1 與 Win10 的 Python 3.6.4 都遇到過錯誤訊息, 但最後均安裝成功, 總結經驗, 成功安裝的條件歸納如下 :

須安裝 Python 64 位元版
可能須安裝 Numpy+MKL (下載), 而非僅 Numpy 而已 (Python 3.6.4 不用)
可能須安裝 msvcp.dll (下載)

特別注意, 若原先是安裝 32 位元版之 Python, 則移除後務必將原安裝目錄 (例如 C:/Python36) 整個刪除乾淨再重新安裝 64 bit 版, 否則在匯入 keras 套件時會出現錯誤.

參考 :

# Windows 安裝深度學習框架 TensorFlow 與 Keras

接下來要利用 Keras 來測試 MNIST 手寫數字辨識資料集, 主要是參考林大貴寫的 "TensorFlow+Keras 深度學習人工智慧實務應用" 這本書的第六章, 另外也參考了下面這兩本 :

Deep Learning (齋藤康毅)
Python Deep Learning (Valentino)

MNIST 手寫數字辨識資料集是機器學習領域最有名的資料集之一, 包含了 70000 個 0~9 手寫阿拉伯數字的 BMP 格式圖檔與其正確之標籤 (Label, 即圖檔對應之 0~9 數字), 其中前 60000 個為訓練樣本, 用來訓練神經網路以建立模型; 其餘 10000 個為測試樣本, 用來檢驗模型是否能正確推論或預測圖片中的數字. 資料集的每個圖片都是解析度為 28*28 (784 個 pixel) 的灰階影像, 每個像素為 0~255 之數值.

Source : MNIST_database

參考 :

# https://en.wikipedia.org/wiki/MNIST_database

MNIST 資料集是卷積神經網路之父 Yann LeCun (揚.勒丘恩) 於貝爾實驗室進行圖像識別研究時所蒐集, 他也是在此時發明了卷積神經網路. Yann LeCun 曾在多倫多大學跟隨 Geoffrey Hinton 做博士後研究, 他與 Geoffrey Hinton, Yoshua Bengio 被稱為深度學習領域的三巨頭. Yann LeCun 自 2013 年起任職臉書 AI 實驗室 (FAIR) 主任, 不過目前已逐步退出管理層, 專注於 AI 科研工作, 但仍繼續主導臉書 AI 研究方向, 參考 :

# Facebook AI chief Yann LeCun is stepping aside to take on dedicated research role

以下為此次測試紀錄 :

1. 匯入 MNIST 模組

首先是從 Keras 匯入 mnist 模組 :

D:\Python\tensorflow>python
Python 3.6.1 (v3.6.1:69c0db5, Mar 21 2017, 18:41:36) [MSC v.1900 64 bit (AMD64)]
on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> from keras.datasets import mnist #匯入 mnist 模組
Using TensorFlow backend.

可見 Keras 預設底層是使用 TensorFlow 運算引擎進行張量運算.

2. 載入 MNIST 資料集

接著呼叫 load_data() 載入 MNIST 資料集. 它會先到 C 碟的使用者目錄下的 .keras 子目錄檢查是否已有 mnist.npz 資料檔, 有的話就直接載入, 否則會自動到下列網址下載並存放於 C:\使用者\user_id\.keras\dataset\ 目錄下 :

# https://s3.amazonaws.com/img-datasets/mnist.npz

>>> (x_train_image, y_train_label), (x_test_image, y_test_label)=mnist.load_data()
Downloading data from https://s3.amazonaws.com/img-datasets/mnist.npz
>>>

如果因防火牆阻擋無法以 minst.load_data() 下載 mnist.npz, 可用瀏覽器下載後放到 C:\使用者\user_id\.keras\dataset\ 下即可 :

3. 查詢資料集筆數

函數 minst.load_data() 的傳回值是兩組 tuple (其元素為串列), 第一組是訓練資料, 第二組是測試資料, 都是前為圖片 (image), 後為標籤 (label), 格式如下 :

(訓練圖片, 訓練標籤), (測試圖片, 測試標籤)

將傳回樣本傳入 len() 查詢資料集筆數 :

>>> print('train image=', len(x_train_image)) #顯示訓練圖片筆數 : 6 萬筆
train image= 60000
>>> print('train label=', len(y_train_label)) #顯示訓練標籤筆數 : 6 萬筆
train label= 60000
>>> print('test image=', len(x_test_image)) #顯示測試圖片筆數 : 1 萬筆
test image= 10000
>>> print('test label=', len(y_test_label)) #顯示測試標籤筆數 : 1 萬筆
test label= 10000

可見訓練樣本 60000 + 測試樣本 10000 共 70000 筆資料.

4. 查詢資料集外型 (shape)

圖像與標籤均有 shape 屬性紀錄樣本之外型 :

>>> print('x_train_image:',x_train_image.shape) #顯示訓練樣本圖片外型
x_train_image: (60000, 28, 28)
>>> print('y_train_label:',y_train_label.shape) #顯示訓練樣本標籤外型
y_train_label: (60000,)
>>> print('x_test_image:',x_test_image.shape) #顯示測試樣本圖片外型
x_test_image: (10000, 28, 28)
>>> print('y_test_label:',y_test_label.shape) #顯示測試樣本標籤外型
y_test_label: (10000,)

可見訓練圖片有 60000 張, 其解析度為 28*28; 而相對之標籤 (label) 則有 60000 個. 測試圖片有 10000 張, 其解析度為 28*28; 而相對之標籤 (label) 也是 10000 個.

5. 顯示第一筆訓練樣本圖片內容

從上面的 shape 屬性可知每一個樣本圖片解析度為 28*28, 即使用 784 個像素來描點, 每一個像素具體來說就是 0~255 的數字, 代表該畫素的灰階程度, 0 表示白色, 255 表示黑色 :

>>> print('x_train_image[0]=', x_train_image[0])
x_train_image[0]= [[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 3 18 18 18 126 136
175 26 166 255 247 127 0 0 0 0]
[ 0 0 0 0 0 0 0 0 30 36 94 154 170 253 253 253 253 253
225 172 253 242 195 64 0 0 0 0]
[ 0 0 0 0 0 0 0 49 238 253 253 253 253 253 253 253 253 251
93 82 82 56 39 0 0 0 0 0]
[ 0 0 0 0 0 0 0 18 219 253 253 253 253 253 198 182 247 241
0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 80 156 107 253 253 205 11 0 43 154
0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 14 1 154 253 90 0 0 0 0
0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 139 253 190 2 0 0 0
0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 11 190 253 70 0 0 0
0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 35 241 225 160 108 1
0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 81 240 253 253 119
25 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 45 186 253 253
150 27 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 16 93 252
253 187 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 249
253 249 64 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 46 130 183 253
253 207 2 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 39 148 229 253 253 253
250 182 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 24 114 221 253 253 253 253 201
78 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 23 66 213 253 253 253 253 198 81 2
0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 18 171 219 253 253 253 253 195 80 9 0 0
0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 55 172 226 253 253 253 253 244 133 11 0 0 0 0
0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 136 253 253 253 212 135 132 16 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0]]
>>> print('y_train_label[0]=', y_train_label[0])
y_train_label[0]= 5

可見此二維串列所儲存之圖片標籤為 5.

6. 顯示第一筆訓練樣本之圖片與標籤

顯示 2D 圖形須匯入 matplotlib 模組, 下列為參考書中範例所寫的 Pyhton 程式, 它會顯示第一個樣本 (索引 0) 的圖片與標籤 :

#show_train_image0.py
from keras.datasets import mnist
import matplotlib.pyplot as plt

def plot_image(image): #繪圖函數
fig=plt.gcf() #取得 pyplot 物件參考
fig.set_size_inches(2, 2) #設定畫布大小為 2 吋*2吋
plt.imshow(image, cmap='binary') #以 binary (灰階) 顯示 28*28 圖形
plt.show() #顯示圖形

(x_train_image, y_train_label), \
(x_test_image, y_test_label)=mnist.load_data() #載入 MNIST 資料集
print(y_train_label[0]) #顯示第一筆樣本之標籤 (label)
plot_image(x_train_image[0]) #繪製第一筆樣本之圖形

將此程式存為 show_train_image0.py, 然後在檔案所在目錄以 python 指令執行此程式, 這時會顯示 MNIST 的第一筆樣本標籤為 5, 並開啟一個繪圖視窗顯示第一筆樣本的圖片 :

D:\Python\test>python show_train_image0.py
Using TensorFlow backend.
5

可見第一筆樣本是手寫數字 5.

7. 顯示第 n 筆訓練樣本之圖片與標籤 (利用命令列參數)

上面的程式只能顯示第一筆樣本, 我們可以透過命令列參數將要顯示之樣本索引傳給程式, 利用模組 sys 的 argv 屬性取得命令列參數, 用來取代上面程式中的索引 0 即可動態地控制要顯示的樣本, 參考 :

# Python 命令行参数

如果在命令列執行 Python 指令時後面帶參數, 例如 :

D:\Python\test>python test.py arg1 arg2 arg3

則可利用 sys 模組的 argv 屬性串列來取得參數. 串列 sys.argv[] 的第一個元素是程式名稱, 以上面指令為例, 其內容為 :

['test.py', 'arg1', 'arg2', 'arg3']

使用 sys 模組需先用 import sys 匯入.

將上面的程式加上命令列參數存取功能修改如下 :

#show_train_image.py
import sys #匯入系統模組
from keras.datasets import mnist
import matplotlib.pyplot as plt

def plot_image(image):
fig=plt.gcf()
fig.set_size_inches(2, 2)
plt.imshow(image, cmap='binary')
plt.show()

(x_train_image, y_train_label), \
(x_test_image, y_test_label)=mnist.load_data()
i=int(sys.argv[1]) #命令列參數為字串, 須轉為整數
print(y_train_label[i])
plot_image(x_train_image[i])

將此程式存為 show_train_image.py, 然後於命令提示字元視窗執行此程式時傳入參數 1 即顯示第 2 筆樣本之圖片與標籤 :

可見第二筆樣本為手寫數字 0.

7. 顯示多筆訓練樣本之圖片與標籤 (利用命令列參數)

上面的程式都只能顯示一個樣本, 其實 matplotlib.pyplot 有一個 subplot() 函數可以在一張圖上繪製多個子圖, 參考 :

# matplotlib.pyplot.subplots
# Matplotlib简介和pyplot的简单使用——subplot

參考書中範例將上面程式修改為如下可一次顯示多個樣本的程式如下 :

#show_train_images.py

import sys

from keras.datasets import mnist

import matplotlib.pyplot as plt

def plot_images_labels_prediction(images,labels,prediction,idx,num=10):

fig=plt.gcf() #取得 pyplot 物件參考

fig.set_size_inches(12, 14) #設定畫布大小為 12 吋*14吋

if num > 25: num=25 #限制最多顯示 25 個子圖

for i in range(0, num): #依序顯示 num 個子圖

ax=plt.subplot(5, 5, i+1) #建立 5*5 個子圖中的第 i+1 個

ax.imshow(images[idx], cmap='binary') #顯示子圖

title="label=" + str(labels[idx])

if len(prediction) > 0: #有預測值就加入標題中

title += ",predict=" + str(prediction[idx])

ax.set_title(title, fontsize=10) #設定標題

ax.set_xticks([]); #不顯示 x 軸刻度

ax.set_yticks([]); #不顯示 y 軸刻度

idx += 1 #樣本序號增量 1

plt.show() #繪製圖形

(x_train_image, y_train_label), \

(x_test_image, y_test_label)=mnist.load_data()

i=int(sys.argv[1]) #取得第一個命令列參數 ()

j=int(sys.argv[2]) #取得第二個命令列參數

plot_images_labels_prediction(x_train_image,y_train_label,[],i,j) #無預測值

此程式中繪圖函數 plot_images_labels_prediction() 的傳入參數有 5 個 :

images=樣本之圖片串列
labels=樣本之標籤串列
prediction=預測值串列
idx=資料集起始索引
num=一次顯示之樣本數

其中 prediction 為預測值串列, 目前還未進行辨識用不到, 可傳入空串列即可. 而 num 預設值為 10, 如果只傳入一個參數, 沒有傳 num 給程式的話, 就顯示 10 個樣本. num 最大值被限制為 25, 即 1 次最多顯示 25 個樣本.

每個樣本圖片會被繪製在 pyplot 的子圖物件 ax 上, 利用子圖物件的 set_title() 方法將對應之標籤與預測值顯示在子圖上方的標題中. 由於一次要顯示的圖最多達 25 張 (即 5*5 方格), 因此畫布區域擴大為 12 吋*14 吋, 因每個子圖占 2 吋 * 2 吋, 高度部分有標題故多 2 吋.

我仍然利用命令列傳入參數, 但此程式要傳入 idx 與 num 這兩個參數, idx 是一個 0 起始的索引, 用來指定要從資料集的哪一個樣本開始顯示; num 為要一次顯示的樣本數. 將上面程式存成檔案 show_train_images.py 後於命令列執行, 例如下面指令是從資料集第二個樣本 (索引 1) 開始顯示 10 個樣本 :

D:\test>python show_train_images.py 1 10

Using TensorFlow backend.

結果如下 :

下面是從資料集開頭顯示 25 的樣本, 例如 :

D:\test>python show_train_images.py 0 25
Using TensorFlow backend.

結果為 :

這樣我們就可以指定任意起始索引與想要顯示幾筆樣本了, 但一次最多顯示 25 個樣本. 注意, MNIST 訓練樣本只有 60000 筆, 因此起始索引最多只到 59999, 例如最後 25 筆訓練樣本是從索引 59975 開始的 25 筆 :

D:\test>python show_train_images.py 59975 25
Using TensorFlow backend.

如果起始索引改為 59976, 則迴圈的最後一圈會因為存取不到索引 60000 而出現錯誤 :

D:\test>python show_train_images.py 59976 25
Using TensorFlow backend.
Traceback (most recent call last):
File "show_train_images.py", line 25, in <module>
plot_images_labels_prediction(x_train_image,y_train_label,[],i,j)
File "show_train_images.py", line 11, in plot_images_labels_prediction
ax.imshow(images[idx], cmap='binary')
IndexError: index 60000 is out of bounds for axis 0 with size 60000

同樣地, 對於 MINST 資料集裡的 10000 個樣本, 我們也可以用這個方法來一次顯示多個測試樣本, 但要將上面程式中傳入函數 plot_images_labels_prediction() 的前兩個參數改為測試樣本串列, 程式修改為如下的 show_test_images.py :

#show_test_images.py
import sys
from keras.datasets import mnist
import matplotlib.pyplot as plt

def plot_images_labels_prediction(images,labels,prediction,idx,num=10):
fig=plt.gcf()
fig.set_size_inches(12, 14)
if num > 25: num=25
for i in range(0, num):
ax=plt.subplot(5, 5, i+1)
ax.imshow(images[idx], cmap='binary')
title="label=" + str(labels[idx])
if len(prediction) > 0:
title += ",predict=" + str(prediction[idx])
ax.set_title(title, fontsize=10)
ax.set_xticks([]);
ax.set_yticks([]);
idx += 1
plt.show()

(x_train_image, y_train_label), \
(x_test_image, y_test_label)=mnist.load_data()
i=int(sys.argv[1])
j=int(sys.argv[2])
plot_images_labels_prediction(x_test_image,y_test_label,[],i,j) #傳入測試樣本串列

顯示測試樣本的前 25 筆 :

D:\test>python show_test_images.py 0 25
Using TensorFlow backend.