# [Open Jarvis] 如何讓Python 自動將語音轉譯成文字?
我發現這個大數學堂有許多非常棒的教學, 例如網路爬蟲, R 語言, 影像辨識, 交易系統, Open Jarvis 等等, 參考 :
# 大數軟體有限公司 (Youtube)
# http://www.largitdata.com/course_list/14
# https://devpost.com/software/open-jarvis
看完上面影片後覺得利用 Python 透過 Google 語音辨識 API 做語音轉文字竟然如此簡單, 不禁躍躍欲試, 晚飯過後迫不及待要驗證一番.
安裝 SpeechRecognition 套件只要在命令提示字元視窗下 pip 或 pip3 install SpeechRecognition 指令即可 :
D:\Python>pip3 install SpeechRecognition
如果是在防火牆內 (例如公司網路) 無法直接安裝, 出現如下錯誤訊息 :
" Could not find a version that satisfies the requirement SpeechRecognition (from versions: )
No matching distribution found for SpeechRecognition"
可先到 PyPi 網站下載 whl 檔安裝 (約 32MB) :
# https://pypi.python.org/pypi/SpeechRecognition/
D:\>cd python
D:\Python>pip3 install SpeechRecognition-3.7.1-py2.py3-none-any.whl
Processing d:\python\speechrecognition-3.7.1-py2.py3-none-any.whl
Installing collected packages: SpeechRecognition
Successfully installed SpeechRecognition-3.7.1
用 pip3 list 檢查確實已安裝此套件 :
D:\Python>pip3 list
DEPRECATION: The default format will switch to columns in the future. You can us
e --format=(legacy|columns) (or define a format=(legacy|columns) in your pip.con
f under the [list] section) to disable this warning.
SpeechRecognition (3.7.1)
virtualenv (15.1.0)
websocket-client (0.40.0)
然後馬上照教學影片中的範例程式來測試 :
>>> import speech_recognition
>>> r=speech_recognition.Recognizer()
>>> with speech_recognition.Microphone() as source:
但是呼叫 listen() 方法時卻出現錯誤 :
Traceback (most recent call last):
File "C:\Python36\lib\site-packages\speech_recognition\__init__.py", line 108, in get_pyaudio
import pyaudio
ModuleNotFoundError: No module named 'pyaudio'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<pyshell#4>", line 1, in <module>
with speech_recognition.Microphone() as source:
File "C:\Python36\lib\site-packages\speech_recognition\__init__.py", line 79, in __init__
self.pyaudio_module = self.get_pyaudio()
File "C:\Python36\lib\site-packages\speech_recognition\__init__.py", line 110, in get_pyaudio
raise AttributeError("Could not find PyAudio; check installation")
AttributeError: Could not find PyAudio; check installation
意思是還缺一個 PyAudio 模組, 於是回到 PyPi 閱讀 SpeechRecognition 套件說明, 原來此套件若使用麥克風當作音源輸入的話, 必須安裝 PyAudio 模組才行 :
"PyAudio is required if and only if you want to use microphone input (Microphone). PyAudio version 0.2.11+ is required, as earlier versions have known memory management bugs when recording from microphones in certain situations.
If not installed, everything in the library will still work, except attempting to instantiate a Microphone object will raise an AttributeError."
用下列指令安裝 PyAudio :
D:\Python> pip3 install PyAudio
或者從 PyPi 下載 whl 安裝檔 :
# https://pypi.python.org/pypi/PyAudio/0.2.11#downloads
D:\Python>pip3 install PyAudio-0.2.11-cp36-cp36m-win_amd64.whl
Processing d:\python\pyaudio-0.2.11-cp36-cp36m-win_amd64.whl
Installing collected packages: PyAudio
Successfully installed PyAudio-0.2.11
安裝好後再次執行 listen() 呼叫仍然出現錯誤訊息 "No Default Input Device Available" :
>>> with speech_recognition.Microphone() as source:
Traceback (most recent call last):
File "<pyshell#7>", line 1, in <module>
with speech_recognition.Microphone() as source:
File "C:\Python36\lib\site-packages\speech_recognition\__init__.py", line 86, in __init__
device_info = audio.get_device_info_by_index(device_index) if device_index is not None else audio.get_default_input_device_info()
File "C:\Python36\lib\site-packages\pyaudio.py", line 949, in get_default_input_device_info
device_index = pa.get_default_input_device()
OSError: No Default Input Device Available
我查 Google 找到下面這篇, 有人建議還要安裝 "PortAudio" :
# PyAudio IOError: No Default Input Device Available
我找到 PortAudio 網站下載 tgz 檔, 解開後發現是一堆包含 html 與 sh 副檔名的檔案, 搞不清楚是要安裝還是要放在哪裡. 後來想到該不會是因為我還沒插上麥克風的關係吧? 找出已經很久沒用的麥克風插進電腦的麥克風孔, 再次執行 listen() 就不會報錯了.
不過執行 listen() 後我對麥克風說了句 "您好嗎" 就停在那邊很久都沒反應, 後來找到下面文章, 原來要先用 adjust_for_ambient_noise() 函數調整麥克風的噪音 :
# Easy Speech Recognition in Python with PyAudio and Pocketsphinx
我將其範例程式修改如下 :
import speech_recognition as sr
#obtain audio from the microphone
with sr.Microphone() as source:
print("Please wait. Calibrating microphone...")
#listen for 5 seconds and create the ambient noise energy level
r.adjust_for_ambient_noise(source, duration=5)
print("Say something!")
# recognize speech using Google Speech Recognition
print("Google Speech Recognition thinks you said:")
print(r.recognize_google(audio, language="zh-TW"))
except sr.UnknownValueError:
print("Google Speech Recognition could not understand audio")
except sr.RequestError as e:
print("No response from Google Speech Recognition service: {0}".format(e))
上面程式是利用 SpeechRecognition 模組中的 recognixe_google() 函數透過 Google 語音辨識 API 來將麥克風收到的語音物件 audio 辨識成指定語系的文字 :
r.recognize_google(audio, language='zh-TW')
這裡要傳入語音物件 audio 與 language 參數, 指定語系為繁體中文的 "zh-TW", 不過中文語系下唸英文也是可以辨識出來的. 將此程式存成 google_sr.py 後執行, 果然就成功了 :
D:\ESP8266\test>python google_sr.py
Please wait. Calibrating microphone...
Say something!
Google Speech Recognition thinks you said:
D:\ESP8266\test>python google_sr.py
Please wait. Calibrating microphone...
Say something!
Google Speech Recognition thinks you said:
Google Speech Recognition could not understand audio
D:\ESP8266\test>python google_sr.py
Please wait. Calibrating microphone...
Say something!
Google Speech Recognition thinks you said:
D:\ESP8266\test>python google_sr.py
Please wait. Calibrating microphone...
Say something!
Google Speech Recognition thinks you said:
D:\ESP8266\test>python google_sr.py
Please wait. Calibrating microphone...
Say something!
Google Speech Recognition thinks you said:
D:\ESP8266\test>python google_sr.py
Please wait. Calibrating microphone...
Say something!
Google Speech Recognition thinks you said:
good morning
D:\ESP8266\test>python google_sr.py
Please wait. Calibrating microphone...
Say something!
Google Speech Recognition thinks you said:
Donald trump is stupid
Google 果然厲害! 辨識率 100%!
參考 :
# Coding Jarvis in Python in 2016
# Speech Recognition with Python
# 透過 Python 使用 Google Speech Recognition 語音辨識服務
# https://www.youtube.com/channel/UCFdTiwvDjyc62DBWrlYDtlQ
2017-08-25 補充 :
早上提早到公司也如法泡製一番, 發現可能因為防火牆無法運作 :
D:\Python\test>python google_sr.py
Please wait. Calibrating microphone...
Say something!
Google Speech Recognition thinks you said:
No response from Google Speech Recognition service: recognition connection faile
d: [WinError 10060] 連線嘗試失敗,因為連線對象有一段時間並未正確回應,或是連線建
您好,拜讀您很多文章,受益良多大感謝,這個語音辨識的是不是也有辦法用 MicroPython 實作出來
回覆刪除我覺得有困難, 因為這些模組好像沒有 MicroPython 的版本, 而且 ESP8266 還要驅動麥克風與喇叭哩. 我想樹莓派應該可以.
應該是 pip 版本太舊啦!
回覆刪除不好意思我是新手我把原本python刪掉照http://yhhuang1966.blogspot.tw/2017/04/windows-python.html灌3.6.4可是在cmd會找不到pip (https://drive.google.com/open?id=1TILSy5E8JYP63ksVIJAIar85oiokfd7n)謝謝你~
回覆刪除安裝畫面第二張圖的 pip 要勾選, 另外第三張圖的 add python to environmental variables 也要勾選.
回覆刪除沒有喔! 用 PIP3 試試看, 應該設定系統環境變數, 在其他目錄操作, 不要在 ProgramFile 下操作.
在控制台/系統/環境變數/裡編輯系統變數 path, 把安裝的 Python 路徑前面加 ; 號放到結尾, 例如 ;C:\Python36\Scripts\;C:\Python36\; 這樣在任何目錄下都可以執行 Python.
回覆刪除錯誤訊息似乎與權限有關, 試試看以系統管理員身分開啟命令提示字元視窗, 再執行 PYTHON.
開始 > 命令提示字元視窗 > 以系統管理員身分執行
OK 嗎?
回覆刪除不好意思 我按照你的影片步驟出現這個
應該是安裝套件沒有成功 :
回覆刪除pip3 install SpeechRecognition-3.7.1-py2.py3-none-any.whl
使用 Anaconda 好像要用 conda 去安裝, 我純粹使用命令列.
我是用pi 3
回覆刪除pip3 list 有 SpeechRecognition 3.8.1 為甚麼還會這樣?
我只在 Windows 上測試 OK, 還沒時間在 Pi 3 上面試, 我找時間玩看看.
回覆刪除Sorry, 還沒研究如何上傳錄音檔案.好像要先存放在 Google Cloud Storage, 參考 :
老師您好,想請教您一下,這個GOOGLE的語音辨識有辦法做喚醒的功能嗎?比方說我只要喊OK GOOGLE然後它就自動開啟辨識,再來我就可以唸我要辨識的內容,請問有這樣的功能嗎,謝謝
回覆刪除這個我不確定, 但很有興趣, 值得進一步研究看看喔
回覆刪除這個我不確定, 但很有興趣, 值得進一步研究看看喔