# [Open Jarvis] 如何讓Python 自動將語音轉譯成文字?
我發現這個大數學堂有許多非常棒的教學, 例如網路爬蟲, R 語言, 影像辨識, 交易系統, Open Jarvis 等等, 參考 :
# 大數軟體有限公司 (Youtube)
# http://www.largitdata.com/course_list/14
# https://devpost.com/software/open-jarvis
看完上面影片後覺得利用 Python 透過 Google 語音辨識 API 做語音轉文字竟然如此簡單, 不禁躍躍欲試, 晚飯過後迫不及待要驗證一番.
安裝 SpeechRecognition 套件只要在命令提示字元視窗下 pip 或 pip3 install SpeechRecognition 指令即可 :
D:\Python>pip3 install SpeechRecognition
如果是在防火牆內 (例如公司網路) 無法直接安裝, 出現如下錯誤訊息 :
" Could not find a version that satisfies the requirement SpeechRecognition (from versions: )
No matching distribution found for SpeechRecognition"
可先到 PyPi 網站下載 whl 檔安裝 (約 32MB) :
# https://pypi.python.org/pypi/SpeechRecognition/
D:\>cd python
D:\Python>pip3 install SpeechRecognition-3.7.1-py2.py3-none-any.whl
Processing d:\python\speechrecognition-3.7.1-py2.py3-none-any.whl
Installing collected packages: SpeechRecognition
Successfully installed SpeechRecognition-3.7.1
用 pip3 list 檢查確實已安裝此套件 :
D:\Python>pip3 list
DEPRECATION: The default format will switch to columns in the future. You can us
e --format=(legacy|columns) (or define a format=(legacy|columns) in your pip.con
f under the [list] section) to disable this warning.
adafruit-ampy (1.0.1)
beautifulsoup4 (4.5.3)
click (6.7)
colorama (0.3.9)
cycler (0.10.0)
Django (1.8.18)
matplotlib (2.0.0)
mpfshell (0.8.0)
numpy (1.12.1+mkl)
olefile (0.44)
pandas (0.19.2)
Pillow (4.1.0)
pip (9.0.1)
py2exe (0.9.2.0)
PyAutoGUI (0.9.36)
pyFirmata (1.0.3)
PyMsgBox (1.0.6)
pyparsing (2.2.0)
PyScreeze (0.1.9)
pyserial (3.3)
python-dateutil (2.6.0)
PyTweening (1.0.3)
pytz (2017.2)
pyudev (0.21.0)
requests (2.13.0)
rshell (0.0.9)
scikit-learn (0.18.1)
scipy (0.19.0)
setuptools (28.8.0)
six (1.10.0)
SpeechRecognition (3.7.1)
virtualenv (15.1.0)
websocket-client (0.40.0)
然後馬上照教學影片中的範例程式來測試 :
>>> import speech_recognition
>>> r=speech_recognition.Recognizer()
>>> with speech_recognition.Microphone() as source:
audio=r.listen(source)
但是呼叫 listen() 方法時卻出現錯誤 :
Traceback (most recent call last):
File "C:\Python36\lib\site-packages\speech_recognition\__init__.py", line 108, in get_pyaudio
import pyaudio
ModuleNotFoundError: No module named 'pyaudio'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<pyshell#4>", line 1, in <module>
with speech_recognition.Microphone() as source:
File "C:\Python36\lib\site-packages\speech_recognition\__init__.py", line 79, in __init__
self.pyaudio_module = self.get_pyaudio()
File "C:\Python36\lib\site-packages\speech_recognition\__init__.py", line 110, in get_pyaudio
raise AttributeError("Could not find PyAudio; check installation")
AttributeError: Could not find PyAudio; check installation
意思是還缺一個 PyAudio 模組, 於是回到 PyPi 閱讀 SpeechRecognition 套件說明, 原來此套件若使用麥克風當作音源輸入的話, 必須安裝 PyAudio 模組才行 :
"PyAudio is required if and only if you want to use microphone input (Microphone). PyAudio version 0.2.11+ is required, as earlier versions have known memory management bugs when recording from microphones in certain situations.
If not installed, everything in the library will still work, except attempting to instantiate a Microphone object will raise an AttributeError."
用下列指令安裝 PyAudio :
D:\Python> pip3 install PyAudio
或者從 PyPi 下載 whl 安裝檔 :
# https://pypi.python.org/pypi/PyAudio/0.2.11#downloads
D:\Python>pip3 install PyAudio-0.2.11-cp36-cp36m-win_amd64.whl
Processing d:\python\pyaudio-0.2.11-cp36-cp36m-win_amd64.whl
Installing collected packages: PyAudio
Successfully installed PyAudio-0.2.11
安裝好後再次執行 listen() 呼叫仍然出現錯誤訊息 "No Default Input Device Available" :
>>> with speech_recognition.Microphone() as source:
audio=r.listen(source)
Traceback (most recent call last):
File "<pyshell#7>", line 1, in <module>
with speech_recognition.Microphone() as source:
File "C:\Python36\lib\site-packages\speech_recognition\__init__.py", line 86, in __init__
device_info = audio.get_device_info_by_index(device_index) if device_index is not None else audio.get_default_input_device_info()
File "C:\Python36\lib\site-packages\pyaudio.py", line 949, in get_default_input_device_info
device_index = pa.get_default_input_device()
OSError: No Default Input Device Available
我查 Google 找到下面這篇, 有人建議還要安裝 "PortAudio" :
# PyAudio IOError: No Default Input Device Available
我找到 PortAudio 網站下載 tgz 檔, 解開後發現是一堆包含 html 與 sh 副檔名的檔案, 搞不清楚是要安裝還是要放在哪裡. 後來想到該不會是因為我還沒插上麥克風的關係吧? 找出已經很久沒用的麥克風插進電腦的麥克風孔, 再次執行 listen() 就不會報錯了.
不過執行 listen() 後我對麥克風說了句 "您好嗎" 就停在那邊很久都沒反應, 後來找到下面文章, 原來要先用 adjust_for_ambient_noise() 函數調整麥克風的噪音 :
# Easy Speech Recognition in Python with PyAudio and Pocketsphinx
我將其範例程式修改如下 :
import speech_recognition as sr
#obtain audio from the microphone
r=sr.Recognizer()
with sr.Microphone() as source:
print("Please wait. Calibrating microphone...")
#listen for 5 seconds and create the ambient noise energy level
r.adjust_for_ambient_noise(source, duration=5)
print("Say something!")
audio=r.listen(source)
# recognize speech using Google Speech Recognition
try:
print("Google Speech Recognition thinks you said:")
print(r.recognize_google(audio, language="zh-TW"))
except sr.UnknownValueError:
print("Google Speech Recognition could not understand audio")
except sr.RequestError as e:
print("No response from Google Speech Recognition service: {0}".format(e))
上面程式是利用 SpeechRecognition 模組中的 recognixe_google() 函數透過 Google 語音辨識 API 來將麥克風收到的語音物件 audio 辨識成指定語系的文字 :
r.recognize_google(audio, language='zh-TW')
這裡要傳入語音物件 audio 與 language 參數, 指定語系為繁體中文的 "zh-TW", 不過中文語系下唸英文也是可以辨識出來的. 將此程式存成 google_sr.py 後執行, 果然就成功了 :
D:\ESP8266\test>python google_sr.py
Please wait. Calibrating microphone...
Say something!
Google Speech Recognition thinks you said:
你好嗎
D:\ESP8266\test>python google_sr.py
Please wait. Calibrating microphone...
Say something!
Google Speech Recognition thinks you said:
Google Speech Recognition could not understand audio
D:\ESP8266\test>python google_sr.py
Please wait. Calibrating microphone...
Say something!
Google Speech Recognition thinks you said:
新年快樂
D:\ESP8266\test>python google_sr.py
Please wait. Calibrating microphone...
Say something!
Google Speech Recognition thinks you said:
川普是笨蛋
D:\ESP8266\test>python google_sr.py
Please wait. Calibrating microphone...
Say something!
Google Speech Recognition thinks you said:
金正恩是瘋子
D:\ESP8266\test>python google_sr.py
Please wait. Calibrating microphone...
Say something!
Google Speech Recognition thinks you said:
good morning
D:\ESP8266\test>python google_sr.py
Please wait. Calibrating microphone...
Say something!
Google Speech Recognition thinks you said:
Donald trump is stupid
Google 果然厲害! 辨識率 100%!
參考 :
# SPEECH RECOGNİTİON WİTH PYTHON
# Coding Jarvis in Python in 2016
# Speech Recognition with Python
# 透過 Python 使用 Google Speech Recognition 語音辨識服務
# https://www.youtube.com/channel/UCFdTiwvDjyc62DBWrlYDtlQ
2017-08-25 補充 :
早上提早到公司也如法泡製一番, 發現可能因為防火牆無法運作 :
D:\Python\test>python google_sr.py
Please wait. Calibrating microphone...
Say something!
Google Speech Recognition thinks you said:
No response from Google Speech Recognition service: recognition connection faile
d: [WinError 10060] 連線嘗試失敗,因為連線對象有一段時間並未正確回應,或是連線建
立失敗,因為連線的主機無法回應。
21 則留言 :
您好,拜讀您很多文章,受益良多大感謝,這個語音辨識的是不是也有辦法用 MicroPython 實作出來
我覺得有困難, 因為這些模組好像沒有 MicroPython 的版本, 而且 ESP8266 還要驅動麥克風與喇叭哩. 我想樹莓派應該可以.
https://drive.google.com/open?id=1mpi6PbWNZuCpLkgoMu1e5CBrSme2VSoj
不好意思我安裝的時候遇到這個問題!
應該是 pip 版本太舊啦!
不好意思我是新手我把原本python刪掉照http://yhhuang1966.blogspot.tw/2017/04/windows-python.html灌3.6.4可是在cmd會找不到pip (https://drive.google.com/open?id=1TILSy5E8JYP63ksVIJAIar85oiokfd7n)謝謝你~
安裝畫面第二張圖的 pip 要勾選, 另外第三張圖的 add python to environmental variables 也要勾選.
不好意思我升級玩pip後還是有紅色的error
https://drive.google.com/open?id=17CaXMhHHzG5QoDqYV87fDkTequtC6wNR
他有需要另外安裝什麼嗎!?
沒有喔! 用 PIP3 試試看, 應該設定系統環境變數, 在其他目錄操作, 不要在 ProgramFile 下操作.
不好意思我用pip3在別的目錄下操作還是有這段錯誤,請問是要設什麼環境變數!?
謝謝!!
在控制台/系統/環境變數/裡編輯系統變數 path, 把安裝的 Python 路徑前面加 ; 號放到結尾, 例如 ;C:\Python36\Scripts\;C:\Python36\; 這樣在任何目錄下都可以執行 Python.
錯誤訊息似乎與權限有關, 試試看以系統管理員身分開啟命令提示字元視窗, 再執行 PYTHON.
開始 > 命令提示字元視窗 > 以系統管理員身分執行
OK 嗎?
不好意思 我按照你的影片步驟出現這個
https://drive.google.com/file/d/1UlwiOtI7xfZk9pkkmKPknM5nzvzRSseW/view?usp=sharing
應該是安裝套件沒有成功 :
pip3 install SpeechRecognition-3.7.1-py2.py3-none-any.whl
使用 Anaconda 好像要用 conda 去安裝, 我純粹使用命令列.
我是用pi 3
pip3 list 有 SpeechRecognition 3.8.1 為甚麼還會這樣?
我只在 Windows 上測試 OK, 還沒時間在 Pi 3 上面試, 我找時間玩看看.
請問沒有麥克風,想直接使用錄音檔,該如何操作?
Sorry, 還沒研究如何上傳錄音檔案.好像要先存放在 Google Cloud Storage, 參考 :
https://cloud.google.com/speech/docs/sync-recognize
老師您好,想請教您一下,這個GOOGLE的語音辨識有辦法做喚醒的功能嗎?比方說我只要喊OK GOOGLE然後它就自動開啟辨識,再來我就可以唸我要辨識的內容,請問有這樣的功能嗎,謝謝
這個我不確定, 但很有興趣, 值得進一步研究看看喔
這個我不確定, 但很有興趣, 值得進一步研究看看喔
張貼留言