2018年10月19日 星期五

Python Fintech 學習筆記 : Google Finance 無法下載歷史資料

昨天拿書架上買來已數月卻無暇閱讀的 "Python 網頁程式交易 APP 實作" 來看, 當初是因為聽說明儀要結束營業才急急忙忙跑去買的, 其實我對此書的排版不甚滿意, 只是對書中的第 14 章介紹如何從 Google Finance 抓取台股歷史資料有興趣而已.



Python Fintech 測試系列此前文章參考 :

Python Fintech 學習筆記 : 安裝技術指標套件 TA-Lib

在 Google Finance 輸入 "TPE: 股票代號" 即搜尋指定標的盤中即時成交價與股價波動情形 :

https://www.google.com/finance





如果要擷取歷史股價, 就要用書中第 14 章所介紹的 Python 程式. 範例程式可從 GitHub 下載 :

https://github.com/letylin/pyptbook

下載第二版的範例程式, 解開 zip 檔後找到第 14 章的 E_14_2 資料夾, 修改底下的 E_14_2.py 中的 19 行, 將 GetYahooFinance 改為 GetGoogleFinance ;

gf=Gf.GetGoogleFinance(stkno,start,end,False)

因為 E_14_1.py 與 E_14_2.py 都是抓 Yahoo, 我想可能是作者搞錯了.  改好後存檔執行 python E_14_2.py 出現錯誤, 找不到 pymsql 與 xlrd 這兩個模組, 用 pip3 安裝即可 :

D:\Python\test>pip3 install pymysql 
Collecting pymysql
  Downloading https://files.pythonhosted.org/packages/a7/7d/682c4a7da195a678047c8f1c51bb7682aaedee1dca7547883c3993ca9282/PyMySQL-0.9.2-py2.py3-none-any.whl (47kB)
Collecting cryptography (from pymysql)
  Downloading https://files.pythonhosted.org/packages/f1/01/a144ec664d3f9ae5837bd72c4d11bdd2d8d403318898e4092457e8af9272/cryptography-2.3.1-cp36-cp36m-win_amd64.whl (1.3MB)
Collecting asn1crypto>=0.21.0 (from cryptography->pymysql)
  Downloading https://files.pythonhosted.org/packages/ea/cd/35485615f45f30a510576f1a56d1e0a7ad7bd8ab5ed7cdc600ef7cd06222/asn1crypto-0.24.0-py2.py3-none-any.whl (101kB)
Requirement already satisfied: six>=1.4.1 in c:\python36\lib\site-packages (from cryptography->pymysql) (1.11.0)
Collecting cffi!=1.11.3,>=1.7 (from cryptography->pymysql)
  Downloading https://files.pythonhosted.org/packages/2f/85/a9184548ad4261916d08a50d9e272bf6f93c54f3735878fbfc9335efd94b/cffi-1.11.5-cp36-cp36m-win_amd64.whl (166kB)
Collecting idna>=2.1 (from cryptography->pymysql)
  Downloading https://files.pythonhosted.org/packages/4b/2a/0276479a4b3caeb8a8c1af2f8e4355746a97fab05a372e4a2c6a6b876165/idna-2.7-py2.py3-none-any.whl (58kB)
Collecting pycparser (from cffi!=1.11.3,>=1.7->cryptography->pymysql)
  Downloading https://files.pythonhosted.org/packages/68/9e/49196946aee219aead1290e00d1e7fdeab8567783e83e1b9ab5585e6206a/pycparser-2.19.tar.gz (158kB)
Building wheels for collected packages: pycparser
  Running setup.py bdist_wheel for pycparser ... done
  Stored in directory: C:\Users\Tony Huang\AppData\Local\pip\Cache\wheels\f2\9a\90\de94f8556265ddc9d9c8b271b0f63e57b26fb1d67a45564511
Successfully built pycparser
Installing collected packages: asn1crypto, pycparser, cffi, idna, cryptography, pymysql
Successfully installed asn1crypto-0.24.0 cffi-1.11.5 cryptography-2.3.1 idna-2.7 pycparser-2.19 pymysql-0.9.2

D:\Python\test>pip3 install xlrd
Collecting xlrd
  Downloading https://files.pythonhosted.org/packages/07/e6/e95c4eec6221bfd8528bcc4ea252a850bffcc4be88ebc367e23a1a84b0bb/xlrd-1.1.0-py2.py3-none-any.whl (108kB)
Installing collected packages: xlrd
Successfully installed xlrd-1.1.0

再次執行 E_14_2.py 結果出現 "HTTP Error 403" (拒絕存取) 錯誤 :

D:\Python\E_14_2>python E_14_2.py 
https://finance.google.com/finance/historical?q=TPE:1102&startdate=Jan+1%2C+2017&enddate=Oct+18%2C+2018&num=200&ei=1zPaWcjPMc3E0QTivbGwBw&start=1
HTTP Error 403: Forbidden 
程式執行時間 = 1秒

我以為是用程式擷取被伺服器拒絕, 但用瀏覽器拜訪此網址仍然被拒絕 :

http://finance.google.com/finance/historical?q=TPE:2330




Google 真是為德不卒, 自己的爬蟲到處抓資料, 卻不讓人家抓他的資料. 或許之前開放歷史資料抓取之後流量大到雲端伺服器也受不了?

參考 :

利用Google試算表(Spreadsheet)取得台股(上市/上櫃)及美股股票報價(2017-05-14 更新)
Google Finance 支援台灣股價查詢、股市走勢圖
Google Finance 查台灣股票的功能
Google Finance has removed Historical Prices?
Your Financial Markets Data API https://intrinio.com

沒有留言 :