Python Fintech 測試系列此前文章參考 :
# Python Fintech 學習筆記 : 安裝技術指標套件 TA-Lib
# Python Fintech 學習筆記 : Google Finance 無法下載歷史資料
"Python 網頁程式交易 APP 實作" 範例程式下載連結 :
# https://github.com/letylin/pyptbook
我下載的是第二版範例 :
# 二版課本範例程式.zip
Yahoo 版抓取程式為其中的 E_14_1.py, 但在我的電腦直接執行時卻出現缺少 pymysql, xlrd, fix_yahoo_finance 與 openpyxl 等模組之錯誤訊息, 須用 pip3 安裝 :
D:\Python\E_14_1>pip3 install pymysql
Collecting pymysql
Downloading https://files.pythonhosted.org/packages/a7/7d/682c4a7da195a678047c8f1c51bb7682aaedee1dca7547883c3993ca9282/PyMySQL-0.9.2-py2.py3-none-any.whl (47kB)
Collecting cryptography (from pymysql)
Downloading https://files.pythonhosted.org/packages/f1/01/a144ec664d3f9ae5837bd72c4d11bdd2d8d403318898e4092457e8af9272/cryptography-2.3.1-cp36-cp36m-win_amd64.whl (1.3MB)
Requirement already satisfied: six>=1.4.1 in c:\python36\lib\site-packages (from cryptography->pymysql) (1.11.0)
Requirement already satisfied: idna>=2.1 in c:\python36\lib\site-packages (from cryptography->pymysql) (2.6)
Collecting asn1crypto>=0.21.0 (from cryptography->pymysql)
Downloading https://files.pythonhosted.org/packages/ea/cd/35485615f45f30a510576f1a56d1e0a7ad7bd8ab5ed7cdc600ef7cd06222/asn1crypto-0.24.0-py2.py3-none-any.whl (101kB)
Collecting cffi!=1.11.3,>=1.7 (from cryptography->pymysql)
Downloading https://files.pythonhosted.org/packages/2f/85/a9184548ad4261916d08a50d9e272bf6f93c54f3735878fbfc9335efd94b/cffi-1.11.5-cp36-cp36m-win_amd64.whl (166kB)
Collecting pycparser (from cffi!=1.11.3,>=1.7->cryptography->pymysql)
Downloading https://files.pythonhosted.org/packages/68/9e/49196946aee219aead1290e00d1e7fdeab8567783e83e1b9ab5585e6206a/pycparser-2.19.tar.gz (158kB)
Building wheels for collected packages: pycparser
Running setup.py bdist_wheel for pycparser: started
Running setup.py bdist_wheel for pycparser: finished with status 'done'
Stored in directory: C:\Users\Tony\AppData\Local\pip\Cache\wheels\f2\9a\90\de94f8556265ddc9d9c8b271b0f63e57b26fb1d67a45564511
Successfully built pycparser
Installing collected packages: asn1crypto, pycparser, cffi, cryptography, pymysql
Successfully installed asn1crypto-0.24.0 cffi-1.11.5 cryptography-2.3.1 pycparser-2.19 pymysql-0.9.2
D:\Python\E_14_1>pip3 install xlrd
Collecting xlrd
Downloading https://files.pythonhosted.org/packages/07/e6/e95c4eec6221bfd8528bcc4ea252a850bffcc4be88ebc367e23a1a84b0bb/xlrd-1.1.0-py2.py3-none-any.whl (108kB)
Installing collected packages: xlrd
Successfully installed xlrd-1.1.0
D:\Python\E_14_1>pip3 install fix_yahoo_finance
Collecting fix_yahoo_finance
Downloading https://files.pythonhosted.org/packages/0a/96/d44330e427f5368cb8abd25997b72956a31b52073d285c4d5cd56e5fdc17/fix-yahoo-finance-0.0.22.tar.gz
Requirement already satisfied: pandas in c:\python36\lib\site-packages (from fix_yahoo_finance) (0.22.0)
Requirement already satisfied: numpy in c:\python36\lib\site-packages (from fix_yahoo_finance) (1.14.1)
Requirement already satisfied: requests in c:\python36\lib\site-packages (from fix_yahoo_finance) (2.18.4)
Collecting multitasking (from fix_yahoo_finance)
Downloading https://files.pythonhosted.org/packages/ac/1a/0750416c5e3683d170757e423f097fdf78ceb9ccdc65658b24341664e53e/multitasking-0.0.7.tar.gz
Requirement already satisfied: python-dateutil>=2 in c:\python36\lib\site-packages (from pandas->fix_yahoo_finance) (2.6.1)
Requirement already satisfied: pytz>=2011k in c:\python36\lib\site-packages (from pandas->fix_yahoo_finance) (2018.3)
Requirement already satisfied: chardet<3 .1.0="">=3.0.2 in c:\python36\lib\site-packages (from requests->fix_yahoo_finance) (3.0.4)3>
Requirement already satisfied: certifi>=2017.4.17 in c:\python36\lib\site-packages (from requests->fix_yahoo_finance) (2018.4.16)
Requirement already satisfied: urllib3<1 .23="">=1.21.1 in c:\python36\lib\site-packages (from requests->fix_yahoo_finance) (1.22)1>
Requirement already satisfied: idna<2 .7="">=2.5 in c:\python36\lib\site-packages (from requests->fix_yahoo_finance) (2.6)2>
Requirement already satisfied: six>=1.5 in c:\python36\lib\site-packages (from python-dateutil>=2->pandas->fix_yahoo_finance) (1.11.0)
Building wheels for collected packages: fix-yahoo-finance, multitasking
Running setup.py bdist_wheel for fix-yahoo-finance: started
Running setup.py bdist_wheel for fix-yahoo-finance: finished with status 'done'
Stored in directory: C:\Users\Tony\AppData\Local\pip\Cache\wheels\2c\ca\ce\218a19aaecf63fd74c75d6a6772b1a799fa05826d8762bfd83
Running setup.py bdist_wheel for multitasking: started
Running setup.py bdist_wheel for multitasking: finished with status 'done'
Stored in directory: C:\Users\Tony\AppData\Local\pip\Cache\wheels\41\e4\48\af808a1c57f43f104042abdaf80fa623ab213ca0268ba4189c
Successfully built fix-yahoo-finance multitasking
Installing collected packages: multitasking, fix-yahoo-finance
Successfully installed fix-yahoo-finance-0.0.22 multitasking-0.0.7
Collecting openpyxl
Downloading https://files.pythonhosted.org/packages/e5/0a/e0a095149a23cedd9c8db6cdde2af7f82105e219e14edea0c31a19aeff9e/openpyxl-2.5.8.tar.gz (1.9MB)
Collecting jdcal (from openpyxl)
Downloading https://files.pythonhosted.org/packages/a0/38/dcf83532480f25284f3ef13f8ed63e03c58a65c9d3ba2a6a894ed9497207/jdcal-1.4-py2.py3-none-any.whl
Collecting et_xmlfile (from openpyxl)
Downloading https://files.pythonhosted.org/packages/22/28/a99c42aea746e18382ad9fb36f64c1c1f04216f41797f2f0fa567da11388/et_xmlfile-1.0.1.tar.gz
Building wheels for collected packages: openpyxl, et-xmlfile
Running setup.py bdist_wheel for openpyxl ... done
Stored in directory: C:\Users\Tony Huang\AppData\Local\pip\Cache\wheels\3f\37\28\5ab3dffb7ff261e6fa21455ec9d157f95958e818c6b89f024c
Running setup.py bdist_wheel for et-xmlfile ... done
Stored in directory: C:\Users\Tony Huang\AppData\Local\pip\Cache\wheels\2a\77\35\0da0965a057698121fc7d8c5a7a9955cdbfb3cc4e2423cad39
Successfully built openpyxl et-xmlfile
Installing collected packages: jdcal, et-xmlfile, openpyxl
Successfully installed et-xmlfile-1.0.1 jdcal-1.4 openpyxl-2.5.8
另外還需要 pandas 模組處理 DataFrame 物件以儲存所擷取之資料, 需用 pip3 安裝 pandas, 但我電腦之前已安裝過, 因此不會有找不到 pandas 問題.
書中範例 E_14_1.py 中我只改了要抓取的起迄日期 :
start = datetime.datetime(2017,10,1)
end = datetime.datetime(2018,10,19)
執行結果如下 :
D:\Python\E_14_1>python E_14_1.py
[*********************100%***********************] 1 of 1 downloaded
Date Open High Low Close Volume
0 2007-10-01 45.558601 46.860199 45.233200 45.558601 16490774
1 2007-10-02 45.558601 46.372101 44.907700 45.151798 10906577
2 2007-10-03 45.151798 45.883999 43.524700 44.419601 17748233
3 2007-10-04 43.931499 43.931499 41.409500 41.979000 22471999
4 2007-10-05 41.816200 43.036598 41.490799 42.792500 18974962
5 2007-10-08 43.117901 43.768799 42.711201 43.524700 9772038
6 2007-10-09 43.117901 43.199299 41.653500 43.199299 8877189
7 2007-10-11 43.199299 46.209400 43.199299 46.209400 38372764
8 2007-10-12 46.209400 47.917900 45.151798 45.151798 25902660
9 2007-10-15 45.558601 46.778900 44.500900 45.151798 12476249
10 2007-10-16 44.419601 44.826401 43.850101 44.175499 10468987
11 2007-10-17 43.931499 44.338200 42.873901 43.850101 10294442
12 2007-10-18 43.850101 44.744999 43.117901 44.338200 7405852
13 2007-10-19 44.338200 44.744999 43.850101 44.012798 4603306
14 2007-10-22 42.304401 42.467098 41.490799 42.223000 11138894
15 2007-10-23 42.223000 42.711201 41.979000 42.385700 8129844
16 2007-10-24 42.711201 43.850101 41.897598 41.897598 10767679
17 2007-10-25 42.385700 43.036598 41.897598 41.897598 7598835
18 2007-10-26 42.304401 43.524700 42.223000 43.524700 11097101
19 2007-10-29 43.768799 44.744999 43.117901 44.744999 11185603
20 2007-10-30 44.582298 44.582298 43.605999 43.687401 5092522
21 2007-10-31 43.687401 43.768799 42.060299 42.629799 8389202
22 2007-11-01 43.280602 43.443298 40.514599 40.921299 12767567
23 2007-11-02 40.270500 40.270500 38.968800 39.050201 16188394
24 2007-11-05 38.684101 38.684101 38.236599 38.562099 15356235
25 2007-11-06 39.457001 39.863701 38.765499 39.457001 14665432
26 2007-11-07 40.270500 40.473900 39.619701 39.945099 11753487
27 2007-11-08 39.212898 39.212898 38.318001 38.846802 7270642
28 2007-11-09 38.846802 39.538300 37.870602 39.538300 9424178
29 2007-11-12 38.236599 38.562099 36.772301 37.016300 10525529
... ... ... ... ... ... ...
2442 2017-08-21 27.200001 27.350000 27.049999 27.250000 1832333
2443 2017-08-22 27.299999 27.400000 27.250000 27.299999 1246889
2444 2017-08-23 27.400000 27.400000 27.100000 27.250000 1961836
2445 2017-08-24 27.200001 27.450001 27.150000 27.450001 1893695
2446 2017-08-25 27.500000 27.549999 27.350000 27.350000 2735921
2447 2017-08-28 27.450001 27.450001 27.350000 27.400000 2451080
2448 2017-08-29 27.400000 27.400000 27.250000 27.400000 3203620
2449 2017-08-30 26.600000 27.000000 26.600000 26.900000 3838522
2450 2017-08-31 27.000000 27.350000 27.000000 27.350000 3695972
2451 2017-09-01 27.400000 27.600000 27.299999 27.450001 3451475
2452 2017-09-04 27.299999 27.450001 27.150000 27.450001 1711265
2453 2017-09-05 27.450001 27.600000 27.299999 27.450001 2508250
2454 2017-09-06 27.350000 27.350000 27.100000 27.200001 2100535
2455 2017-09-07 27.200001 27.200001 26.950001 27.100000 2700381
2456 2017-09-08 27.100000 27.400000 27.000000 27.299999 2028684
2457 2017-09-11 27.299999 27.350000 27.100000 27.200001 1863056
2458 2017-09-12 27.250000 27.250000 27.049999 27.150000 1988872
2459 2017-09-13 27.100000 27.100000 26.950001 26.950001 1732912
2460 2017-09-14 26.799999 27.250000 26.799999 27.250000 1714303
2461 2017-09-15 27.250000 27.500000 27.000000 27.500000 3562428
2462 2017-09-18 27.500000 27.650000 27.299999 27.650000 3323904
2463 2017-09-19 27.700001 27.700001 27.450001 27.600000 2012329
2464 2017-09-20 27.299999 27.600000 27.250000 27.549999 1858452
2465 2017-09-21 27.500000 27.500000 27.200001 27.250000 1967469
2466 2017-09-22 27.200001 27.200001 26.950001 26.950001 2382746
2467 2017-09-25 26.950001 27.450001 26.850000 27.350000 2749546
2468 2017-09-26 27.400000 27.400000 27.049999 27.250000 1039137
2469 2017-09-27 27.049999 27.200001 27.000000 27.049999 1616491
2470 2017-09-28 27.049999 27.100000 26.650000 26.700001 4079153
2471 2017-09-29 26.700001 26.950001 26.650000 26.750000 1903258
[2472 rows x 6 columns]
C:\Python36\lib\site-packages\pandas\io\excel.py:784: DeprecationWarning: Call to deprecated function remove_sheet (Use wb.remove(worksheet) or del wb[sheetname]).
self.book.remove_sheet(self.book.worksheets[0])
程式執行時間 = 5秒
如果擷取失敗會出現下列訊息 :
D:\Python\test\E_14_1>python E_14_1.py
[*********************100%***********************] 1 of 1 downloaded
zero-size array to reduction operation maximum which has no identity
Traceback (most recent call last):
File "E_14_1.py", line 33, in <module>
df = main(stkno,start,end) #呼叫主程式
File "E_14_1.py", line 21, in main
RowData =gf.getstock(True)
File "D:\Python\test\E_14_1\ProgramTrade\class_GetYahooFinance.py", line 54, in getstock
if len(Data) != 0: #有取得資料才進行轉換
UnboundLocalError: local variable 'Data' referenced before assignment
範例程式 E_14_1.py 將擷取網頁與儲存等功能寫成 GetYahooFinance.py 類別, 然後在
E_14_1.py 中建立物件來抓取網頁資料, 這使得程式閱讀起來很彆扭. 事實上此程式是利用第三方模組 fix_yahoo_finance 與 pandas_datareader 來抓資料, 垂回 Pandas 的 DataFrame 物件.
我將其中的核心部分改寫成函數 get_yahoo_twstock(), 如下列範例所示 :
import datetime
import pandas as pd
import fix_yahoo_finance as yf
from pandas_datareader import data as pdr
def get_yahoo_twstock(stock_no, start_date, end_date):
start=datetime.datetime.strptime(start_date,'%Y%m%d')
end=datetime.datetime.strptime(end_date,'%Y%m%d')
yf.pdr_override()
df=pdr.get_data_yahoo(stock_no + '.tw', start, end)
del df['Adj Close']
df=df.reset_index()
return df
stock_no='0050'
start_date='20170101'
end_date='20181022'
data=get_yahoo_twstock(stock_no, start_date, end_date)
print(data)
print(type(data))
注意, get_yahoo_twstock() 的三個參數都是字串, 起訖日期在函數裡面要用 datetime.datetime.strptime() 將字串轉成 datetime.datetime 物件.
>>> %Run test.py
[ 0% ]
[*********************100%***********************] 1 of 1 downloaded
Date Open High Low Close Volume
0 2018-10-01 86.900002 87.400002 86.900002 87.349998 5912234
1 2018-10-02 87.250000 87.250000 86.000000 86.050003 5434332
2 2018-10-03 86.050003 86.300003 85.800003 85.949997 4369690
3 2018-10-04 85.699997 85.699997 84.800003 84.900002 9736968
4 2018-10-05 84.500000 84.500000 83.000000 83.449997 26744096
5 2018-10-08 82.949997 83.400002 82.500000 83.000000 46512204
6 2018-10-09 83.050003 83.400002 82.800003 83.250000 20530897
7 2018-10-11 80.150002 80.150002 77.349998 77.400002 68551624
<class 'pandas.core.frame.DataFrame'>
可見它沒辦法抓到今天為止的歷史資料, 只能到 11 天前左右, 而且有時候會執行失敗, 但這樣總比 Google Finance 完全不能下載歷史資料好吧! 就是沒辦法抓到今天為止的歷史資料有點可惜, 只能用在回測.
沒有留言:
張貼留言