2018年11月5日 星期一

Python 學習筆記 : 台股資料擷取模組 twstock 測試 (二)

由於測試紀錄篇幅太長, 因此將 twstock 套件的主角 Stock 物件的方法紀錄在此, 屬性部分的測試參考 :

Python 學習筆記 : 台股資料擷取模組 twstock 測試 (一)

twstock 說明文件參考 GitHub :

https://github.com/mlouielu/twstock

Stock 物件的方法如下表 :

 Stock 物件的方法 說明
 fetch(year, month) 傳回指定年月的交易資料 (Data 物件) 串列
 fetch_from(year, month) 傳回指定年月至今的交易資料 (Data 物件) 串列
 fetch_31() 傳回最近 31 天的交易資料 (Data 物件) 串列 (=data 屬性)
 ma_bias_ratio(day1, day2) 傳回短天期 day1 與長天期 day2 均價之乖離值串列
 ma_bias_ratio_pivot(data [, size=5]) 傳回正負乖離值元組
 moving_average(data, days) 傳回串列數據 data 之 days 日移動平均值串列
 continuous(data)  傳回價格串列 data 持續上升之天數

參考 :

https://github.com/mlouielu/twstock/blob/master/docs/reference/stock.rst

其中 fetch(), fetch_31(), 與 fetch_from() 方法用來擷取指定範圍之歷史交易資料, fetch_31() 的作用與 data 屬性相同, 都是傳回近 31 個交易日的 Data 物件資料; fetch() 用來擷取指定年月之歷史資料; 而 fetch_from() 則是擷取指定年月至今的歷史資料, 例如 :

>>> tw2330.fetch_31() 
[Data(date=datetime.datetime(2018, 9, 20, 0, 0), capacity=35577071, turnover=9225009310, open=261.5, high=261.5, low=257.5, close=260.0, change=2.0, transaction=9444), Data(date=datetime.datetime(2018, 9, 21, 0, 0), capacity=36500974, turnover=9510363894, open=261.5, high=261.5, low=258.0, close=261.5, change=1.5, transaction=10159), Data(date=datetime.datetime(2018, 9, 25, 0, 0), capacity=24978062, turnover=6552894296, open=261.5, high=264.0, low=260.5, close=263.5, change=2.0, transaction=10082), Data(date=datetime.datetime(2018, 9, 26, 0, 0), capacity=25061115, turnover=6577367245, open=263.0, high=263.5, low=261.0, close=263.5, change=0.0, transaction=7007), Data(date=datetime.datetime(2018, 9, 27, 0, 0), capacity=38495371, turnover=10185827315, open=264.0, high=266.0, low=262.0, close=265.0, change=1.5, transaction=11536), Data(date=datetime.datetime(2018, 9, 28, 0, 0), capacity=39645486, turnover=10412455506, open=266.0, high=266.0, low=260.0, close=262.5, change=-2.5, transaction=13060), Data(date=datetime.datetime(2018, 10, 1, 0, 0), capacity=22409380, turnover=5882532290, open=262.0, high=264.0, low=261.0, close=263.0, change=0.5, transaction=11914), Data(date=datetime.datetime(2018, 10, 2, 0, 0), capacity=38391491, turnover=9926505357, open=262.0, high=263.0, low=257.0, close=257.5, change=-5.5, transaction=17095), Data(date=datetime.datetime(2018, 10, 3, 0, 0), capacity=25228536, turnover=6527393860, open=257.5, high=260.0, low=257.0, close=260.0, change=2.5, transaction=9992), Data(date=datetime.datetime(2018, 10, 4, 0, 0), capacity=38009727, turnover=9690178362, open=257.0, high=257.5, low=254.0, close=254.0, change=-6.0, transaction=15094), Data(date=datetime.datetime(2018, 10, 5, 0, 0), capacity=40396660, turnover=10101451270, open=250.0, high=253.0, low=248.5, close=250.0, change=-4.0, transaction=16031), Data(date=datetime.datetime(2018, 10, 8, 0, 0), capacity=51083958, turnover=12426290240, open=245.5, high=246.5, low=241.0, close=243.5, change=-6.5, transaction=20745), Data(date=datetime.datetime(2018, 10, 9, 0, 0), capacity=28933345, turnover=7051600725, open=243.5, high=245.0, low=242.0, close=244.0, change=0.5, transaction=11743), Data(date=datetime.datetime(2018, 10, 11, 0, 0), capacity=96033657, turnover=22115862239, open=233.5, high=233.5, low=227.0, close=227.5, change=-16.5, transaction=37543), Data(date=datetime.datetime(2018, 10, 12, 0, 0), capacity=54439769, turnover=12650322522, open=231.0, high=237.0, low=229.0, close=237.0, change=9.5, transaction=17980), Data(date=datetime.datetime(2018, 10, 15, 0, 0), capacity=46471280, turnover=10766953220, open=234.0, high=234.0, low=230.5, close=230.5, change=-6.5, transaction=16558), Data(date=datetime.datetime(2018, 10, 16, 0, 0), capacity=39129077, turnover=9155829047, open=229.5, high=237.0, low=229.0, close=237.0, change=6.5, transaction=15182), Data(date=datetime.datetime(2018, 10, 17, 0, 0), capacity=42887858, turnover=10318955161, open=241.5, high=243.0, low=238.0, close=238.5, change=1.5, transaction=17550), Data(date=datetime.datetime(2018, 10, 18, 0, 0), capacity=27793430, turnover=6587887632, open=238.0, high=239.0, low=235.5, close=236.5, change=-2.0, transaction=10599), Data(date=datetime.datetime(2018, 10, 19, 0, 0), capacity=29648460, turnover=6927405460, open=231.0, high=236.5, low=230.0, close=236.0, change=-0.5, transaction=12378), Data(date=datetime.datetime(2018, 10, 22, 0, 0), capacity=29275777, turnover=6872467149, open=232.5, high=238.0, low=231.5, close=237.0, change=1.0, transaction=10384), Data(date=datetime.datetime(2018, 10, 23, 0, 0), capacity=36757745, turnover=8538586595, open=232.0, high=234.0, low=230.0, close=230.0, change=-7.0, transaction=13071), Data(date=datetime.datetime(2018, 10, 24, 0, 0), capacity=45480950, turnover=10426514872, open=230.0, high=231.0, low=227.0, close=229.5, change=-0.5, transaction=17339), Data(date=datetime.datetime(2018, 10, 25, 0, 0), capacity=75524118, turnover=16643542368, open=220.5, high=222.0, low=219.5, close=219.5, change=-10.0, transaction=27514), Data(date=datetime.datetime(2018, 10, 26, 0, 0), capacity=54073644, turnover=11941421696, open=223.0, high=224.0, low=217.0, close=221.0, change=1.5, transaction=17106), Data(date=datetime.datetime(2018, 10, 29, 0, 0), capacity=18463792, turnover=4107663616, open=223.0, high=224.0, low=221.0, close=222.5, change=1.5, transaction=7679), Data(date=datetime.datetime(2018, 10, 30, 0, 0), capacity=29085931, turnover=6487213284, open=221.0, high=225.0, low=220.5, close=223.0, change=0.5, transaction=8574), Data(date=datetime.datetime(2018, 10, 31, 0, 0), capacity=59933201, turnover=13844478353, open=228.0, high=234.0, low=228.0, close=234.0, change=11.0, transaction=19183), Data(date=datetime.datetime(2018, 11, 1, 0, 0), capacity=44514797, turnover=10480329245, open=236.0, high=237.0, low=233.0, close=235.5, change=1.5, transaction=11435), Data(date=datetime.datetime(2018, 11, 2, 0, 0), capacity=30785361, turnover=7249667833, open=236.5, high=236.5, low=233.5, close=236.5, change=1.0, transaction=11223), Data(date=datetime.datetime(2018, 11, 5, 0, 0), capacity=28157065, turnover=6592130635, open=233.0, high=235.5, low=232.0, close=235.0, change=-1.5, transaction=6994)]
>>> tw2330.fetch(2018, 11)    
[Data(date=datetime.datetime(2018, 11, 1, 0, 0), capacity=44514797, turnover=10480329245, open=236.0, high=237.0, low=233.0, close=235.5, change=1.5, transaction=11435), Data(date=datetime.datetime(2018, 11, 2, 0, 0), capacity=30785361, turnover=7249667833, open=236.5, high=236.5, low=233.5, close=236.5, change=1.0, transaction=11223), Data(date=datetime.datetime(2018, 11, 5, 0, 0), capacity=28157065, turnover=6592130635, open=233.0, high=235.5, low=232.0, close=235.0, change=-1.5, transaction=6994)]
>>> tw2330.fetch_from(2018, 11) 
[Data(date=datetime.datetime(2018, 11, 1, 0, 0), capacity=44514797, turnover=10480329245, open=236.0, high=237.0, low=233.0, close=235.5, change=1.5, transaction=11435), Data(date=datetime.datetime(2018, 11, 2, 0, 0), capacity=30785361, turnover=7249667833, open=236.5, high=236.5, low=233.5, close=236.5, change=1.0, transaction=11223), Data(date=datetime.datetime(2018, 11, 5, 0, 0), capacity=28157065, turnover=6592130635, open=233.0, high=235.5, low=232.0, close=235.0, change=-1.5, transaction=6994)]

這些擷取方法都放在 twstock 套件原始碼的 stock.py 檔案中, 主要是透過 TWSEFetcher 類別來完成, 精確地說是此類別中的 fetch() 方法. 此程式是撰寫爬蟲程式的優秀範本 :

    TWSE_BASE_URL = 'http://www.twse.com.tw/'
    REPORT_URL = urllib.parse.urljoin(TWSE_BASE_URL, 'exchangeReport/STOCK_DAY')
    .....

    def fetch(self, year: int, month: int, sid: str, retry: int=5):
        params = {'date': '%d%02d01' % (year, month), 'stockNo': sid}
        for retry_i in range(retry):
            r = requests.get(self.REPORT_URL, params=params)
            try:
                data = r.json()
            except JSONDecodeError:
                continue
            else:
                break
        else:
            # Fail in all retries
            data = {'stat': '', 'data': []}

        if data['stat'] == 'OK':
            data['data'] = self.purify(data)
        else:
            data['data'] = []
        return data

參考 :

https://github.com/mlouielu/twstock/blob/master/twstock/stock.py#L49


其他的四個方法都放在 analytic.py 這個檔案中, 參考 :

https://github.com/mlouielu/twstock/blob/master/twstock/analytics.py

其中 moving_average(data, days) 會傳回串列數據在指定周期之移動平均值, 須傳入兩個參數, 一是資料串列 data, 二是平均天期 days, 傳回值是一個平均值串列; 而 continuous(data) 方法則是傳回資料串列 data 中的數據連續上升之天數, 傳入均價串列即得到均價連續上漲天數, 例如 :

>>> ma5=tw2330.moving_average(tw2330.price, 5)   #5 日均價
>>> ma5 
[262.7, 263.2, 263.5, 262.3, 261.6, 259.4, 256.9, 253.0, 250.3, 243.8, 240.4, 236.5, 235.2, 234.1, 235.9, 235.7, 237.0, 235.6, 233.8, 230.4, 227.4, 224.5, 223.1, 224.0, 227.2, 230.3, 232.8]
>>> tw2330.continuous(ma5)    #5日均價連續上漲天數
4
>>> tw2330.moving_average(tw2330.capacity, 5)   # 5 日均量
[32122518.6, 32936201.6, 30117882.8, 32800568.6, 32834052.8, 32736924.0, 32887158.8, 38622074.4, 36730445.2, 50891469.4, 54177477.8, 55392401.8, 53001425.6, 55792328.2, 42144282.8, 37186021.0, 33746920.4, 33272654.0, 33791272.4, 43337410.0, 48222446.8, 46060049.8, 44525687.0, 47416137.2, 41214273.0, 36556616.4, 38495271.0]

方法 ma_bias_ratio() 用來計算均價的乖離率 (%), 其算法是 :

乖離率=(目前價格-移動平均價)/移動平均價

乖離率 (bias) 描述目前價格與過去一段時間平均值的距離, 也就是價格偏離均線的程度, 乖離率變大表示距離均線越遠, 股價漲太多了 (正乖離). 如果將移動平均值當作平均持有成本, 則乖離率就相當於投資報酬率, 參考 :

什麼是乖離率(BIAS)?

函數 ma_bias_ratio(day1, day2) 計算的是長短天期之間的乖離率, 不是收盤價的乖離率, 它需要傳入兩個天期的參數, day1 為短天期, day2 為長天期, 分別計算其移動平均值, 將得到的短天期串列減掉長天期串列即得. 例如 :

>>> tw2330.ma_bias_ratio(5, 10)  #傳回 5 日, 10 日乖離率
[-1.650000000000034, -3.150000000000034, -5.25, -6.0, -8.899999999999977, -9.5, -10.199999999999989, -8.900000000000006, -8.099999999999994, -3.9499999999999886, -2.3500000000000227, 0.25, 0.19999999999998863, -0.14999999999997726, -2.75, -4.150000000000006, -6.25, -6.25, -4.900000000000006, -1.6000000000000227, 1.450000000000017, 4.150000000000006]
>>> tw2330.ma_bias_ratio_pivot(tw2330.price) 
(False, 4, 223.0)

這個 ma_bias_ratio_pivot() 函數的原始碼因為沒有註解所以看不太懂要做啥, 估計似乎是在判斷轉折點, 等看懂再補充.

另外, twsock 模組的分析功能裡還有一個用來計算交易訊號的 BestFourPoint 類別, 這是作者根據順勢操作法, 以技術面最簡單的移動平均線與成交量數據來判斷一檔股票的買進或賣出時機. 買進的條件有四 :
  1. 量大收紅
  2. 量縮價不跌
  3. 三日均價由下往上
  4. 三日均價大於六日均價
而賣出條件也有四個 (所以名為 BestFourPoint) :
  1. 量大收黑
  2. 量縮價跌
  3. 三日均價由上往下
  4. 三日均價小於六日均價
BestFourPoint 類別提供下列三個方法來產生交易訊號 :

 BestFourPoint 物件的方法 說明
 best_four_point_to_buy()  傳回買進原因字串或 False (無買進訊號)
 best_four_point_to_sell() 傳回賣出原因字串或 False (無賣出訊號)
 best_four_point() 傳回買賣訊號 tuple (買=True, 賣=False, "原因")

這三個方法都不須傳入參數, 只要建立 BestFourPoint 物件後直接呼叫即可, 建立 BestFourPoint 物件須在呼叫其建構子時傳入 Stock 物件當參數, 例如 :

>>> from twstock import Stock
>>> from twstock import BestFourPoint
>>> stock=Stock('2330')
>>> stock.data[30].date 
datetime.datetime(2018, 11, 5, 0, 0) 
>>> stock.price 
[260.0, 261.5, 263.5, 263.5, 265.0, 262.5, 263.0, 257.5, 260.0, 254.0, 250.0, 243.5, 244.0, 227.5, 237.0, 230.5, 237.0, 238.5, 236.5, 236.0, 237.0, 230.0, 229.5, 219.5, 221.0, 222.5, 223.0, 234.0, 235.5, 236.5, 235.0]
>>> stock.capacity
[35577071, 36500974, 24978062, 25061115, 38495371, 39645486, 22409380, 38391491, 25228536, 38009727, 40396660, 51083958, 28933345, 96033657, 54439769, 46471280, 39129077, 42887858, 27793430, 29648460, 29275777, 36757745, 45480950, 75524118, 54073644, 18463792, 29085931, 59933201, 44514797, 30785361, 28157065]
>>> bfp=BestFourPoint(stock) 
>>> bfp.best_four_point_to_buy()
'三日均價大於六日均價'
>>> bfp.best_four_point_to_sell() 
'量縮價跌'
>>> bfp.best_four_point()
(True, '三日均價大於六日均價')

這是以 2018-11-05 的盤後資料所得出的交易訊號, 首先呼叫 best_four_point_to_buy() 方法傳回買進訊號為 '三日均價大於六日均價'; 呼叫 best_four_point_to_sell() 傳回賣出訊號為 '量縮價跌'; 最後 best_four_point() 則傳回綜合交易訊號為買進 (True), 原因是 '三日均價大於六日均價'.

參考 :

GoodInfo 台灣股市資訊網
CMoney 大盤走勢圖
證交所本國上市證券國際證券辨識號碼一覽表

沒有留言:

張貼留言