小狐狸事務所: Python 學習筆記 : 利用語言模型計算技術指標 (一)

2026年4月29日星期三

Python 學習筆記 : 利用語言模型計算技術指標 (一)

最近重讀旗標出版的 "最強 AI 投資分析" 這本書, 此書於 2023 年底買來看了前幾章便擱下, 也沒時間做測試, 今天重讀第四章後, 決定動手來測試看看, 因為去年 10/7 儲值 5 美元的 OpenAI API Key 目前只用了 0.01 美元, 只剩半年就要被清零了, 得在這之前趕快用掉 (在 Vibe coding 時代親自寫程式已淪落為純興趣了).

# https://platform.openai.com/settings/organization/billing/overview

書中範例程式碼下載網址 :

# https://www.flag.com.tw/bk/t/f3933

1. 利用 pandas_ta 計算 SMA 指標 :

首先用 pandas_ta 來計算移動平均指標 SMA8 與 SMA13 暖暖身, 畢竟已有近半年沒接觸了, 關於 pandas_ta 套件用法參考 :

# Python 量化投資筆記索引

下列程式使用 yfinance 取得收盤資料, 然後用 pandas_ta 套件的擴展屬性用法呼叫 df.ta.ma() 計算 SMA 指標, 結果會自動放入 df 的指定欄位, 最後用 kbar 套件繪製 K 線圖, 關於 kbar 套件用法參考 :

# https://github.com/tony1966/kbar

# ai_stock_test_1.py

import yfinance as yf

import pandas as pd

import pandas_ta as ta

from kbar import KBar

if __name__ == "__main__":

df=yf.download('0050.tw', start='2024-07-01', end='2024-08-21', auto_adjust=True)

df.columns=df.columns.map(lambda x: x[0])

df['SMA_8']=df.ta.sma(length=8)

df['SMA_13']=df.ta.sma(length=13)

print(df.tail())

kb=KBar(df)

kb.addplot(df['SMA_8'], panel=2, ylabel='SMA_8')

kb.addplot(df['SMA_13'], panel=2, ylabel='SMA_13')

kb.plot(volume=True, mav=[8, 13])

此處除了在 panel 2 上繪製 SMA8 與 SMA13 指標外, 同時也在 plot() 方法中指定 mav=[8, 13] 繪製 K 線圖之疊圖 (預設 panel=0), 結果如下 :

>>> %Run ai_stock_test_1.py

[*********************100%***********************] 1 of 1 completed

Close High Low ... Volume SMA_8 SMA_13

Date ...

2024-08-14 43.643597 43.909202 43.450429 ... 74857276 41.775311 42.438161

2024-08-15 43.305553 43.703958 43.233115 ... 45926588 42.397066 42.414943

2024-08-16 44.283455 44.343819 44.029927 ... 52823660 42.876964 42.466949

2024-08-19 44.343822 44.597354 44.223093 ... 37122372 43.163695 42.518955

2024-08-20 44.367966 44.718080 44.355892 ... 43139504 43.562101 42.514312

[5 rows x 7 columns]

設定字型為: Microsoft JhengHei

使用指定字型: Microsoft JhengHei

字型候選清單: ['Microsoft JhengHei', 'DejaVu Sans', 'Arial']

2. 串接 OpenAI API 計算 SMA 指標 :

接下來要串接 OpenAI API, 讓 LLM 模型來生成計算技術指標的程式碼後, 用 exec() 執行該程式碼計算技術指標, 好處是毋須去熟悉例如 pandas_ta, ta, 或 Ta-Lib 套件之函式呼叫介面, 直接用自然語言來指揮 LLM 傳回技術指標計算式, 做法參考書中 ˋ4-1 的範例 :

# 讓 AI 自動生成技術指標的程式碼

原程式碼的提示詞使用英文, 作者說經測試使用英文較能得到穩定之回應, 但現在 LLM 日新月異, 對中文的理解能力已非常精準, 因此我將其改寫為中文提示詞, 程式碼如下 :

# ai_stock_test_2.py

from openai import OpenAI, APIError

import yfinance as yf

import pandas as pd

from dotenv import dotenv_values

from kbar import KBar

def ask_gpt(

messages: list[dict[str, str]],

model: str='gpt-3.5-turbo'

) -> str:

try:

reply=client.chat.completions.create(

model=model,

messages=messages

)

return reply.choices[0].message.content or ''

except APIError as e:

return e.message

def ai_helper(df, user_msg):

role=f'''

作為一個專業的程式碼生成機器人，

我需要您的協助來根據特定的用戶需求生成 Python 程式碼。

為了進行下去，我將提供給您一個遵循格式 {list(df.columns)} 的 DataFrame（df）。

您的任務是仔細分析用戶的需求並相應地生成 Python 程式碼。

請注意，您的回應須僅包含代碼本身，並且不應包含任何額外的資訊。

'''

# 把 user_msg 加入到 task 的敘述中，讓 AI 知道要算什麼

task=f'''

您的任務是開發一個名為 'calculate(df)' 的 Python 函式。

這個函式應接受一個 DataFrame 作為其參數。確保您僅使用資料集中存在的欄，

特別是 {list(df.columns)}。

用戶的具體運算需求為：【 {user_msg} 】

處理後，該函式應返回處理過的 DataFrame。

您的回應應嚴格包含 'calculate(df)' 函式的 Python 程式碼，

並排除任何無關的內容。

'''

msg=[{"role": "system", "content": role},

{"role": "user", "content": task}]

reply_data=ask_gpt(msg)

# 清理 markdown 語法

cleaned_code=reply_data.replace("```", "")

cleaned_code=cleaned_code.replace("python", "")

cleaned_code=cleaned_code.strip() # 建議加上 strip() 去除頭尾多餘的空白或換行

# 傳回程式碼

return cleaned_code

if __name__ == "__main__":

config=dotenv_values('.env')

openai_api_key=config.get('OPENAI_API_KEY')

client=OpenAI(api_key=openai_api_key)

df=yf.download('0050.tw', start='2024-07-01', end='2024-08-21', auto_adjust=True)

df.columns=df.columns.map(lambda x: x[0])

code_str=ai_helper(df, "計算 8 日 MA (欄名 SMA_8) 與 13 日 MA (欄名 SMA_13)")

print(code_str)

exec(code_str)

new_df=calculate(df)

print(new_df.tail())

kb=KBar(new_df)

kb.addplot(new_df['SMA_8'], panel=2, ylabel='SMA_8')

kb.addplot(new_df['SMA_13'], panel=2, ylabel='SMA_13')

kb.plot(volume=True, mav=[8, 13])

此程式的 ask_gpt() 函式負責向 GPT 提問並取得回應, 注意, ask_gpt() 的傳入參數都使用了類型提示語法以增加程式碼可讀性. 例如 ask_gpt() 中的 messages: list[dict[str, str]] 意思是 :

messages 是一個串列, 裡面的每個元素都是字典.
字典的鍵與值都是字串, 例如 {"role": "user", "content": "hello"}

參考 :

# AI 應用程式專案 (一) : 新聞稿生成器

而 ai_helper() 函式則負責組裝提示詞 (字典串列) 並呼叫 ask_gpt(), 取得回應的指標計算程式碼後進行清理, 傳回純淨之 Python 程式碼給主函式以 exec() 執行, 結果如下 :

>>> %Run ai_stock_test_2.py

[*********************100%***********************] 1 of 1 completed

def calculate(df):

df['SMA_8'] = df['Close'].rolling(window=8).mean()

df['SMA_13'] = df['Close'].rolling(window=13).mean()

return df

Close High Low ... Volume SMA_8 SMA_13

Date ...

2024-08-14 43.643597 43.909202 43.450429 ... 74857276 41.775311 42.438161

2024-08-15 43.305553 43.703958 43.233115 ... 45926588 42.397066 42.414943

2024-08-16 44.283455 44.343819 44.029927 ... 52823660 42.876964 42.466949

2024-08-19 44.343822 44.597354 44.223093 ... 37122372 43.163695 42.518955

2024-08-20 44.367966 44.718080 44.355892 ... 43139504 43.562101 42.514312

[5 rows x 7 columns]

設定字型為: Microsoft JhengHei

使用指定字型: Microsoft JhengHei

字型候選清單: ['Microsoft JhengHei', 'DejaVu Sans', 'Arial']

計算出來的 SMA 數值與用 pandas_ta 計算的結果相同, 可見即使沒學過技術指標套件, 也可以利用 LLM 來進行技術指標的量化分析.

3. 串接 Gemini API 計算 SMA 指標 :

Gemini 版本的函式要改成 ask_gemini(), 而 ai_helper() 函式基本不變, 只有提示詞類型不同, OpenAI 的提示詞為字典字串, 而 Gemini 則是純字串. 程式碼如下 :

# ai_stock_test_3.py

from google import genai

from google.genai.errors import APIError

import yfinance as yf

import pandas as pd

from dotenv import dotenv_values

from kbar import KBar

def ask_gemini(messages: str, model: str='gemini-2.5-flash') -> str:

try:

reply=client.models.generate_content(

model=model,

contents=messages

)

return reply.text or ''

except APIError as e:

return e.message

def ai_helper(df, user_msg):

role=f'''

作為一個專業的程式碼生成機器人，

我需要您的協助來根據特定的用戶需求生成 Python 程式碼。

為了進行下去，我將提供給您一個遵循格式 {list(df.columns)} 的 DataFrame（df）。

您的任務是仔細分析用戶的需求並相應地生成 Python 程式碼。

請注意，您的回應須僅包含代碼本身，並且不應包含任何額外的資訊。

'''

task=f'''

您的任務是開發一個名為 'calculate(df)' 的 Python 函式。

這個函式應接受一個 DataFrame 作為其參數。確保您僅使用資料集中存在的欄，

特別是 {list(df.columns)}。

用戶的具體運算需求為：【 {user_msg} 】

處理後，該函式應返回處理過的 DataFrame。

您的回應應嚴格包含 'calculate(df)' 函式的 Python 程式碼，

並排除任何無關的內容。

'''

# Gemini 的提示詞為字串型態 : 將系統設定與任務直接合併成一段完整的字串

msg=f"{role}\n\n{task}"

# 呼叫 ask_gemini

reply_data=ask_gemini(msg)

# 清理傳回 markdown 語法

cleaned_code=reply_data.replace("```", "")

cleaned_code=cleaned_code.replace("python", "")

cleaned_code=cleaned_code.strip() # 去除頭尾多餘的空白或換行

# 傳回程式碼

return cleaned_code

if __name__ == "__main__":

config=dotenv_values('.env')

gemini_api_key=config.get('GEMINI_API_KEY')

client=genai.Client(api_key=gemini_api_key)

df=yf.download('0050.tw', start='2024-07-01', end='2024-08-21', auto_adjust=True)

df.columns=df.columns.map(lambda x: x[0])

code_str=ai_helper(df, "計算 8 日 MA (欄名 SMA_8) 與 13 日 MA (欄名 SMA_13)")

print(code_str)

exec(code_str)

new_df=calculate(df)

print(new_df.tail())

kb=KBar(new_df)

kb.addplot(new_df['SMA_8'], panel=2, ylabel='SMA_8')

kb.addplot(new_df['SMA_13'], panel=2, ylabel='SMA_13')

kb.plot(volume=True, mav=[8, 13])

結果與上面是一樣的 :

>>> %Run ai_stock_test_3.py

[*********************100%***********************] 1 of 1 completed

import pandas as pd

def calculate(df):

"""

計算 8 日 MA (欄名 SMA_8) 與 13 日 MA (欄名 SMA_13)。

Args:

df (pd.DataFrame): 包含 'Close', 'High', 'Low', 'Open', 'Volume' 欄位的 DataFrame。

Returns:

pd.DataFrame: 處理後包含 'SMA_8' 和 'SMA_13' 欄位的 DataFrame。

"""

df['SMA_8'] = df['Close'].rolling(window=8).mean()

df['SMA_13'] = df['Close'].rolling(window=13).mean()

return df

Close High Low ... Volume SMA_8 SMA_13

Date ...

2024-08-14 43.643597 43.909202 43.450429 ... 74857276 41.775310 42.438160

2024-08-15 43.305553 43.703958 43.233115 ... 45926588 42.397066 42.414943

2024-08-16 44.283459 44.343823 44.029930 ... 52823660 42.876964 42.466949

2024-08-19 44.343822 44.597354 44.223093 ... 37122372 43.163696 42.518955

2024-08-20 44.367966 44.718080 44.355892 ... 43139504 43.562102 42.514312

[5 rows x 7 columns]

設定字型為: Microsoft JhengHei

使用指定字型: Microsoft JhengHei

字型候選清單: ['Microsoft JhengHei', 'DejaVu Sans', 'Arial']

沒有留言 :

張貼留言

訂閱：張貼留言 ( Atom )

小狐狸事務所

2026年4月29日星期三

Python 學習筆記 : 利用語言模型計算技術指標 (一)

沒有留言 :

文章標籤

常用連結

2026年4月29日 星期三

Python 學習筆記 : 利用語言模型計算技術指標 (一)

沒有留言 :

2026年4月29日星期三