小狐狸事務所: OpenAI API 學習筆記 : API 參數測試 (三)

2025年3月11日星期二

OpenAI API 學習筆記 : API 參數測試 (三)

今天繼續來測試 OpenAI API 的常用參數, 本系列前兩篇測試文章參考 :

# OpenAI API 學習筆記 : API 參數測試 (一)

# OpenAI API 學習筆記 : API 參數測試 (二)

本系列全部文章索引參考 :

# OpenAI API 學習筆記索引

本篇主要測試 OpenAI API 的 stream 參數.

9. stream 參數 :

此參數用來控制串流回應模式是否開啟, 其值為一個布林值 (True/False) :

False (預設) :
關閉串流回應, API 會等整個回應稱成完畢後才一次傳回結果, 適合短回應或非即時應用.
True :
開啟串流回應模式, 生成的內容會逐步傳回, 適合聊天機器人或即時應用.

在測試之前先匯入 OpenAI 類別並建立 OpenAI 物件 :

>>> from openai import OpenAI

>>> api_key='填入 API key'

>>> client=OpenAI(api_key=api_key)

>>> type(client)

例如預設 stream=False 串流關閉時 :

>>> chat_completion=client.chat.completions.create(

messages=[{'role': 'user', 'content': '嗨'}],

model='gpt-3.5-turbo'

)

>>> print(chat_completion.choices[0].message.content)

您好！有什么可以帮助您的吗？

如果傳入 stream=True 的話, API 會傳回一個 Stream 類型的迭代器物件, 用迴圈去迭代它就會依序傳回包含模型生成的字的 ChatComletionChunk 物件.

為了減少生成之回應以利觀察物件內容, 下面先用 max_tokens 限制只生成 2 個 token, 模型將依序傳回最前面的兩個字 '您' 與 '好' 的 ChatComletionChunk 物件 :

>>> chunks=client.chat.completions.create(

messages=[{'role': 'user', 'content': '嗨?'}],

model='gpt-3.5-turbo',

max_tokens=2,

stream=True

)

>>> type(chunks)

為了能格式化物件內容之顯示, 以下使用第三方 rich 模組之 print() 函式來顯示物件內容, 匯入時將其取名為 pprint 避免汙染 Python 內建函式的命名空間 :

>>> from rich import print as pprint

關於 rich 模組用法參考 :

# Python 學習筆記 : 用內建模組 pprint 列印字典物件

接下來迭代 Stream 生成器物件並用 pprint() 顯示所生成的每個 ChatComletionChunk 物件內容 :

>>> for chunk in chunks:

print(type(chunk))

pprint(chunk)

ChatCompletionChunk(

id='chatcmpl-B9TA5q5SiomPcrOddT9nd5QG1JE9i',

choices=[

Choice(

delta=ChoiceDelta(

content='', => 開始物件內容為空字串

function_call=None,

refusal=None,

role='assistant', => 標示角色為 AI 助理

tool_calls=None

finish_reason=None,

index=0, => 索引標示哪一組回應

logprobs=None

)

created=1741596749,

model='gpt-3.5-turbo-0125',

object='chat.completion.chunk',

service_tier='default',

system_fingerprint=None,

usage=None

)

ChatCompletionChunk(

id='chatcmpl-B9TA5q5SiomPcrOddT9nd5QG1JE9i',

choices=[

Choice(

delta=ChoiceDelta(

content='你',

function_call=None,

refusal=None,

role=None,

tool_calls=None

finish_reason=None,

index=0,

logprobs=None

)

created=1741596749,

model='gpt-3.5-turbo-0125',

object='chat.completion.chunk',

service_tier='default',

system_fingerprint=None,

usage=None

)

ChatCompletionChunk(

id='chatcmpl-B9TA5q5SiomPcrOddT9nd5QG1JE9i',

choices=[

Choice(

delta=ChoiceDelta(

content='好',

function_call=None,

refusal=None,

role=None,

tool_calls=None

finish_reason=None,

index=0,

logprobs=None

)

created=1741596749,

model='gpt-3.5-turbo-0125',

object='chat.completion.chunk',

service_tier='default',

system_fingerprint=None,

usage=None

)

ChatCompletionChunk(

id='chatcmpl-B9TA5q5SiomPcrOddT9nd5QG1JE9i',

choices=[

Choice(

delta=ChoiceDelta(

content=None, => 結束物件無內容

function_call=None,

refusal=None,

role=None,

tool_calls=None

finish_reason='length', => 因為長度限制而結束

index=0,

logprobs=None

)

created=1741596749,

model='gpt-3.5-turbo-0125',

object='chat.completion.chunk',

service_tier='default',

system_fingerprint=None,

usage=None

)

可見串流總共傳回 4 個 ChatCompletionChunk 物件, 生成的 token 內容會放在 ChoiceDelta 物件的 content 屬性中, 其中開始與結束物件的 content 分別為空字串與 None, 前者用來標示角色為 AI 助理, 後者用來標示結束原因, 生成的回應則是放在兩者之間.

與預設 stream=False 的回應比較, 可知串流回應是放在 Choices 物件的 delta 屬性內 (非串流則是在 message 屬性內), 因此可以在迭代 Stream 生成器物件時串接 chunk.choices[0].message.content 得到完整的回應, 例如 :

>>> chunks=client.chat.completions.create(

messages=[{'role': 'user', 'content': '嗨'}],

model='gpt-3.5-turbo',

stream=True

)

>>> for chunk in chunks:

print(chunk.choices[0].delta.content, end='')

你好！有什么可以帮助你的吗？None

注意, 此處 print() 要傳入 end='' 參數以便每個 token 能串接 (不跳行), 但是結尾物件的 None 也會顯示出來, 解決辦法是將串流內容與空字串做 or 運算即可 :

>>> chunks=client.chat.completions.create(

messages=[{'role': 'user', 'content': '嗨'}],

model='gpt-3.5-turbo',

stream=True

)

>>> for chunk in chunks:

print(chunk.choices[0].delta.content or '', end='')

你好！有什么可以帮助你的吗？

也可以將上面程式碼寫成函式, 用 yield 語法將串流結果變成產生器 (generator) :

>>> def ask_gpt_s(prompt, model='gpt-4o-mini'):

replies=client.chat.completions.create(

messages=[{"role": "user", "content": prompt}],

model=model,

stream=True

)

for reply in replies:

yield reply.choices[0].delta.content or ''

只要迭代產生器並使用 print() 串接串流片段即可得到完整回應 :

>>> for reply in ask_gpt_s('嗨', 'gpt-3.5-turbo'):

print(reply, end='')

你好！有什么可以帮助你的吗？

關於產生器用法參考 :

# Python 學習筆記 : 產生器 (generator)

如果傳入 n 參數指定生成多組串流回應時要利用 index 來區別故組回應, 例如 :

>>> chunks=client.chat.completions.create(

messages=[{'role': 'user', 'content': '嗨?'}],

model='gpt-3.5-turbo',

max_tokens=4,

stream=True,

n=2

)

此例傳入參數 n=2 要求生成兩組回應, 但為了縮減輸出長度也指定了參數 max_tokens=4 以利觀察結果 :

>>> for chunk in chunks:

pprint(chunk)

ChatCompletionChunk(

id='chatcmpl-B9sD8eQhh6vVWt84oYryrR26Z23vC',

choices=[

Choice(

delta=ChoiceDelta(

content='',

function_call=None,

refusal=None,

role='assistant',

tool_calls=None

finish_reason=None,

index=0,

logprobs=None

)

created=1741693038,

model='gpt-3.5-turbo-0125',

object='chat.completion.chunk',

service_tier='default',

system_fingerprint=None,

usage=None

)

ChatCompletionChunk(

id='chatcmpl-B9sD8eQhh6vVWt84oYryrR26Z23vC',

choices=[

Choice(

delta=ChoiceDelta(

content='你',

function_call=None,

refusal=None,

role=None,

tool_calls=None

finish_reason=None,

index=0,

logprobs=None

)

created=1741693038,

model='gpt-3.5-turbo-0125',

object='chat.completion.chunk',

service_tier='default',

system_fingerprint=None,

usage=None

)

ChatCompletionChunk(

id='chatcmpl-B9sD8eQhh6vVWt84oYryrR26Z23vC',

choices=[

Choice(

delta=ChoiceDelta(

content='',

function_call=None,

refusal=None,

role='assistant',

tool_calls=None

finish_reason=None,

index=1,

logprobs=None

)

created=1741693038,

model='gpt-3.5-turbo-0125',

object='chat.completion.chunk',

service_tier='default',

system_fingerprint=None,

usage=None

)

ChatCompletionChunk(

id='chatcmpl-B9sD8eQhh6vVWt84oYryrR26Z23vC',

choices=[

Choice(

delta=ChoiceDelta(

content='您',

function_call=None,

refusal=None,

role=None,

tool_calls=None

finish_reason=None,

index=1,

logprobs=None

)

created=1741693038,

model='gpt-3.5-turbo-0125',

object='chat.completion.chunk',

service_tier='default',

system_fingerprint=None,

usage=None

)

ChatCompletionChunk(

id='chatcmpl-B9sD8eQhh6vVWt84oYryrR26Z23vC',

choices=[

Choice(

delta=ChoiceDelta(

content='好',

function_call=None,

refusal=None,

role=None,

tool_calls=None

finish_reason=None,

index=0,

logprobs=None

)

created=1741693038,

model='gpt-3.5-turbo-0125',

object='chat.completion.chunk',

service_tier='default',

system_fingerprint=None,

usage=None

)

ChatCompletionChunk(

id='chatcmpl-B9sD8eQhh6vVWt84oYryrR26Z23vC',

choices=[

Choice(

delta=ChoiceDelta(

content='好',

function_call=None,

refusal=None,

role=None,

tool_calls=None

finish_reason=None,

index=1,

logprobs=None

)

created=1741693038,

model='gpt-3.5-turbo-0125',

object='chat.completion.chunk',

service_tier='default',

system_fingerprint=None,

usage=None

)

ChatCompletionChunk(

id='chatcmpl-B9sD8eQhh6vVWt84oYryrR26Z23vC',

choices=[

Choice(

delta=ChoiceDelta(

content='！',

function_call=None,

refusal=None,

role=None,

tool_calls=None

finish_reason=None,

index=0,

logprobs=None

)

created=1741693038,

model='gpt-3.5-turbo-0125',

object='chat.completion.chunk',

service_tier='default',

system_fingerprint=None,

usage=None

)

ChatCompletionChunk(

id='chatcmpl-B9sD8eQhh6vVWt84oYryrR26Z23vC',

choices=[

Choice(

delta=ChoiceDelta(

content='！',

function_call=None,

refusal=None,

role=None,

tool_calls=None

finish_reason=None,

index=1,

logprobs=None

)

created=1741693038,

model='gpt-3.5-turbo-0125',

object='chat.completion.chunk',

service_tier='default',

system_fingerprint=None,

usage=None

)

ChatCompletionChunk(

id='chatcmpl-B9sD8eQhh6vVWt84oYryrR26Z23vC',

choices=[

Choice(

delta=ChoiceDelta(

content='有',

function_call=None,

refusal=None,

role=None,

tool_calls=None

finish_reason=None,

index=0,

logprobs=None

)

created=1741693038,

model='gpt-3.5-turbo-0125',

object='chat.completion.chunk',

service_tier='default',

system_fingerprint=None,

usage=None

)

ChatCompletionChunk(

id='chatcmpl-B9sD8eQhh6vVWt84oYryrR26Z23vC',

choices=[

Choice(

delta=ChoiceDelta(

content='有',

function_call=None,

refusal=None,

role=None,

tool_calls=None

finish_reason=None,

index=1,

logprobs=None

)

created=1741693038,

model='gpt-3.5-turbo-0125',

object='chat.completion.chunk',

service_tier='default',

system_fingerprint=None,

usage=None

)

ChatCompletionChunk(

id='chatcmpl-B9sD8eQhh6vVWt84oYryrR26Z23vC',

choices=[

Choice(

delta=ChoiceDelta(

content=None,

function_call=None,

refusal=None,

role=None,

tool_calls=None

finish_reason='length',

index=0,

logprobs=None

)

created=1741693038,

model='gpt-3.5-turbo-0125',

object='chat.completion.chunk',

service_tier='default',

system_fingerprint=None,

usage=None

)

ChatCompletionChunk(

id='chatcmpl-B9sD8eQhh6vVWt84oYryrR26Z23vC',

choices=[

Choice(

delta=ChoiceDelta(

content=None,

function_call=None,

refusal=None,

role=None,

tool_calls=None

finish_reason='length',

index=1,

logprobs=None

)

created=1741693038,

model='gpt-3.5-turbo-0125',

object='chat.completion.chunk',

service_tier='default',

system_fingerprint=None,

usage=None

)

可見串流中有 index=0 與 index=1 兩組回應, 分別是 '你好! ...' 與 '您好! ...'.

若要將串流片段內容組合成完整回應, 可以先建立一個空字典來儲存每一組回應, 其鍵為組編號 (即 index), 其值為儲存生成 token 的串列 :

>>> responses={i: [] for i in range(2)}

>>> responses

{0: [], 1: []}

然後傳入 n=2 重新提出請求 :

>>> chunks=client.chat.completions.create(

messages=[{'role': 'user', 'content': '嗨?'}],

model='gpt-3.5-turbo',

stream=True,

n=2

)

然後在迭代串流物件時, 依據 index 將每個片段存入各組回應的串列中 :

>>> for chunk in chunks:

for choice in chunk.choices:

index=choice.index

content=choice.delta.content or '' # or 空字串去除 None

responses[index].append(content)

檢視回應字典 responses 內容 :

>>> responses

{0: ['', '你', '好', '！', '有', '什', '么', '可以', '帮', '助', '你', '的', '吗', '？', ''], 1: ['', '您', '好', '!', ' ', '有', '什', '么', '我', '可以', '帮', '助', '您', '的', '吗', '?', '']}

可見這兩組回應的片段都分別存入 index 鍵的值串列中, 利用串列生成式將這些片段組合成完整的語句 :

>>> responses={i: ''.join(responses[i]) for i in range(2)}

>>> responses

{0: '你好！有什么可以帮助你的吗？', 1: '您好! 有什么我可以帮助您的吗?'}

用迴圈列印回應字典內容 :

>>> for i, response in responses.items():

print(f"回應 {i+1}：{response}")

回應 1：你好！有什么可以帮助你的吗？

回應 2：您好! 有什么我可以帮助您的吗?

這樣便取得 2 組回應內容了.

沒有留言 :

張貼留言

訂閱：張貼留言 ( Atom )

小狐狸事務所

2025年3月11日星期二

OpenAI API 學習筆記 : API 參數測試 (三)

沒有留言 :

文章標籤

常用連結

2025年3月11日 星期二

OpenAI API 學習筆記 : API 參數測試 (三)

沒有留言 :

2025年3月11日星期二