小狐狸事務所: AI 應用程式專案 (二) : Youtube 字幕摘要生成器

2026年4月10日星期五

AI 應用程式專案 (二) : Youtube 字幕摘要生成器

本篇繼續測試 Oreilly "AI 應用程式開發" 這本書第三章的 App 專案 No.2 : Youtube 影片摘要, 本篇旨在測試如何利用第三方套件抓取 Youtube 影片字幕後丟給 AI 生成影片內容摘要. 此書的範例程式可在 GitHub 下載 :

# https://oreil.ly/DevAppsGPT_GitHub

本專案範例原始碼網址 :

# https://github.com/malywut/gpt_examples/blob/main/Chap3_02_YoutubeSummarizer/run.py

# https://github.com/malywut/gpt_examples/blob/main/Chap3_02_YoutubeSummarizerVision/run.py

本系列全部測試文章索引參考 :

# OpenAI API 學習筆記索引

1. 安裝 Youtube 字幕抓取工具 yt-dlp :

我詢問 Gemini 要如何下載 YT 字幕檔, 它推薦用 youtube-transcript-api, 但經測試發現無法下載字幕, 很可能是被 YT 阻擋了, 第二選擇是使用 yt-dlp 套件, 經測試可順利下載字幕檔.

首先用 pip 安裝此套件 :

(myvenv) D:\python\test>pip install yt-dlp

Collecting yt-dlp

Downloading yt_dlp-2026.3.17-py3-none-any.whl.metadata (182 kB)

Downloading yt_dlp-2026.3.17-py3-none-any.whl (3.3 MB)

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.3/3.3 MB 7.5 MB/s 0:00:00

Installing collected packages: yt-dlp

Successfully installed yt-dlp-2026.3.17

我找了一個含有中英文字幕的 Python 教學短片來測試 :

# https://www.youtube.com/watch?v=OndBl1H1rwM

測試程式如下 (ChatGPT 生成) :

# get_youtube_transcript_1.py

import sys

import os

import yt_dlp

def get_yt_subtitle_ytdlp(video_id):

url=f"https://www.youtube.com/watch?v={video_id}"

# 設定 yt-dlp 參數

ydl_opts={

'skip_download': True, # 不下載影片檔

'writesubtitles': True, # 抓取手寫字幕

'writeautomaticsub': True, # 如果沒手寫就抓自動生成的

'subtitleslangs': ['zh-Hant', 'zh-TW', 'en'], # 語言優先順序

'outtmpl': '%(title)s.%(ext)s', # 設定輸出檔名主檔名為影片標題

'quiet': True,

'no_warnings': True,

}

try:

with yt_dlp.YoutubeDL(ydl_opts) as ydl:

print(f"DEBUG: 正在透過 yt-dlp 請求影片 {video_id} 的資訊...")

info=ydl.extract_info(url, download=False)

subtitles=info.get('requested_subtitles')

if subtitles:

for lang, sub_info in subtitles.items():

print(f"✅ 成功找到語言: {lang}")

# 下載字幕檔

ydl.download([url])

# 取得檔名（影片檔名 base）

base_filename=ydl.prepare_filename(info)

base_name=os.path.splitext(base_filename)[0]

# 嘗試找字幕檔

found_files=[]

for lang in subtitles.keys():

possible_file=f"{base_name}.{lang}.vtt"

if os.path.exists(possible_file):

found_files.append(possible_file)

if found_files:

for f in found_files:

print(f"🎉 字幕已下載: {f}")

else:

print("⚠️ 字幕下載完成，但找不到實際檔案名稱")

else:

print("❌ 找不到符合的繁體中文或英文字幕")

except Exception as e:

print(f"❌ yt-dlp 抓取失敗: {e}")

if __name__ == "__main__":

if len(sys.argv) < 2:

print("用法: python script.py [影片ID]")

sys.exit(1)

get_yt_subtitle_ytdlp(sys.argv[1])

執行結果如下 :

(myvenv) D:\python\test>python get_youtube_transcript_2.py OndBl1H1rwM

DEBUG: 正在透過 yt-dlp 請求影片 OndBl1H1rwM 的資訊...

✅ 成功找到語言: zh-TW

✅ 成功找到語言: en

🎉 字幕已下載: 【Code Gym】Python基礎教學(5) - for迴圈和while迴圈.zh-TW.vtt

🎉 字幕已下載: 【Code Gym】Python基礎教學(5) - for迴圈和while迴圈.en.vtt

開啟檢視繁中字幕檔內容 :

WEBVTT

Kind: captions

Language: zh-TW

00:00:05.940 --> 00:00:10.400

我們撰寫程式的目的，除了是要建立商業邏輯中判斷的條件

00:00:10.400 --> 00:00:12.960

還需要善用電腦快速運算的能力

00:00:13.140 --> 00:00:16.320

在商業邏輯中執行反覆出現的規則運算

00:00:16.320 --> 00:00:20.820

其中for迴圈和while迴圈就是我們兩個好用的工具

00:00:21.160 --> 00:00:23.680

如果你想要指定程式執行的次數

00:00:23.680 --> 00:00:27.820

或是從容器型態的物件中依序取出裡面的值

00:00:27.820 --> 00:00:31.060

像是我先前介紹過的List, Tuple型態

... (略) ...

00:08:12.700 --> 00:08:17.100

Code Gym頻道主要是分享程式語言教學和電腦網路相關知識

00:08:17.100 --> 00:08:20.900

像是今天影片中所介紹的「for迴圈和while迴圈」

00:08:21.880 --> 00:08:24.240

如果你想要收到最新影片消息

00:08:24.240 --> 00:08:26.080

歡迎訂閱Code Gym頻道

00:08:26.080 --> 00:08:27.020

開小鈴鐺

00:08:27.020 --> 00:08:29.020

我們下次再見，掰掰！

但上面程式有一個缺點, 字幕檔的主檔名使用影片標題, 這可能在之後要用程式開啟檔案時帶來麻煩 (例如標題中有怪碼), 比較好的做法是用影片 ID 當主檔名, 只要修改 yt-dlp 參數中的 'outtmpl' 鍵為 '%(id)s.%(ext)s' 即可 :

'outtmpl': '%(id)s.%(ext)s'

再次執行結果如下 :

(myvenv) D:\python\test>python get_youtube_transcript_1.py OndBl1H1rwM

DEBUG: 正在透過 yt-dlp 請求影片 OndBl1H1rwM 的資訊...

✅ 成功找到語言: zh-TW

✅ 成功找到語言: en

🎉 字幕已下載: OndBl1H1rwM.zh-TW.vtt

🎉 字幕已下載: OndBl1H1rwM.en.vtt

2. 串接 OpenAI API 生成影片字幕摘要 :

在上面下載字幕檔程式的基礎上, 將字幕內容經過清理, 去除文字以外的資訊後丟給 GPT 模型生成摘要, 程式碼如下 :

# get_youtube_transcript_2.py

import sys

import os

import re

import yt_dlp

from openai import OpenAI

from dotenv import dotenv_values

config=dotenv_values('.env')

openai_api_key=config.get('OPENAI_API_KEY')

client=OpenAI(api_key=openai_api_key)

def clean_vtt(file_path):

"""

清理 VTT 字幕檔，移除時間軸、標頭與重複的文字區塊，回傳純文字。

"""

if not os.path.exists(file_path):

return ""

with open(file_path, 'r', encoding='utf-8') as f:

lines=f.readlines()

clean_text_list=[]

for line in lines:

# 移除 WEBVTT 標頭、時間軸 (-->) 與設定行

if "-->" in line or line.startswith("WEBVTT") or line.startswith("Kind:") or line.startswith("Language:"):

continue

# 移除 HTML 標籤 (例如 <c> 標籤)

line=re.sub(r'<[^>]+>', '', line).strip()

# 避免加入空白行與重複的行 (VTT 常有重複出現的字幕快照)

if line and (not clean_text_list or line != clean_text_list[-1]):

clean_text_list.append(line)

return "\n".join(clean_text_list)

def ask_gpt(

messages: list[dict[str, str]],

model: str='gpt-3.5-turbo'

) -> str:

try:

reply=client.chat.completions.create(

model=model,

messages=messages

)

return reply.choices[0].message.content or ''

except APIError as e:

return e.message

def summarizer(text):

if not text:

return "無字幕內容可生成摘要。"

print("\n--- [摘要生成中] ---")

print(f"（已接收到 {len(text)} 字的字幕內容，準備進行摘要...）")

# 呼叫 AI 生成摘要

return ask_gpt([{"role": "user",

"content": f"請摘要下列字幕內容 : \n{text}"}])

def get_yt_subtitle_ytdlp(video_id):

url=f"https://www.youtube.com/watch?v={video_id}"

# 定義語言優先順序：繁體中文 -> 簡體中文 -> 英文

lang_priority=['zh-Hant', 'zh-TW', 'zh-Hans', 'zh-CN', 'en']

ydl_opts={

'skip_download': True,

'writesubtitles': True,

'writeautomaticsub': True,

'subtitleslangs': lang_priority,

'outtmpl': '%(id)s.%(ext)s', # 強制以影片 ID 為主檔名

'quiet': True,

'no_warnings': True,

}

try:

with yt_dlp.YoutubeDL(ydl_opts) as ydl:

print(f"DEBUG: 正在透過 yt-dlp 請求影片 {video_id} 的資訊...")

info=ydl.extract_info(url, download=False)

subtitles=info.get('requested_subtitles')

if not subtitles:

print("❌ 找不到符合要求的字幕。")

return

# 下載字幕檔

ydl.download([url])

# 依照優先順序尋找已下載的檔案

selected_file=None

for lang in lang_priority:

possible_file=f"{video_id}.{lang}.vtt"

if os.path.exists(possible_file):

selected_file=possible_file

print(f"✅ 已選定最優語言字幕: {lang} ({selected_file})")

break

if selected_file:

# 1. 清理字幕

print(f"🧹 正在清理字幕格式...")

cleaned_content=clean_vtt(selected_file)

# 2. 生成摘要

summary_result=summarizer(cleaned_content)

print("\n[摘要結果]:")

print(summary_result)

# 可選：實驗完成後刪除暫存的 vtt 檔

# os.remove(selected_file)

else:

print("⚠️ 檔案下載完成，但讀取時找不到檔案。")

except Exception as e:

print(f"❌ 執行過程中發生錯誤: {e}")

if __name__ == "__main__":

if len(sys.argv) < 2:

print("用法: python script.py [影片ID]")

sys.exit(1)

get_yt_subtitle_ytdlp(sys.argv[1])

執行結果如下 :

(myvenv) D:\python\test>python get_youtube_transcript_2.py OndBl1H1rwM

DEBUG: 正在透過 yt-dlp 請求影片 OndBl1H1rwM 的資訊...

✅ 已選定最優語言字幕: zh-TW (OndBl1H1rwM.zh-TW.vtt)

🧹 正在清理字幕格式...

--- [摘要生成中] ---

（已接收到 2339 字的字幕內容，準備進行摘要...）

[摘要結果]:

本文介紹了在撰寫程式中使用for迴圈和while迴圈的基本概念和用法。for迴圈主要用於從容器型態中依序取出值，可以指定程式執行的次數或範圍，使用range()函式可以簡化處理。在for迴圈中，可以使用break和continue來控制迴圈的流程。而while迴圈則是根據條件式的判斷結果來決定是否執行程式區塊，可以用來進行猜數字等互動式遊戲。最後，介紹了如何匯入Python模組，在學習完本文後可以在程式編輯軟體上實際練習程式碼。

沒有留言 :

張貼留言

訂閱：張貼留言 ( Atom )

小狐狸事務所

2026年4月10日星期五

AI 應用程式專案 (二) : Youtube 字幕摘要生成器

沒有留言 :

文章標籤

常用連結

2026年4月10日 星期五

AI 應用程式專案 (二) : Youtube 字幕摘要生成器

沒有留言 :

2026年4月10日星期五