1347 lines
38 KiB
Markdown
1347 lines
38 KiB
Markdown
# 📚 提示词库管理系统 - 项目开发文档
|
||
|
||
## 目录
|
||
1. [项目概述](#1-项目概述)
|
||
2. [系统架构](#2-系统架构)
|
||
3. [环境准备](#3-环境准备)
|
||
4. [详细实施步骤](#4-详细实施步骤)
|
||
5. [核心代码实现](#5-核心代码实现)
|
||
6. [部署与配置](#6-部署与配置)
|
||
7. [使用指南](#7-使用指南)
|
||
8. [维护与扩展](#8-维护与扩展)
|
||
9. [故障排查](#9-故障排查)
|
||
10. [最佳实践](#10-最佳实践)
|
||
|
||
---
|
||
|
||
## 1. 项目概述
|
||
|
||
### 1.1 项目背景
|
||
将 Google Sheets 中的提示词库同步到 GitHub 仓库,实现版本控制和便捷访问。
|
||
|
||
### 1.2 核心功能
|
||
- ✅ 自动同步 Google Sheets 数据到 GitHub
|
||
- ✅ 保持提示词版本迭代历史
|
||
- ✅ 支持多工作表分类管理
|
||
- ✅ 生成可读的 Markdown 文档
|
||
- ✅ 提供 API 访问接口
|
||
|
||
### 1.3 技术栈
|
||
- **数据源**: Google Sheets
|
||
- **版本控制**: Git/GitHub
|
||
- **编程语言**: Python 3.8+
|
||
- **自动化**: GitHub Actions
|
||
- **格式**: Markdown, JSON
|
||
|
||
---
|
||
|
||
## 2. 系统架构
|
||
|
||
### 2.1 架构图
|
||
```
|
||
┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐
|
||
│ Google Sheets │────▶│ Python Scripts │────▶│ GitHub Repo │
|
||
│ (数据源) │ │ (转换处理) │ │ (存储展示) │
|
||
└─────────────────┘ └──────────────────┘ └─────────────────┘
|
||
│ │ │
|
||
│ │ ▼
|
||
│ │ ┌───────────────┐
|
||
│ │ │ GitHub Pages │
|
||
└───────────────────────┴────────────────▶│ (可选展示) │
|
||
API 调用 └───────────────┘
|
||
```
|
||
|
||
### 2.2 数据流
|
||
1. **输入**: Google Sheets 表格数据
|
||
2. **处理**: Python 脚本读取、解析、转换
|
||
3. **输出**: 结构化的 Markdown 文件和 JSON 索引
|
||
4. **存储**: GitHub 仓库
|
||
5. **展示**: GitHub 页面或 API 访问
|
||
|
||
### 2.3 目录结构规范
|
||
```
|
||
prompt-library/
|
||
├── .github/
|
||
│ ├── workflows/
|
||
│ │ └── sync.yml # GitHub Actions 配置
|
||
│ └── ISSUE_TEMPLATE/ # Issue 模板
|
||
├── prompts/ # 核心提示词目录
|
||
│ ├── [分类名称]/ # 各工作表对应的文件夹
|
||
│ │ ├── (行,列)_标题.md # 单个提示词文件
|
||
│ │ └── index.md # 分类索引
|
||
│ └── index.json # 总索引
|
||
├── scripts/ # 脚本目录
|
||
│ ├── sync_sheets.py # 同步脚本
|
||
│ ├── convert_local.py # 本地转换脚本
|
||
│ ├── requirements.txt # Python 依赖
|
||
│ └── config.yaml # 配置文件
|
||
├── docs/ # 文档目录
|
||
├── tests/ # 测试脚本
|
||
├── .gitignore # Git 忽略规则
|
||
├── README.md # 项目说明
|
||
└── LICENSE # 许可证
|
||
```
|
||
|
||
---
|
||
|
||
## 3. 环境准备
|
||
|
||
### 3.1 本地开发环境
|
||
|
||
#### Windows
|
||
```bash
|
||
# 1. 安装 Python
|
||
# 下载地址: https://www.python.org/downloads/
|
||
|
||
# 2. 安装 Git
|
||
# 下载地址: https://git-scm.com/download/win
|
||
|
||
# 3. 验证安装
|
||
python --version
|
||
git --version
|
||
|
||
# 4. 创建虚拟环境
|
||
python -m venv venv
|
||
venv\Scripts\activate
|
||
|
||
# 5. 安装依赖
|
||
pip install -r requirements.txt
|
||
```
|
||
|
||
#### macOS/Linux
|
||
```bash
|
||
# 1. 安装 Python (如果未安装)
|
||
brew install python3 # macOS
|
||
sudo apt-get install python3 python3-pip # Ubuntu
|
||
|
||
# 2. 创建虚拟环境
|
||
python3 -m venv venv
|
||
source venv/bin/activate
|
||
|
||
# 3. 安装依赖
|
||
pip install -r requirements.txt
|
||
```
|
||
|
||
### 3.2 依赖包清单
|
||
```txt
|
||
# requirements.txt
|
||
pandas==2.0.3
|
||
openpyxl==3.1.2
|
||
google-auth==2.22.0
|
||
google-auth-oauthlib==1.0.0
|
||
google-auth-httplib2==0.1.0
|
||
google-api-python-client==2.96.0
|
||
PyYAML==6.0.1
|
||
python-dotenv==1.0.0
|
||
```
|
||
|
||
### 3.3 Google Sheets API 配置
|
||
|
||
#### 步骤 1: 创建 Google Cloud 项目
|
||
1. 访问 [Google Cloud Console](https://console.cloud.google.com/)
|
||
2. 创建新项目或选择现有项目
|
||
3. 记录项目 ID
|
||
|
||
#### 步骤 2: 启用 API
|
||
```bash
|
||
# 在 Cloud Console 中搜索并启用
|
||
Google Sheets API
|
||
Google Drive API (可选)
|
||
```
|
||
|
||
#### 步骤 3: 创建服务账号
|
||
1. 导航到 "API和服务" → "凭据"
|
||
2. 点击 "创建凭据" → "服务账号"
|
||
3. 填写服务账号信息:
|
||
- 名称: `sheets-sync-bot`
|
||
- ID: `sheets-sync-bot`
|
||
- 描述: `用于同步 Google Sheets 数据`
|
||
4. 创建并下载 JSON 密钥文件
|
||
|
||
#### 步骤 4: 授权访问表格
|
||
1. 打开你的 Google Sheets
|
||
2. 点击右上角 "共享"
|
||
3. 添加服务账号邮箱 (例: `sheets-sync-bot@项目ID.iam.gserviceaccount.com`)
|
||
4. 权限设置为 "查看者"
|
||
|
||
---
|
||
|
||
## 4. 详细实施步骤
|
||
|
||
### 4.1 项目初始化
|
||
|
||
```bash
|
||
# 1. 创建项目目录
|
||
mkdir prompt-library
|
||
cd prompt-library
|
||
|
||
# 2. 初始化 Git
|
||
git init
|
||
|
||
# 3. 创建目录结构
|
||
mkdir -p prompts scripts docs tests .github/workflows
|
||
|
||
# 4. 创建基础文件
|
||
touch README.md LICENSE .gitignore
|
||
touch scripts/config.yaml
|
||
```
|
||
|
||
### 4.2 配置文件设置
|
||
|
||
#### config.yaml
|
||
```yaml
|
||
# scripts/config.yaml
|
||
google_sheets:
|
||
sheet_id: "1ngoQOhJqdguwNAilCl1joNwTje7FWWN9WiI2bo5VhpU"
|
||
credentials_path: "./credentials.json"
|
||
|
||
output:
|
||
prompts_dir: "./prompts"
|
||
use_timestamp: true
|
||
|
||
naming:
|
||
max_title_length: 30
|
||
row_col_format: "({row},{col})"
|
||
separator: "_"
|
||
|
||
sync:
|
||
skip_rows: 2 # 跳过说明行
|
||
skip_keywords: # 跳过包含这些关键词的行
|
||
- "http"
|
||
- "广告"
|
||
- "tron"
|
||
- "eth"
|
||
- "btc"
|
||
```
|
||
|
||
#### .gitignore
|
||
```gitignore
|
||
# Python
|
||
__pycache__/
|
||
*.py[cod]
|
||
*$py.class
|
||
*.so
|
||
.Python
|
||
venv/
|
||
env/
|
||
ENV/
|
||
|
||
# 凭据和密钥
|
||
credentials.json
|
||
token.json
|
||
*.key
|
||
.env
|
||
|
||
# IDE
|
||
.vscode/
|
||
.idea/
|
||
*.swp
|
||
*.swo
|
||
|
||
# OS
|
||
.DS_Store
|
||
Thumbs.db
|
||
|
||
# 临时文件
|
||
*.tmp
|
||
*.bak
|
||
*.log
|
||
temp/
|
||
```
|
||
|
||
### 4.3 核心脚本开发
|
||
|
||
#### 主同步脚本结构
|
||
```python
|
||
# scripts/sync_sheets.py 的基本结构
|
||
|
||
import os
|
||
import yaml
|
||
import json
|
||
from datetime import datetime
|
||
|
||
class SheetsSyncer:
|
||
def __init__(self, config_path='config.yaml'):
|
||
self.config = self.load_config(config_path)
|
||
self.service = self.authenticate()
|
||
|
||
def load_config(self, path):
|
||
"""加载配置文件"""
|
||
pass
|
||
|
||
def authenticate(self):
|
||
"""Google Sheets API 认证"""
|
||
pass
|
||
|
||
def fetch_sheets_data(self):
|
||
"""获取所有工作表数据"""
|
||
pass
|
||
|
||
def process_sheet(self, sheet_name, data):
|
||
"""处理单个工作表"""
|
||
pass
|
||
|
||
def create_prompt_file(self, folder, row, col, content, title):
|
||
"""创建单个提示词文件"""
|
||
pass
|
||
|
||
def generate_index(self, folder, prompts_info):
|
||
"""生成索引文件"""
|
||
pass
|
||
|
||
def sync(self):
|
||
"""主同步流程"""
|
||
pass
|
||
|
||
if __name__ == "__main__":
|
||
syncer = SheetsSyncer()
|
||
syncer.sync()
|
||
```
|
||
|
||
---
|
||
|
||
## 5. 核心代码实现
|
||
|
||
### 5.1 完整同步脚本
|
||
|
||
```python
|
||
# scripts/sync_sheets.py - 完整实现
|
||
|
||
import os
|
||
import re
|
||
import json
|
||
import yaml
|
||
from datetime import datetime
|
||
from typing import Dict, List, Tuple, Optional
|
||
import pandas as pd
|
||
from google.oauth2 import service_account
|
||
from googleapiclient.discovery import build
|
||
|
||
class PromptLibrarySyncer:
|
||
"""提示词库同步器"""
|
||
|
||
def __init__(self, config_path: str = 'config.yaml'):
|
||
"""初始化同步器"""
|
||
self.config = self._load_config(config_path)
|
||
self.service = self._authenticate()
|
||
self.stats = {'sheets': 0, 'prompts': 0, 'versions': 0}
|
||
|
||
def _load_config(self, path: str) -> dict:
|
||
"""加载配置文件"""
|
||
with open(path, 'r', encoding='utf-8') as f:
|
||
return yaml.safe_load(f)
|
||
|
||
def _authenticate(self):
|
||
"""认证 Google Sheets API"""
|
||
SCOPES = ['https://www.googleapis.com/auth/spreadsheets.readonly']
|
||
|
||
creds_path = self.config['google_sheets']['credentials_path']
|
||
if os.path.exists(creds_path):
|
||
creds = service_account.Credentials.from_service_account_file(
|
||
creds_path, scopes=SCOPES
|
||
)
|
||
else:
|
||
# 从环境变量读取
|
||
import json
|
||
creds_json = os.environ.get('GOOGLE_SHEETS_CREDENTIALS')
|
||
creds_dict = json.loads(creds_json)
|
||
creds = service_account.Credentials.from_service_account_info(
|
||
creds_dict, scopes=SCOPES
|
||
)
|
||
|
||
return build('sheets', 'v4', credentials=creds)
|
||
|
||
def _sanitize_filename(self, text: str) -> str:
|
||
"""清理文件名"""
|
||
if not text:
|
||
return "untitled"
|
||
|
||
# 移除非法字符
|
||
text = re.sub(r'[<>:"/\\|?*\n\r]', '', str(text))
|
||
|
||
# 限制长度
|
||
max_length = self.config['naming']['max_title_length']
|
||
text = text[:max_length].strip()
|
||
|
||
return text if text else "untitled"
|
||
|
||
def _extract_title(self, content: str) -> str:
|
||
"""从内容提取标题"""
|
||
if not content:
|
||
return "untitled"
|
||
|
||
# 尝试提取第一行作为标题
|
||
lines = content.split('\n')
|
||
title = lines[0] if lines else content
|
||
|
||
# 如果第一行太长,取前几个词
|
||
words = title.split()[:5]
|
||
title = ' '.join(words)
|
||
|
||
return self._sanitize_filename(title)
|
||
|
||
def fetch_all_sheets(self) -> Dict[str, List[List[str]]]:
|
||
"""获取所有工作表数据"""
|
||
sheet_id = self.config['google_sheets']['sheet_id']
|
||
|
||
# 获取所有工作表名称
|
||
spreadsheet = self.service.spreadsheets().get(
|
||
spreadsheetId=sheet_id
|
||
).execute()
|
||
|
||
sheets = spreadsheet.get('sheets', [])
|
||
all_data = {}
|
||
|
||
for sheet in sheets:
|
||
sheet_name = sheet['properties']['title']
|
||
print(f"📖 读取工作表: {sheet_name}")
|
||
|
||
# 读取数据
|
||
range_name = f'{sheet_name}!A:ZZ'
|
||
try:
|
||
result = self.service.spreadsheets().values().get(
|
||
spreadsheetId=sheet_id,
|
||
range=range_name
|
||
).execute()
|
||
|
||
values = result.get('values', [])
|
||
all_data[sheet_name] = values
|
||
self.stats['sheets'] += 1
|
||
|
||
except Exception as e:
|
||
print(f" ⚠️ 读取失败: {e}")
|
||
|
||
return all_data
|
||
|
||
def process_sheet_data(
|
||
self,
|
||
sheet_name: str,
|
||
data: List[List[str]]
|
||
) -> Dict[int, Dict]:
|
||
"""处理单个工作表数据"""
|
||
prompts_info = {}
|
||
skip_rows = self.config['sync']['skip_rows']
|
||
skip_keywords = self.config['sync']['skip_keywords']
|
||
|
||
# 创建输出文件夹
|
||
output_dir = self.config['output']['prompts_dir']
|
||
folder_name = self._sanitize_filename(sheet_name)
|
||
folder_path = os.path.join(output_dir, folder_name)
|
||
os.makedirs(folder_path, exist_ok=True)
|
||
|
||
# 处理每一行
|
||
for row_idx, row in enumerate(data):
|
||
# 跳过说明行
|
||
if row_idx < skip_rows:
|
||
continue
|
||
|
||
# 跳过空行
|
||
if not row or all(not cell for cell in row):
|
||
continue
|
||
|
||
# 跳过特殊行
|
||
first_cell = str(row[0]) if row else ""
|
||
if any(kw in first_cell.lower() for kw in skip_keywords):
|
||
continue
|
||
|
||
# 实际行号
|
||
row_num = row_idx - skip_rows + 1
|
||
title = None
|
||
versions = {}
|
||
|
||
# 处理每个版本(列)
|
||
for col_idx, cell in enumerate(row):
|
||
if not cell or not cell.strip():
|
||
continue
|
||
|
||
col_num = col_idx + 1
|
||
content = cell.strip()
|
||
|
||
# 提取标题
|
||
if title is None:
|
||
title = self._extract_title(content)
|
||
|
||
# 创建文件
|
||
filename = self._create_prompt_file(
|
||
folder_path, row_num, col_num, content, title
|
||
)
|
||
versions[col_num] = filename
|
||
self.stats['versions'] += 1
|
||
|
||
if versions:
|
||
prompts_info[row_num] = {
|
||
'title': title,
|
||
'versions': versions
|
||
}
|
||
self.stats['prompts'] += 1
|
||
|
||
# 生成索引
|
||
if prompts_info:
|
||
self._create_index(folder_path, sheet_name, prompts_info)
|
||
|
||
return prompts_info
|
||
|
||
def _create_prompt_file(
|
||
self,
|
||
folder: str,
|
||
row: int,
|
||
col: int,
|
||
content: str,
|
||
title: str
|
||
) -> str:
|
||
"""创建单个提示词文件"""
|
||
# 生成文件名
|
||
row_col = self.config['naming']['row_col_format'].format(
|
||
row=row, col=col
|
||
)
|
||
separator = self.config['naming']['separator']
|
||
filename = f"{row_col}{separator}{title}.md"
|
||
filepath = os.path.join(folder, filename)
|
||
|
||
# 写入文件
|
||
with open(filepath, 'w', encoding='utf-8') as f:
|
||
# 文件头部
|
||
f.write(f"# {title}\n\n")
|
||
f.write("## 📍 元信息\n\n")
|
||
f.write(f"- **位置**: 第 {row} 行,第 {col} 列\n")
|
||
|
||
if col == 1:
|
||
f.write("- **版本**: 原始版本\n")
|
||
else:
|
||
f.write(f"- **版本**: 第 {col-1} 次迭代\n")
|
||
|
||
if self.config['output']['use_timestamp']:
|
||
f.write(f"- **更新时间**: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}\n")
|
||
|
||
f.write("\n---\n\n")
|
||
|
||
# 提示词内容
|
||
f.write("## 📝 提示词内容\n\n")
|
||
f.write("```\n")
|
||
f.write(content)
|
||
f.write("\n```\n\n")
|
||
|
||
# 版本导航
|
||
f.write("---\n\n")
|
||
f.write("## 🔄 版本历史\n\n")
|
||
|
||
# 这里可以添加链接到其他版本的逻辑
|
||
|
||
print(f" ✅ 创建: {filename}")
|
||
return filename
|
||
|
||
def _create_index(
|
||
self,
|
||
folder: str,
|
||
sheet_name: str,
|
||
prompts_info: Dict
|
||
):
|
||
"""创建索引文件"""
|
||
index_path = os.path.join(folder, "index.md")
|
||
|
||
with open(index_path, 'w', encoding='utf-8') as f:
|
||
# 标题
|
||
f.write(f"# 📂 {sheet_name}\n\n")
|
||
f.write(f"最后同步: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}\n\n")
|
||
|
||
# 统计信息
|
||
total_prompts = len(prompts_info)
|
||
total_versions = sum(
|
||
len(info['versions']) for info in prompts_info.values()
|
||
)
|
||
avg_versions = total_versions / total_prompts if total_prompts else 0
|
||
|
||
f.write("## 📊 统计\n\n")
|
||
f.write(f"- 提示词总数: {total_prompts}\n")
|
||
f.write(f"- 版本总数: {total_versions}\n")
|
||
f.write(f"- 平均版本数: {avg_versions:.1f}\n\n")
|
||
|
||
# 提示词列表
|
||
f.write("## 📋 提示词列表\n\n")
|
||
f.write("| 序号 | 标题 | 版本数 | 查看 |\n")
|
||
f.write("|------|------|--------|------|\n")
|
||
|
||
for row_num in sorted(prompts_info.keys()):
|
||
info = prompts_info[row_num]
|
||
title = info['title']
|
||
versions = info['versions']
|
||
|
||
# 生成版本链接
|
||
links = []
|
||
for col, filename in sorted(versions.items()):
|
||
links.append(f"[v{col}](./{filename})")
|
||
|
||
links_str = " / ".join(links)
|
||
f.write(f"| {row_num} | {title} | {len(versions)} | {links_str} |\n")
|
||
|
||
# 版本矩阵
|
||
f.write("\n## 🗂️ 版本矩阵\n\n")
|
||
self._create_version_matrix(f, prompts_info)
|
||
|
||
def _create_version_matrix(self, f, prompts_info: Dict):
|
||
"""创建版本矩阵视图"""
|
||
if not prompts_info:
|
||
return
|
||
|
||
# 找出最大列数
|
||
max_col = max(
|
||
max(info['versions'].keys())
|
||
for info in prompts_info.values()
|
||
)
|
||
|
||
# 创建表头
|
||
headers = ["行"] + [f"v{i}" for i in range(1, max_col + 1)]
|
||
f.write("| " + " | ".join(headers) + " |\n")
|
||
f.write("|" + "---|" * len(headers) + "\n")
|
||
|
||
# 填充矩阵
|
||
for row_num in sorted(prompts_info.keys()):
|
||
info = prompts_info[row_num]
|
||
row_data = [str(row_num)]
|
||
|
||
for col in range(1, max_col + 1):
|
||
if col in info['versions']:
|
||
row_data.append("✅")
|
||
else:
|
||
row_data.append("—")
|
||
|
||
f.write("| " + " | ".join(row_data) + " |\n")
|
||
|
||
def create_main_index(self, all_sheets_info: Dict):
|
||
"""创建主索引文件"""
|
||
output_dir = self.config['output']['prompts_dir']
|
||
|
||
# JSON 索引
|
||
index_json = {
|
||
'last_updated': datetime.now().isoformat(),
|
||
'stats': self.stats,
|
||
'categories': []
|
||
}
|
||
|
||
for sheet_name, prompts_info in all_sheets_info.items():
|
||
if prompts_info:
|
||
index_json['categories'].append({
|
||
'name': sheet_name,
|
||
'prompt_count': len(prompts_info),
|
||
'version_count': sum(
|
||
len(info['versions'])
|
||
for info in prompts_info.values()
|
||
)
|
||
})
|
||
|
||
# 保存 JSON
|
||
json_path = os.path.join(output_dir, 'index.json')
|
||
with open(json_path, 'w', encoding='utf-8') as f:
|
||
json.dump(index_json, f, ensure_ascii=False, indent=2)
|
||
|
||
# 创建主 README
|
||
readme_path = os.path.join(
|
||
os.path.dirname(output_dir),
|
||
'README.md'
|
||
)
|
||
|
||
with open(readme_path, 'w', encoding='utf-8') as f:
|
||
f.write("# 📚 提示词库\n\n")
|
||
f.write("\n")
|
||
f.write(f"\n")
|
||
f.write(f"\n\n")
|
||
|
||
f.write(f"最后更新: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}\n\n")
|
||
|
||
f.write("## 📊 总览\n\n")
|
||
f.write(f"- **分类数量**: {self.stats['sheets']}\n")
|
||
f.write(f"- **提示词总数**: {self.stats['prompts']}\n")
|
||
f.write(f"- **版本总数**: {self.stats['versions']}\n\n")
|
||
|
||
f.write("## 📂 分类导航\n\n")
|
||
|
||
for cat in index_json['categories']:
|
||
folder_name = self._sanitize_filename(cat['name'])
|
||
f.write(
|
||
f"- [{cat['name']}](./prompts/{folder_name}/) - "
|
||
f"{cat['prompt_count']} 个提示词, "
|
||
f"{cat['version_count']} 个版本\n"
|
||
)
|
||
|
||
f.write("\n## 🚀 快速开始\n\n")
|
||
f.write("1. 浏览上方分类,选择需要的提示词类型\n")
|
||
f.write("2. 点击进入分类文件夹查看所有提示词\n")
|
||
f.write("3. 选择具体版本查看详细内容\n\n")
|
||
|
||
f.write("## 📖 使用说明\n\n")
|
||
f.write("- 文件命名格式: `(行号,列号)_标题.md`\n")
|
||
f.write("- 行号: 表示提示词的序号\n")
|
||
f.write("- 列号: 表示版本号 (1=原始版本, 2+=迭代版本)\n\n")
|
||
|
||
f.write("## 🔄 同步信息\n\n")
|
||
f.write(f"- 数据源: [Google Sheets]({self._get_sheet_url()})\n")
|
||
f.write("- 同步方式: GitHub Actions / 手动\n")
|
||
f.write("- 同步频率: 每日 / 手动触发\n\n")
|
||
|
||
f.write("## 📝 许可证\n\n")
|
||
f.write("本项目采用 MIT 许可证\n")
|
||
|
||
def _get_sheet_url(self) -> str:
|
||
"""获取 Google Sheets URL"""
|
||
sheet_id = self.config['google_sheets']['sheet_id']
|
||
return f"https://docs.google.com/spreadsheets/d/{sheet_id}"
|
||
|
||
def sync(self):
|
||
"""执行同步"""
|
||
print("🚀 开始同步提示词库...\n")
|
||
|
||
# 获取所有数据
|
||
all_sheets_data = self.fetch_all_sheets()
|
||
|
||
if not all_sheets_data:
|
||
print("❌ 未找到任何工作表数据")
|
||
return
|
||
|
||
print(f"\n📊 找到 {len(all_sheets_data)} 个工作表\n")
|
||
|
||
# 处理每个工作表
|
||
all_sheets_info = {}
|
||
for sheet_name, data in all_sheets_data.items():
|
||
print(f"\n🔄 处理: {sheet_name}")
|
||
prompts_info = self.process_sheet_data(sheet_name, data)
|
||
all_sheets_info[sheet_name] = prompts_info
|
||
|
||
if prompts_info:
|
||
print(f" ✅ 完成: {len(prompts_info)} 个提示词")
|
||
else:
|
||
print(f" ⚠️ 无有效数据")
|
||
|
||
# 创建主索引
|
||
print("\n📝 生成主索引...")
|
||
self.create_main_index(all_sheets_info)
|
||
|
||
# 打印统计
|
||
print("\n✨ 同步完成!\n")
|
||
print("📊 统计信息:")
|
||
print(f" - 工作表: {self.stats['sheets']}")
|
||
print(f" - 提示词: {self.stats['prompts']}")
|
||
print(f" - 版本数: {self.stats['versions']}")
|
||
|
||
def main():
|
||
"""主函数"""
|
||
try:
|
||
syncer = PromptLibrarySyncer()
|
||
syncer.sync()
|
||
except Exception as e:
|
||
print(f"\n❌ 错误: {e}")
|
||
import traceback
|
||
traceback.print_exc()
|
||
exit(1)
|
||
|
||
if __name__ == "__main__":
|
||
main()
|
||
```
|
||
|
||
---
|
||
|
||
## 6. 部署与配置
|
||
|
||
### 6.1 GitHub Actions 自动化
|
||
|
||
```yaml
|
||
# .github/workflows/sync.yml
|
||
name: 同步提示词库
|
||
|
||
on:
|
||
# 定时触发
|
||
schedule:
|
||
- cron: '0 0 * * *' # 每天 UTC 0:00 (北京时间 8:00)
|
||
|
||
# 手动触发
|
||
workflow_dispatch:
|
||
inputs:
|
||
force_sync:
|
||
description: '强制同步所有数据'
|
||
required: false
|
||
default: 'false'
|
||
|
||
# Push 触发
|
||
push:
|
||
branches:
|
||
- main
|
||
paths:
|
||
- 'scripts/**'
|
||
- '.github/workflows/sync.yml'
|
||
|
||
jobs:
|
||
sync:
|
||
runs-on: ubuntu-latest
|
||
|
||
steps:
|
||
- name: 📥 检出代码
|
||
uses: actions/checkout@v3
|
||
with:
|
||
fetch-depth: 0 # 获取完整历史
|
||
|
||
- name: 🐍 设置 Python
|
||
uses: actions/setup-python@v4
|
||
with:
|
||
python-version: '3.10'
|
||
cache: 'pip'
|
||
|
||
- name: 📦 安装依赖
|
||
run: |
|
||
python -m pip install --upgrade pip
|
||
pip install -r scripts/requirements.txt
|
||
|
||
- name: 🔐 设置凭据
|
||
env:
|
||
GOOGLE_SHEETS_CREDENTIALS: ${{ secrets.GOOGLE_SHEETS_CREDENTIALS }}
|
||
run: |
|
||
echo "$GOOGLE_SHEETS_CREDENTIALS" > credentials.json
|
||
|
||
- name: 🔄 同步数据
|
||
env:
|
||
FORCE_SYNC: ${{ github.event.inputs.force_sync }}
|
||
run: |
|
||
python scripts/sync_sheets.py
|
||
|
||
- name: 📊 生成报告
|
||
run: |
|
||
python scripts/generate_report.py
|
||
|
||
- name: 💾 提交更改
|
||
run: |
|
||
git config --local user.email "action@github.com"
|
||
git config --local user.name "GitHub Action"
|
||
git add .
|
||
git diff --staged --quiet || git commit -m "🔄 自动同步: $(date +'%Y-%m-%d %H:%M:%S')"
|
||
|
||
- name: 📤 推送更改
|
||
uses: ad-m/github-push-action@master
|
||
with:
|
||
github_token: ${{ secrets.GITHUB_TOKEN }}
|
||
branch: ${{ github.ref }}
|
||
|
||
- name: 🏷️ 创建标签 (每周)
|
||
if: github.event_name == 'schedule' && contains('0', github.run_number)
|
||
run: |
|
||
TAG="v$(date +'%Y.%W')"
|
||
git tag $TAG
|
||
git push origin $TAG
|
||
```
|
||
|
||
### 6.2 GitHub Secrets 配置
|
||
|
||
在 GitHub 仓库设置中添加以下 Secrets:
|
||
|
||
1. **GOOGLE_SHEETS_CREDENTIALS**
|
||
- 内容: 服务账号 JSON 密钥的完整内容
|
||
- 获取方式: 从 Google Cloud Console 下载的 JSON 文件内容
|
||
|
||
2. **GITHUB_TOKEN** (通常自动提供)
|
||
- 用于推送更改
|
||
|
||
### 6.3 本地部署脚本
|
||
|
||
```bash
|
||
#!/bin/bash
|
||
# deploy.sh - 本地部署脚本
|
||
|
||
echo "🚀 开始部署提示词库同步系统"
|
||
|
||
# 1. 检查依赖
|
||
echo "📦 检查依赖..."
|
||
command -v python3 >/dev/null 2>&1 || { echo "❌ 需要安装 Python 3"; exit 1; }
|
||
command -v git >/dev/null 2>&1 || { echo "❌ 需要安装 Git"; exit 1; }
|
||
|
||
# 2. 创建虚拟环境
|
||
echo "🐍 创建虚拟环境..."
|
||
python3 -m venv venv
|
||
source venv/bin/activate
|
||
|
||
# 3. 安装依赖
|
||
echo "📦 安装 Python 包..."
|
||
pip install -r scripts/requirements.txt
|
||
|
||
# 4. 配置检查
|
||
echo "🔧 检查配置..."
|
||
if [ ! -f "scripts/config.yaml" ]; then
|
||
echo "❌ 缺少 config.yaml 文件"
|
||
exit 1
|
||
fi
|
||
|
||
if [ ! -f "credentials.json" ]; then
|
||
echo "⚠️ 缺少 credentials.json,尝试从环境变量读取"
|
||
fi
|
||
|
||
# 5. 运行同步
|
||
echo "🔄 执行同步..."
|
||
python scripts/sync_sheets.py
|
||
|
||
# 6. Git 操作
|
||
echo "📤 提交到 Git..."
|
||
git add .
|
||
git commit -m "🔄 手动同步: $(date +'%Y-%m-%d %H:%M:%S')"
|
||
git push
|
||
|
||
echo "✅ 部署完成!"
|
||
```
|
||
|
||
---
|
||
|
||
## 7. 使用指南
|
||
|
||
### 7.1 日常使用流程
|
||
|
||
#### 添加新提示词
|
||
1. 在 Google Sheets 中添加新行
|
||
2. 在对应列添加提示词内容
|
||
3. 等待自动同步或手动触发
|
||
|
||
#### 更新提示词版本
|
||
1. 在同一行的新列添加迭代版本
|
||
2. 系统会自动创建新版本文件
|
||
|
||
#### 查看提示词
|
||
1. 访问 GitHub 仓库
|
||
2. 进入 `prompts/分类名称/` 文件夹
|
||
3. 查看具体的提示词文件
|
||
|
||
### 7.2 API 访问
|
||
|
||
#### JavaScript 示例
|
||
```javascript
|
||
// 获取索引
|
||
fetch('https://raw.githubusercontent.com/用户名/仓库名/main/prompts/index.json')
|
||
.then(res => res.json())
|
||
.then(data => {
|
||
console.log('分类列表:', data.categories);
|
||
});
|
||
|
||
// 获取特定提示词
|
||
fetch('https://raw.githubusercontent.com/用户名/仓库名/main/prompts/分类/(1,1)_标题.md')
|
||
.then(res => res.text())
|
||
.then(content => {
|
||
console.log('提示词内容:', content);
|
||
});
|
||
```
|
||
|
||
#### Python 示例
|
||
```python
|
||
import requests
|
||
import json
|
||
|
||
# 获取索引
|
||
url = "https://raw.githubusercontent.com/用户名/仓库名/main/prompts/index.json"
|
||
response = requests.get(url)
|
||
data = json.loads(response.text)
|
||
|
||
print(f"总提示词数: {data['stats']['prompts']}")
|
||
for category in data['categories']:
|
||
print(f"- {category['name']}: {category['prompt_count']} 个")
|
||
```
|
||
|
||
### 7.3 批量操作
|
||
|
||
#### 导出所有提示词
|
||
```bash
|
||
# 克隆仓库
|
||
git clone https://github.com/用户名/仓库名.git
|
||
|
||
# 导出为 ZIP
|
||
zip -r prompts.zip prompts/
|
||
```
|
||
|
||
#### 本地搜索
|
||
```bash
|
||
# 搜索包含特定关键词的提示词
|
||
grep -r "关键词" prompts/
|
||
|
||
# 统计版本数
|
||
find prompts -name "*.md" | wc -l
|
||
```
|
||
|
||
---
|
||
|
||
## 8. 维护与扩展
|
||
|
||
### 8.1 定期维护任务
|
||
|
||
#### 每日
|
||
- ✅ 检查同步日志
|
||
- ✅ 验证新增提示词
|
||
|
||
#### 每周
|
||
- ✅ 清理无效文件
|
||
- ✅ 更新文档
|
||
- ✅ 备份数据
|
||
|
||
#### 每月
|
||
- ✅ 性能优化
|
||
- ✅ 依赖更新
|
||
- ✅ 安全审计
|
||
|
||
### 8.2 扩展功能
|
||
|
||
#### 添加搜索功能
|
||
```python
|
||
# scripts/search.py
|
||
import json
|
||
import os
|
||
from typing import List, Dict
|
||
|
||
def search_prompts(keyword: str, prompts_dir: str = "./prompts") -> List[Dict]:
|
||
"""搜索提示词"""
|
||
results = []
|
||
|
||
for root, dirs, files in os.walk(prompts_dir):
|
||
for file in files:
|
||
if file.endswith('.md') and file != 'index.md':
|
||
filepath = os.path.join(root, file)
|
||
with open(filepath, 'r', encoding='utf-8') as f:
|
||
content = f.read()
|
||
if keyword.lower() in content.lower():
|
||
results.append({
|
||
'file': filepath,
|
||
'category': os.path.basename(root),
|
||
'title': file.replace('.md', '')
|
||
})
|
||
|
||
return results
|
||
```
|
||
|
||
#### 添加版本对比
|
||
```python
|
||
# scripts/diff.py
|
||
import difflib
|
||
|
||
def compare_versions(file1: str, file2: str) -> str:
|
||
"""对比两个版本的差异"""
|
||
with open(file1, 'r', encoding='utf-8') as f1:
|
||
content1 = f1.readlines()
|
||
|
||
with open(file2, 'r', encoding='utf-8') as f2:
|
||
content2 = f2.readlines()
|
||
|
||
diff = difflib.unified_diff(
|
||
content1, content2,
|
||
fromfile=file1,
|
||
tofile=file2,
|
||
lineterm=''
|
||
)
|
||
|
||
return '\n'.join(diff)
|
||
```
|
||
|
||
### 8.3 性能优化
|
||
|
||
#### 增量同步
|
||
```python
|
||
def incremental_sync(self):
|
||
"""只同步变更的内容"""
|
||
# 读取上次同步时间
|
||
last_sync = self.read_last_sync_time()
|
||
|
||
# 获取修改时间
|
||
sheet_modified = self.get_sheet_modified_time()
|
||
|
||
if sheet_modified > last_sync:
|
||
# 执行同步
|
||
self.sync()
|
||
self.update_last_sync_time()
|
||
else:
|
||
print("无需同步,数据未变更")
|
||
```
|
||
|
||
#### 并行处理
|
||
```python
|
||
from concurrent.futures import ThreadPoolExecutor
|
||
|
||
def parallel_process_sheets(self, sheets_data):
|
||
"""并行处理多个工作表"""
|
||
with ThreadPoolExecutor(max_workers=5) as executor:
|
||
futures = []
|
||
for sheet_name, data in sheets_data.items():
|
||
future = executor.submit(self.process_sheet_data, sheet_name, data)
|
||
futures.append(future)
|
||
|
||
# 等待所有任务完成
|
||
for future in futures:
|
||
future.result()
|
||
```
|
||
|
||
---
|
||
|
||
## 9. 故障排查
|
||
|
||
### 9.1 常见问题
|
||
|
||
#### 问题 1: API 认证失败
|
||
```
|
||
错误: Invalid credentials
|
||
```
|
||
**解决方案**:
|
||
1. 检查服务账号密钥是否正确
|
||
2. 确认已启用 Google Sheets API
|
||
3. 验证服务账号有表格访问权限
|
||
|
||
#### 问题 2: 同步超时
|
||
```
|
||
错误: Read timed out
|
||
```
|
||
**解决方案**:
|
||
1. 检查网络连接
|
||
2. 减少单次同步的数据量
|
||
3. 增加超时时间设置
|
||
|
||
#### 问题 3: 文件名冲突
|
||
```
|
||
错误: File already exists
|
||
```
|
||
**解决方案**:
|
||
1. 检查是否有重复的提示词标题
|
||
2. 调整文件命名规则
|
||
3. 添加时间戳或 UUID
|
||
|
||
### 9.2 调试技巧
|
||
|
||
#### 启用详细日志
|
||
```python
|
||
import logging
|
||
|
||
logging.basicConfig(
|
||
level=logging.DEBUG,
|
||
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
|
||
handlers=[
|
||
logging.FileHandler('sync.log'),
|
||
logging.StreamHandler()
|
||
]
|
||
)
|
||
```
|
||
|
||
#### 测试单个功能
|
||
```python
|
||
# test_sync.py
|
||
def test_authentication():
|
||
"""测试 API 认证"""
|
||
syncer = PromptLibrarySyncer()
|
||
assert syncer.service is not None
|
||
print("✅ 认证成功")
|
||
|
||
def test_fetch_data():
|
||
"""测试数据获取"""
|
||
syncer = PromptLibrarySyncer()
|
||
data = syncer.fetch_all_sheets()
|
||
assert len(data) > 0
|
||
print(f"✅ 获取 {len(data)} 个工作表")
|
||
|
||
if __name__ == "__main__":
|
||
test_authentication()
|
||
test_fetch_data()
|
||
```
|
||
|
||
### 9.3 监控告警
|
||
|
||
#### GitHub Actions 通知
|
||
```yaml
|
||
- name: 发送通知
|
||
if: failure()
|
||
uses: 8398a7/action-slack@v3
|
||
with:
|
||
status: ${{ job.status }}
|
||
text: '同步失败!请检查日志'
|
||
webhook_url: ${{ secrets.SLACK_WEBHOOK }}
|
||
```
|
||
|
||
---
|
||
|
||
## 10. 最佳实践
|
||
|
||
### 10.1 代码规范
|
||
|
||
#### Python 代码风格
|
||
- 遵循 PEP 8
|
||
- 使用 Type Hints
|
||
- 编写 Docstrings
|
||
- 单元测试覆盖
|
||
|
||
#### Git 提交规范
|
||
```
|
||
类型: 简短描述
|
||
|
||
详细说明(可选)
|
||
|
||
相关 Issue: #123
|
||
```
|
||
|
||
类型标识:
|
||
- 🔄 sync: 同步相关
|
||
- ✨ feat: 新功能
|
||
- 🐛 fix: 修复问题
|
||
- 📝 docs: 文档更新
|
||
- ♻️ refactor: 重构
|
||
- 🎨 style: 格式调整
|
||
|
||
### 10.2 安全建议
|
||
|
||
1. **凭据管理**
|
||
- 永远不要提交凭据到代码库
|
||
- 使用环境变量或密钥管理服务
|
||
- 定期轮换密钥
|
||
|
||
2. **权限控制**
|
||
- 服务账号只给予最小必要权限
|
||
- 定期审查访问权限
|
||
- 使用只读权限
|
||
|
||
3. **数据保护**
|
||
- 敏感信息脱敏处理
|
||
- 备份重要数据
|
||
- 加密传输
|
||
|
||
### 10.3 性能建议
|
||
|
||
1. **缓存策略**
|
||
- 缓存不常变化的数据
|
||
- 使用增量同步
|
||
- 避免重复请求
|
||
|
||
2. **资源优化**
|
||
- 限制并发请求数
|
||
- 批量处理数据
|
||
- 异步操作
|
||
|
||
3. **监控指标**
|
||
- 同步耗时
|
||
- API 调用次数
|
||
- 错误率
|
||
|
||
---
|
||
|
||
## 附录
|
||
|
||
### A. 快速命令参考
|
||
|
||
```bash
|
||
# 安装
|
||
pip install -r scripts/requirements.txt
|
||
|
||
# 同步
|
||
python scripts/sync_sheets.py
|
||
|
||
# 搜索
|
||
python scripts/search.py "关键词"
|
||
|
||
# 生成报告
|
||
python scripts/generate_report.py
|
||
|
||
# 清理
|
||
python scripts/cleanup.py
|
||
|
||
# 测试
|
||
pytest tests/
|
||
|
||
# 部署
|
||
./deploy.sh
|
||
```
|
||
|
||
### B. 相关链接
|
||
|
||
- [Google Sheets API 文档](https://developers.google.com/sheets/api)
|
||
- [GitHub Actions 文档](https://docs.github.com/actions)
|
||
- [Python 最佳实践](https://docs.python-guide.org/)
|
||
- [Markdown 语法](https://www.markdownguide.org/)
|
||
|
||
### C. 版本历史
|
||
|
||
- v1.0.0 (2024-01-01): 初始版本
|
||
- v1.1.0 (2024-02-01): 添加增量同步
|
||
- v1.2.0 (2024-03-01): 支持并行处理
|
||
- v2.0.0 (2024-04-01): 重构架构
|
||
|
||
---
|
||
|
||
## 联系与支持
|
||
|
||
- 📧 邮箱: your-email@example.com
|
||
- 🐛 Issues: https://github.com/username/repo/issues
|
||
- 💬 讨论: https://github.com/username/repo/discussions
|
||
|
||
---
|
||
|
||
*本文档最后更新: 2025-02-02*
|
||
|
||
---
|
||
|
||
## 附加信息:从 Excel 数据处理结果
|
||
|
||
### Excel 文件处理概述
|
||
本次处理的文件:`prompt (2).xlsx`
|
||
- 文件包含 18 行、3 列数据
|
||
- 主要内容包括提示词迭代框架说明、工具链接和加密货币钱包地址
|
||
|
||
### 提示词迭代框架详细说明
|
||
|
||
基于Excel第一行的完整描述:
|
||
**底部每个工作表代表一类提示词,图表的横轴表示提示词的迭代版本(如提示词1a、提示词1b、提示词1c 等),体现每一类提示词在不同阶段的演化。纵轴表示不同的提示词(如提示词1、提示词2、…、提示词y),每一行展示同一类型提示词在不同版本下的具体内容,便于对比各类型提示词随版本迭代的变化趋势。**
|
||
|
||
#### 版本迭代示例结构
|
||
```
|
||
提示词矩阵(基于Excel数据):
|
||
┌─────────────┬──────────────┬──────────────┬──────────────┐
|
||
│ 提示词类型 │ 版本 a │ 版本 b │ 版本 c │
|
||
├─────────────┼──────────────┼──────────────┼──────────────┤
|
||
│ 提示词1 │ 提示词1a │ 提示词1b │ 提示词1c │
|
||
│ 提示词2 │ 提示词2a │ 提示词2b │ - │
|
||
│ ... │ ... │ ... │ ... │
|
||
│ 提示词ya │ 提示词ya │ - │ - │
|
||
└─────────────┴──────────────┴──────────────┴──────────────┘
|
||
```
|
||
|
||
### 相关工具和资源
|
||
|
||
#### AI 优化工具链接
|
||
- **OpenAI 提示词优化平台**:
|
||
- URL: https://platform.openai.com/chat/edit?models=gpt-5&optimize=true
|
||
- 描述: openai提示词优化网站
|
||
- 用途: 用于优化和改进提示词效果,支持GPT-5模型
|
||
|
||
#### 社交媒体资源
|
||
- **Twitter/X 账号**:
|
||
- URL: https://x.com/123olp
|
||
- 描述: 点击关注我的推特,获取最新动态,首页接广告位
|
||
|
||
### 加密货币支持地址
|
||
|
||
以下为项目支持的各个区块链网络钱包地址(礼貌要饭地址):
|
||
|
||
| 网络名称 | 钱包地址 | 备注 |
|
||
|----------|----------|------|
|
||
| **TRON (TRX)** | `TQtBXCSTwLFHjBqTS4rNUp7ufiGx51BRey` | 波场网络 |
|
||
| **Solana (SOL)** | `HjYhozVf9AQmfv7yv79xSNs6uaEU5oUk2USasYQfUYau` | Solana生态 |
|
||
| **Ethereum (ETH)** | `0xa396923a71ee7D9480b346a17dDeEb2c0C287BBC` | 以太坊主网 |
|
||
| **BSC (BNB)** | `0xa396923a71ee7D9480b346a17dDeEb2c0C287BBC` | 币安智能链 |
|
||
| **Bitcoin (BTC)** | `bc1plslluj3zq3snpnnczplu7ywf37h89dyudqua04pz4txwh8z5z5vsre7nlm` | 比特币网络 |
|
||
| **SUI** | `0xb720c98a48c77f2d49d375932b2867e793029e6337f1562522640e4f84203d2e` | SUI区块链 |
|
||
|
||
⚠️ **重要提醒**: 广告位内容请注意识别风险
|
||
|
||
### 数据处理统计
|
||
|
||
- **Excel文件信息**:
|
||
- 文件名: prompt (2).xlsx
|
||
- 数据维度: 18行 × 3列
|
||
- 处理时间: 2025-02-02
|
||
- 有效数据行: 15行(去除标题和空行)
|
||
|
||
- **提取的核心内容**:
|
||
- 提示词迭代框架说明: 1项
|
||
- 工具链接: 1个(OpenAI优化平台)
|
||
- 社交媒体链接: 1个(Twitter)
|
||
- 加密货币钱包: 6个网络地址
|
||
|
||
### 整合建议
|
||
|
||
基于Excel数据,建议在提示词库管理系统中:
|
||
|
||
1. **框架对齐**: 确保系统支持Excel中描述的横纵轴迭代模式
|
||
2. **版本管理**: 实现a/b/c版本命名规范
|
||
3. **工具集成**: 考虑集成OpenAI优化API
|
||
4. **社区建设**: 利用社交媒体进行推广和用户反馈收集
|
||
5. **支持机制**: 建立可持续的项目支持渠道 |