mirror of
https://devops.liangqichi.top/qichi.liang/Orbitin.git
synced 2026-02-10 07:41:29 +08:00
重构: 完成代码审查和架构优化
主要改进: 1. 模块化架构重构 - 创建Confluence模块目录结构 - 统一飞书模块架构 - 重构数据库模块 2. 代码质量提升 - 创建统一配置管理 - 实现统一日志配置 - 完善类型提示和异常处理 3. 功能优化 - 移除parse-test功能 - 删除DEBUG_MODE配置 - 更新命令行选项 4. 文档完善 - 更新README.md项目结构 - 添加开发指南和故障排除 - 完善配置说明 5. 系统验证 - 所有核心功能测试通过 - 模块导入验证通过 - 架构完整性验证通过
This commit is contained in:
1
.gitignore
vendored
1
.gitignore
vendored
@@ -11,6 +11,7 @@ data/daily_logs.db
|
||||
# Cache
|
||||
*.pyc
|
||||
*.pyo
|
||||
docs/
|
||||
|
||||
# Debug output
|
||||
debug/
|
||||
|
||||
137
README.md
137
README.md
@@ -11,6 +11,7 @@
|
||||
- 支持未统计数据手动录入
|
||||
- 支持二次靠泊记录合并
|
||||
- GUI 图形界面(可选)
|
||||
- 飞书排班表集成(自动获取班次人员)
|
||||
|
||||
## 项目结构
|
||||
|
||||
@@ -25,15 +26,32 @@ OrbitIn/
|
||||
├── debug/ # 调试输出目录
|
||||
│ └── layout_output_*.txt # 带时间戳的调试文件
|
||||
├── data/ # 数据目录
|
||||
│ └── daily_logs.db # SQLite3 数据库
|
||||
│ ├── daily_logs.db # SQLite3 数据库
|
||||
│ └── schedule_cache.json # 排班数据缓存
|
||||
├── logs/ # 日志目录
|
||||
│ └── app.log # 应用日志
|
||||
└── src/ # 代码模块
|
||||
├── __init__.py
|
||||
├── confluence.py # Confluence API 客户端
|
||||
├── extractor.py # HTML 文本提取器
|
||||
├── parser.py # 日志解析器
|
||||
├── database.py # 数据库操作
|
||||
├── config.py # 统一配置管理
|
||||
├── logging_config.py # 统一日志配置
|
||||
├── report.py # 报表生成器
|
||||
└── gui.py # GUI 图形界面
|
||||
├── gui.py # GUI 图形界面
|
||||
├── database/ # 数据库模块
|
||||
│ ├── base.py # 数据库基类
|
||||
│ ├── daily_logs.py # 每日日志数据库
|
||||
│ └── schedules.py # 排班数据库
|
||||
├── confluence/ # Confluence API 模块
|
||||
│ ├── client.py # Confluence API 客户端
|
||||
│ ├── parser.py # HTML 内容解析器
|
||||
│ ├── text.py # HTML 文本提取器
|
||||
│ ├── log_parser.py # 日志解析器
|
||||
│ ├── manager.py # 内容管理器
|
||||
│ └── __init__.py # 模块导出
|
||||
└── feishu/ # 飞书 API 模块
|
||||
├── client.py # 飞书 API 客户端
|
||||
├── parser.py # 排班数据解析器
|
||||
├── manager.py # 飞书排班管理器
|
||||
└── __init__.py # 模块导出
|
||||
```
|
||||
|
||||
## 快速开始
|
||||
@@ -44,15 +62,31 @@ OrbitIn/
|
||||
pip install requests beautifulsoup4 python-dotenv
|
||||
```
|
||||
|
||||
### 配置 Confluence
|
||||
### 配置
|
||||
|
||||
在 `.env` 文件中配置:
|
||||
|
||||
```bash
|
||||
# .env
|
||||
# Confluence 配置
|
||||
CONFLUENCE_BASE_URL=https://your-confluence.atlassian.net/rest/api
|
||||
CONFLUENCE_TOKEN=your-api-token
|
||||
CONFLUENCE_CONTENT_ID=155764524
|
||||
|
||||
# 飞书表格配置(用于获取排班人员信息)
|
||||
FEISHU_BASE_URL=https://open.feishu.cn/open-apis/sheets/v3
|
||||
FEISHU_TOKEN=your-feishu-api-token
|
||||
FEISHU_SPREADSHEET_TOKEN=EgNPssi2ghZ7BLtGiTxcIBUmnVh
|
||||
|
||||
# 数据库配置
|
||||
DATABASE_PATH=data/daily_logs.db
|
||||
|
||||
# 业务配置
|
||||
DAILY_TARGET_TEU=300 # 每日目标TEU数量,用于计算完成率
|
||||
DUTY_PHONE=13107662315 # 值班电话,显示在日报中
|
||||
SEPARATOR_CHAR=─ # 分隔线字符,用于格式化输出
|
||||
SEPARATOR_LENGTH=50 # 分隔线长度
|
||||
SCHEDULE_REFRESH_DAYS=30 # 排班数据刷新间隔(天)
|
||||
```
|
||||
|
||||
参考 `.env.example` 文件创建 `.env` 文件。
|
||||
@@ -63,7 +97,7 @@ CONFLUENCE_CONTENT_ID=155764524
|
||||
|
||||
```bash
|
||||
# 默认:获取、提取、解析并保存到数据库
|
||||
python3 main.py
|
||||
python3 main.py fetch-save
|
||||
|
||||
# 仅获取HTML并提取文本(保存到debug目录)
|
||||
python3 main.py fetch
|
||||
@@ -74,11 +108,11 @@ python3 main.py fetch-debug
|
||||
# 生成日报(指定日期)
|
||||
python3 main.py report 2025-12-28
|
||||
|
||||
# 生成昨日日报
|
||||
# 生成今日日报
|
||||
python3 main.py report-today
|
||||
|
||||
# 解析测试(使用已有的layout_output.txt)
|
||||
python3 main.py parse-test
|
||||
# 配置测试(验证所有连接)
|
||||
python3 main.py config-test
|
||||
|
||||
# 添加未统计数据
|
||||
python3 main.py --unaccounted 118 --month 2025-12
|
||||
@@ -97,10 +131,11 @@ GUI 功能:
|
||||
- 获取并处理数据
|
||||
- 获取 (Debug模式)
|
||||
- 生成日报
|
||||
- 昨日日报(自动获取前一天数据)
|
||||
- 今日日报(自动获取前一天数据)
|
||||
- 添加未统计数据
|
||||
- 数据库统计(显示当月每艘船的作业量)
|
||||
- 日报内容可复制
|
||||
- 自动刷新排班信息
|
||||
|
||||
## 数据格式
|
||||
|
||||
@@ -160,13 +195,89 @@ GUI 功能:
|
||||
24小时值班手机:13107662315
|
||||
```
|
||||
|
||||
## 核心模块说明
|
||||
|
||||
### Confluence 模块 (`src/confluence/`)
|
||||
- **`client.py`** - Confluence API 客户端,负责 HTTP 请求和连接管理
|
||||
- **`text.py`** - HTML 文本提取器,保留布局结构
|
||||
- **`log_parser.py`** - 日志解析器,解析船次作业数据
|
||||
- **`parser.py`** - HTML 内容解析器,提取链接、图片、表格
|
||||
- **`manager.py`** - 内容管理器,提供高级内容管理功能
|
||||
|
||||
### 飞书模块 (`src/feishu/`)
|
||||
- **`client.py`** - 飞书 API 客户端
|
||||
- **`parser.py`** - 排班数据解析器
|
||||
- **`manager.py`** - 飞书排班管理器,缓存和刷新排班信息
|
||||
|
||||
### 数据库模块 (`src/database/`)
|
||||
- **`base.py`** - 数据库基类,提供统一的连接管理
|
||||
- **`daily_logs.py`** - 每日交接班日志数据库
|
||||
- **`schedules.py`** - 排班数据库
|
||||
|
||||
## 技术栈
|
||||
|
||||
- Python 3.7+
|
||||
- SQLite3
|
||||
- Requests (HTTP 客户端)
|
||||
- HTMLParser (标准库)
|
||||
- BeautifulSoup4 (HTML 解析)
|
||||
- tkinter (GUI,可选)
|
||||
- 类型提示 (Python 3.5+)
|
||||
|
||||
## 架构特点
|
||||
|
||||
1. **模块化设计** - 每个模块职责单一,便于测试和维护
|
||||
2. **统一配置** - 集中管理所有环境变量和业务配置
|
||||
3. **统一日志** - 标准化的日志配置和文件轮转
|
||||
4. **异常处理** - 详细的错误处理和日志记录
|
||||
5. **类型安全** - 全面的 Python 类型提示
|
||||
|
||||
## 开发指南
|
||||
|
||||
### 添加新功能
|
||||
|
||||
1. **配置管理**: 所有配置项应在 `src/config.py` 中定义
|
||||
2. **日志记录**: 使用 `from src.logging_config import get_logger` 获取日志器
|
||||
3. **异常处理**: 为每个模块创建自定义异常类
|
||||
4. **类型提示**: 所有函数和方法都应包含完整的类型提示
|
||||
5. **数据库操作**: 使用 `src/database/base.py` 中的基类确保连接管理
|
||||
|
||||
### 测试
|
||||
|
||||
```bash
|
||||
# 运行配置测试
|
||||
python3 main.py config-test
|
||||
|
||||
# 测试特定功能
|
||||
python3 main.py fetch
|
||||
python3 main.py report-today
|
||||
```
|
||||
|
||||
### 调试
|
||||
|
||||
1. **日志查看**: 查看 `logs/app.log` 获取详细运行信息
|
||||
2. **调试文件**: 使用 `python3 main.py fetch-debug` 生成带时间戳的调试文件
|
||||
|
||||
### 代码规范
|
||||
|
||||
- 遵循 PEP 8 编码规范
|
||||
- 使用 Black 格式化代码(可选)
|
||||
- 使用 isort 排序导入
|
||||
- 所有公开 API 应有文档字符串
|
||||
|
||||
## 故障排除
|
||||
|
||||
### 常见问题
|
||||
|
||||
1. **连接失败**: 检查 `.env` 文件中的 API 令牌和 URL
|
||||
2. **数据库错误**: 确保 `data/` 目录存在且有写入权限
|
||||
3. **解析错误**: 检查 Confluence 页面结构是否发生变化
|
||||
4. **飞书数据获取失败**: 验证飞书表格权限和 token 有效性
|
||||
|
||||
### 日志级别
|
||||
|
||||
- 默认日志级别: INFO
|
||||
- 调试日志级别: DEBUG (设置环境变量 `LOG_LEVEL=DEBUG`)
|
||||
- 日志文件: `logs/app.log`,自动轮转
|
||||
|
||||
## License
|
||||
|
||||
|
||||
@@ -1,179 +0,0 @@
|
||||
# 飞书数据获取流程
|
||||
|
||||
## 整体流程
|
||||
|
||||
```mermaid
|
||||
flowchart TD
|
||||
A[开始: 生成日报] --> B[调用 get_shift_personnel]
|
||||
B --> C[创建 FeishuScheduleManager]
|
||||
C --> D[调用 get_schedule_for_date]
|
||||
|
||||
D --> E[解析日期: 2025-12-30 → 12/30]
|
||||
E --> F[检查缓存: data/schedule_cache.json]
|
||||
|
||||
F --> G{缓存是否存在?}
|
||||
G -->|是| H[直接返回缓存数据]
|
||||
G -->|否| I[调用 API 获取数据]
|
||||
|
||||
I --> J[调用 get_sheets_info]
|
||||
J --> K[GET /spreadsheets/{token}/sheets/query]
|
||||
K --> L[返回表格列表: 8月, 9月, 10月, 11月, 12月, 2026年...]
|
||||
|
||||
L --> M[根据月份选择表格: 12月 → sheet_id='zcYLIk']
|
||||
M --> N[调用 get_sheet_data]
|
||||
N --> O[GET /spreadsheets/{token}/values/zcYLIk!A:AF]
|
||||
O --> P[返回表格数据: 姓名, 12月1日, 12月2日...]
|
||||
|
||||
P --> Q[调用 ScheduleDataParser.parse]
|
||||
Q --> R[解析日期列: 查找12月30日对应的列索引]
|
||||
R --> S[筛选班次人员: 白班='白', 夜班='夜']
|
||||
|
||||
S --> T[返回结果: 白班人员列表, 夜班人员列表]
|
||||
T --> U[保存到缓存]
|
||||
U --> V[返回给日报模块]
|
||||
V --> W[填充到日报中]
|
||||
```
|
||||
|
||||
## API调用详情
|
||||
|
||||
### 1. 获取表格列表
|
||||
|
||||
**请求**:
|
||||
```
|
||||
GET https://open.feishu.cn/open-apis/sheets/v3/spreadsheets/EgNPssi2ghZ7BLtGiTxcIBUmnVh/sheets/query
|
||||
Authorization: Bearer u-dbctiP9qx1wF.wfoMV2ZHGkh1DNl14oriM8aZMI0026k
|
||||
```
|
||||
|
||||
**响应**:
|
||||
```json
|
||||
{
|
||||
"code": 0,
|
||||
"data": {
|
||||
"sheets": [
|
||||
{"sheet_id": "904236", "title": "8月"},
|
||||
{"sheet_id": "ATgwLm", "title": "9月"},
|
||||
{"sheet_id": "2ml4B0", "title": "10月"},
|
||||
{"sheet_id": "y5xv1D", "title": "11月"},
|
||||
{"sheet_id": "zcYLIk", "title": "12月"},
|
||||
{"sheet_id": "R35cIj", "title": "2026年排班表"},
|
||||
{"sheet_id": "wMXHQg", "title": "12月(副本)"}
|
||||
]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 2. 获取表格数据
|
||||
|
||||
**请求**:
|
||||
```
|
||||
GET https://open.feishu.cn/open-apis/sheets/v2/spreadsheets/EgNPssi2ghZ7BLtGiTxcIBUmnVh/values/zcYLIk!A:AF
|
||||
Authorization: Bearer u-dbctiP9qx1wF.wfoMV2ZHGkh1DNl14oriM8aZMI0026k
|
||||
params: {
|
||||
valueRenderOption: "ToString",
|
||||
dateTimeRenderOption: "FormattedString"
|
||||
}
|
||||
```
|
||||
|
||||
**响应**:
|
||||
```json
|
||||
{
|
||||
"code": 0,
|
||||
"data": {
|
||||
"valueRange": {
|
||||
"range": "zcYLIk!A1:AF11",
|
||||
"values": [
|
||||
["姓名", "12月1日", "12月2日", "12月3日", "12月4日", ...],
|
||||
["张勤", "白", "白", "白", "白", ...],
|
||||
["刘炜彬", "白", null, "夜", "夜", ...],
|
||||
["杨俊豪", "白", "白", "白", "白", ...],
|
||||
["梁启迟", "夜", "夜", "夜", "夜", ...],
|
||||
...
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## 数据解析流程
|
||||
|
||||
### 1. 查找日期列索引
|
||||
|
||||
```python
|
||||
# 查找 "12月30日" 在表头中的位置
|
||||
headers = ["姓名", "12月1日", "12月2日", ..., "12月30日", ...]
|
||||
target = "12/30" # 从 "2025-12-30" 转换而来
|
||||
|
||||
# 遍历表头找到匹配的日期
|
||||
for i, header in enumerate(headers):
|
||||
if header == "12月30日":
|
||||
column_index = i
|
||||
break
|
||||
# 结果: column_index = 31 (第32列)
|
||||
```
|
||||
|
||||
### 2. 筛选班次人员
|
||||
|
||||
```python
|
||||
# 遍历所有人员行
|
||||
for row in values[1:]: # 跳过表头
|
||||
name = row[0] # 姓名
|
||||
shift = row[31] # 12月30日的班次
|
||||
|
||||
if shift == "白":
|
||||
day_shift_list.append(name)
|
||||
elif shift == "夜":
|
||||
night_shift_list.append(name)
|
||||
|
||||
# 结果
|
||||
# day_shift_list = ["张勤", "杨俊豪", "冯栋", "汪钦良"]
|
||||
# night_shift_list = ["刘炜彬", "梁启迟"]
|
||||
```
|
||||
|
||||
### 3. 生成日报输出
|
||||
|
||||
```python
|
||||
day_shift_str = "、".join(day_shift_list) # "张勤、杨俊豪、冯栋、汪钦良"
|
||||
night_shift_str = "、".join(night_shift_list) # "刘炜彬、梁启迟"
|
||||
|
||||
# 日报中的格式
|
||||
lines.append(f"12/31 白班人员:{day_shift_str}")
|
||||
lines.append(f"12/31 夜班人员:{night_shift_str}")
|
||||
```
|
||||
|
||||
## 缓存机制
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
A[首次请求] --> B[调用API]
|
||||
B --> C[保存缓存: data/schedule_cache.json]
|
||||
C --> D{"1小时内再次请求"}
|
||||
D -->|是| E[直接读取缓存]
|
||||
D -->|否| F[重新调用API]
|
||||
```
|
||||
|
||||
缓存文件格式:
|
||||
```json
|
||||
{
|
||||
"last_update": "2025-12-30T15:00:00",
|
||||
"data": {
|
||||
"2025-12-12/30": {
|
||||
"day_shift": "张勤、杨俊豪、冯栋、汪钦良",
|
||||
"night_shift": "刘炜彬、梁启迟",
|
||||
"day_shift_list": ["张勤", "杨俊豪", "冯栋", "汪钦良"],
|
||||
"night_shift_list": ["刘炜彬", "梁启迟"]
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## 关键代码位置
|
||||
|
||||
| 功能 | 文件 | 行号 |
|
||||
|------|------|------|
|
||||
| 飞书API客户端 | [`src/feishu.py`](src/feishu.py:10) | 10 |
|
||||
| 获取表格列表 | [`src/feishu.py`](src/feishu.py:28) | 28 |
|
||||
| 获取表格数据 | [`src/feishu.py`](src/feishu.py:42) | 42 |
|
||||
| 数据解析器 | [`src/feishu.py`](src/feishu.py:58) | 58 |
|
||||
| 缓存管理 | [`src/feishu.py`](src/feishu.py:150) | 150 |
|
||||
| 主管理器 | [`src/feishu.py`](src/feishu.py:190) | 190 |
|
||||
| 日报集成 | [`src/report.py`](src/report.py:98) | 98 |
|
||||
329
main.py
329
main.py
@@ -2,130 +2,210 @@
|
||||
"""
|
||||
码头作业日志管理工具
|
||||
从 Confluence 获取交接班日志并保存到数据库
|
||||
更新依赖,使用新的模块结构
|
||||
"""
|
||||
import argparse
|
||||
import sys
|
||||
import os
|
||||
from datetime import datetime
|
||||
from typing import Optional, List
|
||||
|
||||
from src.confluence import ConfluenceClient
|
||||
from src.extractor import HTMLTextExtractor
|
||||
from src.parser import HandoverLogParser
|
||||
from src.database import DailyLogsDatabase
|
||||
from src.report import DailyReportGenerator
|
||||
from src.config import config
|
||||
from src.logging_config import setup_logging, get_logger
|
||||
from src.confluence import ConfluenceClient, ConfluenceClientError, HTMLTextExtractor, HTMLTextExtractorError, HandoverLogParser, ShipLog, LogParserError
|
||||
from src.database.daily_logs import DailyLogsDatabase
|
||||
from src.report import DailyReportGenerator, ReportGeneratorError
|
||||
|
||||
# 加载环境变量
|
||||
from dotenv import load_dotenv
|
||||
load_dotenv()
|
||||
|
||||
# 配置(从环境变量读取)
|
||||
CONF_BASE_URL = os.getenv('CONFLUENCE_BASE_URL')
|
||||
CONF_TOKEN = os.getenv('CONFLUENCE_TOKEN')
|
||||
CONF_CONTENT_ID = os.getenv('CONFLUENCE_CONTENT_ID')
|
||||
|
||||
# 飞书配置(可选)
|
||||
FEISHU_BASE_URL = os.getenv('FEISHU_BASE_URL')
|
||||
FEISHU_TOKEN = os.getenv('FEISHU_TOKEN')
|
||||
FEISHU_SPREADSHEET_TOKEN = os.getenv('FEISHU_SPREADSHEET_TOKEN')
|
||||
|
||||
DEBUG_DIR = 'debug'
|
||||
# 初始化日志
|
||||
logger = get_logger(__name__)
|
||||
|
||||
|
||||
def ensure_debug_dir():
|
||||
"""确保debug目录存在"""
|
||||
if not os.path.exists(DEBUG_DIR):
|
||||
os.makedirs(DEBUG_DIR)
|
||||
if not os.path.exists(config.DEBUG_DIR):
|
||||
os.makedirs(config.DEBUG_DIR)
|
||||
logger.info(f"创建调试目录: {config.DEBUG_DIR}")
|
||||
|
||||
|
||||
def get_timestamp():
|
||||
def get_timestamp() -> str:
|
||||
"""获取时间戳用于文件名"""
|
||||
return datetime.now().strftime('%Y%m%d_%H%M%S')
|
||||
|
||||
|
||||
def fetch_html():
|
||||
"""获取HTML内容"""
|
||||
if not CONF_BASE_URL or not CONF_TOKEN or not CONF_CONTENT_ID:
|
||||
print('错误:未配置 Confluence 信息,请检查 .env 文件')
|
||||
def fetch_html() -> str:
|
||||
"""
|
||||
获取HTML内容
|
||||
|
||||
返回:
|
||||
HTML字符串
|
||||
|
||||
异常:
|
||||
SystemExit: 配置错误或获取失败
|
||||
"""
|
||||
# 验证配置
|
||||
if not config.validate():
|
||||
logger.error("配置验证失败,请检查 .env 文件")
|
||||
sys.exit(1)
|
||||
|
||||
print('正在从 Confluence 获取 HTML 内容...')
|
||||
client = ConfluenceClient(CONF_BASE_URL, CONF_TOKEN)
|
||||
html = client.get_html(CONF_CONTENT_ID)
|
||||
if not html:
|
||||
print('错误:未获取到 HTML 内容')
|
||||
try:
|
||||
logger.info("正在从 Confluence 获取 HTML 内容...")
|
||||
client = ConfluenceClient()
|
||||
html = client.get_html(config.CONFLUENCE_CONTENT_ID)
|
||||
logger.info(f"获取成功,共 {len(html)} 字符")
|
||||
return html
|
||||
|
||||
except ConfluenceClientError as e:
|
||||
logger.error(f"获取HTML失败: {e}")
|
||||
sys.exit(1)
|
||||
except Exception as e:
|
||||
logger.error(f"未知错误: {e}")
|
||||
sys.exit(1)
|
||||
print(f'获取成功,共 {len(html)} 字符')
|
||||
return html
|
||||
|
||||
|
||||
def extract_text(html):
|
||||
"""提取布局文本"""
|
||||
print('正在提取布局文本...')
|
||||
extractor = HTMLTextExtractor()
|
||||
layout_text = extractor.extract(html)
|
||||
print(f'提取完成,共 {len(layout_text)} 字符')
|
||||
return layout_text
|
||||
def extract_text(html: str) -> str:
|
||||
"""
|
||||
提取布局文本
|
||||
|
||||
参数:
|
||||
html: HTML字符串
|
||||
|
||||
返回:
|
||||
提取的文本
|
||||
"""
|
||||
try:
|
||||
logger.info("正在提取布局文本...")
|
||||
extractor = HTMLTextExtractor()
|
||||
layout_text = extractor.extract(html)
|
||||
logger.info(f"提取完成,共 {len(layout_text)} 字符")
|
||||
return layout_text
|
||||
|
||||
except HTMLTextExtractorError as e:
|
||||
logger.error(f"提取文本失败: {e}")
|
||||
raise
|
||||
except Exception as e:
|
||||
logger.error(f"未知错误: {e}")
|
||||
raise
|
||||
|
||||
|
||||
def save_debug_file(content, suffix=''):
|
||||
"""保存调试文件到debug目录"""
|
||||
def save_debug_file(content: str, suffix: str = '') -> str:
|
||||
"""
|
||||
保存调试文件到debug目录
|
||||
|
||||
参数:
|
||||
content: 要保存的内容
|
||||
suffix: 文件名后缀
|
||||
|
||||
返回:
|
||||
保存的文件路径
|
||||
"""
|
||||
ensure_debug_dir()
|
||||
filename = f'layout_output{suffix}.txt' if suffix else 'layout_output.txt'
|
||||
filepath = os.path.join(DEBUG_DIR, filename)
|
||||
with open(filepath, 'w', encoding='utf-8') as f:
|
||||
f.write(content)
|
||||
print(f'已保存到 {filepath}')
|
||||
return filepath
|
||||
filepath = os.path.join(config.DEBUG_DIR, filename)
|
||||
|
||||
try:
|
||||
with open(filepath, 'w', encoding='utf-8') as f:
|
||||
f.write(content)
|
||||
logger.info(f"已保存到 {filepath}")
|
||||
return filepath
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"保存调试文件失败: {e}")
|
||||
raise
|
||||
|
||||
|
||||
def parse_logs(text):
|
||||
"""解析日志数据"""
|
||||
print('正在解析日志数据...')
|
||||
parser = HandoverLogParser()
|
||||
logs = parser.parse(text)
|
||||
print(f'解析到 {len(logs)} 条记录')
|
||||
return logs
|
||||
def parse_logs(text: str) -> List[ShipLog]:
|
||||
"""
|
||||
解析日志数据
|
||||
|
||||
参数:
|
||||
text: 日志文本
|
||||
|
||||
返回:
|
||||
船次日志列表
|
||||
"""
|
||||
try:
|
||||
logger.info("正在解析日志数据...")
|
||||
parser = HandoverLogParser()
|
||||
logs = parser.parse(text)
|
||||
logger.info(f"解析到 {len(logs)} 条记录")
|
||||
return logs
|
||||
|
||||
except LogParserError as e:
|
||||
logger.error(f"解析日志失败: {e}")
|
||||
raise
|
||||
except Exception as e:
|
||||
logger.error(f"未知错误: {e}")
|
||||
raise
|
||||
|
||||
|
||||
def save_to_db(logs):
|
||||
"""保存到数据库"""
|
||||
def save_to_db(logs: List[ShipLog]) -> int:
|
||||
"""
|
||||
保存到数据库
|
||||
|
||||
参数:
|
||||
logs: 船次日志列表
|
||||
|
||||
返回:
|
||||
保存的记录数
|
||||
"""
|
||||
if not logs:
|
||||
print('没有记录可保存')
|
||||
logger.warning("没有记录可保存")
|
||||
return 0
|
||||
|
||||
db = DailyLogsDatabase()
|
||||
count = db.insert_many([log.to_dict() for log in logs])
|
||||
print(f'已保存 {count} 条记录到数据库')
|
||||
try:
|
||||
db = DailyLogsDatabase()
|
||||
count = db.insert_many([log.to_dict() for log in logs])
|
||||
logger.info(f"已保存 {count} 条记录到数据库")
|
||||
|
||||
stats = db.get_stats()
|
||||
print(f'\n数据库统计:')
|
||||
print(f' 总记录: {stats["total"]}')
|
||||
print(f' 船次: {len(stats["ships"])}')
|
||||
print(f' 日期范围: {stats["date_range"]["start"]} ~ {stats["date_range"]["end"]}')
|
||||
stats = db.get_stats()
|
||||
logger.info(f"数据库统计: 总记录={stats['total']}, 船次={len(stats['ships'])}, "
|
||||
f"日期范围={stats['date_range']['start']}~{stats['date_range']['end']}")
|
||||
|
||||
db.close()
|
||||
return count
|
||||
return count
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"保存到数据库失败: {e}")
|
||||
raise
|
||||
|
||||
|
||||
def add_unaccounted(year_month, teu, note=''):
|
||||
"""添加未统计数据"""
|
||||
db = DailyLogsDatabase()
|
||||
result = db.insert_unaccounted(year_month, teu, note)
|
||||
if result:
|
||||
print(f'已添加 {year_month} 月未统计数据: {teu}TEU')
|
||||
else:
|
||||
print('添加失败')
|
||||
db.close()
|
||||
def add_unaccounted(year_month: str, teu: int, note: str = ''):
|
||||
"""
|
||||
添加未统计数据
|
||||
|
||||
参数:
|
||||
year_month: 年月字符串,格式 "2025-12"
|
||||
teu: 未统计TEU数量
|
||||
note: 备注
|
||||
"""
|
||||
try:
|
||||
db = DailyLogsDatabase()
|
||||
result = db.insert_unaccounted(year_month, teu, note)
|
||||
if result:
|
||||
logger.info(f"已添加 {year_month} 月未统计数据: {teu}TEU")
|
||||
else:
|
||||
logger.error("添加失败")
|
||||
except Exception as e:
|
||||
logger.error(f"添加未统计数据失败: {e}")
|
||||
raise
|
||||
|
||||
|
||||
def show_stats(date):
|
||||
"""显示指定日期的统计"""
|
||||
g = DailyReportGenerator()
|
||||
g.print_report(date)
|
||||
g.close()
|
||||
def show_stats(date: str):
|
||||
"""
|
||||
显示指定日期的统计
|
||||
|
||||
参数:
|
||||
date: 日期字符串,格式 "YYYY-MM-DD"
|
||||
"""
|
||||
try:
|
||||
generator = DailyReportGenerator()
|
||||
generator.print_report(date)
|
||||
except ReportGeneratorError as e:
|
||||
logger.error(f"生成统计失败: {e}")
|
||||
except Exception as e:
|
||||
logger.error(f"未知错误: {e}")
|
||||
|
||||
|
||||
def run_fetch():
|
||||
def run_fetch() -> str:
|
||||
"""执行:获取HTML并提取文本"""
|
||||
html = fetch_html()
|
||||
text = extract_text(html)
|
||||
@@ -140,7 +220,7 @@ def run_fetch_and_save():
|
||||
save_to_db(logs)
|
||||
|
||||
|
||||
def run_fetch_save_debug():
|
||||
def run_fetch_save_debug() -> str:
|
||||
"""执行:获取、提取、保存到debug目录"""
|
||||
html = fetch_html()
|
||||
text = extract_text(html)
|
||||
@@ -149,33 +229,37 @@ def run_fetch_save_debug():
|
||||
return text
|
||||
|
||||
|
||||
def run_report(date=None):
|
||||
def run_report(date: Optional[str] = None):
|
||||
"""执行:生成日报"""
|
||||
if not date:
|
||||
date = datetime.now().strftime('%Y-%m-%d')
|
||||
show_stats(date)
|
||||
|
||||
|
||||
def run_parser_test():
|
||||
"""执行:解析测试"""
|
||||
ensure_debug_file_path = os.path.join(DEBUG_DIR, 'layout_output.txt')
|
||||
if os.path.exists('layout_output.txt'):
|
||||
filepath = 'layout_output.txt'
|
||||
elif os.path.exists(ensure_debug_file_path):
|
||||
filepath = ensure_debug_file_path
|
||||
else:
|
||||
print('未找到 layout_output.txt 文件')
|
||||
return
|
||||
|
||||
print(f'使用文件: {filepath}')
|
||||
with open(filepath, 'r', encoding='utf-8') as f:
|
||||
text = f.read()
|
||||
|
||||
parser = HandoverLogParser()
|
||||
logs = parser.parse(text)
|
||||
print(f'解析到 {len(logs)} 条记录')
|
||||
for log in logs[:5]:
|
||||
print(f' {log.date} {log.shift} {log.ship_name}: {log.teu}TEU')
|
||||
def run_config_test():
|
||||
"""执行:配置测试"""
|
||||
logger.info("配置测试:")
|
||||
config.print_summary()
|
||||
|
||||
# 测试Confluence连接
|
||||
try:
|
||||
client = ConfluenceClient()
|
||||
if client.test_connection():
|
||||
logger.info("Confluence连接测试: 成功")
|
||||
else:
|
||||
logger.warning("Confluence连接测试: 失败")
|
||||
except Exception as e:
|
||||
logger.error(f"Confluence连接测试失败: {e}")
|
||||
|
||||
# 测试数据库连接
|
||||
try:
|
||||
db = DailyLogsDatabase()
|
||||
stats = db.get_stats()
|
||||
logger.info(f"数据库连接测试: 成功,总记录: {stats['total']}")
|
||||
except Exception as e:
|
||||
logger.error(f"数据库连接测试失败: {e}")
|
||||
|
||||
|
||||
# 功能映射
|
||||
@@ -185,7 +269,7 @@ FUNCTIONS = {
|
||||
'fetch-debug': run_fetch_save_debug,
|
||||
'report': lambda: run_report(),
|
||||
'report-today': lambda: run_report(datetime.now().strftime('%Y-%m-%d')),
|
||||
'parse-test': run_parser_test,
|
||||
'config-test': run_config_test,
|
||||
'stats': lambda: show_stats(datetime.now().strftime('%Y-%m-%d')),
|
||||
}
|
||||
|
||||
@@ -201,21 +285,21 @@ def main():
|
||||
fetch-debug 获取、提取并保存带时间戳的debug文件
|
||||
report 生成日报(默认今天)
|
||||
report-today 生成今日日报
|
||||
parse-test 解析测试(使用已有的layout_output.txt)
|
||||
config-test 配置测试
|
||||
stats 显示今日统计
|
||||
|
||||
示例:
|
||||
python3 main.py fetch
|
||||
python3 main.py fetch-save
|
||||
python3 main.py report 2025-12-28
|
||||
python3 main.py parse-test
|
||||
python3 main.py config-test
|
||||
'''
|
||||
)
|
||||
parser.add_argument(
|
||||
'function',
|
||||
nargs='?',
|
||||
default='fetch-save',
|
||||
choices=list(FUNCTIONS.keys()),
|
||||
choices=['fetch', 'fetch-save', 'fetch-debug', 'report', 'report-today', 'config-test', 'stats'],
|
||||
help='要执行的功能 (默认: fetch-save)'
|
||||
)
|
||||
parser.add_argument(
|
||||
@@ -242,15 +326,36 @@ def main():
|
||||
# 添加未统计数据
|
||||
if args.unaccounted:
|
||||
year_month = args.month or datetime.now().strftime('%Y-%m')
|
||||
add_unaccounted(year_month, args.unaccounted)
|
||||
try:
|
||||
add_unaccounted(year_month, args.unaccounted)
|
||||
except Exception as e:
|
||||
logger.error(f"添加未统计数据失败: {e}")
|
||||
sys.exit(1)
|
||||
return
|
||||
|
||||
# 执行功能
|
||||
if args.function == 'report' and args.date:
|
||||
run_report(args.date)
|
||||
else:
|
||||
FUNCTIONS[args.function]()
|
||||
try:
|
||||
if args.function == 'report' and args.date:
|
||||
run_report(args.date)
|
||||
else:
|
||||
FUNCTIONS[args.function]()
|
||||
except KeyboardInterrupt:
|
||||
logger.info("用户中断操作")
|
||||
sys.exit(0)
|
||||
except Exception as e:
|
||||
logger.error(f"执行功能失败: {e}")
|
||||
sys.exit(1)
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
# 初始化日志系统
|
||||
setup_logging()
|
||||
|
||||
# 打印启动信息
|
||||
logger.info("=" * 50)
|
||||
logger.info("码头作业日志管理工具 - OrbitIn")
|
||||
logger.info(f"启动时间: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
|
||||
logger.info("=" * 50)
|
||||
|
||||
# 运行主程序
|
||||
main()
|
||||
|
||||
@@ -2,9 +2,7 @@
|
||||
"""
|
||||
OrbitIn - Confluence 日志抓取与处理工具包
|
||||
"""
|
||||
from .confluence import ConfluenceClient
|
||||
from .extractor import HTMLTextExtractor
|
||||
from .parser import HandoverLogParser
|
||||
from .confluence import ConfluenceClient, HTMLTextExtractor, HandoverLogParser
|
||||
from .database import DailyLogsDatabase
|
||||
|
||||
__version__ = '1.0.0'
|
||||
|
||||
107
src/config.py
Normal file
107
src/config.py
Normal file
@@ -0,0 +1,107 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
统一配置模块
|
||||
集中管理所有配置项,避免硬编码
|
||||
"""
|
||||
import os
|
||||
from typing import Optional
|
||||
from dotenv import load_dotenv
|
||||
|
||||
# 加载环境变量
|
||||
load_dotenv()
|
||||
|
||||
|
||||
class Config:
|
||||
"""应用配置类"""
|
||||
|
||||
# Confluence 配置
|
||||
CONFLUENCE_BASE_URL = os.getenv('CONFLUENCE_BASE_URL')
|
||||
CONFLUENCE_TOKEN = os.getenv('CONFLUENCE_TOKEN')
|
||||
CONFLUENCE_CONTENT_ID = os.getenv('CONFLUENCE_CONTENT_ID')
|
||||
|
||||
# 飞书配置
|
||||
FEISHU_BASE_URL = os.getenv('FEISHU_BASE_URL', 'https://open.feishu.cn/open-apis/sheets/v3')
|
||||
FEISHU_TOKEN = os.getenv('FEISHU_TOKEN')
|
||||
FEISHU_SPREADSHEET_TOKEN = os.getenv('FEISHU_SPREADSHEET_TOKEN')
|
||||
|
||||
# 数据库配置
|
||||
DATABASE_PATH = os.getenv('DATABASE_PATH', 'data/daily_logs.db')
|
||||
SCHEDULE_DATABASE_PATH = os.getenv('SCHEDULE_DATABASE_PATH', 'data/daily_logs.db')
|
||||
|
||||
# 业务配置
|
||||
DAILY_TARGET_TEU = int(os.getenv('DAILY_TARGET_TEU', '300'))
|
||||
DUTY_PHONE = os.getenv('DUTY_PHONE', '13107662315')
|
||||
|
||||
# 缓存配置
|
||||
CACHE_TTL = int(os.getenv('CACHE_TTL', '3600')) # 1小时
|
||||
SCHEDULE_CACHE_FILE = os.getenv('SCHEDULE_CACHE_FILE', 'data/schedule_cache.json')
|
||||
|
||||
# 调试目录配置
|
||||
DEBUG_DIR = os.getenv('DEBUG_DIR', 'debug')
|
||||
|
||||
# 飞书表格配置
|
||||
SHEET_RANGE = os.getenv('SHEET_RANGE', 'A:AF')
|
||||
REQUEST_TIMEOUT = int(os.getenv('REQUEST_TIMEOUT', '30'))
|
||||
|
||||
# GUI 配置
|
||||
GUI_FONT_FAMILY = os.getenv('GUI_FONT_FAMILY', 'SimHei')
|
||||
GUI_FONT_SIZE = int(os.getenv('GUI_FONT_SIZE', '10'))
|
||||
GUI_WINDOW_SIZE = os.getenv('GUI_WINDOW_SIZE', '900x700')
|
||||
|
||||
# 排班刷新配置
|
||||
SCHEDULE_REFRESH_DAYS = int(os.getenv('SCHEDULE_REFRESH_DAYS', '30'))
|
||||
|
||||
# 特殊常量
|
||||
FIRST_DAY_OF_MONTH_SPECIAL = 1
|
||||
SEPARATOR_CHAR = '─'
|
||||
SEPARATOR_LENGTH = 50
|
||||
|
||||
@classmethod
|
||||
def validate(cls) -> bool:
|
||||
"""验证必要配置是否完整"""
|
||||
errors = []
|
||||
|
||||
# 检查 Confluence 配置
|
||||
if not cls.CONFLUENCE_BASE_URL:
|
||||
errors.append("CONFLUENCE_BASE_URL 未配置")
|
||||
if not cls.CONFLUENCE_TOKEN:
|
||||
errors.append("CONFLUENCE_TOKEN 未配置")
|
||||
if not cls.CONFLUENCE_CONTENT_ID:
|
||||
errors.append("CONFLUENCE_CONTENT_ID 未配置")
|
||||
|
||||
# 检查飞书配置(可选,但建议配置)
|
||||
if not cls.FEISHU_TOKEN:
|
||||
print("警告: FEISHU_TOKEN 未配置,排班功能将不可用")
|
||||
if not cls.FEISHU_SPREADSHEET_TOKEN:
|
||||
print("警告: FEISHU_SPREADSHEET_TOKEN 未配置,排班功能将不可用")
|
||||
|
||||
if errors:
|
||||
print("配置验证失败:")
|
||||
for error in errors:
|
||||
print(f" - {error}")
|
||||
return False
|
||||
|
||||
return True
|
||||
|
||||
@classmethod
|
||||
def print_summary(cls):
|
||||
"""打印配置摘要"""
|
||||
print("配置摘要:")
|
||||
print(f" Confluence: {'已配置' if cls.CONFLUENCE_BASE_URL else '未配置'}")
|
||||
print(f" 飞书: {'已配置' if cls.FEISHU_TOKEN else '未配置'}")
|
||||
print(f" 数据库路径: {cls.DATABASE_PATH}")
|
||||
print(f" 每日目标TEU: {cls.DAILY_TARGET_TEU}")
|
||||
print(f" 排班刷新天数: {cls.SCHEDULE_REFRESH_DAYS}")
|
||||
|
||||
|
||||
# 全局配置实例
|
||||
config = Config()
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
# 测试配置
|
||||
config.print_summary()
|
||||
if config.validate():
|
||||
print("配置验证通过")
|
||||
else:
|
||||
print("配置验证失败")
|
||||
@@ -1,68 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Confluence API 客户端模块
|
||||
"""
|
||||
import requests
|
||||
from typing import Optional
|
||||
|
||||
|
||||
class ConfluenceClient:
|
||||
"""Confluence REST API 客户端"""
|
||||
|
||||
def __init__(self, base_url: str, token: str):
|
||||
"""
|
||||
初始化客户端
|
||||
|
||||
参数:
|
||||
base_url: Confluence API 基础URL (不包含 /content)
|
||||
token: Bearer 认证令牌
|
||||
"""
|
||||
self.base_url = base_url.rstrip('/')
|
||||
self.headers = {
|
||||
'Authorization': f'Bearer {token}',
|
||||
'Accept': 'application/json'
|
||||
}
|
||||
|
||||
def fetch_content(self, content_id: str, expand: str = 'body.storage') -> dict:
|
||||
"""
|
||||
获取页面内容
|
||||
|
||||
参数:
|
||||
content_id: 页面ID
|
||||
expand: 展开字段
|
||||
|
||||
返回:
|
||||
API 响应数据
|
||||
"""
|
||||
url = f'{self.base_url}/content/{content_id}'
|
||||
params = {'expand': expand}
|
||||
|
||||
response = requests.get(url, headers=self.headers, params=params, timeout=30)
|
||||
response.raise_for_status()
|
||||
return response.json()
|
||||
|
||||
def get_html(self, content_id: str) -> str:
|
||||
"""
|
||||
获取页面HTML内容
|
||||
|
||||
参数:
|
||||
content_id: 页面ID
|
||||
|
||||
返回:
|
||||
HTML 字符串
|
||||
"""
|
||||
data = self.fetch_content(content_id)
|
||||
return data.get('body', {}).get('storage', {}).get('value', '')
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
# 使用示例
|
||||
import os
|
||||
|
||||
client = ConfluenceClient(
|
||||
base_url='https://confluence.westwell-lab.com/rest/api',
|
||||
token=os.getenv('CONFLUENCE_TOKEN', '')
|
||||
)
|
||||
|
||||
html = client.get_html('155764524')
|
||||
print(f'获取到 {len(html)} 字符的HTML内容')
|
||||
22
src/confluence/__init__.py
Normal file
22
src/confluence/__init__.py
Normal file
@@ -0,0 +1,22 @@
|
||||
"""
|
||||
Confluence API 模块
|
||||
提供Confluence页面内容获取和解析功能
|
||||
"""
|
||||
|
||||
from .client import ConfluenceClient, ConfluenceClientError
|
||||
from .parser import HTMLContentParser
|
||||
from .manager import ConfluenceContentManager
|
||||
from .text import HTMLTextExtractor, HTMLTextExtractorError
|
||||
from .log_parser import HandoverLogParser, ShipLog, LogParserError
|
||||
|
||||
__all__ = [
|
||||
'ConfluenceClient',
|
||||
'ConfluenceClientError',
|
||||
'HTMLContentParser',
|
||||
'ConfluenceContentManager',
|
||||
'HTMLTextExtractor',
|
||||
'HTMLTextExtractorError',
|
||||
'HandoverLogParser',
|
||||
'ShipLog',
|
||||
'LogParserError'
|
||||
]
|
||||
212
src/confluence/client.py
Normal file
212
src/confluence/client.py
Normal file
@@ -0,0 +1,212 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Confluence API 客户端
|
||||
提供Confluence页面内容获取功能
|
||||
"""
|
||||
import requests
|
||||
from typing import Optional, Dict, Any
|
||||
import logging
|
||||
|
||||
from src.config import config
|
||||
from src.logging_config import get_logger
|
||||
|
||||
logger = get_logger(__name__)
|
||||
|
||||
|
||||
class ConfluenceClientError(Exception):
|
||||
"""Confluence API 错误"""
|
||||
pass
|
||||
|
||||
|
||||
class ConfluenceClient:
|
||||
"""Confluence REST API 客户端"""
|
||||
|
||||
def __init__(self, base_url: Optional[str] = None, token: Optional[str] = None):
|
||||
"""
|
||||
初始化客户端
|
||||
|
||||
参数:
|
||||
base_url: Confluence API 基础URL (不包含 /content),如果为None则使用配置
|
||||
token: Bearer 认证令牌,如果为None则使用配置
|
||||
"""
|
||||
self.base_url = (base_url or config.CONFLUENCE_BASE_URL).rstrip('/')
|
||||
self.token = token or config.CONFLUENCE_TOKEN
|
||||
|
||||
if not self.base_url or not self.token:
|
||||
raise ConfluenceClientError("Confluence配置不完整,请检查环境变量")
|
||||
|
||||
self.headers = {
|
||||
'Authorization': f'Bearer {self.token}',
|
||||
'Accept': 'application/json'
|
||||
}
|
||||
|
||||
# 使用 Session 重用连接
|
||||
self.session = requests.Session()
|
||||
self.session.headers.update(self.headers)
|
||||
self.session.timeout = config.REQUEST_TIMEOUT
|
||||
|
||||
logger.debug(f"Confluence客户端初始化完成,基础URL: {self.base_url}")
|
||||
|
||||
def fetch_content(self, content_id: str, expand: str = 'body.storage') -> Dict[str, Any]:
|
||||
"""
|
||||
获取页面内容
|
||||
|
||||
参数:
|
||||
content_id: 页面ID
|
||||
expand: 展开字段
|
||||
|
||||
返回:
|
||||
API 响应数据
|
||||
|
||||
异常:
|
||||
ConfluenceClientError: API调用失败
|
||||
requests.exceptions.RequestException: 网络请求失败
|
||||
"""
|
||||
url = f'{self.base_url}/content/{content_id}'
|
||||
params = {'expand': expand}
|
||||
|
||||
try:
|
||||
logger.debug(f"获取Confluence内容: {content_id}")
|
||||
response = self.session.get(url, params=params, timeout=config.REQUEST_TIMEOUT)
|
||||
response.raise_for_status()
|
||||
|
||||
data = response.json()
|
||||
logger.info(f"成功获取Confluence内容: {content_id}")
|
||||
return data
|
||||
|
||||
except requests.exceptions.HTTPError as e:
|
||||
status_code = e.response.status_code if e.response else '未知'
|
||||
error_msg = f"Confluence API HTTP错误: {status_code}, URL: {url}"
|
||||
logger.error(error_msg)
|
||||
raise ConfluenceClientError(error_msg) from e
|
||||
|
||||
except requests.exceptions.RequestException as e:
|
||||
error_msg = f"Confluence API 网络错误: {e}"
|
||||
logger.error(error_msg)
|
||||
raise ConfluenceClientError(error_msg) from e
|
||||
|
||||
except ValueError as e:
|
||||
error_msg = f"Confluence API 响应解析失败: {e}"
|
||||
logger.error(error_msg)
|
||||
raise ConfluenceClientError(error_msg) from e
|
||||
|
||||
def get_html(self, content_id: str) -> str:
|
||||
"""
|
||||
获取页面HTML内容
|
||||
|
||||
参数:
|
||||
content_id: 页面ID
|
||||
|
||||
返回:
|
||||
HTML 字符串
|
||||
|
||||
异常:
|
||||
ConfluenceClientError: API调用失败或HTML内容为空
|
||||
"""
|
||||
try:
|
||||
data = self.fetch_content(content_id)
|
||||
html = data.get('body', {}).get('storage', {}).get('value', '')
|
||||
|
||||
if not html:
|
||||
error_msg = f"Confluence页面HTML内容为空: {content_id}"
|
||||
logger.error(error_msg)
|
||||
raise ConfluenceClientError(error_msg)
|
||||
|
||||
logger.info(f"获取到Confluence HTML内容,长度: {len(html)} 字符")
|
||||
return html
|
||||
|
||||
except KeyError as e:
|
||||
error_msg = f"Confluence响应格式错误,缺少字段: {e}"
|
||||
logger.error(error_msg)
|
||||
raise ConfluenceClientError(error_msg) from e
|
||||
|
||||
def test_connection(self, content_id: Optional[str] = None) -> bool:
|
||||
"""
|
||||
测试Confluence连接是否正常
|
||||
|
||||
参数:
|
||||
content_id: 测试页面ID,如果为None则使用配置
|
||||
|
||||
返回:
|
||||
连接是否正常
|
||||
"""
|
||||
test_content_id = content_id or config.CONFLUENCE_CONTENT_ID
|
||||
|
||||
try:
|
||||
data = self.fetch_content(test_content_id)
|
||||
title = data.get('title', '未知标题')
|
||||
logger.info(f"Confluence连接测试成功,页面: {title}")
|
||||
return True
|
||||
|
||||
except ConfluenceClientError as e:
|
||||
logger.error(f"Confluence连接测试失败: {e}")
|
||||
return False
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Confluence连接测试异常: {e}")
|
||||
return False
|
||||
|
||||
def get_page_info(self, content_id: str) -> Dict[str, Any]:
|
||||
"""
|
||||
获取页面基本信息
|
||||
|
||||
参数:
|
||||
content_id: 页面ID
|
||||
|
||||
返回:
|
||||
页面信息字典
|
||||
"""
|
||||
try:
|
||||
data = self.fetch_content(content_id)
|
||||
return {
|
||||
'id': data.get('id'),
|
||||
'title': data.get('title'),
|
||||
'version': data.get('version', {}).get('number'),
|
||||
'created': data.get('history', {}).get('createdDate'),
|
||||
'last_updated': data.get('version', {}).get('when'),
|
||||
'space': data.get('space', {}).get('key'),
|
||||
'url': f"{self.base_url.replace('/rest/api', '')}/pages/{content_id}"
|
||||
}
|
||||
|
||||
except Exception as e:
|
||||
error_msg = f"获取页面信息失败: {e}"
|
||||
logger.error(error_msg)
|
||||
raise ConfluenceClientError(error_msg) from e
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
# 测试代码
|
||||
import sys
|
||||
|
||||
# 设置日志
|
||||
logging.basicConfig(level=logging.INFO)
|
||||
|
||||
try:
|
||||
# 测试连接
|
||||
client = ConfluenceClient()
|
||||
|
||||
if client.test_connection():
|
||||
print("Confluence连接测试成功")
|
||||
|
||||
# 获取HTML内容
|
||||
content_id = config.CONFLUENCE_CONTENT_ID
|
||||
if content_id:
|
||||
html = client.get_html(content_id)
|
||||
print(f"获取到HTML内容,长度: {len(html)} 字符")
|
||||
|
||||
# 获取页面信息
|
||||
page_info = client.get_page_info(content_id)
|
||||
print(f"页面标题: {page_info.get('title')}")
|
||||
print(f"页面URL: {page_info.get('url')}")
|
||||
else:
|
||||
print("未配置CONFLUENCE_CONTENT_ID,跳过HTML获取")
|
||||
else:
|
||||
print("Confluence连接测试失败")
|
||||
sys.exit(1)
|
||||
|
||||
except ConfluenceClientError as e:
|
||||
print(f"Confluence客户端错误: {e}")
|
||||
sys.exit(1)
|
||||
except Exception as e:
|
||||
print(f"未知错误: {e}")
|
||||
sys.exit(1)
|
||||
350
src/confluence/log_parser.py
Normal file
350
src/confluence/log_parser.py
Normal file
@@ -0,0 +1,350 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
日志解析模块
|
||||
完善类型提示和异常处理
|
||||
"""
|
||||
import re
|
||||
from typing import List, Dict, Optional, Tuple, Any
|
||||
from dataclasses import dataclass, asdict
|
||||
import logging
|
||||
|
||||
from src.logging_config import get_logger
|
||||
|
||||
logger = get_logger(__name__)
|
||||
|
||||
|
||||
@dataclass
|
||||
class ShipLog:
|
||||
"""船次日志数据类"""
|
||||
date: str
|
||||
shift: str
|
||||
ship_name: str
|
||||
teu: Optional[int] = None
|
||||
efficiency: Optional[float] = None
|
||||
vehicles: Optional[int] = None
|
||||
|
||||
def to_dict(self) -> Dict[str, Any]:
|
||||
"""转换为字典"""
|
||||
return asdict(self)
|
||||
|
||||
|
||||
class LogParserError(Exception):
|
||||
"""日志解析错误"""
|
||||
pass
|
||||
|
||||
|
||||
class HandoverLogParser:
|
||||
"""交接班日志解析器"""
|
||||
|
||||
SEPARATOR = '———————————————————————————————————————————————'
|
||||
|
||||
def __init__(self):
|
||||
"""初始化解析器"""
|
||||
pass
|
||||
|
||||
@staticmethod
|
||||
def parse_date(date_str: str) -> str:
|
||||
"""
|
||||
解析日期字符串
|
||||
|
||||
参数:
|
||||
date_str: 日期字符串,格式 "2025.12.30"
|
||||
|
||||
返回:
|
||||
标准化日期字符串 "2025-12-30"
|
||||
|
||||
异常:
|
||||
ValueError: 日期格式无效
|
||||
"""
|
||||
if not date_str:
|
||||
return date_str
|
||||
|
||||
try:
|
||||
parts = date_str.split('.')
|
||||
if len(parts) == 3:
|
||||
# 验证每个部分都是数字
|
||||
year, month, day = parts
|
||||
if not (year.isdigit() and month.isdigit() and day.isdigit()):
|
||||
raise ValueError(f"日期包含非数字字符: {date_str}")
|
||||
|
||||
# 标准化为YYYY-MM-DD格式
|
||||
return f"{year}-{month.zfill(2)}-{day.zfill(2)}"
|
||||
|
||||
# 如果不是点分隔格式,尝试其他格式
|
||||
if '-' in date_str:
|
||||
# 已经是标准格式
|
||||
return date_str
|
||||
|
||||
logger.warning(f"无法解析日期格式: {date_str}")
|
||||
return date_str
|
||||
|
||||
except Exception as e:
|
||||
logger.warning(f"解析日期失败: {date_str}, 错误: {e}")
|
||||
return date_str
|
||||
|
||||
def parse(self, text: str) -> List[ShipLog]:
|
||||
"""
|
||||
解析日志文本
|
||||
|
||||
参数:
|
||||
text: 日志文本
|
||||
|
||||
返回:
|
||||
船次日志列表(已合并同日期同班次同船名的记录)
|
||||
|
||||
异常:
|
||||
LogParserError: 解析失败
|
||||
ValueError: 输入参数无效
|
||||
"""
|
||||
if not text:
|
||||
logger.warning("日志文本为空")
|
||||
return []
|
||||
|
||||
if not isinstance(text, str):
|
||||
error_msg = f"日志文本类型错误,应为字符串,实际为: {type(text)}"
|
||||
logger.error(error_msg)
|
||||
raise ValueError(error_msg)
|
||||
|
||||
try:
|
||||
logs: List[ShipLog] = []
|
||||
|
||||
# 预处理:移除单行分隔符(前后都是空行的分隔符)
|
||||
# 保留真正的内容分隔符(前后有内容的)
|
||||
lines = text.split('\n')
|
||||
processed_lines: List[str] = []
|
||||
i = 0
|
||||
while i < len(lines):
|
||||
line = lines[i]
|
||||
if line.strip() == self.SEPARATOR:
|
||||
# 检查是否是单行分隔符(前后都是空行或分隔符)
|
||||
prev_empty = i == 0 or not lines[i-1].strip() or lines[i-1].strip() == self.SEPARATOR
|
||||
next_empty = i == len(lines) - 1 or not lines[i+1].strip() or lines[i+1].strip() == self.SEPARATOR
|
||||
if prev_empty and next_empty:
|
||||
# 单行分隔符,跳过
|
||||
i += 1
|
||||
continue
|
||||
processed_lines.append(line)
|
||||
i += 1
|
||||
|
||||
processed_text = '\n'.join(processed_lines)
|
||||
blocks = processed_text.split(self.SEPARATOR)
|
||||
|
||||
for block in blocks:
|
||||
if not block.strip() or '日期:' not in block:
|
||||
continue
|
||||
|
||||
# 解析日期
|
||||
date_match = re.search(r'日期:(\d{4}\.\d{2}\.\d{2})', block)
|
||||
if not date_match:
|
||||
continue
|
||||
|
||||
date = self.parse_date(date_match.group(1))
|
||||
self._parse_block(block, date, logs)
|
||||
|
||||
# 合并同日期同班次同船名的记录(累加TEU)
|
||||
merged: Dict[Tuple[str, str, str], ShipLog] = {}
|
||||
for log in logs:
|
||||
key = (log.date, log.shift, log.ship_name)
|
||||
if key not in merged:
|
||||
merged[key] = ShipLog(
|
||||
date=log.date,
|
||||
shift=log.shift,
|
||||
ship_name=log.ship_name,
|
||||
teu=log.teu,
|
||||
efficiency=log.efficiency,
|
||||
vehicles=log.vehicles
|
||||
)
|
||||
else:
|
||||
# 累加TEU
|
||||
if log.teu:
|
||||
if merged[key].teu is None:
|
||||
merged[key].teu = log.teu
|
||||
else:
|
||||
merged[key].teu += log.teu
|
||||
# 累加车辆数
|
||||
if log.vehicles:
|
||||
if merged[key].vehicles is None:
|
||||
merged[key].vehicles = log.vehicles
|
||||
else:
|
||||
merged[key].vehicles += log.vehicles
|
||||
|
||||
result = list(merged.values())
|
||||
logger.info(f"日志解析完成,共 {len(result)} 条记录")
|
||||
return result
|
||||
|
||||
except Exception as e:
|
||||
error_msg = f"日志解析失败: {e}"
|
||||
logger.error(error_msg)
|
||||
raise LogParserError(error_msg) from e
|
||||
|
||||
def _parse_block(self, block: str, date: str, logs: List[ShipLog]) -> None:
|
||||
"""解析日期块"""
|
||||
try:
|
||||
for shift in ['白班', '夜班']:
|
||||
shift_pattern = f'{shift}:'
|
||||
if shift_pattern not in block:
|
||||
continue
|
||||
|
||||
shift_start = block.find(shift_pattern) + len(shift_pattern)
|
||||
|
||||
# 只找到下一个班次作为边界,不限制"注意事项:"
|
||||
next_pos = len(block)
|
||||
for next_shift in ['白班', '夜班']:
|
||||
if next_shift != shift:
|
||||
pos = block.find(f'{next_shift}:', shift_start)
|
||||
if pos != -1 and pos < next_pos:
|
||||
next_pos = pos
|
||||
|
||||
shift_content = block[shift_start:next_pos]
|
||||
self._parse_ships(shift_content, date, shift, logs)
|
||||
|
||||
except Exception as e:
|
||||
logger.warning(f"解析日期块失败: {date}, 错误: {e}")
|
||||
|
||||
def _parse_ships(self, content: str, date: str, shift: str, logs: List[ShipLog]) -> None:
|
||||
"""解析船次"""
|
||||
try:
|
||||
parts = content.split('实船作业:')
|
||||
|
||||
for part in parts:
|
||||
if not part.strip():
|
||||
continue
|
||||
|
||||
cleaned = part.replace('\xa0', ' ').strip()
|
||||
# 匹配 "xxx# 船名" 格式(船号和船名分开)
|
||||
ship_match = re.search(r'(\d+)#\s*(\S+)', cleaned)
|
||||
|
||||
if not ship_match:
|
||||
continue
|
||||
|
||||
# 船名只取纯船名(去掉xx#前缀和二次靠泊等标注)
|
||||
ship_name = ship_match.group(2)
|
||||
# 移除二次靠泊等标注
|
||||
ship_name = re.sub(r'(二次靠泊)|(再次靠泊)|\(二次靠泊\)|\(再次靠泊\)', '', ship_name).strip()
|
||||
|
||||
vehicles_match = re.search(r'上场车辆数:(\d+)', cleaned)
|
||||
teu_eff_match = re.search(
|
||||
r'作业量/效率:(\d+)TEU[,,\s]*', cleaned
|
||||
)
|
||||
|
||||
# 解析TEU
|
||||
teu = None
|
||||
if teu_eff_match:
|
||||
try:
|
||||
teu = int(teu_eff_match.group(1))
|
||||
except ValueError as e:
|
||||
logger.warning(f"TEU解析失败: {teu_eff_match.group(1)}, 错误: {e}")
|
||||
|
||||
# 解析车辆数
|
||||
vehicles = None
|
||||
if vehicles_match:
|
||||
try:
|
||||
vehicles = int(vehicles_match.group(1))
|
||||
except ValueError as e:
|
||||
logger.warning(f"车辆数解析失败: {vehicles_match.group(1)}, 错误: {e}")
|
||||
|
||||
log = ShipLog(
|
||||
date=date,
|
||||
shift=shift,
|
||||
ship_name=ship_name,
|
||||
teu=teu,
|
||||
efficiency=None, # 目前日志中没有效率数据
|
||||
vehicles=vehicles
|
||||
)
|
||||
logs.append(log)
|
||||
|
||||
except Exception as e:
|
||||
logger.warning(f"解析船次失败: {date} {shift}, 错误: {e}")
|
||||
|
||||
def parse_from_file(self, filepath: str) -> List[ShipLog]:
|
||||
"""
|
||||
从文件解析日志
|
||||
|
||||
参数:
|
||||
filepath: 文件路径
|
||||
|
||||
返回:
|
||||
船次日志列表
|
||||
|
||||
异常:
|
||||
FileNotFoundError: 文件不存在
|
||||
LogParserError: 解析失败
|
||||
"""
|
||||
try:
|
||||
with open(filepath, 'r', encoding='utf-8') as f:
|
||||
text = f.read()
|
||||
|
||||
return self.parse(text)
|
||||
|
||||
except FileNotFoundError as e:
|
||||
error_msg = f"日志文件不存在: {filepath}"
|
||||
logger.error(error_msg)
|
||||
raise
|
||||
except Exception as e:
|
||||
error_msg = f"从文件解析日志失败: {filepath}, 错误: {e}"
|
||||
logger.error(error_msg)
|
||||
raise LogParserError(error_msg) from e
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
# 测试代码
|
||||
import sys
|
||||
|
||||
# 设置日志
|
||||
logging.basicConfig(level=logging.INFO)
|
||||
|
||||
parser = HandoverLogParser()
|
||||
|
||||
# 测试日期解析
|
||||
test_dates = ["2025.12.30", "2025.01.01", "无效日期", "2025-12-30"]
|
||||
for date in test_dates:
|
||||
parsed = parser.parse_date(date)
|
||||
print(f"解析日期 '{date}' -> '{parsed}'")
|
||||
|
||||
# 测试日志解析
|
||||
test_text = """
|
||||
日期:2025.12.30
|
||||
———————————————————————————————————————————————
|
||||
白班:
|
||||
实船作业:123# 测试船1
|
||||
上场车辆数:5
|
||||
作业量/效率:100TEU,
|
||||
注意事项:无
|
||||
———————————————————————————————————————————————
|
||||
夜班:
|
||||
实船作业:456# 测试船2
|
||||
上场车辆数:3
|
||||
作业量/效率:80TEU,
|
||||
注意事项:无
|
||||
"""
|
||||
|
||||
try:
|
||||
logs = parser.parse(test_text)
|
||||
print(f"\n解析到 {len(logs)} 条记录")
|
||||
for log in logs:
|
||||
print(f" {log.date} {log.shift} {log.ship_name}: {log.teu}TEU, {log.vehicles}辆车")
|
||||
except LogParserError as e:
|
||||
print(f"日志解析失败: {e}")
|
||||
sys.exit(1)
|
||||
|
||||
# 测试合并功能
|
||||
duplicate_text = """
|
||||
日期:2025.12.30
|
||||
———————————————————————————————————————————————
|
||||
白班:
|
||||
实船作业:123# 测试船1
|
||||
上场车辆数:5
|
||||
作业量/效率:100TEU,
|
||||
实船作业:123# 测试船1(二次靠泊)
|
||||
上场车辆数:3
|
||||
作业量/效率:50TEU,
|
||||
"""
|
||||
|
||||
try:
|
||||
logs = parser.parse(duplicate_text)
|
||||
print(f"\n合并测试,解析到 {len(logs)} 条记录")
|
||||
for log in logs:
|
||||
print(f" {log.date} {log.shift} {log.ship_name}: {log.teu}TEU, {log.vehicles}辆车")
|
||||
except LogParserError as e:
|
||||
print(f"合并测试失败: {e}")
|
||||
sys.exit(1)
|
||||
354
src/confluence/manager.py
Normal file
354
src/confluence/manager.py
Normal file
@@ -0,0 +1,354 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Confluence 内容管理器
|
||||
提供高级的Confluence内容管理功能
|
||||
"""
|
||||
from typing import Dict, List, Optional, Any
|
||||
import logging
|
||||
from datetime import datetime, timedelta
|
||||
|
||||
from src.logging_config import get_logger
|
||||
from .client import ConfluenceClient, ConfluenceClientError
|
||||
from .parser import HTMLContentParser
|
||||
|
||||
logger = get_logger(__name__)
|
||||
|
||||
|
||||
class ConfluenceContentManager:
|
||||
"""Confluence 内容管理器"""
|
||||
|
||||
def __init__(self, client: Optional[ConfluenceClient] = None):
|
||||
"""
|
||||
初始化内容管理器
|
||||
|
||||
参数:
|
||||
client: Confluence客户端实例,如果为None则创建新实例
|
||||
"""
|
||||
self.client = client or ConfluenceClient()
|
||||
self.parser = HTMLContentParser()
|
||||
logger.debug("Confluence内容管理器初始化完成")
|
||||
|
||||
def get_content_with_analysis(self, content_id: str) -> Dict[str, Any]:
|
||||
"""
|
||||
获取内容并进行分析
|
||||
|
||||
参数:
|
||||
content_id: 页面ID
|
||||
|
||||
返回:
|
||||
包含内容和分析结果的字典
|
||||
"""
|
||||
try:
|
||||
logger.info(f"获取并分析Confluence内容: {content_id}")
|
||||
|
||||
# 获取页面信息
|
||||
page_info = self.client.get_page_info(content_id)
|
||||
|
||||
# 获取HTML内容
|
||||
html = self.client.get_html(content_id)
|
||||
|
||||
# 分析内容
|
||||
analysis = self.parser.analyze_content(html)
|
||||
|
||||
# 提取纯文本(前500字符)
|
||||
plain_text = self.parser.extract_plain_text(html)
|
||||
preview_text = plain_text[:500] + "..." if len(plain_text) > 500 else plain_text
|
||||
|
||||
result = {
|
||||
'page_info': page_info,
|
||||
'html_length': len(html),
|
||||
'analysis': analysis,
|
||||
'preview_text': preview_text,
|
||||
'has_content': len(html) > 0,
|
||||
'timestamp': datetime.now().isoformat()
|
||||
}
|
||||
|
||||
logger.info(f"内容分析完成: {content_id}")
|
||||
return result
|
||||
|
||||
except ConfluenceClientError as e:
|
||||
logger.error(f"获取内容失败: {e}")
|
||||
raise
|
||||
except Exception as e:
|
||||
error_msg = f"内容分析失败: {e}"
|
||||
logger.error(error_msg)
|
||||
raise ValueError(error_msg) from e
|
||||
|
||||
def check_content_health(self, content_id: str) -> Dict[str, Any]:
|
||||
"""
|
||||
检查内容健康状况
|
||||
|
||||
参数:
|
||||
content_id: 页面ID
|
||||
|
||||
返回:
|
||||
健康检查结果
|
||||
"""
|
||||
try:
|
||||
logger.info(f"检查内容健康状况: {content_id}")
|
||||
|
||||
# 获取页面信息
|
||||
page_info = self.client.get_page_info(content_id)
|
||||
|
||||
# 获取HTML内容
|
||||
html = self.client.get_html(content_id)
|
||||
|
||||
# 分析内容
|
||||
analysis = self.parser.analyze_content(html)
|
||||
|
||||
# 检查健康状况
|
||||
health_checks = {
|
||||
'has_content': len(html) > 0,
|
||||
'has_text': analysis['plain_text_length'] > 0,
|
||||
'has_structure': analysis['has_tables'] or analysis['has_links'] or analysis['has_images'],
|
||||
'content_size_ok': 100 <= len(html) <= 1000000, # 100字节到1MB
|
||||
'text_ratio_ok': analysis['plain_text_length'] / max(len(html), 1) > 0.1, # 文本占比至少10%
|
||||
'word_count_ok': analysis['word_count'] >= 10, # 至少10个单词
|
||||
'has_links': analysis['has_links'],
|
||||
'has_images': analysis['has_images'],
|
||||
'has_tables': analysis['has_tables']
|
||||
}
|
||||
|
||||
# 计算健康分数
|
||||
passed_checks = sum(1 for check in health_checks.values() if check)
|
||||
total_checks = len(health_checks)
|
||||
health_score = passed_checks / total_checks
|
||||
|
||||
# 生成建议
|
||||
suggestions = []
|
||||
if not health_checks['has_content']:
|
||||
suggestions.append("页面内容为空")
|
||||
if not health_checks['has_text']:
|
||||
suggestions.append("页面缺少文本内容")
|
||||
if not health_checks['content_size_ok']:
|
||||
suggestions.append("页面内容大小异常")
|
||||
if not health_checks['text_ratio_ok']:
|
||||
suggestions.append("文本占比过低")
|
||||
if not health_checks['word_count_ok']:
|
||||
suggestions.append("单词数量不足")
|
||||
|
||||
result = {
|
||||
'page_info': page_info,
|
||||
'health_score': health_score,
|
||||
'health_status': '健康' if health_score >= 0.8 else '警告' if health_score >= 0.5 else '异常',
|
||||
'health_checks': health_checks,
|
||||
'analysis': analysis,
|
||||
'suggestions': suggestions,
|
||||
'timestamp': datetime.now().isoformat()
|
||||
}
|
||||
|
||||
logger.info(f"健康检查完成: {content_id}, 分数: {health_score:.2f}")
|
||||
return result
|
||||
|
||||
except ConfluenceClientError as e:
|
||||
logger.error(f"健康检查失败: {e}")
|
||||
raise
|
||||
except Exception as e:
|
||||
error_msg = f"健康检查失败: {e}"
|
||||
logger.error(error_msg)
|
||||
raise ValueError(error_msg) from e
|
||||
|
||||
def extract_content_summary(self, content_id: str, max_length: int = 200) -> Dict[str, Any]:
|
||||
"""
|
||||
提取内容摘要
|
||||
|
||||
参数:
|
||||
content_id: 页面ID
|
||||
max_length: 摘要最大长度
|
||||
|
||||
返回:
|
||||
内容摘要
|
||||
"""
|
||||
try:
|
||||
logger.info(f"提取内容摘要: {content_id}")
|
||||
|
||||
# 获取页面信息
|
||||
page_info = self.client.get_page_info(content_id)
|
||||
|
||||
# 获取HTML内容
|
||||
html = self.client.get_html(content_id)
|
||||
|
||||
# 提取纯文本
|
||||
plain_text = self.parser.extract_plain_text(html)
|
||||
|
||||
# 生成摘要
|
||||
if len(plain_text) <= max_length:
|
||||
summary = plain_text
|
||||
else:
|
||||
# 尝试在句子边界处截断
|
||||
sentences = plain_text.split('. ')
|
||||
summary_parts = []
|
||||
current_length = 0
|
||||
|
||||
for sentence in sentences:
|
||||
if current_length + len(sentence) + 2 <= max_length: # +2 for ". "
|
||||
summary_parts.append(sentence)
|
||||
current_length += len(sentence) + 2
|
||||
else:
|
||||
break
|
||||
|
||||
summary = '. '.join(summary_parts)
|
||||
if summary and not summary.endswith('.'):
|
||||
summary += '...'
|
||||
|
||||
# 提取关键信息
|
||||
links = self.parser.extract_links(html)
|
||||
images = self.parser.extract_images(html)
|
||||
tables = self.parser.extract_tables(html)
|
||||
|
||||
result = {
|
||||
'page_info': page_info,
|
||||
'summary': summary,
|
||||
'summary_length': len(summary),
|
||||
'total_length': len(plain_text),
|
||||
'key_elements': {
|
||||
'link_count': len(links),
|
||||
'image_count': len(images),
|
||||
'table_count': len(tables)
|
||||
},
|
||||
'has_rich_content': len(links) > 0 or len(images) > 0 or len(tables) > 0,
|
||||
'timestamp': datetime.now().isoformat()
|
||||
}
|
||||
|
||||
logger.info(f"内容摘要提取完成: {content_id}")
|
||||
return result
|
||||
|
||||
except ConfluenceClientError as e:
|
||||
logger.error(f"提取摘要失败: {e}")
|
||||
raise
|
||||
except Exception as e:
|
||||
error_msg = f"提取摘要失败: {e}"
|
||||
logger.error(error_msg)
|
||||
raise ValueError(error_msg) from e
|
||||
|
||||
def batch_analyze_pages(self, content_ids: List[str]) -> Dict[str, Any]:
|
||||
"""
|
||||
批量分析多个页面
|
||||
|
||||
参数:
|
||||
content_ids: 页面ID列表
|
||||
|
||||
返回:
|
||||
批量分析结果
|
||||
"""
|
||||
try:
|
||||
logger.info(f"批量分析 {len(content_ids)} 个页面")
|
||||
|
||||
results = []
|
||||
errors = []
|
||||
|
||||
for content_id in content_ids:
|
||||
try:
|
||||
result = self.get_content_with_analysis(content_id)
|
||||
results.append(result)
|
||||
logger.debug(f"页面分析完成: {content_id}")
|
||||
|
||||
except Exception as e:
|
||||
errors.append({
|
||||
'content_id': content_id,
|
||||
'error': str(e)
|
||||
})
|
||||
logger.warning(f"页面分析失败: {content_id}, 错误: {e}")
|
||||
|
||||
# 计算统计信息
|
||||
if results:
|
||||
total_pages = len(results)
|
||||
successful_pages = len(results)
|
||||
failed_pages = len(errors)
|
||||
|
||||
total_html_length = sum(r['html_length'] for r in results)
|
||||
avg_html_length = total_html_length / successful_pages if successful_pages > 0 else 0
|
||||
|
||||
stats = {
|
||||
'total_pages': total_pages,
|
||||
'successful_pages': successful_pages,
|
||||
'failed_pages': failed_pages,
|
||||
'success_rate': successful_pages / total_pages if total_pages > 0 else 0,
|
||||
'total_html_length': total_html_length,
|
||||
'avg_html_length': avg_html_length,
|
||||
'has_content_pages': sum(1 for r in results if r['has_content']),
|
||||
'timestamp': datetime.now().isoformat()
|
||||
}
|
||||
else:
|
||||
stats = {
|
||||
'total_pages': 0,
|
||||
'successful_pages': 0,
|
||||
'failed_pages': len(errors),
|
||||
'success_rate': 0,
|
||||
'total_html_length': 0,
|
||||
'avg_html_length': 0,
|
||||
'has_content_pages': 0,
|
||||
'timestamp': datetime.now().isoformat()
|
||||
}
|
||||
|
||||
batch_result = {
|
||||
'stats': stats,
|
||||
'results': results,
|
||||
'errors': errors,
|
||||
'timestamp': datetime.now().isoformat()
|
||||
}
|
||||
|
||||
logger.info(f"批量分析完成: 成功 {len(results)} 个,失败 {len(errors)} 个")
|
||||
return batch_result
|
||||
|
||||
except Exception as e:
|
||||
error_msg = f"批量分析失败: {e}"
|
||||
logger.error(error_msg)
|
||||
raise ValueError(error_msg) from e
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
# 测试代码
|
||||
import sys
|
||||
|
||||
# 设置日志
|
||||
logging.basicConfig(level=logging.INFO)
|
||||
|
||||
try:
|
||||
# 创建管理器
|
||||
manager = ConfluenceContentManager()
|
||||
|
||||
# 测试连接
|
||||
from src.config import config
|
||||
content_id = config.CONFLUENCE_CONTENT_ID
|
||||
|
||||
if not content_id:
|
||||
print("未配置CONFLUENCE_CONTENT_ID,跳过测试")
|
||||
sys.exit(0)
|
||||
|
||||
if manager.client.test_connection(content_id):
|
||||
print("Confluence连接测试成功")
|
||||
|
||||
# 测试内容分析
|
||||
print("\n1. 测试内容分析:")
|
||||
analysis = manager.get_content_with_analysis(content_id)
|
||||
print(f" 页面标题: {analysis['page_info'].get('title')}")
|
||||
print(f" 内容长度: {analysis['html_length']} 字符")
|
||||
print(f" 文本预览: {analysis['preview_text'][:100]}...")
|
||||
|
||||
# 测试健康检查
|
||||
print("\n2. 测试健康检查:")
|
||||
health = manager.check_content_health(content_id)
|
||||
print(f" 健康分数: {health['health_score']:.2f}")
|
||||
print(f" 健康状态: {health['health_status']}")
|
||||
print(f" 建议: {health['suggestions']}")
|
||||
|
||||
# 测试内容摘要
|
||||
print("\n3. 测试内容摘要:")
|
||||
summary = manager.extract_content_summary(content_id)
|
||||
print(f" 摘要: {summary['summary']}")
|
||||
print(f" 摘要长度: {summary['summary_length']} 字符")
|
||||
print(f" 总长度: {summary['total_length']} 字符")
|
||||
|
||||
print("\n所有测试通过")
|
||||
|
||||
else:
|
||||
print("Confluence连接测试失败")
|
||||
sys.exit(1)
|
||||
|
||||
except ConfluenceClientError as e:
|
||||
print(f"Confluence客户端错误: {e}")
|
||||
sys.exit(1)
|
||||
except Exception as e:
|
||||
print(f"未知错误: {e}")
|
||||
sys.exit(1)
|
||||
244
src/confluence/parser.py
Normal file
244
src/confluence/parser.py
Normal file
@@ -0,0 +1,244 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Confluence HTML 内容解析器
|
||||
提供Confluence HTML内容的解析和格式化功能
|
||||
"""
|
||||
import re
|
||||
from typing import Dict, List, Optional, Tuple
|
||||
import logging
|
||||
|
||||
from src.logging_config import get_logger
|
||||
|
||||
logger = get_logger(__name__)
|
||||
|
||||
|
||||
class HTMLContentParser:
|
||||
"""Confluence HTML 内容解析器"""
|
||||
|
||||
def __init__(self):
|
||||
"""初始化解析器"""
|
||||
logger.debug("HTML内容解析器初始化完成")
|
||||
|
||||
def extract_plain_text(self, html: str) -> str:
|
||||
"""
|
||||
从HTML中提取纯文本(简单版本)
|
||||
|
||||
参数:
|
||||
html: HTML字符串
|
||||
|
||||
返回:
|
||||
纯文本字符串
|
||||
"""
|
||||
try:
|
||||
# 移除HTML标签
|
||||
text = re.sub(r'<[^>]+>', ' ', html)
|
||||
# 合并多个空格
|
||||
text = re.sub(r'\s+', ' ', text)
|
||||
# 解码HTML实体(简单版本)
|
||||
text = text.replace(' ', ' ').replace('&', '&').replace('<', '<').replace('>', '>')
|
||||
# 去除首尾空格
|
||||
text = text.strip()
|
||||
|
||||
logger.debug(f"提取纯文本完成,长度: {len(text)} 字符")
|
||||
return text
|
||||
|
||||
except Exception as e:
|
||||
error_msg = f"提取纯文本失败: {e}"
|
||||
logger.error(error_msg)
|
||||
raise ValueError(error_msg) from e
|
||||
|
||||
def extract_links(self, html: str) -> List[Dict[str, str]]:
|
||||
"""
|
||||
从HTML中提取链接
|
||||
|
||||
参数:
|
||||
html: HTML字符串
|
||||
|
||||
返回:
|
||||
链接列表,每个链接包含 'text' 和 'url'
|
||||
"""
|
||||
links = []
|
||||
try:
|
||||
# 简单的正则表达式匹配链接
|
||||
link_pattern = r'<a\s+[^>]*href=["\']([^"\']+)["\'][^>]*>([^<]+)</a>'
|
||||
matches = re.findall(link_pattern, html, re.IGNORECASE)
|
||||
|
||||
for url, text in matches:
|
||||
links.append({
|
||||
'text': text.strip(),
|
||||
'url': url.strip()
|
||||
})
|
||||
|
||||
logger.debug(f"提取到 {len(links)} 个链接")
|
||||
return links
|
||||
|
||||
except Exception as e:
|
||||
error_msg = f"提取链接失败: {e}"
|
||||
logger.error(error_msg)
|
||||
raise ValueError(error_msg) from e
|
||||
|
||||
def extract_images(self, html: str) -> List[Dict[str, str]]:
|
||||
"""
|
||||
从HTML中提取图片
|
||||
|
||||
参数:
|
||||
html: HTML字符串
|
||||
|
||||
返回:
|
||||
图片列表,每个图片包含 'src' 和 'alt'
|
||||
"""
|
||||
images = []
|
||||
try:
|
||||
# 简单的正则表达式匹配图片
|
||||
img_pattern = r'<img\s+[^>]*src=["\']([^"\']+)["\'][^>]*alt=["\']([^"\']*)["\'][^>]*>'
|
||||
matches = re.findall(img_pattern, html, re.IGNORECASE)
|
||||
|
||||
for src, alt in matches:
|
||||
images.append({
|
||||
'src': src.strip(),
|
||||
'alt': alt.strip()
|
||||
})
|
||||
|
||||
logger.debug(f"提取到 {len(images)} 张图片")
|
||||
return images
|
||||
|
||||
except Exception as e:
|
||||
error_msg = f"提取图片失败: {e}"
|
||||
logger.error(error_msg)
|
||||
raise ValueError(error_msg) from e
|
||||
|
||||
def extract_tables(self, html: str) -> List[List[List[str]]]:
|
||||
"""
|
||||
从HTML中提取表格数据
|
||||
|
||||
参数:
|
||||
html: HTML字符串
|
||||
|
||||
返回:
|
||||
表格列表,每个表格是二维列表
|
||||
"""
|
||||
tables = []
|
||||
try:
|
||||
# 简单的表格提取(仅支持简单表格)
|
||||
table_pattern = r'<table[^>]*>(.*?)</table>'
|
||||
table_matches = re.findall(table_pattern, html, re.IGNORECASE | re.DOTALL)
|
||||
|
||||
for table_html in table_matches:
|
||||
rows = []
|
||||
# 提取行
|
||||
row_pattern = r'<tr[^>]*>(.*?)</tr>'
|
||||
row_matches = re.findall(row_pattern, table_html, re.IGNORECASE | re.DOTALL)
|
||||
|
||||
for row_html in row_matches:
|
||||
cells = []
|
||||
# 提取单元格
|
||||
cell_pattern = r'<t[dh][^>]*>(.*?)</t[dh]>'
|
||||
cell_matches = re.findall(cell_pattern, row_html, re.IGNORECASE | re.DOTALL)
|
||||
|
||||
for cell_html in cell_matches:
|
||||
# 清理单元格内容
|
||||
cell_text = re.sub(r'<[^>]+>', '', cell_html)
|
||||
cell_text = re.sub(r'\s+', ' ', cell_text).strip()
|
||||
cells.append(cell_text)
|
||||
|
||||
rows.append(cells)
|
||||
|
||||
if rows: # 只添加非空表格
|
||||
tables.append(rows)
|
||||
|
||||
logger.debug(f"提取到 {len(tables)} 个表格")
|
||||
return tables
|
||||
|
||||
except Exception as e:
|
||||
error_msg = f"提取表格失败: {e}"
|
||||
logger.error(error_msg)
|
||||
raise ValueError(error_msg) from e
|
||||
|
||||
def analyze_content(self, html: str) -> Dict[str, any]:
|
||||
"""
|
||||
分析HTML内容
|
||||
|
||||
参数:
|
||||
html: HTML字符串
|
||||
|
||||
返回:
|
||||
内容分析结果
|
||||
"""
|
||||
try:
|
||||
plain_text = self.extract_plain_text(html)
|
||||
links = self.extract_links(html)
|
||||
images = self.extract_images(html)
|
||||
tables = self.extract_tables(html)
|
||||
|
||||
analysis = {
|
||||
'total_length': len(html),
|
||||
'plain_text_length': len(plain_text),
|
||||
'link_count': len(links),
|
||||
'image_count': len(images),
|
||||
'table_count': len(tables),
|
||||
'word_count': len(plain_text.split()),
|
||||
'line_count': plain_text.count('\n') + 1,
|
||||
'has_tables': len(tables) > 0,
|
||||
'has_images': len(images) > 0,
|
||||
'has_links': len(links) > 0
|
||||
}
|
||||
|
||||
logger.info(f"内容分析完成: {analysis}")
|
||||
return analysis
|
||||
|
||||
except Exception as e:
|
||||
error_msg = f"内容分析失败: {e}"
|
||||
logger.error(error_msg)
|
||||
raise ValueError(error_msg) from e
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
# 测试代码
|
||||
import sys
|
||||
|
||||
# 设置日志
|
||||
logging.basicConfig(level=logging.INFO)
|
||||
|
||||
# 测试HTML
|
||||
test_html = """
|
||||
<html>
|
||||
<body>
|
||||
<h1>测试页面</h1>
|
||||
<p>这是一个测试页面,包含<a href="https://example.com">链接</a>和图片。</p>
|
||||
<img src="test.jpg" alt="测试图片">
|
||||
<table>
|
||||
<tr><th>标题1</th><th>标题2</th></tr>
|
||||
<tr><td>数据1</td><td>数据2</td></tr>
|
||||
</table>
|
||||
</body>
|
||||
</html>
|
||||
"""
|
||||
|
||||
try:
|
||||
parser = HTMLContentParser()
|
||||
|
||||
# 测试纯文本提取
|
||||
text = parser.extract_plain_text(test_html)
|
||||
print(f"纯文本: {text[:100]}...")
|
||||
|
||||
# 测试链接提取
|
||||
links = parser.extract_links(test_html)
|
||||
print(f"链接: {links}")
|
||||
|
||||
# 测试图片提取
|
||||
images = parser.extract_images(test_html)
|
||||
print(f"图片: {images}")
|
||||
|
||||
# 测试表格提取
|
||||
tables = parser.extract_tables(test_html)
|
||||
print(f"表格: {tables}")
|
||||
|
||||
# 测试内容分析
|
||||
analysis = parser.analyze_content(test_html)
|
||||
print(f"内容分析: {analysis}")
|
||||
|
||||
print("所有测试通过")
|
||||
|
||||
except Exception as e:
|
||||
print(f"测试失败: {e}")
|
||||
sys.exit(1)
|
||||
309
src/confluence/text.py
Normal file
309
src/confluence/text.py
Normal file
@@ -0,0 +1,309 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
HTML 文本提取模块
|
||||
改进异常处理和类型提示
|
||||
"""
|
||||
import re
|
||||
from bs4 import BeautifulSoup, Tag, NavigableString
|
||||
from typing import List, Optional, Any, Union
|
||||
import logging
|
||||
|
||||
from src.config import config
|
||||
from src.logging_config import get_logger
|
||||
|
||||
logger = get_logger(__name__)
|
||||
|
||||
|
||||
class HTMLTextExtractorError(Exception):
|
||||
"""HTML文本提取错误"""
|
||||
pass
|
||||
|
||||
|
||||
class HTMLTextExtractor:
|
||||
"""HTML 文本提取器 - 保留布局结构"""
|
||||
|
||||
# 块级元素列表
|
||||
BLOCK_TAGS = {
|
||||
'h1', 'h2', 'h3', 'h4', 'h5', 'h6', 'p', 'div', 'section',
|
||||
'table', 'tr', 'td', 'th', 'li', 'ul', 'ol', 'blockquote',
|
||||
'pre', 'hr', 'br', 'tbody', 'thead', 'tfoot'
|
||||
}
|
||||
|
||||
def __init__(self):
|
||||
"""初始化提取器"""
|
||||
self.output_lines: List[str] = []
|
||||
|
||||
def extract(self, html: str) -> str:
|
||||
"""
|
||||
从HTML中提取保留布局的文本
|
||||
|
||||
参数:
|
||||
html: HTML字符串
|
||||
|
||||
返回:
|
||||
格式化的纯文本
|
||||
|
||||
异常:
|
||||
HTMLTextExtractorError: HTML解析失败
|
||||
ValueError: 输入参数无效
|
||||
"""
|
||||
if not html:
|
||||
logger.warning("HTML内容为空")
|
||||
return ''
|
||||
|
||||
if not isinstance(html, str):
|
||||
error_msg = f"HTML参数类型错误,应为字符串,实际为: {type(html)}"
|
||||
logger.error(error_msg)
|
||||
raise ValueError(error_msg)
|
||||
|
||||
try:
|
||||
logger.debug(f"开始解析HTML,长度: {len(html)} 字符")
|
||||
soup = BeautifulSoup(html, 'html.parser')
|
||||
|
||||
# 移除不需要的元素
|
||||
for tag in soup(["script", "style", "noscript"]):
|
||||
tag.decompose()
|
||||
|
||||
# 移除 Confluence 宏
|
||||
for macro in soup.find_all(attrs={"ac:name": True}):
|
||||
macro.decompose()
|
||||
|
||||
self.output_lines = []
|
||||
|
||||
# 处理 body 或整个文档
|
||||
body = soup.body if soup.body else soup
|
||||
for child in body.children:
|
||||
self._process_node(child)
|
||||
|
||||
# 清理结果
|
||||
result = ''.join(self.output_lines)
|
||||
result = re.sub(r'\n\s*\n\s*\n', '\n\n', result)
|
||||
result = '\n'.join(line.rstrip() for line in result.split('\n'))
|
||||
|
||||
logger.info(f"HTML提取完成,输出长度: {len(result)} 字符")
|
||||
return result.strip()
|
||||
|
||||
except Exception as e:
|
||||
error_msg = f"HTML解析失败: {e}"
|
||||
logger.error(error_msg)
|
||||
raise HTMLTextExtractorError(error_msg) from e
|
||||
|
||||
def _process_node(self, node: Union[Tag, NavigableString], indent: int = 0,
|
||||
list_context: Optional[tuple] = None) -> None:
|
||||
"""递归处理节点"""
|
||||
if isinstance(node, NavigableString):
|
||||
text = str(node).strip()
|
||||
if text:
|
||||
text = re.sub(r'\s+', ' ', text)
|
||||
if self.output_lines and not self.output_lines[-1].endswith('\n'):
|
||||
self.output_lines[-1] += text
|
||||
else:
|
||||
self.output_lines.append(' ' * indent + text)
|
||||
return
|
||||
|
||||
if not isinstance(node, Tag):
|
||||
return
|
||||
|
||||
tag_name = node.name.lower()
|
||||
is_block = tag_name in self.BLOCK_TAGS
|
||||
|
||||
# 块级元素前添加换行
|
||||
if is_block and self.output_lines and not self.output_lines[-1].endswith('\n'):
|
||||
self.output_lines.append('\n')
|
||||
|
||||
# 处理特定标签
|
||||
if tag_name in ('h1', 'h2', 'h3', 'h4', 'h5', 'h6'):
|
||||
try:
|
||||
level = int(tag_name[1])
|
||||
prefix = '#' * level + ' '
|
||||
text = node.get_text().strip()
|
||||
if text:
|
||||
self.output_lines.append(' ' * indent + prefix + text + '\n')
|
||||
except (ValueError, IndexError) as e:
|
||||
logger.warning(f"解析标题标签失败: {tag_name}, 错误: {e}")
|
||||
return
|
||||
|
||||
elif tag_name == 'p':
|
||||
text = node.get_text().strip()
|
||||
if text:
|
||||
self.output_lines.append(' ' * indent + text + '\n')
|
||||
return
|
||||
|
||||
elif tag_name == 'hr':
|
||||
self.output_lines.append(' ' * indent + config.SEPARATOR_CHAR * config.SEPARATOR_LENGTH + '\n')
|
||||
return
|
||||
|
||||
elif tag_name == 'br':
|
||||
self.output_lines.append('\n')
|
||||
return
|
||||
|
||||
elif tag_name == 'table':
|
||||
self._process_table(node, indent)
|
||||
return
|
||||
|
||||
elif tag_name in ('ul', 'ol'):
|
||||
self._process_list(node, indent, tag_name)
|
||||
return
|
||||
|
||||
elif tag_name == 'li':
|
||||
self._process_list_item(node, indent, list_context)
|
||||
return
|
||||
|
||||
elif tag_name == 'a':
|
||||
try:
|
||||
href = node.get('href', '')
|
||||
text = node.get_text().strip()
|
||||
if href and text:
|
||||
self.output_lines.append(f'{text} ({href})')
|
||||
elif text:
|
||||
self.output_lines.append(text)
|
||||
except Exception as e:
|
||||
logger.warning(f"解析链接标签失败: {e}")
|
||||
return
|
||||
|
||||
elif tag_name in ('strong', 'b'):
|
||||
text = node.get_text().strip()
|
||||
if text:
|
||||
self.output_lines.append(f'**{text}**')
|
||||
return
|
||||
|
||||
elif tag_name in ('em', 'i'):
|
||||
text = node.get_text().strip()
|
||||
if text:
|
||||
self.output_lines.append(f'*{text}*')
|
||||
return
|
||||
|
||||
else:
|
||||
# 默认递归处理子元素
|
||||
for child in node.children:
|
||||
self._process_node(child, indent, list_context)
|
||||
|
||||
if is_block and self.output_lines and not self.output_lines[-1].endswith('\n'):
|
||||
self.output_lines.append('\n')
|
||||
|
||||
def _process_table(self, table: Tag, indent: int) -> None:
|
||||
"""处理表格"""
|
||||
try:
|
||||
rows = []
|
||||
for tr in table.find_all('tr'):
|
||||
row = []
|
||||
for td in tr.find_all(['td', 'th']):
|
||||
row.append(td.get_text().strip())
|
||||
if row:
|
||||
rows.append(row)
|
||||
|
||||
if rows:
|
||||
# 计算列宽
|
||||
col_widths = []
|
||||
max_cols = max(len(r) for r in rows)
|
||||
for i in range(max_cols):
|
||||
col_width = max((len(r[i]) if i < len(r) else 0) for r in rows)
|
||||
col_widths.append(col_width)
|
||||
|
||||
for row in rows:
|
||||
line = ' ' * indent
|
||||
for i, cell in enumerate(row):
|
||||
width = col_widths[i] if i < len(col_widths) else 0
|
||||
line += cell.ljust(width) + ' '
|
||||
self.output_lines.append(line.rstrip() + '\n')
|
||||
self.output_lines.append('\n')
|
||||
|
||||
except Exception as e:
|
||||
logger.warning(f"处理表格失败: {e}")
|
||||
# 降级处理:简单提取表格文本
|
||||
table_text = table.get_text().strip()
|
||||
if table_text:
|
||||
self.output_lines.append(' ' * indent + table_text + '\n')
|
||||
|
||||
def _process_list(self, ul: Tag, indent: int, list_type: str) -> None:
|
||||
"""处理列表"""
|
||||
try:
|
||||
counter = 1 if list_type == 'ol' else None
|
||||
for child in ul.children:
|
||||
if isinstance(child, Tag) and child.name == 'li':
|
||||
ctx = (list_type, counter) if counter else (list_type, 1)
|
||||
self._process_list_item(child, indent, ctx)
|
||||
if counter:
|
||||
counter += 1
|
||||
else:
|
||||
self._process_node(child, indent, (list_type, 1) if not counter else None)
|
||||
except Exception as e:
|
||||
logger.warning(f"处理列表失败: {e}")
|
||||
|
||||
def _process_list_item(self, li: Tag, indent: int, list_context: Optional[tuple]) -> None:
|
||||
"""处理列表项"""
|
||||
try:
|
||||
prefix = ''
|
||||
if list_context:
|
||||
list_type, num = list_context
|
||||
prefix = '• ' if list_type == 'ul' else f'{num}. '
|
||||
|
||||
# 收集直接文本
|
||||
direct_parts = []
|
||||
for child in li.children:
|
||||
if isinstance(child, NavigableString):
|
||||
text = str(child).strip()
|
||||
if text:
|
||||
direct_parts.append(text)
|
||||
elif isinstance(child, Tag) and child.name == 'a':
|
||||
href = child.get('href', '')
|
||||
link_text = child.get_text().strip()
|
||||
if href and link_text:
|
||||
direct_parts.append(f'{link_text} ({href})')
|
||||
|
||||
if direct_parts:
|
||||
self.output_lines.append(' ' * indent + prefix + ' '.join(direct_parts) + '\n')
|
||||
|
||||
# 处理子元素
|
||||
for child in li.children:
|
||||
if isinstance(child, Tag) and child.name != 'a':
|
||||
self._process_node(child, indent + 2, None)
|
||||
|
||||
except Exception as e:
|
||||
logger.warning(f"处理列表项失败: {e}")
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
# 测试代码
|
||||
import sys
|
||||
|
||||
# 设置日志
|
||||
logging.basicConfig(level=logging.INFO)
|
||||
|
||||
extractor = HTMLTextExtractor()
|
||||
|
||||
# 测试正常HTML
|
||||
html = "<h1>标题</h1><p>段落</p><ul><li>项目1</li><li>项目2</li></ul>"
|
||||
try:
|
||||
result = extractor.extract(html)
|
||||
print(f"测试1 - 正常HTML提取结果:\n{result}")
|
||||
except Exception as e:
|
||||
print(f"测试1失败: {e}")
|
||||
|
||||
# 测试空HTML
|
||||
try:
|
||||
result = extractor.extract("")
|
||||
print(f"测试2 - 空HTML提取结果: '{result}'")
|
||||
except Exception as e:
|
||||
print(f"测试2失败: {e}")
|
||||
|
||||
# 测试无效HTML
|
||||
try:
|
||||
result = extractor.extract("<invalid>html")
|
||||
print(f"测试3 - 无效HTML提取结果:\n{result}")
|
||||
except Exception as e:
|
||||
print(f"测试3失败: {e}")
|
||||
|
||||
# 测试表格
|
||||
table_html = """
|
||||
<table>
|
||||
<tr><th>姓名</th><th>年龄</th></tr>
|
||||
<tr><td>张三</td><td>25</td></tr>
|
||||
<tr><td>李四</td><td>30</td></tr>
|
||||
</table>
|
||||
"""
|
||||
try:
|
||||
result = extractor.extract(table_html)
|
||||
print(f"测试4 - 表格提取结果:\n{result}")
|
||||
except Exception as e:
|
||||
print(f"测试4失败: {e}")
|
||||
253
src/database.py
253
src/database.py
@@ -1,253 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
数据库模块
|
||||
"""
|
||||
import sqlite3
|
||||
import os
|
||||
from datetime import datetime
|
||||
from typing import List, Dict, Optional
|
||||
|
||||
|
||||
class DailyLogsDatabase:
|
||||
"""每日交接班日志数据库"""
|
||||
|
||||
def __init__(self, db_path: str = 'data/daily_logs.db'):
|
||||
"""
|
||||
初始化数据库
|
||||
|
||||
参数:
|
||||
db_path: 数据库文件路径
|
||||
"""
|
||||
self.db_path = db_path
|
||||
self._ensure_directory()
|
||||
self.conn = self._connect()
|
||||
self._init_schema()
|
||||
|
||||
def _ensure_directory(self):
|
||||
"""确保数据目录存在"""
|
||||
data_dir = os.path.dirname(self.db_path)
|
||||
if data_dir and not os.path.exists(data_dir):
|
||||
os.makedirs(data_dir)
|
||||
|
||||
def _connect(self) -> sqlite3.Connection:
|
||||
"""连接数据库"""
|
||||
conn = sqlite3.connect(self.db_path)
|
||||
conn.row_factory = sqlite3.Row
|
||||
return conn
|
||||
|
||||
def _init_schema(self):
|
||||
"""初始化表结构"""
|
||||
cursor = self.conn.cursor()
|
||||
|
||||
cursor.execute('''
|
||||
CREATE TABLE IF NOT EXISTS daily_handover_logs (
|
||||
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
date TEXT NOT NULL,
|
||||
shift TEXT NOT NULL,
|
||||
ship_name TEXT NOT NULL,
|
||||
teu INTEGER,
|
||||
efficiency REAL,
|
||||
vehicles INTEGER,
|
||||
created_at TEXT DEFAULT CURRENT_TIMESTAMP,
|
||||
UNIQUE(date, shift, ship_name) ON CONFLICT REPLACE
|
||||
)
|
||||
''')
|
||||
|
||||
# 检查是否需要迁移旧表结构
|
||||
cursor.execute("SELECT sql FROM sqlite_master WHERE type='table' AND name='daily_handover_logs'")
|
||||
table_sql = cursor.fetchone()[0]
|
||||
if 'UNIQUE' not in table_sql:
|
||||
# 旧表结构,需要迁移
|
||||
print("检测到旧表结构,正在迁移...")
|
||||
|
||||
# 重命名旧表
|
||||
cursor.execute('ALTER TABLE daily_handover_logs RENAME TO daily_handover_logs_old')
|
||||
|
||||
# 创建新表
|
||||
cursor.execute('''
|
||||
CREATE TABLE daily_handover_logs (
|
||||
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
date TEXT NOT NULL,
|
||||
shift TEXT NOT NULL,
|
||||
ship_name TEXT NOT NULL,
|
||||
teu INTEGER,
|
||||
efficiency REAL,
|
||||
vehicles INTEGER,
|
||||
created_at TEXT DEFAULT CURRENT_TIMESTAMP,
|
||||
UNIQUE(date, shift, ship_name) ON CONFLICT REPLACE
|
||||
)
|
||||
''')
|
||||
|
||||
# 复制数据(忽略重复)
|
||||
cursor.execute('''
|
||||
INSERT OR IGNORE INTO daily_handover_logs
|
||||
(date, shift, ship_name, teu, efficiency, vehicles, created_at)
|
||||
SELECT date, shift, ship_name, teu, efficiency, vehicles, created_at
|
||||
FROM daily_handover_logs_old
|
||||
''')
|
||||
|
||||
# 删除旧表
|
||||
cursor.execute('DROP TABLE daily_handover_logs_old')
|
||||
print("迁移完成!")
|
||||
|
||||
# 索引
|
||||
cursor.execute('CREATE INDEX IF NOT EXISTS idx_date ON daily_handover_logs(date)')
|
||||
cursor.execute('CREATE INDEX IF NOT EXISTS idx_ship ON daily_handover_logs(ship_name)')
|
||||
|
||||
# 创建未统计月报数据表
|
||||
cursor.execute('''
|
||||
CREATE TABLE IF NOT EXISTS monthly_unaccounted (
|
||||
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
year_month TEXT NOT NULL UNIQUE,
|
||||
teu INTEGER NOT NULL,
|
||||
note TEXT,
|
||||
created_at TEXT DEFAULT CURRENT_TIMESTAMP
|
||||
)
|
||||
''')
|
||||
|
||||
self.conn.commit()
|
||||
|
||||
def insert(self, log: Dict) -> bool:
|
||||
"""插入记录(存在则替换,不存在则插入)"""
|
||||
try:
|
||||
cursor = self.conn.cursor()
|
||||
# 使用 INSERT OR REPLACE 来更新已存在的记录
|
||||
cursor.execute('''
|
||||
INSERT OR REPLACE INTO daily_handover_logs
|
||||
(date, shift, ship_name, teu, efficiency, vehicles, created_at)
|
||||
VALUES (?, ?, ?, ?, ?, ?, CURRENT_TIMESTAMP)
|
||||
''', (
|
||||
log['date'], log['shift'], log['ship_name'],
|
||||
log.get('teu'), log.get('efficiency'), log.get('vehicles')
|
||||
))
|
||||
self.conn.commit()
|
||||
return True
|
||||
except sqlite3.Error as e:
|
||||
print(f"数据库错误: {e}")
|
||||
return False
|
||||
|
||||
def insert_many(self, logs: List[Dict]) -> int:
|
||||
"""批量插入"""
|
||||
count = 0
|
||||
for log in logs:
|
||||
if self.insert(log):
|
||||
count += 1
|
||||
return count
|
||||
|
||||
def query_by_date(self, date: str) -> List[Dict]:
|
||||
"""按日期查询"""
|
||||
cursor = self.conn.cursor()
|
||||
cursor.execute('''
|
||||
SELECT * FROM daily_handover_logs
|
||||
WHERE date = ? ORDER BY shift, ship_name
|
||||
''', (date,))
|
||||
return [dict(row) for row in cursor.fetchall()]
|
||||
|
||||
def query_by_ship(self, ship_name: str) -> List[Dict]:
|
||||
"""按船名查询"""
|
||||
cursor = self.conn.cursor()
|
||||
cursor.execute('''
|
||||
SELECT * FROM daily_handover_logs
|
||||
WHERE ship_name LIKE ? ORDER BY date DESC
|
||||
''', (f'%{ship_name}%',))
|
||||
return [dict(row) for row in cursor.fetchall()]
|
||||
|
||||
def query_all(self, limit: int = 1000) -> List[Dict]:
|
||||
"""查询所有"""
|
||||
cursor = self.conn.cursor()
|
||||
cursor.execute('''
|
||||
SELECT * FROM daily_handover_logs
|
||||
ORDER BY date DESC, shift LIMIT ?
|
||||
''', (limit,))
|
||||
return [dict(row) for row in cursor.fetchall()]
|
||||
|
||||
def get_stats(self) -> Dict:
|
||||
"""获取统计信息"""
|
||||
cursor = self.conn.cursor()
|
||||
|
||||
cursor.execute('SELECT COUNT(*) FROM daily_handover_logs')
|
||||
total = cursor.fetchone()[0]
|
||||
|
||||
cursor.execute('SELECT DISTINCT ship_name FROM daily_handover_logs')
|
||||
ships = [row[0] for row in cursor.fetchall()]
|
||||
|
||||
cursor.execute('SELECT MIN(date), MAX(date) FROM daily_handover_logs')
|
||||
date_range = cursor.fetchone()
|
||||
|
||||
return {
|
||||
'total': total,
|
||||
'ships': ships,
|
||||
'date_range': {'start': date_range[0], 'end': date_range[1]}
|
||||
}
|
||||
|
||||
def get_ships_with_monthly_teu(self, year_month: str = None) -> List[Dict]:
|
||||
"""获取所有船只及其当月TEU总量"""
|
||||
cursor = self.conn.cursor()
|
||||
|
||||
if year_month:
|
||||
# 按年月筛选
|
||||
cursor.execute('''
|
||||
SELECT ship_name, SUM(teu) as monthly_teu
|
||||
FROM daily_handover_logs
|
||||
WHERE date LIKE ?
|
||||
GROUP BY ship_name
|
||||
ORDER BY monthly_teu DESC
|
||||
''', (f'{year_month}%',))
|
||||
else:
|
||||
# 全部
|
||||
cursor.execute('''
|
||||
SELECT ship_name, SUM(teu) as monthly_teu
|
||||
FROM daily_handover_logs
|
||||
GROUP BY ship_name
|
||||
ORDER BY monthly_teu DESC
|
||||
''')
|
||||
|
||||
return [{'ship_name': row[0], 'monthly_teu': row[1]} for row in cursor.fetchall()]
|
||||
|
||||
def insert_unaccounted(self, year_month: str, teu: int, note: str = '') -> bool:
|
||||
"""插入未统计数据"""
|
||||
try:
|
||||
cursor = self.conn.cursor()
|
||||
cursor.execute('''
|
||||
INSERT OR REPLACE INTO monthly_unaccounted
|
||||
(year_month, teu, note, created_at)
|
||||
VALUES (?, ?, ?, CURRENT_TIMESTAMP)
|
||||
''', (year_month, teu, note))
|
||||
self.conn.commit()
|
||||
return True
|
||||
except sqlite3.Error as e:
|
||||
print(f"数据库错误: {e}")
|
||||
return False
|
||||
|
||||
def get_unaccounted(self, year_month: str) -> int:
|
||||
"""获取指定月份的未统计数据"""
|
||||
cursor = self.conn.cursor()
|
||||
cursor.execute(
|
||||
'SELECT teu FROM monthly_unaccounted WHERE year_month = ?',
|
||||
(year_month,)
|
||||
)
|
||||
result = cursor.fetchone()
|
||||
return result[0] if result else 0
|
||||
|
||||
def close(self):
|
||||
"""关闭连接"""
|
||||
if self.conn:
|
||||
self.conn.close()
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
db = DailyLogsDatabase()
|
||||
|
||||
# 测试插入
|
||||
test_log = {
|
||||
'date': '2025-12-28',
|
||||
'shift': '白班',
|
||||
'ship_name': '测试船',
|
||||
'teu': 100,
|
||||
'efficiency': 3.5,
|
||||
'vehicles': 5
|
||||
}
|
||||
|
||||
db.insert(test_log)
|
||||
print(f'总记录: {db.get_stats()["total"]}')
|
||||
db.close()
|
||||
15
src/database/__init__.py
Normal file
15
src/database/__init__.py
Normal file
@@ -0,0 +1,15 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
数据库模块包
|
||||
提供统一的数据库接口
|
||||
"""
|
||||
from src.database.base import DatabaseBase, DatabaseConnectionError
|
||||
from src.database.daily_logs import DailyLogsDatabase
|
||||
from src.database.schedules import ScheduleDatabase
|
||||
|
||||
__all__ = [
|
||||
'DatabaseBase',
|
||||
'DatabaseConnectionError',
|
||||
'DailyLogsDatabase',
|
||||
'ScheduleDatabase'
|
||||
]
|
||||
257
src/database/base.py
Normal file
257
src/database/base.py
Normal file
@@ -0,0 +1,257 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
数据库基类模块
|
||||
提供统一的数据库连接管理和上下文管理器
|
||||
"""
|
||||
import os
|
||||
import sqlite3
|
||||
from contextlib import contextmanager
|
||||
from typing import Generator, Optional, Any
|
||||
from pathlib import Path
|
||||
|
||||
from src.config import config
|
||||
from src.logging_config import get_logger
|
||||
|
||||
logger = get_logger(__name__)
|
||||
|
||||
|
||||
class DatabaseConnectionError(Exception):
|
||||
"""数据库连接错误"""
|
||||
pass
|
||||
|
||||
|
||||
class DatabaseBase:
|
||||
"""数据库基类,提供统一的连接管理"""
|
||||
|
||||
def __init__(self, db_path: Optional[str] = None):
|
||||
"""
|
||||
初始化数据库基类
|
||||
|
||||
参数:
|
||||
db_path: 数据库文件路径,如果为None则使用默认配置
|
||||
"""
|
||||
self.db_path = db_path or config.DATABASE_PATH
|
||||
self._connection: Optional[sqlite3.Connection] = None
|
||||
self._ensure_directory()
|
||||
|
||||
def _ensure_directory(self):
|
||||
"""确保数据库目录存在"""
|
||||
data_dir = os.path.dirname(self.db_path)
|
||||
if data_dir and not os.path.exists(data_dir):
|
||||
os.makedirs(data_dir)
|
||||
logger.info(f"创建数据库目录: {data_dir}")
|
||||
|
||||
def _connect(self) -> sqlite3.Connection:
|
||||
"""
|
||||
创建数据库连接
|
||||
|
||||
返回:
|
||||
sqlite3.Connection 对象
|
||||
|
||||
异常:
|
||||
DatabaseConnectionError: 连接失败时抛出
|
||||
"""
|
||||
try:
|
||||
conn = sqlite3.connect(self.db_path)
|
||||
conn.row_factory = sqlite3.Row
|
||||
logger.debug(f"数据库连接已建立: {self.db_path}")
|
||||
return conn
|
||||
except sqlite3.Error as e:
|
||||
error_msg = f"数据库连接失败: {self.db_path}, 错误: {e}"
|
||||
logger.error(error_msg)
|
||||
raise DatabaseConnectionError(error_msg) from e
|
||||
|
||||
@contextmanager
|
||||
def get_connection(self) -> Generator[sqlite3.Connection, None, None]:
|
||||
"""
|
||||
获取数据库连接的上下文管理器
|
||||
|
||||
使用示例:
|
||||
with self.get_connection() as conn:
|
||||
cursor = conn.cursor()
|
||||
cursor.execute(...)
|
||||
|
||||
返回:
|
||||
数据库连接对象
|
||||
"""
|
||||
conn = None
|
||||
try:
|
||||
conn = self._connect()
|
||||
yield conn
|
||||
except sqlite3.Error as e:
|
||||
logger.error(f"数据库操作失败: {e}")
|
||||
raise
|
||||
finally:
|
||||
if conn:
|
||||
conn.close()
|
||||
logger.debug("数据库连接已关闭")
|
||||
|
||||
def execute_query(self, query: str, params: tuple = ()) -> list:
|
||||
"""
|
||||
执行查询并返回结果
|
||||
|
||||
参数:
|
||||
query: SQL查询语句
|
||||
params: 查询参数
|
||||
|
||||
返回:
|
||||
查询结果列表
|
||||
"""
|
||||
with self.get_connection() as conn:
|
||||
cursor = conn.cursor()
|
||||
cursor.execute(query, params)
|
||||
return [dict(row) for row in cursor.fetchall()]
|
||||
|
||||
def execute_update(self, query: str, params: tuple = ()) -> int:
|
||||
"""
|
||||
执行更新操作
|
||||
|
||||
参数:
|
||||
query: SQL更新语句
|
||||
params: 更新参数
|
||||
|
||||
返回:
|
||||
受影响的行数
|
||||
"""
|
||||
with self.get_connection() as conn:
|
||||
cursor = conn.cursor()
|
||||
cursor.execute(query, params)
|
||||
conn.commit()
|
||||
return cursor.rowcount
|
||||
|
||||
def execute_many(self, query: str, params_list: list) -> int:
|
||||
"""
|
||||
批量执行操作
|
||||
|
||||
参数:
|
||||
query: SQL语句
|
||||
params_list: 参数列表
|
||||
|
||||
返回:
|
||||
受影响的总行数
|
||||
"""
|
||||
with self.get_connection() as conn:
|
||||
cursor = conn.cursor()
|
||||
cursor.executemany(query, params_list)
|
||||
conn.commit()
|
||||
return cursor.rowcount
|
||||
|
||||
def table_exists(self, table_name: str) -> bool:
|
||||
"""
|
||||
检查表是否存在
|
||||
|
||||
参数:
|
||||
table_name: 表名
|
||||
|
||||
返回:
|
||||
表是否存在
|
||||
"""
|
||||
query = """
|
||||
SELECT name FROM sqlite_master
|
||||
WHERE type='table' AND name=?
|
||||
"""
|
||||
result = self.execute_query(query, (table_name,))
|
||||
return len(result) > 0
|
||||
|
||||
def get_table_info(self, table_name: str) -> list:
|
||||
"""
|
||||
获取表结构信息
|
||||
|
||||
参数:
|
||||
table_name: 表名
|
||||
|
||||
返回:
|
||||
表结构信息列表
|
||||
"""
|
||||
with self.get_connection() as conn:
|
||||
cursor = conn.cursor()
|
||||
cursor.execute(f"PRAGMA table_info({table_name})")
|
||||
return [dict(row) for row in cursor.fetchall()]
|
||||
|
||||
def vacuum(self):
|
||||
"""执行数据库整理"""
|
||||
with self.get_connection() as conn:
|
||||
conn.execute("VACUUM")
|
||||
logger.info("数据库整理完成")
|
||||
|
||||
def backup(self, backup_path: Optional[str] = None):
|
||||
"""
|
||||
备份数据库
|
||||
|
||||
参数:
|
||||
backup_path: 备份文件路径,如果为None则使用默认路径
|
||||
"""
|
||||
if backup_path is None:
|
||||
backup_dir = "backups"
|
||||
os.makedirs(backup_dir, exist_ok=True)
|
||||
timestamp = os.path.getmtime(self.db_path)
|
||||
from datetime import datetime
|
||||
dt = datetime.fromtimestamp(timestamp)
|
||||
backup_path = os.path.join(
|
||||
backup_dir,
|
||||
f"backup_{dt.strftime('%Y%m%d_%H%M%S')}.db"
|
||||
)
|
||||
|
||||
try:
|
||||
with self.get_connection() as src_conn:
|
||||
dest_conn = sqlite3.connect(backup_path)
|
||||
src_conn.backup(dest_conn)
|
||||
dest_conn.close()
|
||||
logger.info(f"数据库备份完成: {backup_path}")
|
||||
except sqlite3.Error as e:
|
||||
logger.error(f"数据库备份失败: {e}")
|
||||
raise
|
||||
|
||||
|
||||
# 全局数据库连接池(可选,用于高性能场景)
|
||||
class ConnectionPool:
|
||||
"""简单的数据库连接池"""
|
||||
|
||||
def __init__(self, db_path: str, max_connections: int = 5):
|
||||
self.db_path = db_path
|
||||
self.max_connections = max_connections
|
||||
self._connections: list[sqlite3.Connection] = []
|
||||
self._in_use: set[sqlite3.Connection] = set()
|
||||
|
||||
@contextmanager
|
||||
def get_connection(self) -> Generator[sqlite3.Connection, None, None]:
|
||||
"""从连接池获取连接"""
|
||||
conn = None
|
||||
try:
|
||||
if self._connections:
|
||||
conn = self._connections.pop()
|
||||
elif len(self._in_use) < self.max_connections:
|
||||
conn = sqlite3.connect(self.db_path)
|
||||
conn.row_factory = sqlite3.Row
|
||||
else:
|
||||
raise DatabaseConnectionError("连接池已满")
|
||||
|
||||
self._in_use.add(conn)
|
||||
yield conn
|
||||
finally:
|
||||
if conn:
|
||||
self._in_use.remove(conn)
|
||||
self._connections.append(conn)
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
# 测试数据库基类
|
||||
db = DatabaseBase()
|
||||
|
||||
# 测试连接
|
||||
with db.get_connection() as conn:
|
||||
cursor = conn.cursor()
|
||||
cursor.execute("SELECT sqlite_version()")
|
||||
version = cursor.fetchone()[0]
|
||||
print(f"SQLite版本: {version}")
|
||||
|
||||
# 测试查询
|
||||
if db.table_exists("sqlite_master"):
|
||||
print("sqlite_master表存在")
|
||||
|
||||
# 测试备份
|
||||
try:
|
||||
db.backup("test_backup.db")
|
||||
print("备份测试完成")
|
||||
except Exception as e:
|
||||
print(f"备份测试失败: {e}")
|
||||
336
src/database/daily_logs.py
Normal file
336
src/database/daily_logs.py
Normal file
@@ -0,0 +1,336 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
每日交接班日志数据库模块
|
||||
基于新的数据库基类重构
|
||||
"""
|
||||
from typing import List, Dict, Optional, Any
|
||||
from datetime import datetime
|
||||
|
||||
from src.database.base import DatabaseBase
|
||||
from src.logging_config import get_logger
|
||||
|
||||
logger = get_logger(__name__)
|
||||
|
||||
|
||||
class DailyLogsDatabase(DatabaseBase):
|
||||
"""每日交接班日志数据库"""
|
||||
|
||||
def __init__(self, db_path: Optional[str] = None):
|
||||
"""
|
||||
初始化数据库
|
||||
|
||||
参数:
|
||||
db_path: 数据库文件路径,如果为None则使用默认配置
|
||||
"""
|
||||
super().__init__(db_path)
|
||||
self._init_schema()
|
||||
|
||||
def _init_schema(self):
|
||||
"""初始化表结构"""
|
||||
with self.get_connection() as conn:
|
||||
cursor = conn.cursor()
|
||||
|
||||
# 创建每日交接班日志表
|
||||
cursor.execute('''
|
||||
CREATE TABLE IF NOT EXISTS daily_handover_logs (
|
||||
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
date TEXT NOT NULL,
|
||||
shift TEXT NOT NULL,
|
||||
ship_name TEXT NOT NULL,
|
||||
teu INTEGER,
|
||||
efficiency REAL,
|
||||
vehicles INTEGER,
|
||||
created_at TEXT DEFAULT CURRENT_TIMESTAMP,
|
||||
UNIQUE(date, shift, ship_name) ON CONFLICT REPLACE
|
||||
)
|
||||
''')
|
||||
|
||||
# 检查是否需要迁移旧表结构
|
||||
cursor.execute("SELECT sql FROM sqlite_master WHERE type='table' AND name='daily_handover_logs'")
|
||||
table_sql = cursor.fetchone()[0]
|
||||
if 'UNIQUE' not in table_sql:
|
||||
logger.warning("检测到旧表结构,正在迁移...")
|
||||
|
||||
# 重命名旧表
|
||||
cursor.execute('ALTER TABLE daily_handover_logs RENAME TO daily_handover_logs_old')
|
||||
|
||||
# 创建新表
|
||||
cursor.execute('''
|
||||
CREATE TABLE daily_handover_logs (
|
||||
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
date TEXT NOT NULL,
|
||||
shift TEXT NOT NULL,
|
||||
ship_name TEXT NOT NULL,
|
||||
teu INTEGER,
|
||||
efficiency REAL,
|
||||
vehicles INTEGER,
|
||||
created_at TEXT DEFAULT CURRENT_TIMESTAMP,
|
||||
UNIQUE(date, shift, ship_name) ON CONFLICT REPLACE
|
||||
)
|
||||
''')
|
||||
|
||||
# 复制数据(忽略重复)
|
||||
cursor.execute('''
|
||||
INSERT OR IGNORE INTO daily_handover_logs
|
||||
(date, shift, ship_name, teu, efficiency, vehicles, created_at)
|
||||
SELECT date, shift, ship_name, teu, efficiency, vehicles, created_at
|
||||
FROM daily_handover_logs_old
|
||||
''')
|
||||
|
||||
# 删除旧表
|
||||
cursor.execute('DROP TABLE daily_handover_logs_old')
|
||||
logger.info("迁移完成!")
|
||||
|
||||
# 创建索引
|
||||
cursor.execute('CREATE INDEX IF NOT EXISTS idx_date ON daily_handover_logs(date)')
|
||||
cursor.execute('CREATE INDEX IF NOT EXISTS idx_ship ON daily_handover_logs(ship_name)')
|
||||
|
||||
# 创建未统计月报数据表
|
||||
cursor.execute('''
|
||||
CREATE TABLE IF NOT EXISTS monthly_unaccounted (
|
||||
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
year_month TEXT NOT NULL UNIQUE,
|
||||
teu INTEGER NOT NULL,
|
||||
note TEXT,
|
||||
created_at TEXT DEFAULT CURRENT_TIMESTAMP
|
||||
)
|
||||
''')
|
||||
|
||||
conn.commit()
|
||||
logger.debug("数据库表结构初始化完成")
|
||||
|
||||
def insert(self, log: Dict[str, Any]) -> bool:
|
||||
"""
|
||||
插入记录(存在则替换,不存在则插入)
|
||||
|
||||
参数:
|
||||
log: 日志记录字典
|
||||
|
||||
返回:
|
||||
是否成功
|
||||
"""
|
||||
try:
|
||||
query = '''
|
||||
INSERT OR REPLACE INTO daily_handover_logs
|
||||
(date, shift, ship_name, teu, efficiency, vehicles, created_at)
|
||||
VALUES (?, ?, ?, ?, ?, ?, CURRENT_TIMESTAMP)
|
||||
'''
|
||||
params = (
|
||||
log['date'], log['shift'], log['ship_name'],
|
||||
log.get('teu'), log.get('efficiency'), log.get('vehicles')
|
||||
)
|
||||
|
||||
self.execute_update(query, params)
|
||||
logger.debug(f"插入记录: {log['date']} {log['shift']} {log['ship_name']}")
|
||||
return True
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"插入记录失败: {e}, 记录: {log}")
|
||||
return False
|
||||
|
||||
def insert_many(self, logs: List[Dict[str, Any]]) -> int:
|
||||
"""
|
||||
批量插入
|
||||
|
||||
参数:
|
||||
logs: 日志记录列表
|
||||
|
||||
返回:
|
||||
成功插入的数量
|
||||
"""
|
||||
count = 0
|
||||
for log in logs:
|
||||
if self.insert(log):
|
||||
count += 1
|
||||
|
||||
logger.info(f"批量插入完成,成功 {count}/{len(logs)} 条记录")
|
||||
return count
|
||||
|
||||
def query_by_date(self, date: str) -> List[Dict[str, Any]]:
|
||||
"""
|
||||
按日期查询
|
||||
|
||||
参数:
|
||||
date: 日期字符串
|
||||
|
||||
返回:
|
||||
日志记录列表
|
||||
"""
|
||||
query = '''
|
||||
SELECT * FROM daily_handover_logs
|
||||
WHERE date = ? ORDER BY shift, ship_name
|
||||
'''
|
||||
return self.execute_query(query, (date,))
|
||||
|
||||
def query_by_ship(self, ship_name: str) -> List[Dict[str, Any]]:
|
||||
"""
|
||||
按船名查询
|
||||
|
||||
参数:
|
||||
ship_name: 船名
|
||||
|
||||
返回:
|
||||
日志记录列表
|
||||
"""
|
||||
query = '''
|
||||
SELECT * FROM daily_handover_logs
|
||||
WHERE ship_name LIKE ? ORDER BY date DESC
|
||||
'''
|
||||
return self.execute_query(query, (f'%{ship_name}%',))
|
||||
|
||||
def query_all(self, limit: int = 1000) -> List[Dict[str, Any]]:
|
||||
"""
|
||||
查询所有记录
|
||||
|
||||
参数:
|
||||
limit: 限制返回数量
|
||||
|
||||
返回:
|
||||
日志记录列表
|
||||
"""
|
||||
query = '''
|
||||
SELECT * FROM daily_handover_logs
|
||||
ORDER BY date DESC, shift LIMIT ?
|
||||
'''
|
||||
return self.execute_query(query, (limit,))
|
||||
|
||||
def get_stats(self) -> Dict[str, Any]:
|
||||
"""
|
||||
获取统计信息
|
||||
|
||||
返回:
|
||||
统计信息字典
|
||||
"""
|
||||
with self.get_connection() as conn:
|
||||
cursor = conn.cursor()
|
||||
|
||||
cursor.execute('SELECT COUNT(*) FROM daily_handover_logs')
|
||||
total = cursor.fetchone()[0]
|
||||
|
||||
cursor.execute('SELECT DISTINCT ship_name FROM daily_handover_logs')
|
||||
ships = [row[0] for row in cursor.fetchall()]
|
||||
|
||||
cursor.execute('SELECT MIN(date), MAX(date) FROM daily_handover_logs')
|
||||
date_range = cursor.fetchone()
|
||||
|
||||
return {
|
||||
'total': total,
|
||||
'ships': ships,
|
||||
'date_range': {'start': date_range[0], 'end': date_range[1]}
|
||||
}
|
||||
|
||||
def get_ships_with_monthly_teu(self, year_month: Optional[str] = None) -> List[Dict[str, Any]]:
|
||||
"""
|
||||
获取所有船只及其当月TEU总量
|
||||
|
||||
参数:
|
||||
year_month: 年月字符串,格式 "2025-12",如果为None则统计所有
|
||||
|
||||
返回:
|
||||
船只统计列表
|
||||
"""
|
||||
if year_month:
|
||||
query = '''
|
||||
SELECT ship_name, SUM(teu) as monthly_teu
|
||||
FROM daily_handover_logs
|
||||
WHERE date LIKE ?
|
||||
GROUP BY ship_name
|
||||
ORDER BY monthly_teu DESC
|
||||
'''
|
||||
return self.execute_query(query, (f'{year_month}%',))
|
||||
else:
|
||||
query = '''
|
||||
SELECT ship_name, SUM(teu) as monthly_teu
|
||||
FROM daily_handover_logs
|
||||
GROUP BY ship_name
|
||||
ORDER BY monthly_teu DESC
|
||||
'''
|
||||
return self.execute_query(query)
|
||||
|
||||
def insert_unaccounted(self, year_month: str, teu: int, note: str = '') -> bool:
|
||||
"""
|
||||
插入未统计数据
|
||||
|
||||
参数:
|
||||
year_month: 年月字符串,格式 "2025-12"
|
||||
teu: 未统计TEU数量
|
||||
note: 备注
|
||||
|
||||
返回:
|
||||
是否成功
|
||||
"""
|
||||
try:
|
||||
query = '''
|
||||
INSERT OR REPLACE INTO monthly_unaccounted
|
||||
(year_month, teu, note, created_at)
|
||||
VALUES (?, ?, ?, CURRENT_TIMESTAMP)
|
||||
'''
|
||||
self.execute_update(query, (year_month, teu, note))
|
||||
logger.info(f"插入未统计数据: {year_month} {teu}TEU")
|
||||
return True
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"插入未统计数据失败: {e}")
|
||||
return False
|
||||
|
||||
def get_unaccounted(self, year_month: str) -> int:
|
||||
"""
|
||||
获取指定月份的未统计数据
|
||||
|
||||
参数:
|
||||
year_month: 年月字符串,格式 "2025-12"
|
||||
|
||||
返回:
|
||||
未统计TEU数量
|
||||
"""
|
||||
query = 'SELECT teu FROM monthly_unaccounted WHERE year_month = ?'
|
||||
result = self.execute_query(query, (year_month,))
|
||||
return result[0]['teu'] if result else 0
|
||||
|
||||
def delete_by_date(self, date: str) -> int:
|
||||
"""
|
||||
删除指定日期的记录
|
||||
|
||||
参数:
|
||||
date: 日期字符串
|
||||
|
||||
返回:
|
||||
删除的记录数
|
||||
"""
|
||||
query = 'DELETE FROM daily_handover_logs WHERE date = ?'
|
||||
return self.execute_update(query, (date,))
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
# 测试代码
|
||||
db = DailyLogsDatabase()
|
||||
|
||||
# 测试插入
|
||||
test_log = {
|
||||
'date': '2025-12-30',
|
||||
'shift': '白班',
|
||||
'ship_name': '测试船',
|
||||
'teu': 100,
|
||||
'efficiency': 3.5,
|
||||
'vehicles': 5
|
||||
}
|
||||
|
||||
success = db.insert(test_log)
|
||||
print(f"插入测试: {'成功' if success else '失败'}")
|
||||
|
||||
# 测试查询
|
||||
logs = db.query_by_date('2025-12-30')
|
||||
print(f"查询结果: {len(logs)} 条记录")
|
||||
|
||||
# 测试统计
|
||||
stats = db.get_stats()
|
||||
print(f"统计信息: {stats}")
|
||||
|
||||
# 测试未统计数据
|
||||
db.insert_unaccounted('2025-12', 118, '测试备注')
|
||||
unaccounted = db.get_unaccounted('2025-12')
|
||||
print(f"未统计数据: {unaccounted}TEU")
|
||||
|
||||
# 清理测试数据
|
||||
db.delete_by_date('2025-12-30')
|
||||
print("测试数据已清理")
|
||||
342
src/database/schedules.py
Normal file
342
src/database/schedules.py
Normal file
@@ -0,0 +1,342 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
排班人员数据库模块
|
||||
基于新的数据库基类重构
|
||||
"""
|
||||
import json
|
||||
import hashlib
|
||||
from typing import List, Dict, Optional, Any
|
||||
|
||||
from src.database.base import DatabaseBase
|
||||
from src.logging_config import get_logger
|
||||
|
||||
logger = get_logger(__name__)
|
||||
|
||||
|
||||
class ScheduleDatabase(DatabaseBase):
|
||||
"""排班人员数据库"""
|
||||
|
||||
def __init__(self, db_path: Optional[str] = None):
|
||||
"""
|
||||
初始化数据库
|
||||
|
||||
参数:
|
||||
db_path: 数据库文件路径,如果为None则使用默认配置
|
||||
"""
|
||||
super().__init__(db_path)
|
||||
self._init_schema()
|
||||
|
||||
def _init_schema(self):
|
||||
"""初始化表结构"""
|
||||
with self.get_connection() as conn:
|
||||
cursor = conn.cursor()
|
||||
|
||||
# 创建排班人员表
|
||||
cursor.execute('''
|
||||
CREATE TABLE IF NOT EXISTS schedule_personnel (
|
||||
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
date TEXT NOT NULL,
|
||||
day_shift TEXT,
|
||||
night_shift TEXT,
|
||||
day_shift_list TEXT, -- JSON数组
|
||||
night_shift_list TEXT, -- JSON数组
|
||||
sheet_id TEXT,
|
||||
sheet_title TEXT,
|
||||
data_hash TEXT, -- 数据哈希,用于检测更新
|
||||
created_at TEXT DEFAULT CURRENT_TIMESTAMP,
|
||||
updated_at TEXT DEFAULT CURRENT_TIMESTAMP,
|
||||
UNIQUE(date)
|
||||
)
|
||||
''')
|
||||
|
||||
# 创建表格版本表(用于检测表格是否有更新)
|
||||
cursor.execute('''
|
||||
CREATE TABLE IF NOT EXISTS sheet_versions (
|
||||
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
sheet_id TEXT NOT NULL,
|
||||
sheet_title TEXT NOT NULL,
|
||||
revision INTEGER NOT NULL,
|
||||
data_hash TEXT,
|
||||
last_checked_at TEXT DEFAULT CURRENT_TIMESTAMP,
|
||||
UNIQUE(sheet_id)
|
||||
)
|
||||
''')
|
||||
|
||||
# 创建索引
|
||||
cursor.execute('CREATE INDEX IF NOT EXISTS idx_schedule_date ON schedule_personnel(date)')
|
||||
cursor.execute('CREATE INDEX IF NOT EXISTS idx_schedule_sheet ON schedule_personnel(sheet_id)')
|
||||
cursor.execute('CREATE INDEX IF NOT EXISTS idx_sheet_versions ON sheet_versions(sheet_id)')
|
||||
|
||||
conn.commit()
|
||||
logger.debug("排班数据库表结构初始化完成")
|
||||
|
||||
def _calculate_hash(self, data: Dict[str, Any]) -> str:
|
||||
"""
|
||||
计算数据哈希值
|
||||
|
||||
参数:
|
||||
data: 数据字典
|
||||
|
||||
返回:
|
||||
MD5哈希值
|
||||
"""
|
||||
data_str = json.dumps(data, sort_keys=True, ensure_ascii=False)
|
||||
return hashlib.md5(data_str.encode('utf-8')).hexdigest()
|
||||
|
||||
def check_sheet_update(self, sheet_id: str, sheet_title: str, revision: int, data: Dict[str, Any]) -> bool:
|
||||
"""
|
||||
检查表格是否有更新
|
||||
|
||||
参数:
|
||||
sheet_id: 表格ID
|
||||
sheet_title: 表格标题
|
||||
revision: 表格版本号
|
||||
data: 表格数据
|
||||
|
||||
返回:
|
||||
True: 有更新,需要重新获取
|
||||
False: 无更新,可以使用缓存
|
||||
"""
|
||||
with self.get_connection() as conn:
|
||||
cursor = conn.cursor()
|
||||
|
||||
# 查询当前版本
|
||||
cursor.execute(
|
||||
'SELECT revision, data_hash FROM sheet_versions WHERE sheet_id = ?',
|
||||
(sheet_id,)
|
||||
)
|
||||
result = cursor.fetchone()
|
||||
|
||||
if not result:
|
||||
# 第一次获取,记录版本
|
||||
data_hash = self._calculate_hash(data)
|
||||
cursor.execute('''
|
||||
INSERT INTO sheet_versions (sheet_id, sheet_title, revision, data_hash, last_checked_at)
|
||||
VALUES (?, ?, ?, ?, CURRENT_TIMESTAMP)
|
||||
''', (sheet_id, sheet_title, revision, data_hash))
|
||||
conn.commit()
|
||||
logger.debug(f"首次记录表格版本: {sheet_title} (ID: {sheet_id})")
|
||||
return True
|
||||
|
||||
# 检查版本号或数据是否有变化
|
||||
old_revision = result['revision']
|
||||
old_hash = result['data_hash']
|
||||
new_hash = self._calculate_hash(data)
|
||||
|
||||
if old_revision != revision or old_hash != new_hash:
|
||||
# 有更新,更新版本信息
|
||||
cursor.execute('''
|
||||
UPDATE sheet_versions
|
||||
SET revision = ?, data_hash = ?, last_checked_at = CURRENT_TIMESTAMP
|
||||
WHERE sheet_id = ?
|
||||
''', (revision, new_hash, sheet_id))
|
||||
conn.commit()
|
||||
logger.info(f"表格有更新: {sheet_title} (ID: {sheet_id})")
|
||||
return True
|
||||
|
||||
# 无更新,更新检查时间
|
||||
cursor.execute('''
|
||||
UPDATE sheet_versions
|
||||
SET last_checked_at = CURRENT_TIMESTAMP
|
||||
WHERE sheet_id = ?
|
||||
''', (sheet_id,))
|
||||
conn.commit()
|
||||
logger.debug(f"表格无更新: {sheet_title} (ID: {sheet_id})")
|
||||
return False
|
||||
|
||||
def save_schedule(self, date: str, schedule_data: Dict[str, Any],
|
||||
sheet_id: Optional[str] = None, sheet_title: Optional[str] = None) -> bool:
|
||||
"""
|
||||
保存排班信息到数据库
|
||||
|
||||
参数:
|
||||
date: 日期 (YYYY-MM-DD)
|
||||
schedule_data: 排班数据
|
||||
sheet_id: 表格ID
|
||||
sheet_title: 表格标题
|
||||
|
||||
返回:
|
||||
是否成功
|
||||
"""
|
||||
try:
|
||||
# 准备数据
|
||||
day_shift = schedule_data.get('day_shift', '')
|
||||
night_shift = schedule_data.get('night_shift', '')
|
||||
day_shift_list = json.dumps(schedule_data.get('day_shift_list', []), ensure_ascii=False)
|
||||
night_shift_list = json.dumps(schedule_data.get('night_shift_list', []), ensure_ascii=False)
|
||||
data_hash = self._calculate_hash(schedule_data)
|
||||
|
||||
# 使用 INSERT OR REPLACE 来更新已存在的记录
|
||||
query = '''
|
||||
INSERT OR REPLACE INTO schedule_personnel
|
||||
(date, day_shift, night_shift, day_shift_list, night_shift_list,
|
||||
sheet_id, sheet_title, data_hash, updated_at)
|
||||
VALUES (?, ?, ?, ?, ?, ?, ?, ?, CURRENT_TIMESTAMP)
|
||||
'''
|
||||
params = (
|
||||
date, day_shift, night_shift, day_shift_list, night_shift_list,
|
||||
sheet_id, sheet_title, data_hash
|
||||
)
|
||||
|
||||
self.execute_update(query, params)
|
||||
logger.debug(f"保存排班信息: {date}")
|
||||
return True
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"保存排班信息失败: {e}, 日期: {date}")
|
||||
return False
|
||||
|
||||
def get_schedule(self, date: str) -> Optional[Dict[str, Any]]:
|
||||
"""
|
||||
获取指定日期的排班信息
|
||||
|
||||
参数:
|
||||
date: 日期 (YYYY-MM-DD)
|
||||
|
||||
返回:
|
||||
排班信息字典,未找到返回None
|
||||
"""
|
||||
query = 'SELECT * FROM schedule_personnel WHERE date = ?'
|
||||
result = self.execute_query(query, (date,))
|
||||
|
||||
if not result:
|
||||
return None
|
||||
|
||||
row = result[0]
|
||||
|
||||
# 解析JSON数组
|
||||
day_shift_list = json.loads(row['day_shift_list']) if row['day_shift_list'] else []
|
||||
night_shift_list = json.loads(row['night_shift_list']) if row['night_shift_list'] else []
|
||||
|
||||
return {
|
||||
'date': row['date'],
|
||||
'day_shift': row['day_shift'],
|
||||
'night_shift': row['night_shift'],
|
||||
'day_shift_list': day_shift_list,
|
||||
'night_shift_list': night_shift_list,
|
||||
'sheet_id': row['sheet_id'],
|
||||
'sheet_title': row['sheet_title'],
|
||||
'updated_at': row['updated_at']
|
||||
}
|
||||
|
||||
def get_schedule_by_range(self, start_date: str, end_date: str) -> List[Dict[str, Any]]:
|
||||
"""
|
||||
获取日期范围内的排班信息
|
||||
|
||||
参数:
|
||||
start_date: 开始日期 (YYYY-MM-DD)
|
||||
end_date: 结束日期 (YYYY-MM-DD)
|
||||
|
||||
返回:
|
||||
排班信息列表
|
||||
"""
|
||||
query = '''
|
||||
SELECT * FROM schedule_personnel
|
||||
WHERE date >= ? AND date <= ?
|
||||
ORDER BY date
|
||||
'''
|
||||
results = self.execute_query(query, (start_date, end_date))
|
||||
|
||||
processed_results = []
|
||||
for row in results:
|
||||
day_shift_list = json.loads(row['day_shift_list']) if row['day_shift_list'] else []
|
||||
night_shift_list = json.loads(row['night_shift_list']) if row['night_shift_list'] else []
|
||||
|
||||
processed_results.append({
|
||||
'date': row['date'],
|
||||
'day_shift': row['day_shift'],
|
||||
'night_shift': row['night_shift'],
|
||||
'day_shift_list': day_shift_list,
|
||||
'night_shift_list': night_shift_list,
|
||||
'sheet_id': row['sheet_id'],
|
||||
'sheet_title': row['sheet_title'],
|
||||
'updated_at': row['updated_at']
|
||||
})
|
||||
|
||||
return processed_results
|
||||
|
||||
def delete_old_schedules(self, before_date: str) -> int:
|
||||
"""
|
||||
删除指定日期之前的排班记录
|
||||
|
||||
参数:
|
||||
before_date: 日期 (YYYY-MM-DD)
|
||||
|
||||
返回:
|
||||
删除的记录数
|
||||
"""
|
||||
query = 'DELETE FROM schedule_personnel WHERE date < ?'
|
||||
return self.execute_update(query, (before_date,))
|
||||
|
||||
def get_stats(self) -> Dict[str, Any]:
|
||||
"""获取统计信息"""
|
||||
with self.get_connection() as conn:
|
||||
cursor = conn.cursor()
|
||||
|
||||
cursor.execute('SELECT COUNT(*) FROM schedule_personnel')
|
||||
total = cursor.fetchone()[0]
|
||||
|
||||
cursor.execute('SELECT MIN(date), MAX(date) FROM schedule_personnel')
|
||||
date_range = cursor.fetchone()
|
||||
|
||||
cursor.execute('SELECT COUNT(DISTINCT sheet_id) FROM schedule_personnel')
|
||||
sheet_count = cursor.fetchone()[0]
|
||||
|
||||
return {
|
||||
'total': total,
|
||||
'date_range': {'start': date_range[0], 'end': date_range[1]},
|
||||
'sheet_count': sheet_count
|
||||
}
|
||||
|
||||
def clear_all(self) -> int:
|
||||
"""
|
||||
清空所有排班数据
|
||||
|
||||
返回:
|
||||
删除的记录数
|
||||
"""
|
||||
query1 = 'DELETE FROM schedule_personnel'
|
||||
query2 = 'DELETE FROM sheet_versions'
|
||||
|
||||
count1 = self.execute_update(query1)
|
||||
count2 = self.execute_update(query2)
|
||||
|
||||
logger.info(f"清空排班数据,删除 {count1} 条排班记录和 {count2} 条版本记录")
|
||||
return count1 + count2
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
# 测试代码
|
||||
db = ScheduleDatabase()
|
||||
|
||||
# 测试保存
|
||||
test_schedule = {
|
||||
'day_shift': '张勤、杨俊豪',
|
||||
'night_shift': '刘炜彬、梁启迟',
|
||||
'day_shift_list': ['张勤', '杨俊豪'],
|
||||
'night_shift_list': ['刘炜彬', '梁启迟']
|
||||
}
|
||||
|
||||
success = db.save_schedule('2025-12-31', test_schedule, 'zcYLIk', '12月')
|
||||
print(f"保存测试: {'成功' if success else '失败'}")
|
||||
|
||||
# 测试获取
|
||||
schedule = db.get_schedule('2025-12-31')
|
||||
print(f"获取结果: {schedule}")
|
||||
|
||||
# 测试范围查询
|
||||
schedules = db.get_schedule_by_range('2025-12-01', '2025-12-31')
|
||||
print(f"范围查询: {len(schedules)} 条记录")
|
||||
|
||||
# 测试统计
|
||||
stats = db.get_stats()
|
||||
print(f"统计信息: {stats}")
|
||||
|
||||
# 测试表格版本检查
|
||||
test_data = {'values': [['姓名', '12月31日'], ['张三', '白']]}
|
||||
needs_update = db.check_sheet_update('test_sheet', '测试表格', 1, test_data)
|
||||
print(f"表格更新检查: {'需要更新' if needs_update else '无需更新'}")
|
||||
|
||||
# 清理测试数据
|
||||
db.delete_old_schedules('2026-01-01')
|
||||
print("测试数据已清理")
|
||||
216
src/extractor.py
216
src/extractor.py
@@ -1,216 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
HTML 文本提取模块
|
||||
"""
|
||||
import re
|
||||
from bs4 import BeautifulSoup, Tag, NavigableString
|
||||
from typing import List
|
||||
|
||||
|
||||
class HTMLTextExtractor:
|
||||
"""HTML 文本提取器 - 保留布局结构"""
|
||||
|
||||
# 块级元素列表
|
||||
BLOCK_TAGS = {
|
||||
'h1', 'h2', 'h3', 'h4', 'h5', 'h6', 'p', 'div', 'section',
|
||||
'table', 'tr', 'td', 'th', 'li', 'ul', 'ol', 'blockquote',
|
||||
'pre', 'hr', 'br', 'tbody', 'thead', 'tfoot'
|
||||
}
|
||||
|
||||
def __init__(self):
|
||||
"""初始化提取器"""
|
||||
self.output_lines: List[str] = []
|
||||
|
||||
def extract(self, html: str) -> str:
|
||||
"""
|
||||
从HTML中提取保留布局的文本
|
||||
|
||||
参数:
|
||||
html: HTML字符串
|
||||
|
||||
返回:
|
||||
格式化的纯文本
|
||||
"""
|
||||
if not html:
|
||||
return ''
|
||||
|
||||
soup = BeautifulSoup(html, 'html.parser')
|
||||
|
||||
# 移除不需要的元素
|
||||
for tag in soup(["script", "style", "noscript"]):
|
||||
tag.decompose()
|
||||
|
||||
# 移除 Confluence 宏
|
||||
for macro in soup.find_all(attrs={"ac:name": True}):
|
||||
macro.decompose()
|
||||
|
||||
self.output_lines = []
|
||||
|
||||
# 处理 body 或整个文档
|
||||
body = soup.body if soup.body else soup
|
||||
for child in body.children:
|
||||
self._process_node(child)
|
||||
|
||||
# 清理结果
|
||||
result = ''.join(self.output_lines)
|
||||
result = re.sub(r'\n\s*\n\s*\n', '\n\n', result)
|
||||
result = '\n'.join(line.rstrip() for line in result.split('\n'))
|
||||
return result.strip()
|
||||
|
||||
def _process_node(self, node, indent: int = 0, list_context=None):
|
||||
"""递归处理节点"""
|
||||
if isinstance(node, NavigableString):
|
||||
text = str(node).strip()
|
||||
if text:
|
||||
text = re.sub(r'\s+', ' ', text)
|
||||
if self.output_lines and not self.output_lines[-1].endswith('\n'):
|
||||
self.output_lines[-1] += text
|
||||
else:
|
||||
self.output_lines.append(' ' * indent + text)
|
||||
return
|
||||
|
||||
if not isinstance(node, Tag):
|
||||
return
|
||||
|
||||
tag_name = node.name.lower()
|
||||
is_block = tag_name in self.BLOCK_TAGS
|
||||
|
||||
# 块级元素前添加换行
|
||||
if is_block and self.output_lines and not self.output_lines[-1].endswith('\n'):
|
||||
self.output_lines.append('\n')
|
||||
|
||||
# 处理特定标签
|
||||
if tag_name in ('h1', 'h2', 'h3', 'h4', 'h5', 'h6'):
|
||||
level = int(tag_name[1])
|
||||
prefix = '#' * level + ' '
|
||||
text = node.get_text().strip()
|
||||
if text:
|
||||
self.output_lines.append(' ' * indent + prefix + text + '\n')
|
||||
return
|
||||
|
||||
elif tag_name == 'p':
|
||||
text = node.get_text().strip()
|
||||
if text:
|
||||
self.output_lines.append(' ' * indent + text + '\n')
|
||||
return
|
||||
|
||||
elif tag_name == 'hr':
|
||||
self.output_lines.append(' ' * indent + '─' * 50 + '\n')
|
||||
return
|
||||
|
||||
elif tag_name == 'br':
|
||||
self.output_lines.append('\n')
|
||||
return
|
||||
|
||||
elif tag_name == 'table':
|
||||
self._process_table(node, indent)
|
||||
return
|
||||
|
||||
elif tag_name in ('ul', 'ol'):
|
||||
self._process_list(node, indent, tag_name)
|
||||
return
|
||||
|
||||
elif tag_name == 'li':
|
||||
self._process_list_item(node, indent, list_context)
|
||||
return
|
||||
|
||||
elif tag_name == 'a':
|
||||
href = node.get('href', '')
|
||||
text = node.get_text().strip()
|
||||
if href and text:
|
||||
self.output_lines.append(f'{text} ({href})')
|
||||
elif text:
|
||||
self.output_lines.append(text)
|
||||
return
|
||||
|
||||
elif tag_name in ('strong', 'b'):
|
||||
text = node.get_text().strip()
|
||||
if text:
|
||||
self.output_lines.append(f'**{text}**')
|
||||
return
|
||||
|
||||
elif tag_name in ('em', 'i'):
|
||||
text = node.get_text().strip()
|
||||
if text:
|
||||
self.output_lines.append(f'*{text}*')
|
||||
return
|
||||
|
||||
else:
|
||||
# 默认递归处理子元素
|
||||
for child in node.children:
|
||||
self._process_node(child, indent, list_context)
|
||||
|
||||
if is_block and self.output_lines and not self.output_lines[-1].endswith('\n'):
|
||||
self.output_lines.append('\n')
|
||||
|
||||
def _process_table(self, table: Tag, indent: int):
|
||||
"""处理表格"""
|
||||
rows = []
|
||||
for tr in table.find_all('tr'):
|
||||
row = []
|
||||
for td in tr.find_all(['td', 'th']):
|
||||
row.append(td.get_text().strip())
|
||||
if row:
|
||||
rows.append(row)
|
||||
|
||||
if rows:
|
||||
# 计算列宽
|
||||
col_widths = []
|
||||
for i in range(max(len(r) for r in rows)):
|
||||
col_width = max((len(r[i]) if i < len(r) else 0) for r in rows)
|
||||
col_widths.append(col_width)
|
||||
|
||||
for row in rows:
|
||||
line = ' ' * indent
|
||||
for i, cell in enumerate(row):
|
||||
width = col_widths[i] if i < len(col_widths) else 0
|
||||
line += cell.ljust(width) + ' '
|
||||
self.output_lines.append(line.rstrip() + '\n')
|
||||
self.output_lines.append('\n')
|
||||
|
||||
def _process_list(self, ul: Tag, indent: int, list_type: str):
|
||||
"""处理列表"""
|
||||
counter = 1 if list_type == 'ol' else None
|
||||
for child in ul.children:
|
||||
if isinstance(child, Tag) and child.name == 'li':
|
||||
ctx = (list_type, counter) if counter else (list_type, 1)
|
||||
self._process_list_item(child, indent, ctx)
|
||||
if counter:
|
||||
counter += 1
|
||||
else:
|
||||
self._process_node(child, indent, (list_type, 1) if not counter else None)
|
||||
|
||||
def _process_list_item(self, li: Tag, indent: int, list_context):
|
||||
"""处理列表项"""
|
||||
prefix = ''
|
||||
if list_context:
|
||||
list_type, num = list_context
|
||||
prefix = '• ' if list_type == 'ul' else f'{num}. '
|
||||
|
||||
# 收集直接文本
|
||||
direct_parts = []
|
||||
for child in li.children:
|
||||
if isinstance(child, NavigableString):
|
||||
text = str(child).strip()
|
||||
if text:
|
||||
direct_parts.append(text)
|
||||
elif isinstance(child, Tag) and child.name == 'a':
|
||||
href = child.get('href', '')
|
||||
link_text = child.get_text().strip()
|
||||
if href and link_text:
|
||||
direct_parts.append(f'{link_text} ({href})')
|
||||
|
||||
if direct_parts:
|
||||
self.output_lines.append(' ' * indent + prefix + ' '.join(direct_parts) + '\n')
|
||||
|
||||
# 处理子元素
|
||||
for child in li.children:
|
||||
if isinstance(child, Tag) and child.name != 'a':
|
||||
self._process_node(child, indent + 2, None)
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
# 测试
|
||||
html = "<h1>标题</h1><p>段落</p><ul><li>项目1</li><li>项目2</li></ul>"
|
||||
extractor = HTMLTextExtractor()
|
||||
print(extractor.extract(html))
|
||||
467
src/feishu.py
467
src/feishu.py
@@ -1,467 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
飞书表格 API 客户端模块
|
||||
用于获取码头作业人员排班信息
|
||||
"""
|
||||
import requests
|
||||
import json
|
||||
import os
|
||||
import time
|
||||
from datetime import datetime, timedelta
|
||||
from typing import Dict, List, Optional, Tuple
|
||||
import logging
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
class FeishuSheetsClient:
|
||||
"""飞书表格 API 客户端"""
|
||||
|
||||
def __init__(self, base_url: str, token: str, spreadsheet_token: str):
|
||||
"""
|
||||
初始化客户端
|
||||
|
||||
参数:
|
||||
base_url: 飞书 API 基础URL
|
||||
token: Bearer 认证令牌
|
||||
spreadsheet_token: 表格 token
|
||||
"""
|
||||
self.base_url = base_url.rstrip('/')
|
||||
self.spreadsheet_token = spreadsheet_token
|
||||
self.headers = {
|
||||
'Authorization': f'Bearer {token}',
|
||||
'Content-Type': 'application/json',
|
||||
'Accept': 'application/json'
|
||||
}
|
||||
|
||||
def get_sheets_info(self) -> List[Dict]:
|
||||
"""
|
||||
获取所有表格信息(sheet_id 和 title)
|
||||
|
||||
返回:
|
||||
表格信息列表 [{'sheet_id': 'xxx', 'title': 'xxx'}, ...]
|
||||
"""
|
||||
url = f'{self.base_url}/spreadsheets/{self.spreadsheet_token}/sheets/query'
|
||||
|
||||
try:
|
||||
response = requests.get(url, headers=self.headers, timeout=30)
|
||||
response.raise_for_status()
|
||||
data = response.json()
|
||||
|
||||
if data.get('code') != 0:
|
||||
logger.error(f"飞书API错误: {data.get('msg')}")
|
||||
return []
|
||||
|
||||
sheets = data.get('data', {}).get('sheets', [])
|
||||
result = []
|
||||
for sheet in sheets:
|
||||
result.append({
|
||||
'sheet_id': sheet.get('sheet_id'),
|
||||
'title': sheet.get('title')
|
||||
})
|
||||
|
||||
logger.info(f"获取到 {len(result)} 个表格")
|
||||
return result
|
||||
|
||||
except requests.exceptions.RequestException as e:
|
||||
logger.error(f"获取表格信息失败: {e}")
|
||||
return []
|
||||
except Exception as e:
|
||||
logger.error(f"解析表格信息失败: {e}")
|
||||
return []
|
||||
|
||||
def get_sheet_data(self, sheet_id: str, range_: str = 'A:AF') -> Dict:
|
||||
"""
|
||||
获取指定表格的数据
|
||||
|
||||
参数:
|
||||
sheet_id: 表格ID
|
||||
range_: 数据范围,默认 A:AF (31列)
|
||||
|
||||
返回:
|
||||
飞书API返回的原始数据
|
||||
"""
|
||||
# 注意:获取表格数据使用 v2 API,而不是 v3
|
||||
# 根据你提供的示例:GET 'https://open.feishu.cn/open-apis/sheets/v2/spreadsheets/{token}/values/{sheet_id}!A:AF'
|
||||
url = f'{self.base_url.replace("/v3", "/v2")}/spreadsheets/{self.spreadsheet_token}/values/{sheet_id}!{range_}'
|
||||
params = {
|
||||
'valueRenderOption': 'ToString',
|
||||
'dateTimeRenderOption': 'FormattedString'
|
||||
}
|
||||
|
||||
try:
|
||||
response = requests.get(url, headers=self.headers, params=params, timeout=30)
|
||||
response.raise_for_status()
|
||||
data = response.json()
|
||||
|
||||
if data.get('code') != 0:
|
||||
logger.error(f"飞书API错误: {data.get('msg')}")
|
||||
return {}
|
||||
|
||||
return data.get('data', {})
|
||||
|
||||
except requests.exceptions.RequestException as e:
|
||||
logger.error(f"获取表格数据失败: {e}")
|
||||
return {}
|
||||
except Exception as e:
|
||||
logger.error(f"解析表格数据失败: {e}")
|
||||
return {}
|
||||
|
||||
|
||||
class ScheduleDataParser:
|
||||
"""排班数据解析器"""
|
||||
|
||||
@staticmethod
|
||||
def _parse_chinese_date(date_str: str) -> Optional[str]:
|
||||
"""
|
||||
解析中文日期格式
|
||||
|
||||
参数:
|
||||
date_str: 中文日期,如 "12月30日" 或 "12/30" 或 "12月1日"
|
||||
|
||||
返回:
|
||||
标准化日期字符串 "MM月DD日"
|
||||
"""
|
||||
if not date_str:
|
||||
return None
|
||||
|
||||
# 如果是 "12/30" 格式
|
||||
if '/' in date_str:
|
||||
try:
|
||||
month, day = date_str.split('/')
|
||||
# 移除可能的空格
|
||||
month = month.strip()
|
||||
day = day.strip()
|
||||
return f"{int(month)}月{int(day)}日"
|
||||
except:
|
||||
return None
|
||||
|
||||
# 如果是 "12月30日" 格式
|
||||
if '月' in date_str and '日' in date_str:
|
||||
return date_str
|
||||
|
||||
# 如果是 "12月1日" 格式(已经包含"日"字)
|
||||
if '月' in date_str:
|
||||
# 检查是否已经有"日"字
|
||||
if '日' not in date_str:
|
||||
return f"{date_str}日"
|
||||
return date_str
|
||||
|
||||
return None
|
||||
|
||||
@staticmethod
|
||||
def _find_date_column_index(headers: List[str], target_date: str) -> Optional[int]:
|
||||
"""
|
||||
在表头中查找目标日期对应的列索引
|
||||
|
||||
参数:
|
||||
headers: 表头行 ["姓名", "12月1日", "12月2日", ...]
|
||||
target_date: 目标日期 "12月30日"
|
||||
|
||||
返回:
|
||||
列索引(从0开始),未找到返回None
|
||||
"""
|
||||
if not headers or not target_date:
|
||||
return None
|
||||
|
||||
# 标准化目标日期
|
||||
target_std = ScheduleDataParser._parse_chinese_date(target_date)
|
||||
if not target_std:
|
||||
return None
|
||||
|
||||
# 遍历表头查找匹配的日期
|
||||
for i, header in enumerate(headers):
|
||||
header_std = ScheduleDataParser._parse_chinese_date(header)
|
||||
if header_std == target_std:
|
||||
return i
|
||||
|
||||
return None
|
||||
|
||||
def parse(self, values: List[List[str]], target_date: str) -> Dict:
|
||||
"""
|
||||
解析排班数据,获取指定日期的班次人员
|
||||
|
||||
参数:
|
||||
values: 飞书表格返回的二维数组
|
||||
target_date: 目标日期(格式: "12月30日" 或 "12/30")
|
||||
|
||||
返回:
|
||||
{
|
||||
'day_shift': '张勤、刘炜彬、杨俊豪',
|
||||
'night_shift': '梁启迟、江唯、汪钦良',
|
||||
'day_shift_list': ['张勤', '刘炜彬', '杨俊豪'],
|
||||
'night_shift_list': ['梁启迟', '江唯', '汪钦良']
|
||||
}
|
||||
"""
|
||||
if not values or len(values) < 2:
|
||||
return {
|
||||
'day_shift': '',
|
||||
'night_shift': '',
|
||||
'day_shift_list': [],
|
||||
'night_shift_list': []
|
||||
}
|
||||
|
||||
# 第一行是表头
|
||||
headers = values[0]
|
||||
date_column_index = self._find_date_column_index(headers, target_date)
|
||||
|
||||
if date_column_index is None:
|
||||
logger.warning(f"未找到日期列: {target_date}")
|
||||
return {
|
||||
'day_shift': '',
|
||||
'night_shift': '',
|
||||
'day_shift_list': [],
|
||||
'night_shift_list': []
|
||||
}
|
||||
|
||||
# 收集白班和夜班人员
|
||||
day_shift_names = []
|
||||
night_shift_names = []
|
||||
|
||||
# 从第二行开始是人员数据
|
||||
for row in values[1:]:
|
||||
if len(row) <= date_column_index:
|
||||
continue
|
||||
|
||||
name = row[0] if row else ''
|
||||
shift = row[date_column_index] if date_column_index < len(row) else ''
|
||||
|
||||
if not name or not shift:
|
||||
continue
|
||||
|
||||
if shift == '白':
|
||||
day_shift_names.append(name)
|
||||
elif shift == '夜':
|
||||
night_shift_names.append(name)
|
||||
|
||||
# 格式化输出
|
||||
day_shift_str = '、'.join(day_shift_names) if day_shift_names else ''
|
||||
night_shift_str = '、'.join(night_shift_names) if night_shift_names else ''
|
||||
|
||||
return {
|
||||
'day_shift': day_shift_str,
|
||||
'night_shift': night_shift_str,
|
||||
'day_shift_list': day_shift_names,
|
||||
'night_shift_list': night_shift_names
|
||||
}
|
||||
|
||||
|
||||
class ScheduleCache:
|
||||
"""排班数据缓存"""
|
||||
|
||||
def __init__(self, cache_file: str = 'data/schedule_cache.json'):
|
||||
self.cache_file = cache_file
|
||||
self.cache_ttl = 3600 # 1小时
|
||||
|
||||
def load(self) -> Optional[Dict]:
|
||||
"""加载缓存"""
|
||||
try:
|
||||
if not os.path.exists(self.cache_file):
|
||||
return None
|
||||
|
||||
with open(self.cache_file, 'r', encoding='utf-8') as f:
|
||||
cache_data = json.load(f)
|
||||
|
||||
# 检查缓存是否过期
|
||||
last_update = cache_data.get('last_update')
|
||||
if last_update:
|
||||
last_time = datetime.fromisoformat(last_update)
|
||||
if (datetime.now() - last_time).total_seconds() < self.cache_ttl:
|
||||
return cache_data.get('data')
|
||||
|
||||
return None
|
||||
|
||||
except Exception as e:
|
||||
logger.warning(f"加载缓存失败: {e}")
|
||||
return None
|
||||
|
||||
def save(self, data: Dict):
|
||||
"""保存缓存"""
|
||||
try:
|
||||
# 确保目录存在
|
||||
os.makedirs(os.path.dirname(self.cache_file), exist_ok=True)
|
||||
|
||||
cache_data = {
|
||||
'last_update': datetime.now().isoformat(),
|
||||
'data': data
|
||||
}
|
||||
|
||||
with open(self.cache_file, 'w', encoding='utf-8') as f:
|
||||
json.dump(cache_data, f, ensure_ascii=False, indent=2)
|
||||
|
||||
logger.info(f"缓存已保存到 {self.cache_file}")
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"保存缓存失败: {e}")
|
||||
|
||||
|
||||
class FeishuScheduleManager:
|
||||
"""飞书排班管理器(主入口)"""
|
||||
|
||||
def __init__(self, base_url: str = None, token: str = None,
|
||||
spreadsheet_token: str = None):
|
||||
"""
|
||||
初始化管理器
|
||||
|
||||
参数:
|
||||
base_url: 飞书API基础URL,从环境变量读取
|
||||
token: 飞书API令牌,从环境变量读取
|
||||
spreadsheet_token: 表格token,从环境变量读取
|
||||
"""
|
||||
# 从环境变量读取配置
|
||||
self.base_url = base_url or os.getenv('FEISHU_BASE_URL', 'https://open.feishu.cn/open-apis/sheets/v3')
|
||||
self.token = token or os.getenv('FEISHU_TOKEN', '')
|
||||
self.spreadsheet_token = spreadsheet_token or os.getenv('FEISHU_SPREADSHEET_TOKEN', '')
|
||||
|
||||
if not self.token or not self.spreadsheet_token:
|
||||
logger.warning("飞书配置不完整,请检查环境变量")
|
||||
|
||||
self.client = FeishuSheetsClient(self.base_url, self.token, self.spreadsheet_token)
|
||||
self.parser = ScheduleDataParser()
|
||||
self.cache = ScheduleCache()
|
||||
|
||||
def get_schedule_for_date(self, date_str: str, use_cache: bool = True) -> Dict:
|
||||
"""
|
||||
获取指定日期的排班信息
|
||||
|
||||
参数:
|
||||
date_str: 日期字符串,格式 "2025-12-30" 或 "12/30"
|
||||
use_cache: 是否使用缓存
|
||||
|
||||
返回:
|
||||
排班信息字典
|
||||
"""
|
||||
# 转换日期格式
|
||||
try:
|
||||
if '-' in date_str:
|
||||
# "2025-12-30" -> "12/30"
|
||||
dt = datetime.strptime(date_str, '%Y-%m-%d')
|
||||
target_date = dt.strftime('%m/%d')
|
||||
year_month = dt.strftime('%Y-%m')
|
||||
month_name = dt.strftime('%m月') # "12月"
|
||||
else:
|
||||
# "12/30" -> "12/30"
|
||||
target_date = date_str
|
||||
# 假设当前年份
|
||||
current_year = datetime.now().year
|
||||
month = int(date_str.split('/')[0])
|
||||
year_month = f"{current_year}-{month:02d}"
|
||||
month_name = f"{month}月"
|
||||
except:
|
||||
target_date = date_str
|
||||
year_month = datetime.now().strftime('%Y-%m')
|
||||
month_name = datetime.now().strftime('%m月')
|
||||
|
||||
# 尝试从缓存获取
|
||||
if use_cache:
|
||||
cached_data = self.cache.load()
|
||||
cache_key = f"{year_month}_{target_date}"
|
||||
if cached_data and cache_key in cached_data:
|
||||
logger.info(f"从缓存获取 {cache_key} 的排班信息")
|
||||
return cached_data[cache_key]
|
||||
|
||||
# 获取表格信息
|
||||
sheets = self.client.get_sheets_info()
|
||||
if not sheets:
|
||||
logger.error("未获取到表格信息")
|
||||
return {
|
||||
'day_shift': '',
|
||||
'night_shift': '',
|
||||
'day_shift_list': [],
|
||||
'night_shift_list': []
|
||||
}
|
||||
|
||||
# 根据月份选择对应的表格
|
||||
sheet_id = None
|
||||
sheet_title = None
|
||||
|
||||
# 优先查找月份表格,如 "12月"
|
||||
for sheet in sheets:
|
||||
title = sheet.get('title', '')
|
||||
if month_name in title:
|
||||
sheet_id = sheet['sheet_id']
|
||||
sheet_title = title
|
||||
logger.info(f"找到月份表格: {title} (ID: {sheet_id})")
|
||||
break
|
||||
|
||||
# 如果没有找到月份表格,使用第一个表格
|
||||
if not sheet_id and sheets:
|
||||
sheet_id = sheets[0]['sheet_id']
|
||||
sheet_title = sheets[0]['title']
|
||||
logger.warning(f"未找到 {month_name} 表格,使用第一个表格: {sheet_title}")
|
||||
|
||||
if not sheet_id:
|
||||
logger.error("未找到可用的表格")
|
||||
return {
|
||||
'day_shift': '',
|
||||
'night_shift': '',
|
||||
'day_shift_list': [],
|
||||
'night_shift_list': []
|
||||
}
|
||||
|
||||
# 获取表格数据
|
||||
sheet_data = self.client.get_sheet_data(sheet_id)
|
||||
if not sheet_data:
|
||||
logger.error("未获取到表格数据")
|
||||
return {
|
||||
'day_shift': '',
|
||||
'night_shift': '',
|
||||
'day_shift_list': [],
|
||||
'night_shift_list': []
|
||||
}
|
||||
|
||||
values = sheet_data.get('valueRange', {}).get('values', [])
|
||||
if not values:
|
||||
logger.error("表格数据为空")
|
||||
return {
|
||||
'day_shift': '',
|
||||
'night_shift': '',
|
||||
'day_shift_list': [],
|
||||
'night_shift_list': []
|
||||
}
|
||||
|
||||
# 解析数据
|
||||
result = self.parser.parse(values, target_date)
|
||||
|
||||
# 更新缓存
|
||||
if use_cache:
|
||||
cached_data = self.cache.load() or {}
|
||||
cache_key = f"{year_month}_{target_date}"
|
||||
cached_data[cache_key] = result
|
||||
self.cache.save(cached_data)
|
||||
|
||||
return result
|
||||
|
||||
def get_schedule_for_today(self) -> Dict:
|
||||
"""获取今天的排班信息"""
|
||||
today = datetime.now().strftime('%Y-%m-%d')
|
||||
return self.get_schedule_for_date(today)
|
||||
|
||||
def get_schedule_for_tomorrow(self) -> Dict:
|
||||
"""获取明天的排班信息"""
|
||||
tomorrow = (datetime.now() + timedelta(days=1)).strftime('%Y-%m-%d')
|
||||
return self.get_schedule_for_date(tomorrow)
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
# 测试代码
|
||||
import sys
|
||||
|
||||
# 设置日志
|
||||
logging.basicConfig(level=logging.INFO)
|
||||
|
||||
# 从环境变量读取配置
|
||||
manager = FeishuScheduleManager()
|
||||
|
||||
if len(sys.argv) > 1:
|
||||
date_str = sys.argv[1]
|
||||
else:
|
||||
date_str = datetime.now().strftime('%Y-%m-%d')
|
||||
|
||||
print(f"获取 {date_str} 的排班信息...")
|
||||
schedule = manager.get_schedule_for_date(date_str)
|
||||
|
||||
print(f"白班人员: {schedule['day_shift']}")
|
||||
print(f"夜班人员: {schedule['night_shift']}")
|
||||
print(f"白班列表: {schedule['day_shift_list']}")
|
||||
print(f"夜班列表: {schedule['night_shift_list']}")
|
||||
15
src/feishu/__init__.py
Normal file
15
src/feishu/__init__.py
Normal file
@@ -0,0 +1,15 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
飞书模块包
|
||||
提供统一的飞书API接口
|
||||
"""
|
||||
from src.feishu.client import FeishuSheetsClient, FeishuClientError
|
||||
from src.feishu.parser import ScheduleDataParser
|
||||
from src.feishu.manager import FeishuScheduleManager
|
||||
|
||||
__all__ = [
|
||||
'FeishuSheetsClient',
|
||||
'FeishuClientError',
|
||||
'ScheduleDataParser',
|
||||
'FeishuScheduleManager'
|
||||
]
|
||||
182
src/feishu/client.py
Normal file
182
src/feishu/client.py
Normal file
@@ -0,0 +1,182 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
飞书表格 API 客户端模块
|
||||
统一版本,支持月度表格和年度表格
|
||||
"""
|
||||
import requests
|
||||
from typing import Dict, List, Optional
|
||||
import logging
|
||||
|
||||
from src.config import config
|
||||
from src.logging_config import get_logger
|
||||
|
||||
logger = get_logger(__name__)
|
||||
|
||||
|
||||
class FeishuClientError(Exception):
|
||||
"""飞书客户端异常基类"""
|
||||
pass
|
||||
|
||||
|
||||
class FeishuSheetsClient:
|
||||
"""飞书表格 API 客户端"""
|
||||
|
||||
def __init__(self, base_url: Optional[str] = None, token: Optional[str] = None,
|
||||
spreadsheet_token: Optional[str] = None):
|
||||
"""
|
||||
初始化客户端
|
||||
|
||||
参数:
|
||||
base_url: 飞书 API 基础URL,如果为None则使用配置
|
||||
token: Bearer 认证令牌,如果为None则使用配置
|
||||
spreadsheet_token: 表格 token,如果为None则使用配置
|
||||
"""
|
||||
self.base_url = (base_url or config.FEISHU_BASE_URL).rstrip('/')
|
||||
self.spreadsheet_token = spreadsheet_token or config.FEISHU_SPREADSHEET_TOKEN
|
||||
self.token = token or config.FEISHU_TOKEN
|
||||
|
||||
self.headers = {
|
||||
'Authorization': f'Bearer {self.token}',
|
||||
'Content-Type': 'application/json',
|
||||
'Accept': 'application/json'
|
||||
}
|
||||
|
||||
# 使用 Session 重用连接
|
||||
self.session = requests.Session()
|
||||
self.session.headers.update(self.headers)
|
||||
self.session.timeout = config.REQUEST_TIMEOUT
|
||||
|
||||
logger.debug(f"飞书客户端初始化完成,基础URL: {self.base_url}")
|
||||
|
||||
def get_sheets_info(self) -> List[Dict[str, str]]:
|
||||
"""
|
||||
获取所有表格信息(sheet_id 和 title)
|
||||
|
||||
返回:
|
||||
表格信息列表 [{'sheet_id': 'xxx', 'title': 'xxx'}, ...]
|
||||
|
||||
异常:
|
||||
requests.exceptions.RequestException: 网络请求失败
|
||||
ValueError: API返回错误
|
||||
"""
|
||||
url = f'{self.base_url}/spreadsheets/{self.spreadsheet_token}/sheets/query'
|
||||
|
||||
try:
|
||||
response = self.session.get(url, timeout=config.REQUEST_TIMEOUT)
|
||||
response.raise_for_status()
|
||||
data = response.json()
|
||||
|
||||
if data.get('code') != 0:
|
||||
error_msg = f"飞书API错误: {data.get('msg')}"
|
||||
logger.error(error_msg)
|
||||
raise ValueError(error_msg)
|
||||
|
||||
sheets = data.get('data', {}).get('sheets', [])
|
||||
result = []
|
||||
for sheet in sheets:
|
||||
result.append({
|
||||
'sheet_id': sheet.get('sheet_id'),
|
||||
'title': sheet.get('title')
|
||||
})
|
||||
|
||||
logger.info(f"获取到 {len(result)} 个表格")
|
||||
return result
|
||||
|
||||
except requests.exceptions.RequestException as e:
|
||||
logger.error(f"获取表格信息失败: {e}")
|
||||
raise
|
||||
except Exception as e:
|
||||
logger.error(f"解析表格信息失败: {e}")
|
||||
raise
|
||||
|
||||
def get_sheet_data(self, sheet_id: str, range_: Optional[str] = None) -> Dict:
|
||||
"""
|
||||
获取指定表格的数据
|
||||
|
||||
参数:
|
||||
sheet_id: 表格ID
|
||||
range_: 数据范围,如果为None则使用配置
|
||||
|
||||
返回:
|
||||
飞书API返回的原始数据,包含revision版本号
|
||||
|
||||
异常:
|
||||
requests.exceptions.RequestException: 网络请求失败
|
||||
ValueError: API返回错误
|
||||
"""
|
||||
if range_ is None:
|
||||
range_ = config.SHEET_RANGE
|
||||
|
||||
# 注意:获取表格数据使用 v2 API,而不是 v3
|
||||
url = f'{self.base_url.replace("/v3", "/v2")}/spreadsheets/{self.spreadsheet_token}/values/{sheet_id}!{range_}'
|
||||
params = {
|
||||
'valueRenderOption': 'ToString',
|
||||
'dateTimeRenderOption': 'FormattedString'
|
||||
}
|
||||
|
||||
try:
|
||||
response = self.session.get(url, params=params, timeout=config.REQUEST_TIMEOUT)
|
||||
response.raise_for_status()
|
||||
data = response.json()
|
||||
|
||||
if data.get('code') != 0:
|
||||
error_msg = f"飞书API错误: {data.get('msg')}"
|
||||
logger.error(error_msg)
|
||||
raise ValueError(error_msg)
|
||||
|
||||
logger.debug(f"获取表格数据成功: {sheet_id}, 范围: {range_}")
|
||||
return data.get('data', {})
|
||||
|
||||
except requests.exceptions.RequestException as e:
|
||||
logger.error(f"获取表格数据失败: {e}, sheet_id: {sheet_id}")
|
||||
raise
|
||||
except Exception as e:
|
||||
logger.error(f"解析表格数据失败: {e}, sheet_id: {sheet_id}")
|
||||
raise
|
||||
|
||||
def test_connection(self) -> bool:
|
||||
"""
|
||||
测试飞书连接是否正常
|
||||
|
||||
返回:
|
||||
连接是否正常
|
||||
"""
|
||||
try:
|
||||
sheets = self.get_sheets_info()
|
||||
if sheets:
|
||||
logger.info(f"飞书连接测试成功,找到 {len(sheets)} 个表格")
|
||||
return True
|
||||
else:
|
||||
logger.warning("飞书连接测试成功,但未找到表格")
|
||||
return False
|
||||
except Exception as e:
|
||||
logger.error(f"飞书连接测试失败: {e}")
|
||||
return False
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
# 测试代码
|
||||
import sys
|
||||
|
||||
# 设置日志级别
|
||||
logging.basicConfig(level=logging.INFO)
|
||||
|
||||
# 测试连接
|
||||
client = FeishuSheetsClient()
|
||||
|
||||
if client.test_connection():
|
||||
print("飞书连接测试成功")
|
||||
|
||||
# 获取表格信息
|
||||
sheets = client.get_sheets_info()
|
||||
for sheet in sheets[:3]: # 只显示前3个
|
||||
print(f"表格: {sheet['title']} (ID: {sheet['sheet_id']})")
|
||||
|
||||
if sheets:
|
||||
# 获取第一个表格的数据
|
||||
sheet_id = sheets[0]['sheet_id']
|
||||
data = client.get_sheet_data(sheet_id, 'A1:C5')
|
||||
print(f"获取到表格数据,版本: {data.get('revision', '未知')}")
|
||||
else:
|
||||
print("飞书连接测试失败")
|
||||
sys.exit(1)
|
||||
323
src/feishu/manager.py
Normal file
323
src/feishu/manager.py
Normal file
@@ -0,0 +1,323 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
飞书排班管理器模块
|
||||
统一入口,使用数据库存储和缓存
|
||||
"""
|
||||
from datetime import datetime, timedelta
|
||||
from typing import Dict, List, Optional, Tuple
|
||||
import logging
|
||||
|
||||
from src.config import config
|
||||
from src.logging_config import get_logger
|
||||
from src.feishu.client import FeishuSheetsClient
|
||||
from src.feishu.parser import ScheduleDataParser
|
||||
from src.database.schedules import ScheduleDatabase
|
||||
|
||||
logger = get_logger(__name__)
|
||||
|
||||
|
||||
class FeishuScheduleManager:
|
||||
"""飞书排班管理器(统一入口)"""
|
||||
|
||||
def __init__(self, base_url: Optional[str] = None, token: Optional[str] = None,
|
||||
spreadsheet_token: Optional[str] = None, db_path: Optional[str] = None):
|
||||
"""
|
||||
初始化管理器
|
||||
|
||||
参数:
|
||||
base_url: 飞书API基础URL,如果为None则使用配置
|
||||
token: 飞书API令牌,如果为None则使用配置
|
||||
spreadsheet_token: 表格token,如果为None则使用配置
|
||||
db_path: 数据库路径,如果为None则使用配置
|
||||
"""
|
||||
# 检查配置是否完整
|
||||
self._check_config(token, spreadsheet_token)
|
||||
|
||||
# 初始化组件
|
||||
self.client = FeishuSheetsClient(base_url, token, spreadsheet_token)
|
||||
self.parser = ScheduleDataParser()
|
||||
self.db = ScheduleDatabase(db_path)
|
||||
|
||||
logger.info("飞书排班管理器初始化完成")
|
||||
|
||||
def _check_config(self, token: Optional[str], spreadsheet_token: Optional[str]) -> None:
|
||||
"""检查必要配置"""
|
||||
if not token and not config.FEISHU_TOKEN:
|
||||
logger.warning("飞书令牌未配置,排班功能将不可用")
|
||||
|
||||
if not spreadsheet_token and not config.FEISHU_SPREADSHEET_TOKEN:
|
||||
logger.warning("飞书表格令牌未配置,排班功能将不可用")
|
||||
|
||||
def _select_sheet_for_date(self, sheets: List[Dict[str, str]], target_year_month: str) -> Optional[Dict[str, str]]:
|
||||
"""
|
||||
为指定日期选择最合适的表格
|
||||
|
||||
参数:
|
||||
sheets: 表格列表
|
||||
target_year_month: 目标年月,格式 "2025-12"
|
||||
|
||||
返回:
|
||||
选中的表格信息,未找到返回None
|
||||
"""
|
||||
if not sheets:
|
||||
logger.error("表格列表为空")
|
||||
return None
|
||||
|
||||
# 提取年份和月份
|
||||
try:
|
||||
year = target_year_month[:4]
|
||||
month = target_year_month[5:7].lstrip('0')
|
||||
except (IndexError, ValueError) as e:
|
||||
logger.error(f"解析年月失败: {target_year_month}, 错误: {e}")
|
||||
return None
|
||||
|
||||
# 对于2026年,优先使用年度表格
|
||||
if year == '2026':
|
||||
# 查找年度表格,如 "2026年排班表"
|
||||
year_name = f"{year}年"
|
||||
for sheet in sheets:
|
||||
title = sheet.get('title', '')
|
||||
if year_name in title and '排班表' in title:
|
||||
logger.info(f"找到2026年年度表格: {title}")
|
||||
return sheet
|
||||
|
||||
# 优先查找月份表格,如 "12月"
|
||||
month_name = f"{int(month)}月"
|
||||
for sheet in sheets:
|
||||
title = sheet.get('title', '')
|
||||
if month_name in title:
|
||||
logger.info(f"找到月份表格: {title}")
|
||||
return sheet
|
||||
|
||||
# 查找年度表格,如 "2026年排班表"
|
||||
year_name = f"{year}年"
|
||||
for sheet in sheets:
|
||||
title = sheet.get('title', '')
|
||||
if year_name in title and '排班表' in title:
|
||||
logger.info(f"找到年度表格: {title}")
|
||||
return sheet
|
||||
|
||||
# 如果没有找到匹配的表格,使用第一个表格
|
||||
logger.warning(f"未找到 {target_year_month} 的匹配表格,使用第一个表格: {sheets[0]['title']}")
|
||||
return sheets[0]
|
||||
|
||||
def get_schedule_for_date(self, date_str: str) -> Dict[str, any]:
|
||||
"""
|
||||
获取指定日期的排班信息
|
||||
|
||||
参数:
|
||||
date_str: 日期字符串,格式 "2025-12-30"
|
||||
|
||||
返回:
|
||||
排班信息字典
|
||||
|
||||
异常:
|
||||
ValueError: 日期格式无效
|
||||
Exception: 其他错误
|
||||
"""
|
||||
try:
|
||||
# 解析日期
|
||||
dt = datetime.strptime(date_str, '%Y-%m-%d')
|
||||
|
||||
# 生成两种格式的日期字符串,用于匹配不同表格
|
||||
target_date_mm_dd = dt.strftime('%m/%d') # "01/01" 用于月度表格
|
||||
target_date_chinese = f"{dt.month}月{dt.day}日" # "1月1日" 用于年度表格
|
||||
target_year_month = dt.strftime('%Y-%m') # "2025-12"
|
||||
|
||||
logger.info(f"获取 {date_str} 的排班信息 (格式: {target_date_mm_dd}/{target_date_chinese})")
|
||||
|
||||
# 1. 首先尝试从数据库获取
|
||||
cached_schedule = self.db.get_schedule(date_str)
|
||||
if cached_schedule:
|
||||
logger.info(f"从数据库获取 {date_str} 的排班信息")
|
||||
return self._format_db_result(cached_schedule)
|
||||
|
||||
# 2. 数据库中没有,需要从飞书获取
|
||||
logger.info(f"数据库中没有 {date_str} 的排班信息,从飞书获取")
|
||||
|
||||
# 获取表格信息
|
||||
sheets = self.client.get_sheets_info()
|
||||
if not sheets:
|
||||
logger.error("未获取到表格信息")
|
||||
return self._empty_result()
|
||||
|
||||
# 选择最合适的表格
|
||||
selected_sheet = self._select_sheet_for_date(sheets, target_year_month)
|
||||
if not selected_sheet:
|
||||
logger.error("未找到合适的表格")
|
||||
return self._empty_result()
|
||||
|
||||
sheet_id = selected_sheet['sheet_id']
|
||||
sheet_title = selected_sheet['title']
|
||||
|
||||
# 3. 获取表格数据
|
||||
sheet_data = self.client.get_sheet_data(sheet_id)
|
||||
if not sheet_data:
|
||||
logger.error("未获取到表格数据")
|
||||
return self._empty_result()
|
||||
|
||||
values = sheet_data.get('valueRange', {}).get('values', [])
|
||||
revision = sheet_data.get('revision', 0)
|
||||
|
||||
if not values:
|
||||
logger.error("表格数据为空")
|
||||
return self._empty_result()
|
||||
|
||||
# 4. 检查表格是否有更新
|
||||
need_update = self.db.check_sheet_update(
|
||||
sheet_id, sheet_title, revision, {'values': values}
|
||||
)
|
||||
|
||||
if not need_update and cached_schedule:
|
||||
# 表格无更新,且数据库中有缓存,直接返回
|
||||
logger.info(f"表格无更新,使用数据库缓存")
|
||||
return self._format_db_result(cached_schedule)
|
||||
|
||||
# 5. 解析数据 - 根据表格类型选择合适的日期格式
|
||||
# 如果是年度表格,使用中文日期格式;否则使用mm/dd格式
|
||||
if '年' in sheet_title and '排班表' in sheet_title:
|
||||
target_date = target_date_chinese # "1月1日"
|
||||
else:
|
||||
target_date = target_date_mm_dd # "01/01"
|
||||
|
||||
logger.info(f"使用日期格式: {target_date} 解析表格: {sheet_title}")
|
||||
result = self.parser.parse(values, target_date, sheet_title)
|
||||
|
||||
# 6. 保存到数据库
|
||||
if result['day_shift'] or result['night_shift']:
|
||||
self.db.save_schedule(date_str, result, sheet_id, sheet_title)
|
||||
logger.info(f"已保存 {date_str} 的排班信息到数据库")
|
||||
|
||||
return result
|
||||
|
||||
except ValueError as e:
|
||||
logger.error(f"日期格式无效: {date_str}, 错误: {e}")
|
||||
raise
|
||||
except Exception as e:
|
||||
logger.error(f"获取排班信息失败: {e}")
|
||||
# 降级处理:返回空值
|
||||
return self._empty_result()
|
||||
|
||||
def get_schedule_for_today(self) -> Dict[str, any]:
|
||||
"""获取今天的排班信息"""
|
||||
today = datetime.now().strftime('%Y-%m-%d')
|
||||
return self.get_schedule_for_date(today)
|
||||
|
||||
def get_schedule_for_tomorrow(self) -> Dict[str, any]:
|
||||
"""获取明天的排班信息"""
|
||||
tomorrow = (datetime.now() + timedelta(days=1)).strftime('%Y-%m-%d')
|
||||
return self.get_schedule_for_date(tomorrow)
|
||||
|
||||
def refresh_all_schedules(self, days: Optional[int] = None):
|
||||
"""
|
||||
刷新未来指定天数的排班信息
|
||||
|
||||
参数:
|
||||
days: 刷新未来多少天的排班信息,如果为None则使用配置
|
||||
"""
|
||||
if days is None:
|
||||
days = config.SCHEDULE_REFRESH_DAYS
|
||||
|
||||
logger.info(f"开始刷新未来 {days} 天的排班信息")
|
||||
|
||||
today = datetime.now()
|
||||
success_count = 0
|
||||
error_count = 0
|
||||
|
||||
for i in range(days):
|
||||
date = (today + timedelta(days=i)).strftime('%Y-%m-%d')
|
||||
try:
|
||||
logger.debug(f"刷新 {date} 的排班信息...")
|
||||
self.get_schedule_for_date(date)
|
||||
success_count += 1
|
||||
except Exception as e:
|
||||
logger.error(f"刷新 {date} 的排班信息失败: {e}")
|
||||
error_count += 1
|
||||
|
||||
logger.info(f"排班信息刷新完成,成功: {success_count}, 失败: {error_count}")
|
||||
|
||||
def get_schedule_by_range(self, start_date: str, end_date: str) -> List[Dict[str, any]]:
|
||||
"""
|
||||
获取日期范围内的排班信息
|
||||
|
||||
参数:
|
||||
start_date: 开始日期 (YYYY-MM-DD)
|
||||
end_date: 结束日期 (YYYY-MM-DD)
|
||||
|
||||
返回:
|
||||
排班信息列表
|
||||
"""
|
||||
try:
|
||||
# 验证日期格式
|
||||
datetime.strptime(start_date, '%Y-%m-%d')
|
||||
datetime.strptime(end_date, '%Y-%m-%d')
|
||||
|
||||
return self.db.get_schedule_by_range(start_date, end_date)
|
||||
|
||||
except ValueError as e:
|
||||
logger.error(f"日期格式无效: {e}")
|
||||
return []
|
||||
except Exception as e:
|
||||
logger.error(f"获取排班范围失败: {e}")
|
||||
return []
|
||||
|
||||
def test_connection(self) -> bool:
|
||||
"""测试飞书连接是否正常"""
|
||||
return self.client.test_connection()
|
||||
|
||||
def get_stats(self) -> Dict[str, any]:
|
||||
"""获取排班数据库统计信息"""
|
||||
return self.db.get_stats()
|
||||
|
||||
def _empty_result(self) -> Dict[str, any]:
|
||||
"""返回空结果"""
|
||||
return {
|
||||
'day_shift': '',
|
||||
'night_shift': '',
|
||||
'day_shift_list': [],
|
||||
'night_shift_list': []
|
||||
}
|
||||
|
||||
def _format_db_result(self, db_result: Dict[str, any]) -> Dict[str, any]:
|
||||
"""格式化数据库结果"""
|
||||
return {
|
||||
'day_shift': db_result['day_shift'],
|
||||
'night_shift': db_result['night_shift'],
|
||||
'day_shift_list': db_result['day_shift_list'],
|
||||
'night_shift_list': db_result['night_shift_list']
|
||||
}
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
# 测试代码
|
||||
import sys
|
||||
|
||||
# 设置日志
|
||||
logging.basicConfig(level=logging.INFO)
|
||||
|
||||
# 初始化管理器
|
||||
manager = FeishuScheduleManager()
|
||||
|
||||
# 测试连接
|
||||
if not manager.test_connection():
|
||||
print("飞书连接测试失败")
|
||||
sys.exit(1)
|
||||
|
||||
print("飞书连接测试成功")
|
||||
|
||||
# 测试获取今天和明天的排班
|
||||
today_schedule = manager.get_schedule_for_today()
|
||||
print(f"今天排班: 白班={today_schedule['day_shift']}, 夜班={today_schedule['night_shift']}")
|
||||
|
||||
tomorrow_schedule = manager.get_schedule_for_tomorrow()
|
||||
print(f"明天排班: 白班={tomorrow_schedule['day_shift']}, 夜班={tomorrow_schedule['night_shift']}")
|
||||
|
||||
# 测试统计
|
||||
stats = manager.get_stats()
|
||||
print(f"排班统计: {stats}")
|
||||
|
||||
# 测试范围查询(最近7天)
|
||||
end_date = datetime.now().strftime('%Y-%m-%d')
|
||||
start_date = (datetime.now() - timedelta(days=7)).strftime('%Y-%m-%d')
|
||||
schedules = manager.get_schedule_by_range(start_date, end_date)
|
||||
print(f"最近7天排班记录: {len(schedules)} 条")
|
||||
339
src/feishu/parser.py
Normal file
339
src/feishu/parser.py
Normal file
@@ -0,0 +1,339 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
排班数据解析器模块
|
||||
支持月度表格和年度表格解析
|
||||
"""
|
||||
import re
|
||||
from typing import Dict, List, Optional, Tuple
|
||||
import logging
|
||||
|
||||
from src.logging_config import get_logger
|
||||
|
||||
logger = get_logger(__name__)
|
||||
|
||||
|
||||
class ScheduleDataParser:
|
||||
"""排班数据解析器(支持月度表格和年度表格)"""
|
||||
|
||||
@staticmethod
|
||||
def _parse_chinese_date(date_str: str) -> Optional[str]:
|
||||
"""
|
||||
解析中文日期格式
|
||||
|
||||
参数:
|
||||
date_str: 中文日期,如 "12月30日" 或 "12/30" 或 "12月1日" 或 "1月1日"
|
||||
|
||||
返回:
|
||||
标准化日期字符串 "M月D日" (不补零)
|
||||
|
||||
异常:
|
||||
ValueError: 日期格式无效
|
||||
"""
|
||||
if not date_str or not isinstance(date_str, str):
|
||||
return None
|
||||
|
||||
date_str = date_str.strip()
|
||||
|
||||
try:
|
||||
# 如果是 "12/30" 格式
|
||||
if '/' in date_str:
|
||||
month, day = date_str.split('/')
|
||||
# 移除可能的空格和前导零
|
||||
month = month.strip().lstrip('0')
|
||||
day = day.strip().lstrip('0')
|
||||
if not month.isdigit() or not day.isdigit():
|
||||
raise ValueError(f"日期格式无效: {date_str}")
|
||||
return f"{int(month)}月{int(day)}日"
|
||||
|
||||
# 如果是 "12月30日" 或 "1月1日" 格式
|
||||
if '月' in date_str and '日' in date_str:
|
||||
# 移除前导零,如 "01月01日" -> "1月1日"
|
||||
parts = date_str.split('月')
|
||||
if len(parts) == 2:
|
||||
month_part = parts[0].lstrip('0')
|
||||
day_part = parts[1].rstrip('日').lstrip('0')
|
||||
if not month_part or not day_part:
|
||||
raise ValueError(f"日期格式无效: {date_str}")
|
||||
return f"{month_part}月{day_part}日"
|
||||
return date_str
|
||||
|
||||
# 如果是 "12月1日" 格式(已经包含"日"字)
|
||||
if '月' in date_str:
|
||||
# 检查是否已经有"日"字
|
||||
if '日' not in date_str:
|
||||
return f"{date_str}日"
|
||||
return date_str
|
||||
|
||||
# 如果是纯数字,尝试解析
|
||||
if date_str.isdigit() and len(date_str) == 4:
|
||||
# 假设是 "1230" 格式
|
||||
month = date_str[:2].lstrip('0')
|
||||
day = date_str[2:].lstrip('0')
|
||||
return f"{month}月{day}日"
|
||||
|
||||
return None
|
||||
|
||||
except Exception as e:
|
||||
logger.warning(f"解析日期失败: {date_str}, 错误: {e}")
|
||||
return None
|
||||
|
||||
@staticmethod
|
||||
def _find_date_column_index(headers: List[str], target_date: str) -> Optional[int]:
|
||||
"""
|
||||
在表头中查找目标日期对应的列索引
|
||||
|
||||
参数:
|
||||
headers: 表头行 ["姓名", "12月1日", "12月2日", ...]
|
||||
target_date: 目标日期 "12月30日"
|
||||
|
||||
返回:
|
||||
列索引(从0开始),未找到返回None
|
||||
"""
|
||||
if not headers or not target_date:
|
||||
return None
|
||||
|
||||
# 标准化目标日期
|
||||
target_std = ScheduleDataParser._parse_chinese_date(target_date)
|
||||
if not target_std:
|
||||
logger.warning(f"无法标准化目标日期: {target_date}")
|
||||
return None
|
||||
|
||||
# 遍历表头查找匹配的日期
|
||||
for i, header in enumerate(headers):
|
||||
if not header:
|
||||
continue
|
||||
|
||||
header_std = ScheduleDataParser._parse_chinese_date(header)
|
||||
if header_std == target_std:
|
||||
logger.debug(f"找到日期列: {target_date} -> {header} (索引: {i})")
|
||||
return i
|
||||
|
||||
logger.warning(f"未找到日期列: {target_date}, 表头: {headers}")
|
||||
return None
|
||||
|
||||
def parse_monthly_sheet(self, values: List[List[str]], target_date: str) -> Dict[str, any]:
|
||||
"""
|
||||
解析月度表格数据(如12月表格)
|
||||
|
||||
参数:
|
||||
values: 飞书表格返回的二维数组
|
||||
target_date: 目标日期(格式: "12月30日" 或 "12/30")
|
||||
|
||||
返回:
|
||||
排班信息字典
|
||||
"""
|
||||
if not values or len(values) < 2:
|
||||
logger.warning("表格数据为空或不足")
|
||||
return self._empty_result()
|
||||
|
||||
# 第一行是表头
|
||||
headers = values[0]
|
||||
date_column_index = self._find_date_column_index(headers, target_date)
|
||||
|
||||
if date_column_index is None:
|
||||
logger.warning(f"未找到日期列: {target_date}")
|
||||
return self._empty_result()
|
||||
|
||||
# 收集白班和夜班人员
|
||||
day_shift_names = []
|
||||
night_shift_names = []
|
||||
|
||||
# 从第二行开始是人员数据
|
||||
for row_idx, row in enumerate(values[1:], start=2):
|
||||
if len(row) <= date_column_index:
|
||||
continue
|
||||
|
||||
name = row[0] if row else ''
|
||||
shift = row[date_column_index] if date_column_index < len(row) else ''
|
||||
|
||||
if not name or not shift:
|
||||
continue
|
||||
|
||||
# 清理班次值
|
||||
shift = shift.strip()
|
||||
if shift == '白':
|
||||
day_shift_names.append(name.strip())
|
||||
elif shift == '夜':
|
||||
night_shift_names.append(name.strip())
|
||||
elif shift: # 其他班次类型
|
||||
logger.debug(f"忽略未知班次类型: {shift} (行: {row_idx})")
|
||||
|
||||
return self._format_result(day_shift_names, night_shift_names)
|
||||
|
||||
def parse_yearly_sheet(self, values: List[List[str]], target_date: str) -> Dict[str, any]:
|
||||
"""
|
||||
解析年度表格数据(如2026年排班表)
|
||||
|
||||
参数:
|
||||
values: 飞书表格返回的二维数组
|
||||
target_date: 目标日期(格式: "12月30日" 或 "12/30")
|
||||
|
||||
返回:
|
||||
排班信息字典
|
||||
"""
|
||||
if not values:
|
||||
logger.warning("年度表格数据为空")
|
||||
return self._empty_result()
|
||||
|
||||
# 查找目标月份的数据块
|
||||
target_month = target_date.split('月')[0] if '月' in target_date else ''
|
||||
if not target_month:
|
||||
logger.warning(f"无法从 {target_date} 提取月份")
|
||||
return self._empty_result()
|
||||
|
||||
# 在年度表格中查找对应的月份块
|
||||
current_block_start = -1
|
||||
current_month = ''
|
||||
|
||||
for i, row in enumerate(values):
|
||||
if not row:
|
||||
continue
|
||||
|
||||
first_cell = str(row[0]) if row else ''
|
||||
|
||||
# 检查是否是月份标题行,如 "福州港1月排班表"
|
||||
if '排班表' in first_cell and '月' in first_cell:
|
||||
# 提取月份数字
|
||||
month_match = re.search(r'(\d+)月', first_cell)
|
||||
if month_match:
|
||||
current_month = month_match.group(1).lstrip('0')
|
||||
current_block_start = i
|
||||
logger.debug(f"找到月份块: {current_month}月 (行: {i+1})")
|
||||
|
||||
# 如果找到目标月份,检查下一行是否是表头行
|
||||
if current_month == target_month and i == current_block_start + 1:
|
||||
# 当前行是表头行
|
||||
headers = row
|
||||
date_column_index = self._find_date_column_index(headers, target_date)
|
||||
|
||||
if date_column_index is None:
|
||||
logger.warning(f"在年度表格中未找到日期列: {target_date}")
|
||||
return self._empty_result()
|
||||
|
||||
# 收集人员数据(从表头行的下一行开始)
|
||||
day_shift_names = []
|
||||
night_shift_names = []
|
||||
|
||||
for j in range(i + 1, len(values)):
|
||||
person_row = values[j]
|
||||
if not person_row:
|
||||
# 遇到空行,继续检查下一行
|
||||
continue
|
||||
|
||||
# 检查是否是下一个月份块的开始
|
||||
if person_row[0] and isinstance(person_row[0], str) and '排班表' in person_row[0] and '月' in person_row[0]:
|
||||
break
|
||||
|
||||
# 跳过星期行(第一列为空的行)
|
||||
if not person_row[0]:
|
||||
continue
|
||||
|
||||
if len(person_row) <= date_column_index:
|
||||
continue
|
||||
|
||||
name = person_row[0] if person_row else ''
|
||||
shift = person_row[date_column_index] if date_column_index < len(person_row) else ''
|
||||
|
||||
if not name or not shift:
|
||||
continue
|
||||
|
||||
# 清理班次值
|
||||
shift = shift.strip()
|
||||
if shift == '白':
|
||||
day_shift_names.append(name.strip())
|
||||
elif shift == '夜':
|
||||
night_shift_names.append(name.strip())
|
||||
|
||||
return self._format_result(day_shift_names, night_shift_names)
|
||||
|
||||
logger.warning(f"在年度表格中未找到 {target_month}月 的数据块")
|
||||
return self._empty_result()
|
||||
|
||||
def parse(self, values: List[List[str]], target_date: str, sheet_title: str = '') -> Dict[str, any]:
|
||||
"""
|
||||
解析排班数据,自动判断表格类型
|
||||
|
||||
参数:
|
||||
values: 飞书表格返回的二维数组
|
||||
target_date: 目标日期(格式: "12月30日" 或 "12/30")
|
||||
sheet_title: 表格标题,用于判断表格类型
|
||||
|
||||
返回:
|
||||
排班信息字典
|
||||
"""
|
||||
# 根据表格标题判断表格类型
|
||||
if '年' in sheet_title and '排班表' in sheet_title:
|
||||
# 年度表格
|
||||
logger.info(f"使用年度表格解析器: {sheet_title}")
|
||||
return self.parse_yearly_sheet(values, target_date)
|
||||
else:
|
||||
# 月度表格
|
||||
logger.info(f"使用月度表格解析器: {sheet_title}")
|
||||
return self.parse_monthly_sheet(values, target_date)
|
||||
|
||||
def _empty_result(self) -> Dict[str, any]:
|
||||
"""返回空结果"""
|
||||
return {
|
||||
'day_shift': '',
|
||||
'night_shift': '',
|
||||
'day_shift_list': [],
|
||||
'night_shift_list': []
|
||||
}
|
||||
|
||||
def _format_result(self, day_shift_names: List[str], night_shift_names: List[str]) -> Dict[str, any]:
|
||||
"""格式化结果"""
|
||||
# 去重并排序
|
||||
day_shift_names = sorted(set(day_shift_names))
|
||||
night_shift_names = sorted(set(night_shift_names))
|
||||
|
||||
# 格式化输出
|
||||
day_shift_str = '、'.join(day_shift_names) if day_shift_names else ''
|
||||
night_shift_str = '、'.join(night_shift_names) if night_shift_names else ''
|
||||
|
||||
return {
|
||||
'day_shift': day_shift_str,
|
||||
'night_shift': night_shift_str,
|
||||
'day_shift_list': day_shift_names,
|
||||
'night_shift_list': night_shift_names
|
||||
}
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
# 测试代码
|
||||
import sys
|
||||
|
||||
# 设置日志
|
||||
logging.basicConfig(level=logging.DEBUG)
|
||||
|
||||
parser = ScheduleDataParser()
|
||||
|
||||
# 测试日期解析
|
||||
test_dates = ["12/30", "12月30日", "1月1日", "01/01", "1230", "无效日期"]
|
||||
for date in test_dates:
|
||||
parsed = parser._parse_chinese_date(date)
|
||||
print(f"解析 '{date}' -> '{parsed}'")
|
||||
|
||||
# 测试月度表格解析
|
||||
monthly_values = [
|
||||
["姓名", "12月1日", "12月2日", "12月3日"],
|
||||
["张三", "白", "夜", ""],
|
||||
["李四", "夜", "白", "白"],
|
||||
["王五", "", "白", "夜"]
|
||||
]
|
||||
|
||||
result = parser.parse_monthly_sheet(monthly_values, "12月2日")
|
||||
print(f"\n月度表格解析结果: {result}")
|
||||
|
||||
# 测试年度表格解析
|
||||
yearly_values = [
|
||||
["福州港2026年排班表"],
|
||||
["姓名", "1月1日", "1月2日", "1月3日"],
|
||||
["张三", "白", "夜", ""],
|
||||
["李四", "夜", "白", "白"],
|
||||
["福州港2月排班表"],
|
||||
["姓名", "2月1日", "2月2日"],
|
||||
["王五", "白", "夜"]
|
||||
]
|
||||
|
||||
result = parser.parse_yearly_sheet(yearly_values, "1月2日")
|
||||
print(f"年度表格解析结果: {result}")
|
||||
642
src/feishu_v2.py
642
src/feishu_v2.py
@@ -1,642 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
飞书表格 API 客户端模块 v2
|
||||
支持数据库存储和2026年全年排班表
|
||||
"""
|
||||
import requests
|
||||
import json
|
||||
import os
|
||||
import time
|
||||
from datetime import datetime, timedelta
|
||||
from typing import Dict, List, Optional, Tuple
|
||||
import logging
|
||||
import hashlib
|
||||
|
||||
from src.schedule_database import ScheduleDatabase
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
class FeishuSheetsClient:
|
||||
"""飞书表格 API 客户端"""
|
||||
|
||||
def __init__(self, base_url: str, token: str, spreadsheet_token: str):
|
||||
"""
|
||||
初始化客户端
|
||||
|
||||
参数:
|
||||
base_url: 飞书 API 基础URL
|
||||
token: Bearer 认证令牌
|
||||
spreadsheet_token: 表格 token
|
||||
"""
|
||||
self.base_url = base_url.rstrip('/')
|
||||
self.spreadsheet_token = spreadsheet_token
|
||||
self.headers = {
|
||||
'Authorization': f'Bearer {token}',
|
||||
'Content-Type': 'application/json',
|
||||
'Accept': 'application/json'
|
||||
}
|
||||
|
||||
def get_sheets_info(self) -> List[Dict]:
|
||||
"""
|
||||
获取所有表格信息(sheet_id 和 title)
|
||||
|
||||
返回:
|
||||
表格信息列表 [{'sheet_id': 'xxx', 'title': 'xxx'}, ...]
|
||||
"""
|
||||
url = f'{self.base_url}/spreadsheets/{self.spreadsheet_token}/sheets/query'
|
||||
|
||||
try:
|
||||
response = requests.get(url, headers=self.headers, timeout=30)
|
||||
response.raise_for_status()
|
||||
data = response.json()
|
||||
|
||||
if data.get('code') != 0:
|
||||
logger.error(f"飞书API错误: {data.get('msg')}")
|
||||
return []
|
||||
|
||||
sheets = data.get('data', {}).get('sheets', [])
|
||||
result = []
|
||||
for sheet in sheets:
|
||||
result.append({
|
||||
'sheet_id': sheet.get('sheet_id'),
|
||||
'title': sheet.get('title')
|
||||
})
|
||||
|
||||
logger.info(f"获取到 {len(result)} 个表格")
|
||||
return result
|
||||
|
||||
except requests.exceptions.RequestException as e:
|
||||
logger.error(f"获取表格信息失败: {e}")
|
||||
return []
|
||||
except Exception as e:
|
||||
logger.error(f"解析表格信息失败: {e}")
|
||||
return []
|
||||
|
||||
def get_sheet_data(self, sheet_id: str, range_: str = 'A:AF') -> Dict:
|
||||
"""
|
||||
获取指定表格的数据
|
||||
|
||||
参数:
|
||||
sheet_id: 表格ID
|
||||
range_: 数据范围,默认 A:AF (31列)
|
||||
|
||||
返回:
|
||||
飞书API返回的原始数据,包含revision版本号
|
||||
"""
|
||||
# 注意:获取表格数据使用 v2 API,而不是 v3
|
||||
url = f'{self.base_url.replace("/v3", "/v2")}/spreadsheets/{self.spreadsheet_token}/values/{sheet_id}!{range_}'
|
||||
params = {
|
||||
'valueRenderOption': 'ToString',
|
||||
'dateTimeRenderOption': 'FormattedString'
|
||||
}
|
||||
|
||||
try:
|
||||
response = requests.get(url, headers=self.headers, params=params, timeout=30)
|
||||
response.raise_for_status()
|
||||
data = response.json()
|
||||
|
||||
if data.get('code') != 0:
|
||||
logger.error(f"飞书API错误: {data.get('msg')}")
|
||||
return {}
|
||||
|
||||
return data.get('data', {})
|
||||
|
||||
except requests.exceptions.RequestException as e:
|
||||
logger.error(f"获取表格数据失败: {e}")
|
||||
return {}
|
||||
except Exception as e:
|
||||
logger.error(f"解析表格数据失败: {e}")
|
||||
return {}
|
||||
|
||||
|
||||
class ScheduleDataParser:
|
||||
"""排班数据解析器(支持2026年全年排班表)"""
|
||||
|
||||
@staticmethod
|
||||
def _parse_chinese_date(date_str: str) -> Optional[str]:
|
||||
"""
|
||||
解析中文日期格式
|
||||
|
||||
参数:
|
||||
date_str: 中文日期,如 "12月30日" 或 "12/30" 或 "12月1日" 或 "1月1日"
|
||||
|
||||
返回:
|
||||
标准化日期字符串 "M月D日" (不补零)
|
||||
"""
|
||||
if not date_str:
|
||||
return None
|
||||
|
||||
# 如果是 "12/30" 格式
|
||||
if '/' in date_str:
|
||||
try:
|
||||
month, day = date_str.split('/')
|
||||
# 移除可能的空格和前导零
|
||||
month = month.strip().lstrip('0')
|
||||
day = day.strip().lstrip('0')
|
||||
return f"{int(month)}月{int(day)}日"
|
||||
except:
|
||||
return None
|
||||
|
||||
# 如果是 "12月30日" 或 "1月1日" 格式
|
||||
if '月' in date_str and '日' in date_str:
|
||||
# 移除前导零,如 "01月01日" -> "1月1日"
|
||||
parts = date_str.split('月')
|
||||
if len(parts) == 2:
|
||||
month_part = parts[0].lstrip('0')
|
||||
day_part = parts[1].rstrip('日').lstrip('0')
|
||||
return f"{month_part}月{day_part}日"
|
||||
return date_str
|
||||
|
||||
# 如果是 "12月1日" 格式(已经包含"日"字)
|
||||
if '月' in date_str:
|
||||
# 检查是否已经有"日"字
|
||||
if '日' not in date_str:
|
||||
return f"{date_str}日"
|
||||
return date_str
|
||||
|
||||
return None
|
||||
|
||||
@staticmethod
|
||||
def _find_date_column_index(headers: List[str], target_date: str) -> Optional[int]:
|
||||
"""
|
||||
在表头中查找目标日期对应的列索引
|
||||
|
||||
参数:
|
||||
headers: 表头行 ["姓名", "12月1日", "12月2日", ...]
|
||||
target_date: 目标日期 "12月30日"
|
||||
|
||||
返回:
|
||||
列索引(从0开始),未找到返回None
|
||||
"""
|
||||
if not headers or not target_date:
|
||||
return None
|
||||
|
||||
# 标准化目标日期
|
||||
target_std = ScheduleDataParser._parse_chinese_date(target_date)
|
||||
if not target_std:
|
||||
return None
|
||||
|
||||
# 遍历表头查找匹配的日期
|
||||
for i, header in enumerate(headers):
|
||||
header_std = ScheduleDataParser._parse_chinese_date(header)
|
||||
if header_std == target_std:
|
||||
return i
|
||||
|
||||
return None
|
||||
|
||||
def parse_monthly_sheet(self, values: List[List[str]], target_date: str) -> Dict:
|
||||
"""
|
||||
解析月度表格数据(如12月表格)
|
||||
|
||||
参数:
|
||||
values: 飞书表格返回的二维数组
|
||||
target_date: 目标日期(格式: "12月30日" 或 "12/30")
|
||||
|
||||
返回:
|
||||
排班信息字典
|
||||
"""
|
||||
if not values or len(values) < 2:
|
||||
return {
|
||||
'day_shift': '',
|
||||
'night_shift': '',
|
||||
'day_shift_list': [],
|
||||
'night_shift_list': []
|
||||
}
|
||||
|
||||
# 第一行是表头
|
||||
headers = values[0]
|
||||
date_column_index = self._find_date_column_index(headers, target_date)
|
||||
|
||||
if date_column_index is None:
|
||||
logger.warning(f"未找到日期列: {target_date}")
|
||||
return {
|
||||
'day_shift': '',
|
||||
'night_shift': '',
|
||||
'day_shift_list': [],
|
||||
'night_shift_list': []
|
||||
}
|
||||
|
||||
# 收集白班和夜班人员
|
||||
day_shift_names = []
|
||||
night_shift_names = []
|
||||
|
||||
# 从第二行开始是人员数据
|
||||
for row in values[1:]:
|
||||
if len(row) <= date_column_index:
|
||||
continue
|
||||
|
||||
name = row[0] if row else ''
|
||||
shift = row[date_column_index] if date_column_index < len(row) else ''
|
||||
|
||||
if not name or not shift:
|
||||
continue
|
||||
|
||||
if shift == '白':
|
||||
day_shift_names.append(name)
|
||||
elif shift == '夜':
|
||||
night_shift_names.append(name)
|
||||
|
||||
# 格式化输出
|
||||
day_shift_str = '、'.join(day_shift_names) if day_shift_names else ''
|
||||
night_shift_str = '、'.join(night_shift_names) if night_shift_names else ''
|
||||
|
||||
return {
|
||||
'day_shift': day_shift_str,
|
||||
'night_shift': night_shift_str,
|
||||
'day_shift_list': day_shift_names,
|
||||
'night_shift_list': night_shift_names
|
||||
}
|
||||
|
||||
def parse_yearly_sheet(self, values: List[List[str]], target_date: str) -> Dict:
|
||||
"""
|
||||
解析年度表格数据(如2026年排班表)
|
||||
|
||||
参数:
|
||||
values: 飞书表格返回的二维数组
|
||||
target_date: 目标日期(格式: "12月30日" 或 "12/30")
|
||||
|
||||
返回:
|
||||
排班信息字典
|
||||
"""
|
||||
if not values:
|
||||
return {
|
||||
'day_shift': '',
|
||||
'night_shift': '',
|
||||
'day_shift_list': [],
|
||||
'night_shift_list': []
|
||||
}
|
||||
|
||||
# 查找目标月份的数据块
|
||||
target_month = target_date.split('月')[0] if '月' in target_date else ''
|
||||
if not target_month:
|
||||
logger.warning(f"无法从 {target_date} 提取月份")
|
||||
return {
|
||||
'day_shift': '',
|
||||
'night_shift': '',
|
||||
'day_shift_list': [],
|
||||
'night_shift_list': []
|
||||
}
|
||||
|
||||
# 在年度表格中查找对应的月份块
|
||||
current_block_start = -1
|
||||
current_month = ''
|
||||
|
||||
for i, row in enumerate(values):
|
||||
if not row:
|
||||
continue
|
||||
|
||||
first_cell = str(row[0]) if row else ''
|
||||
|
||||
# 检查是否是月份标题行,如 "福州港1月排班表"
|
||||
if '排班表' in first_cell and '月' in first_cell:
|
||||
# 提取月份数字
|
||||
for char in first_cell:
|
||||
if char.isdigit():
|
||||
month_str = ''
|
||||
j = first_cell.index(char)
|
||||
while j < len(first_cell) and first_cell[j].isdigit():
|
||||
month_str += first_cell[j]
|
||||
j += 1
|
||||
if month_str:
|
||||
current_month = month_str
|
||||
current_block_start = i
|
||||
break
|
||||
|
||||
# 如果找到目标月份,检查下一行是否是表头行
|
||||
if current_month == target_month and i == current_block_start + 1:
|
||||
# 当前行是表头行
|
||||
headers = row
|
||||
date_column_index = self._find_date_column_index(headers, target_date)
|
||||
|
||||
if date_column_index is None:
|
||||
logger.warning(f"在年度表格中未找到日期列: {target_date}")
|
||||
return {
|
||||
'day_shift': '',
|
||||
'night_shift': '',
|
||||
'day_shift_list': [],
|
||||
'night_shift_list': []
|
||||
}
|
||||
|
||||
# 收集人员数据(从表头行的下一行开始)
|
||||
day_shift_names = []
|
||||
night_shift_names = []
|
||||
|
||||
for j in range(i + 1, len(values)):
|
||||
person_row = values[j]
|
||||
if not person_row:
|
||||
# 遇到空行,继续检查下一行
|
||||
continue
|
||||
|
||||
# 检查是否是下一个月份块的开始
|
||||
if person_row[0] and isinstance(person_row[0], str) and '排班表' in person_row[0] and '月' in person_row[0]:
|
||||
break
|
||||
|
||||
# 跳过星期行(第一列为空的行)
|
||||
if not person_row[0]:
|
||||
continue
|
||||
|
||||
if len(person_row) <= date_column_index:
|
||||
continue
|
||||
|
||||
name = person_row[0] if person_row else ''
|
||||
shift = person_row[date_column_index] if date_column_index < len(person_row) else ''
|
||||
|
||||
if not name or not shift:
|
||||
continue
|
||||
|
||||
if shift == '白':
|
||||
day_shift_names.append(name)
|
||||
elif shift == '夜':
|
||||
night_shift_names.append(name)
|
||||
|
||||
# 格式化输出
|
||||
day_shift_str = '、'.join(day_shift_names) if day_shift_names else ''
|
||||
night_shift_str = '、'.join(night_shift_names) if night_shift_names else ''
|
||||
|
||||
return {
|
||||
'day_shift': day_shift_str,
|
||||
'night_shift': night_shift_str,
|
||||
'day_shift_list': day_shift_names,
|
||||
'night_shift_list': night_shift_names
|
||||
}
|
||||
|
||||
logger.warning(f"在年度表格中未找到 {target_month}月 的数据块")
|
||||
return {
|
||||
'day_shift': '',
|
||||
'night_shift': '',
|
||||
'day_shift_list': [],
|
||||
'night_shift_list': []
|
||||
}
|
||||
|
||||
def parse(self, values: List[List[str]], target_date: str, sheet_title: str = '') -> Dict:
|
||||
"""
|
||||
解析排班数据,自动判断表格类型
|
||||
|
||||
参数:
|
||||
values: 飞书表格返回的二维数组
|
||||
target_date: 目标日期(格式: "12月30日" 或 "12/30")
|
||||
sheet_title: 表格标题,用于判断表格类型
|
||||
|
||||
返回:
|
||||
排班信息字典
|
||||
"""
|
||||
# 根据表格标题判断表格类型
|
||||
if '年' in sheet_title and '排班表' in sheet_title:
|
||||
# 年度表格
|
||||
logger.info(f"使用年度表格解析器: {sheet_title}")
|
||||
return self.parse_yearly_sheet(values, target_date)
|
||||
else:
|
||||
# 月度表格
|
||||
logger.info(f"使用月度表格解析器: {sheet_title}")
|
||||
return self.parse_monthly_sheet(values, target_date)
|
||||
|
||||
|
||||
class FeishuScheduleManagerV2:
|
||||
"""飞书排班管理器 v2(使用数据库存储)"""
|
||||
|
||||
def __init__(self, base_url: str = None, token: str = None,
|
||||
spreadsheet_token: str = None):
|
||||
"""
|
||||
初始化管理器
|
||||
|
||||
参数:
|
||||
base_url: 飞书API基础URL,从环境变量读取
|
||||
token: 飞书API令牌,从环境变量读取
|
||||
spreadsheet_token: 表格token,从环境变量读取
|
||||
"""
|
||||
# 从环境变量读取配置
|
||||
self.base_url = base_url or os.getenv('FEISHU_BASE_URL', 'https://open.feishu.cn/open-apis/sheets/v3')
|
||||
self.token = token or os.getenv('FEISHU_TOKEN', '')
|
||||
self.spreadsheet_token = spreadsheet_token or os.getenv('FEISHU_SPREADSHEET_TOKEN', '')
|
||||
|
||||
if not self.token or not self.spreadsheet_token:
|
||||
logger.warning("飞书配置不完整,请检查环境变量")
|
||||
|
||||
self.client = FeishuSheetsClient(self.base_url, self.token, self.spreadsheet_token)
|
||||
self.parser = ScheduleDataParser()
|
||||
self.db = ScheduleDatabase()
|
||||
|
||||
def _select_sheet_for_date(self, sheets: List[Dict], target_year_month: str) -> Optional[Dict]:
|
||||
"""
|
||||
为指定日期选择最合适的表格
|
||||
|
||||
参数:
|
||||
sheets: 表格列表
|
||||
target_year_month: 目标年月,格式 "2025-12"
|
||||
|
||||
返回:
|
||||
选中的表格信息,未找到返回None
|
||||
"""
|
||||
if not sheets:
|
||||
return None
|
||||
|
||||
# 提取年份和月份
|
||||
year = target_year_month[:4]
|
||||
month = target_year_month[5:7]
|
||||
|
||||
# 对于2026年,优先使用年度表格
|
||||
if year == '2026':
|
||||
# 查找年度表格,如 "2026年排班表"
|
||||
year_name = f"{year}年"
|
||||
for sheet in sheets:
|
||||
title = sheet.get('title', '')
|
||||
if year_name in title and '排班表' in title:
|
||||
logger.info(f"找到2026年年度表格: {title}")
|
||||
return sheet
|
||||
|
||||
# 优先查找月份表格,如 "12月"
|
||||
month_name = f"{int(month)}月"
|
||||
for sheet in sheets:
|
||||
title = sheet.get('title', '')
|
||||
if month_name in title:
|
||||
logger.info(f"找到月份表格: {title}")
|
||||
return sheet
|
||||
|
||||
# 查找年度表格,如 "2026年排班表"
|
||||
year_name = f"{year}年"
|
||||
for sheet in sheets:
|
||||
title = sheet.get('title', '')
|
||||
if year_name in title and '排班表' in title:
|
||||
logger.info(f"找到年度表格: {title}")
|
||||
return sheet
|
||||
|
||||
# 如果没有找到匹配的表格,使用第一个表格
|
||||
logger.warning(f"未找到 {target_year_month} 的匹配表格,使用第一个表格: {sheets[0]['title']}")
|
||||
return sheets[0]
|
||||
|
||||
def get_schedule_for_date(self, date_str: str) -> Dict:
|
||||
"""
|
||||
获取指定日期的排班信息
|
||||
|
||||
参数:
|
||||
date_str: 日期字符串,格式 "2025-12-30"
|
||||
|
||||
返回:
|
||||
排班信息字典
|
||||
"""
|
||||
try:
|
||||
# 解析日期
|
||||
dt = datetime.strptime(date_str, '%Y-%m-%d')
|
||||
# 生成两种格式的日期字符串,用于匹配不同表格
|
||||
target_date_mm_dd = dt.strftime('%m/%d') # "01/01" 用于月度表格
|
||||
target_date_chinese = f"{dt.month}月{dt.day}日" # "1月1日" 用于年度表格
|
||||
target_year_month = dt.strftime('%Y-%m') # "2025-12"
|
||||
|
||||
logger.info(f"获取 {date_str} 的排班信息 (格式: {target_date_mm_dd}/{target_date_chinese})")
|
||||
|
||||
# 1. 首先尝试从数据库获取
|
||||
cached_schedule = self.db.get_schedule(date_str)
|
||||
if cached_schedule:
|
||||
logger.info(f"从数据库获取 {date_str} 的排班信息")
|
||||
return {
|
||||
'day_shift': cached_schedule['day_shift'],
|
||||
'night_shift': cached_schedule['night_shift'],
|
||||
'day_shift_list': cached_schedule['day_shift_list'],
|
||||
'night_shift_list': cached_schedule['night_shift_list']
|
||||
}
|
||||
|
||||
# 2. 数据库中没有,需要从飞书获取
|
||||
logger.info(f"数据库中没有 {date_str} 的排班信息,从飞书获取")
|
||||
|
||||
# 获取表格信息
|
||||
sheets = self.client.get_sheets_info()
|
||||
if not sheets:
|
||||
logger.error("未获取到表格信息")
|
||||
return {
|
||||
'day_shift': '',
|
||||
'night_shift': '',
|
||||
'day_shift_list': [],
|
||||
'night_shift_list': []
|
||||
}
|
||||
|
||||
# 选择最合适的表格
|
||||
selected_sheet = self._select_sheet_for_date(sheets, target_year_month)
|
||||
if not selected_sheet:
|
||||
logger.error("未找到合适的表格")
|
||||
return {
|
||||
'day_shift': '',
|
||||
'night_shift': '',
|
||||
'day_shift_list': [],
|
||||
'night_shift_list': []
|
||||
}
|
||||
|
||||
sheet_id = selected_sheet['sheet_id']
|
||||
sheet_title = selected_sheet['title']
|
||||
|
||||
# 3. 获取表格数据
|
||||
sheet_data = self.client.get_sheet_data(sheet_id)
|
||||
if not sheet_data:
|
||||
logger.error("未获取到表格数据")
|
||||
return {
|
||||
'day_shift': '',
|
||||
'night_shift': '',
|
||||
'day_shift_list': [],
|
||||
'night_shift_list': []
|
||||
}
|
||||
|
||||
values = sheet_data.get('valueRange', {}).get('values', [])
|
||||
revision = sheet_data.get('revision', 0)
|
||||
|
||||
if not values:
|
||||
logger.error("表格数据为空")
|
||||
return {
|
||||
'day_shift': '',
|
||||
'night_shift': '',
|
||||
'day_shift_list': [],
|
||||
'night_shift_list': []
|
||||
}
|
||||
|
||||
# 4. 检查表格是否有更新
|
||||
need_update = self.db.check_sheet_update(
|
||||
sheet_id, sheet_title, revision, {'values': values}
|
||||
)
|
||||
|
||||
if not need_update and cached_schedule:
|
||||
# 表格无更新,且数据库中有缓存,直接返回
|
||||
logger.info(f"表格无更新,使用数据库缓存")
|
||||
return {
|
||||
'day_shift': cached_schedule['day_shift'],
|
||||
'night_shift': cached_schedule['night_shift'],
|
||||
'day_shift_list': cached_schedule['day_shift_list'],
|
||||
'night_shift_list': cached_schedule['night_shift_list']
|
||||
}
|
||||
|
||||
# 5. 解析数据 - 根据表格类型选择合适的日期格式
|
||||
# 如果是年度表格,使用中文日期格式;否则使用mm/dd格式
|
||||
if '年' in sheet_title and '排班表' in sheet_title:
|
||||
target_date = target_date_chinese # "1月1日"
|
||||
else:
|
||||
target_date = target_date_mm_dd # "01/01"
|
||||
|
||||
logger.info(f"使用日期格式: {target_date} 解析表格: {sheet_title}")
|
||||
result = self.parser.parse(values, target_date, sheet_title)
|
||||
|
||||
# 6. 保存到数据库
|
||||
if result['day_shift'] or result['night_shift']:
|
||||
self.db.save_schedule(date_str, result, sheet_id, sheet_title)
|
||||
logger.info(f"已保存 {date_str} 的排班信息到数据库")
|
||||
|
||||
return result
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"获取排班信息失败: {e}")
|
||||
# 降级处理:返回空值
|
||||
return {
|
||||
'day_shift': '',
|
||||
'night_shift': '',
|
||||
'day_shift_list': [],
|
||||
'night_shift_list': []
|
||||
}
|
||||
|
||||
def get_schedule_for_today(self) -> Dict:
|
||||
"""获取今天的排班信息"""
|
||||
today = datetime.now().strftime('%Y-%m-%d')
|
||||
return self.get_schedule_for_date(today)
|
||||
|
||||
def get_schedule_for_tomorrow(self) -> Dict:
|
||||
"""获取明天的排班信息"""
|
||||
tomorrow = (datetime.now() + timedelta(days=1)).strftime('%Y-%m-%d')
|
||||
return self.get_schedule_for_date(tomorrow)
|
||||
|
||||
def refresh_all_schedules(self, days: int = 30):
|
||||
"""
|
||||
刷新未来指定天数的排班信息
|
||||
|
||||
参数:
|
||||
days: 刷新未来多少天的排班信息
|
||||
"""
|
||||
logger.info(f"开始刷新未来 {days} 天的排班信息")
|
||||
|
||||
today = datetime.now()
|
||||
for i in range(days):
|
||||
date = (today + timedelta(days=i)).strftime('%Y-%m-%d')
|
||||
logger.info(f"刷新 {date} 的排班信息...")
|
||||
self.get_schedule_for_date(date)
|
||||
|
||||
logger.info(f"排班信息刷新完成")
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
# 测试代码
|
||||
import sys
|
||||
|
||||
# 设置日志
|
||||
logging.basicConfig(level=logging.INFO)
|
||||
|
||||
# 从环境变量读取配置
|
||||
manager = FeishuScheduleManagerV2()
|
||||
|
||||
if len(sys.argv) > 1:
|
||||
date_str = sys.argv[1]
|
||||
else:
|
||||
date_str = datetime.now().strftime('%Y-%m-%d')
|
||||
|
||||
print(f"获取 {date_str} 的排班信息...")
|
||||
schedule = manager.get_schedule_for_date(date_str)
|
||||
|
||||
print(f"白班人员: {schedule['day_shift']}")
|
||||
print(f"夜班人员: {schedule['night_shift']}")
|
||||
print(f"白班列表: {schedule['day_shift_list']}")
|
||||
print(f"夜班列表: {schedule['night_shift_list']}")
|
||||
|
||||
194
src/gui.py
194
src/gui.py
@@ -12,11 +12,15 @@ import os
|
||||
# 添加项目根目录到 Python 路径
|
||||
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
|
||||
|
||||
from src.confluence import ConfluenceClient
|
||||
from src.extractor import HTMLTextExtractor
|
||||
from src.parser import HandoverLogParser
|
||||
from src.database import DailyLogsDatabase
|
||||
from src.report import DailyReportGenerator
|
||||
# 导入新架构的模块
|
||||
from src.config import config
|
||||
from src.logging_config import get_logger
|
||||
from src.confluence import ConfluenceClient, ConfluenceClientError, HTMLTextExtractor, HTMLTextExtractorError, HandoverLogParser, LogParserError
|
||||
from src.report import DailyReportGenerator, ReportGeneratorError
|
||||
from src.database.base import DatabaseConnectionError
|
||||
from src.database.daily_logs import DailyLogsDatabase
|
||||
from src.database.schedules import ScheduleDatabase
|
||||
from src.feishu import FeishuScheduleManager, FeishuClientError
|
||||
|
||||
|
||||
class OrbitInGUI:
|
||||
@@ -25,9 +29,12 @@ class OrbitInGUI:
|
||||
def __init__(self, root):
|
||||
self.root = root
|
||||
self.root.title("码头作业日志管理工具 - OrbitIn")
|
||||
self.root.geometry("900x700")
|
||||
self.root.geometry(config.GUI_WINDOW_SIZE)
|
||||
self.root.resizable(True, True)
|
||||
|
||||
# 初始化日志器
|
||||
self.logger = get_logger(__name__)
|
||||
|
||||
# 设置样式
|
||||
style = ttk.Style()
|
||||
style.theme_use('clam')
|
||||
@@ -154,7 +161,7 @@ class OrbitInGUI:
|
||||
self.report_text = scrolledtext.ScrolledText(
|
||||
right_frame,
|
||||
wrap=tk.WORD,
|
||||
font=('SimHei', 10),
|
||||
font=(config.GUI_FONT_FAMILY, config.GUI_FONT_SIZE),
|
||||
bg='white',
|
||||
height=18
|
||||
)
|
||||
@@ -227,72 +234,94 @@ class OrbitInGUI:
|
||||
"""获取并处理数据"""
|
||||
self.set_status("正在获取数据...")
|
||||
self.log_message("开始获取数据...")
|
||||
self.logger.info("开始获取数据...")
|
||||
|
||||
try:
|
||||
# 加载配置
|
||||
from dotenv import load_dotenv
|
||||
load_dotenv()
|
||||
|
||||
base_url = os.getenv('CONFLUENCE_BASE_URL')
|
||||
token = os.getenv('CONFLUENCE_TOKEN')
|
||||
content_id = os.getenv('CONFLUENCE_CONTENT_ID')
|
||||
|
||||
if not base_url or not token or not content_id:
|
||||
# 检查配置
|
||||
if not config.CONFLUENCE_BASE_URL or not config.CONFLUENCE_TOKEN or not config.CONFLUENCE_CONTENT_ID:
|
||||
self.log_message("错误: 未配置 Confluence 信息,请检查 .env 文件", is_error=True)
|
||||
self.logger.error("Confluence 配置不完整")
|
||||
return
|
||||
|
||||
# 获取 HTML
|
||||
self.log_message("正在从 Confluence 获取 HTML...")
|
||||
client = ConfluenceClient(base_url, token)
|
||||
html = client.get_html(content_id)
|
||||
self.logger.info("正在从 Confluence 获取 HTML...")
|
||||
client = ConfluenceClient(config.CONFLUENCE_BASE_URL, config.CONFLUENCE_TOKEN)
|
||||
html = client.get_html(config.CONFLUENCE_CONTENT_ID)
|
||||
|
||||
if not html:
|
||||
self.log_message("错误: 未获取到 HTML 内容", is_error=True)
|
||||
self.logger.error("未获取到 HTML 内容")
|
||||
return
|
||||
|
||||
self.log_message(f"获取成功,共 {len(html)} 字符")
|
||||
self.logger.info(f"获取成功,共 {len(html)} 字符")
|
||||
|
||||
# 提取文本
|
||||
self.log_message("正在提取布局文本...")
|
||||
self.logger.info("正在提取布局文本...")
|
||||
extractor = HTMLTextExtractor()
|
||||
layout_text = extractor.extract(html)
|
||||
self.log_message(f"提取完成,共 {len(layout_text)} 字符")
|
||||
self.logger.info(f"提取完成,共 {len(layout_text)} 字符")
|
||||
|
||||
# 解析数据
|
||||
self.log_message("正在解析日志数据...")
|
||||
self.logger.info("正在解析日志数据...")
|
||||
parser = HandoverLogParser()
|
||||
logs = parser.parse(layout_text)
|
||||
self.log_message(f"解析到 {len(logs)} 条记录")
|
||||
self.logger.info(f"解析到 {len(logs)} 条记录")
|
||||
|
||||
# 保存到数据库
|
||||
if logs:
|
||||
self.log_message("正在保存到数据库...")
|
||||
self.logger.info("正在保存到数据库...")
|
||||
db = DailyLogsDatabase()
|
||||
count = db.insert_many([log.to_dict() for log in logs])
|
||||
db.close()
|
||||
self.log_message(f"已保存 {count} 条记录")
|
||||
self.logger.info(f"已保存 {count} 条记录")
|
||||
|
||||
# 显示统计
|
||||
db = DailyLogsDatabase()
|
||||
stats = db.get_stats()
|
||||
db.close()
|
||||
self.log_message(f"数据库总计: {stats['total']} 条记录, {len(stats['ships'])} 艘船")
|
||||
self.logger.info(f"数据库总计: {stats['total']} 条记录, {len(stats['ships'])} 艘船")
|
||||
|
||||
# 刷新日报显示
|
||||
self.generate_today_report()
|
||||
else:
|
||||
self.log_message("未解析到任何记录")
|
||||
self.logger.warning("未解析到任何记录")
|
||||
|
||||
self.set_status("完成")
|
||||
self.logger.info("数据获取完成")
|
||||
|
||||
except ConfluenceClientError as e:
|
||||
self.log_message(f"Confluence API 错误: {e}", is_error=True)
|
||||
self.logger.error(f"Confluence API 错误: {e}")
|
||||
self.set_status("错误")
|
||||
except HTMLTextExtractorError as e:
|
||||
self.log_message(f"HTML 提取错误: {e}", is_error=True)
|
||||
self.logger.error(f"HTML 提取错误: {e}")
|
||||
self.set_status("错误")
|
||||
except LogParserError as e:
|
||||
self.log_message(f"日志解析错误: {e}", is_error=True)
|
||||
self.logger.error(f"日志解析错误: {e}")
|
||||
self.set_status("错误")
|
||||
except DatabaseConnectionError as e:
|
||||
self.log_message(f"数据库连接错误: {e}", is_error=True)
|
||||
self.logger.error(f"数据库连接错误: {e}")
|
||||
self.set_status("错误")
|
||||
except Exception as e:
|
||||
self.log_message(f"错误: {e}", is_error=True)
|
||||
self.log_message(f"未知错误: {e}", is_error=True)
|
||||
self.logger.error(f"未知错误: {e}", exc_info=True)
|
||||
self.set_status("错误")
|
||||
|
||||
def fetch_debug(self):
|
||||
"""Debug模式获取数据"""
|
||||
self.set_status("正在获取 Debug 数据...")
|
||||
self.log_message("使用本地 layout_output.txt 进行 Debug...")
|
||||
self.logger.info("使用本地 layout_output.txt 进行 Debug...")
|
||||
|
||||
try:
|
||||
# 检查本地文件
|
||||
@@ -302,35 +331,51 @@ class OrbitInGUI:
|
||||
filepath = 'debug/layout_output.txt'
|
||||
else:
|
||||
self.log_message("错误: 未找到 layout_output.txt 文件", is_error=True)
|
||||
self.logger.error("未找到 layout_output.txt 文件")
|
||||
return
|
||||
|
||||
self.log_message(f"使用文件: {filepath}")
|
||||
self.logger.info(f"使用文件: {filepath}")
|
||||
|
||||
with open(filepath, 'r', encoding='utf-8') as f:
|
||||
text = f.read()
|
||||
|
||||
self.log_message(f"读取完成,共 {len(text)} 字符")
|
||||
self.logger.info(f"读取完成,共 {len(text)} 字符")
|
||||
|
||||
# 解析数据
|
||||
self.log_message("正在解析日志数据...")
|
||||
self.logger.info("正在解析日志数据...")
|
||||
parser = HandoverLogParser()
|
||||
logs = parser.parse(text)
|
||||
self.log_message(f"解析到 {len(logs)} 条记录")
|
||||
self.logger.info(f"解析到 {len(logs)} 条记录")
|
||||
|
||||
if logs:
|
||||
self.log_message("正在保存到数据库...")
|
||||
self.logger.info("正在保存到数据库...")
|
||||
db = DailyLogsDatabase()
|
||||
count = db.insert_many([log.to_dict() for log in logs])
|
||||
db.close()
|
||||
self.log_message(f"已保存 {count} 条记录")
|
||||
self.logger.info(f"已保存 {count} 条记录")
|
||||
|
||||
# 刷新日报显示
|
||||
self.generate_today_report()
|
||||
|
||||
self.set_status("完成")
|
||||
self.logger.info("Debug 数据获取完成")
|
||||
|
||||
except LogParserError as e:
|
||||
self.log_message(f"日志解析错误: {e}", is_error=True)
|
||||
self.logger.error(f"日志解析错误: {e}")
|
||||
self.set_status("错误")
|
||||
except DatabaseConnectionError as e:
|
||||
self.log_message(f"数据库连接错误: {e}", is_error=True)
|
||||
self.logger.error(f"数据库连接错误: {e}")
|
||||
self.set_status("错误")
|
||||
except Exception as e:
|
||||
self.log_message(f"错误: {e}", is_error=True)
|
||||
self.log_message(f"未知错误: {e}", is_error=True)
|
||||
self.logger.error(f"未知错误: {e}", exc_info=True)
|
||||
self.set_status("错误")
|
||||
|
||||
def generate_report(self):
|
||||
@@ -339,21 +384,23 @@ class OrbitInGUI:
|
||||
|
||||
if not date:
|
||||
self.log_message("错误: 请输入日期", is_error=True)
|
||||
self.logger.error("未输入日期")
|
||||
return
|
||||
|
||||
try:
|
||||
datetime.strptime(date, '%Y-%m-%d')
|
||||
except ValueError:
|
||||
self.log_message("错误: 日期格式无效,请使用 YYYY-MM-DD", is_error=True)
|
||||
self.logger.error(f"日期格式无效: {date}")
|
||||
return
|
||||
|
||||
self.set_status("正在生成日报...")
|
||||
self.log_message(f"生成 {date} 的日报...")
|
||||
self.logger.info(f"生成 {date} 的日报...")
|
||||
|
||||
try:
|
||||
g = DailyReportGenerator()
|
||||
report = g.generate_report(date)
|
||||
g.close()
|
||||
|
||||
# 在日报文本框中显示(可复制)
|
||||
self.report_text.delete("1.0", tk.END)
|
||||
@@ -366,9 +413,19 @@ class OrbitInGUI:
|
||||
self.log_message("=" * 40)
|
||||
|
||||
self.set_status("完成")
|
||||
self.logger.info(f"日报生成完成: {date}")
|
||||
|
||||
except ReportGeneratorError as e:
|
||||
self.log_message(f"日报生成错误: {e}", is_error=True)
|
||||
self.logger.error(f"日报生成错误: {e}")
|
||||
self.set_status("错误")
|
||||
except DatabaseConnectionError as e:
|
||||
self.log_message(f"数据库连接错误: {e}", is_error=True)
|
||||
self.logger.error(f"数据库连接错误: {e}")
|
||||
self.set_status("错误")
|
||||
except Exception as e:
|
||||
self.log_message(f"错误: {e}", is_error=True)
|
||||
self.log_message(f"未知错误: {e}", is_error=True)
|
||||
self.logger.error(f"未知错误: {e}", exc_info=True)
|
||||
self.set_status("错误")
|
||||
|
||||
def generate_today_report(self):
|
||||
@@ -384,117 +441,143 @@ class OrbitInGUI:
|
||||
|
||||
if not year_month or not teu:
|
||||
self.log_message("错误: 请输入月份和 TEU", is_error=True)
|
||||
self.logger.error("未输入月份和 TEU")
|
||||
return
|
||||
|
||||
try:
|
||||
teu = int(teu)
|
||||
except ValueError:
|
||||
self.log_message("错误: TEU 必须是数字", is_error=True)
|
||||
self.logger.error(f"TEU 不是数字: {teu}")
|
||||
return
|
||||
|
||||
self.set_status("正在添加...")
|
||||
self.log_message(f"添加 {year_month} 月未统计数据: {teu}TEU")
|
||||
self.logger.info(f"添加 {year_month} 月未统计数据: {teu}TEU")
|
||||
|
||||
try:
|
||||
db = DailyLogsDatabase()
|
||||
result = db.insert_unaccounted(year_month, teu, '')
|
||||
db.close()
|
||||
|
||||
if result:
|
||||
self.log_message("添加成功!")
|
||||
self.logger.info(f"未统计数据添加成功: {year_month} {teu}TEU")
|
||||
# 刷新日报显示
|
||||
self.generate_today_report()
|
||||
else:
|
||||
self.log_message("添加失败!", is_error=True)
|
||||
self.logger.error(f"未统计数据添加失败: {year_month} {teu}TEU")
|
||||
|
||||
self.set_status("完成")
|
||||
|
||||
except DatabaseConnectionError as e:
|
||||
self.log_message(f"数据库连接错误: {e}", is_error=True)
|
||||
self.logger.error(f"数据库连接错误: {e}")
|
||||
self.set_status("错误")
|
||||
except Exception as e:
|
||||
self.log_message(f"错误: {e}", is_error=True)
|
||||
self.log_message(f"未知错误: {e}", is_error=True)
|
||||
self.logger.error(f"未知错误: {e}", exc_info=True)
|
||||
self.set_status("错误")
|
||||
|
||||
def auto_fetch_data(self):
|
||||
"""自动获取新数据(GUI启动时调用)"""
|
||||
self.set_status("正在自动获取新数据...")
|
||||
self.log_message("GUI启动,开始自动获取新数据...")
|
||||
self.logger.info("GUI启动,开始自动获取新数据...")
|
||||
|
||||
try:
|
||||
# 1. 检查飞书配置,如果配置完整则刷新排班信息
|
||||
from dotenv import load_dotenv
|
||||
load_dotenv()
|
||||
|
||||
feishu_token = os.getenv('FEISHU_TOKEN')
|
||||
feishu_spreadsheet_token = os.getenv('FEISHU_SPREADSHEET_TOKEN')
|
||||
|
||||
if feishu_token and feishu_spreadsheet_token:
|
||||
if config.FEISHU_TOKEN and config.FEISHU_SPREADSHEET_TOKEN:
|
||||
try:
|
||||
self.log_message("正在刷新排班信息...")
|
||||
from src.feishu_v2 import FeishuScheduleManagerV2
|
||||
feishu_manager = FeishuScheduleManagerV2()
|
||||
self.logger.info("正在刷新排班信息...")
|
||||
feishu_manager = FeishuScheduleManager()
|
||||
# 只刷新未来7天的排班,减少API调用
|
||||
feishu_manager.refresh_all_schedules(days=7)
|
||||
self.log_message("排班信息刷新完成")
|
||||
except Exception as e:
|
||||
self.logger.info("排班信息刷新完成")
|
||||
except FeishuClientError as e:
|
||||
self.log_message(f"刷新排班信息时出错: {e}", is_error=True)
|
||||
self.logger.error(f"刷新排班信息时出错: {e}")
|
||||
self.log_message("将继续处理其他任务...")
|
||||
except Exception as e:
|
||||
self.log_message(f"刷新排班信息时出现未知错误: {e}", is_error=True)
|
||||
self.logger.error(f"刷新排班信息时出现未知错误: {e}", exc_info=True)
|
||||
self.log_message("将继续处理其他任务...")
|
||||
else:
|
||||
self.log_message("飞书配置不完整,跳过排班信息刷新")
|
||||
self.logger.warning("飞书配置不完整,跳过排班信息刷新")
|
||||
|
||||
# 2. 尝试获取最新的作业数据
|
||||
self.log_message("正在尝试获取最新作业数据...")
|
||||
self.logger.info("正在尝试获取最新作业数据...")
|
||||
|
||||
base_url = os.getenv('CONFLUENCE_BASE_URL')
|
||||
token = os.getenv('CONFLUENCE_TOKEN')
|
||||
content_id = os.getenv('CONFLUENCE_CONTENT_ID')
|
||||
|
||||
if base_url and token and content_id:
|
||||
if config.CONFLUENCE_BASE_URL and config.CONFLUENCE_TOKEN and config.CONFLUENCE_CONTENT_ID:
|
||||
try:
|
||||
# 获取 HTML
|
||||
self.log_message("正在从 Confluence 获取 HTML...")
|
||||
from src.confluence import ConfluenceClient
|
||||
client = ConfluenceClient(base_url, token)
|
||||
html = client.get_html(content_id)
|
||||
self.logger.info("正在从 Confluence 获取 HTML...")
|
||||
client = ConfluenceClient(config.CONFLUENCE_BASE_URL, config.CONFLUENCE_TOKEN)
|
||||
html = client.get_html(config.CONFLUENCE_CONTENT_ID)
|
||||
|
||||
if html:
|
||||
self.log_message(f"获取成功,共 {len(html)} 字符")
|
||||
self.logger.info(f"获取成功,共 {len(html)} 字符")
|
||||
|
||||
# 提取文本
|
||||
self.log_message("正在提取布局文本...")
|
||||
from src.extractor import HTMLTextExtractor
|
||||
self.logger.info("正在提取布局文本...")
|
||||
extractor = HTMLTextExtractor()
|
||||
layout_text = extractor.extract(html)
|
||||
|
||||
# 解析数据
|
||||
self.log_message("正在解析日志数据...")
|
||||
from src.parser import HandoverLogParser
|
||||
self.logger.info("正在解析日志数据...")
|
||||
parser = HandoverLogParser()
|
||||
logs = parser.parse(layout_text)
|
||||
|
||||
if logs:
|
||||
# 保存到数据库
|
||||
self.log_message("正在保存到数据库...")
|
||||
self.logger.info("正在保存到数据库...")
|
||||
db = DailyLogsDatabase()
|
||||
count = db.insert_many([log.to_dict() for log in logs])
|
||||
db.close()
|
||||
self.log_message(f"已保存 {count} 条新记录")
|
||||
self.logger.info(f"已保存 {count} 条新记录")
|
||||
else:
|
||||
self.log_message("未解析到新记录")
|
||||
self.logger.warning("未解析到新记录")
|
||||
else:
|
||||
self.log_message("未获取到 HTML 内容,跳过数据获取")
|
||||
except Exception as e:
|
||||
self.logger.warning("未获取到 HTML 内容,跳过数据获取")
|
||||
except ConfluenceClientError as e:
|
||||
self.log_message(f"获取作业数据时出错: {e}", is_error=True)
|
||||
self.logger.error(f"获取作业数据时出错: {e}")
|
||||
except HTMLTextExtractorError as e:
|
||||
self.log_message(f"HTML 提取错误: {e}", is_error=True)
|
||||
self.logger.error(f"HTML 提取错误: {e}")
|
||||
except LogParserError as e:
|
||||
self.log_message(f"日志解析错误: {e}", is_error=True)
|
||||
self.logger.error(f"日志解析错误: {e}")
|
||||
except Exception as e:
|
||||
self.log_message(f"获取作业数据时出现未知错误: {e}", is_error=True)
|
||||
self.logger.error(f"获取作业数据时出现未知错误: {e}", exc_info=True)
|
||||
else:
|
||||
self.log_message("Confluence 配置不完整,跳过数据获取")
|
||||
self.logger.warning("Confluence 配置不完整,跳过数据获取")
|
||||
|
||||
# 3. 显示今日日报
|
||||
self.log_message("正在生成今日日报...")
|
||||
self.logger.info("正在生成今日日报...")
|
||||
self.generate_today_report()
|
||||
|
||||
self.set_status("就绪")
|
||||
self.log_message("自动获取完成,GUI已就绪")
|
||||
self.logger.info("自动获取完成,GUI已就绪")
|
||||
|
||||
except Exception as e:
|
||||
self.log_message(f"自动获取过程中出现错误: {e}", is_error=True)
|
||||
self.logger.error(f"自动获取过程中出现错误: {e}", exc_info=True)
|
||||
self.log_message("将继续显示GUI界面...")
|
||||
self.set_status("就绪")
|
||||
# 即使出错也显示今日日报
|
||||
@@ -505,6 +588,7 @@ class OrbitInGUI:
|
||||
self.set_status("正在统计...")
|
||||
self.log_message("数据库统计信息:")
|
||||
self.log_message("-" * 30)
|
||||
self.logger.info("显示数据库统计信息...")
|
||||
|
||||
try:
|
||||
db = DailyLogsDatabase()
|
||||
@@ -514,8 +598,6 @@ class OrbitInGUI:
|
||||
current_month = datetime.now().strftime('%Y-%m')
|
||||
ships_monthly = db.get_ships_with_monthly_teu(current_month)
|
||||
|
||||
db.close()
|
||||
|
||||
self.log_message(f"总记录数: {stats['total']}")
|
||||
self.log_message(f"船次数量: {len(stats['ships'])}")
|
||||
self.log_message(f"日期范围: {stats['date_range']['start']} ~ {stats['date_range']['end']}")
|
||||
@@ -532,9 +614,15 @@ class OrbitInGUI:
|
||||
self.log_message(f" 本月合计: {total_monthly_teu}TEU")
|
||||
|
||||
self.set_status("完成")
|
||||
self.logger.info(f"数据库统计完成: {stats['total']} 条记录, {len(stats['ships'])} 艘船")
|
||||
|
||||
except DatabaseConnectionError as e:
|
||||
self.log_message(f"数据库连接错误: {e}", is_error=True)
|
||||
self.logger.error(f"数据库连接错误: {e}")
|
||||
self.set_status("错误")
|
||||
except Exception as e:
|
||||
self.log_message(f"错误: {e}", is_error=True)
|
||||
self.log_message(f"未知错误: {e}", is_error=True)
|
||||
self.logger.error(f"未知错误: {e}", exc_info=True)
|
||||
self.set_status("错误")
|
||||
|
||||
|
||||
|
||||
144
src/logging_config.py
Normal file
144
src/logging_config.py
Normal file
@@ -0,0 +1,144 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
统一日志配置模块
|
||||
提供统一的日志配置,避免各模块自行配置
|
||||
"""
|
||||
import os
|
||||
import logging
|
||||
import sys
|
||||
from logging.handlers import RotatingFileHandler
|
||||
from typing import Optional
|
||||
|
||||
from src.config import config
|
||||
|
||||
|
||||
def setup_logging(
|
||||
log_file: Optional[str] = None,
|
||||
console_level: int = logging.INFO,
|
||||
file_level: int = logging.DEBUG,
|
||||
max_bytes: int = 10 * 1024 * 1024, # 10MB
|
||||
backup_count: int = 5
|
||||
) -> logging.Logger:
|
||||
"""
|
||||
配置统一的日志系统
|
||||
|
||||
参数:
|
||||
log_file: 日志文件路径,如果为None则使用默认路径
|
||||
console_level: 控制台日志级别
|
||||
file_level: 文件日志级别
|
||||
max_bytes: 单个日志文件最大大小
|
||||
backup_count: 备份文件数量
|
||||
|
||||
返回:
|
||||
配置好的根日志器
|
||||
"""
|
||||
# 创建日志目录
|
||||
if log_file is None:
|
||||
log_dir = 'logs'
|
||||
log_file = os.path.join(log_dir, 'app.log')
|
||||
else:
|
||||
log_dir = os.path.dirname(log_file)
|
||||
|
||||
if log_dir and not os.path.exists(log_dir):
|
||||
os.makedirs(log_dir)
|
||||
|
||||
# 获取根日志器
|
||||
logger = logging.getLogger()
|
||||
logger.setLevel(logging.DEBUG) # 根日志器设置为最低级别
|
||||
|
||||
# 清除现有handler,避免重复添加
|
||||
logger.handlers.clear()
|
||||
|
||||
# 控制台handler
|
||||
console_handler = logging.StreamHandler(sys.stdout)
|
||||
console_handler.setLevel(console_level)
|
||||
console_formatter = logging.Formatter(
|
||||
'%(asctime)s - %(name)s - %(levelname)s - %(message)s',
|
||||
datefmt='%Y-%m-%d %H:%M:%S'
|
||||
)
|
||||
console_handler.setFormatter(console_formatter)
|
||||
logger.addHandler(console_handler)
|
||||
|
||||
# 文件handler(轮转)
|
||||
file_handler = RotatingFileHandler(
|
||||
log_file,
|
||||
maxBytes=max_bytes,
|
||||
backupCount=backup_count,
|
||||
encoding='utf-8'
|
||||
)
|
||||
file_handler.setLevel(file_level)
|
||||
file_formatter = logging.Formatter(
|
||||
'%(asctime)s - %(name)s - %(levelname)s - %(filename)s:%(lineno)d - %(message)s',
|
||||
datefmt='%Y-%m-%d %H:%M:%S'
|
||||
)
|
||||
file_handler.setFormatter(file_formatter)
|
||||
logger.addHandler(file_handler)
|
||||
|
||||
# 设置第三方库的日志级别
|
||||
logging.getLogger('urllib3').setLevel(logging.WARNING)
|
||||
logging.getLogger('requests').setLevel(logging.WARNING)
|
||||
|
||||
logger.info(f"日志系统已初始化,日志文件: {log_file}")
|
||||
logger.info(f"控制台日志级别: {logging.getLevelName(console_level)}")
|
||||
logger.info(f"文件日志级别: {logging.getLevelName(file_level)}")
|
||||
|
||||
return logger
|
||||
|
||||
|
||||
def get_logger(name: str) -> logging.Logger:
|
||||
"""
|
||||
获取指定名称的日志器
|
||||
|
||||
参数:
|
||||
name: 日志器名称,通常使用 __name__
|
||||
|
||||
返回:
|
||||
配置好的日志器
|
||||
"""
|
||||
return logging.getLogger(name)
|
||||
|
||||
|
||||
# 自动初始化日志系统
|
||||
if not logging.getLogger().handlers:
|
||||
# 只有在没有handler时才初始化,避免重复初始化
|
||||
setup_logging()
|
||||
|
||||
|
||||
# 便捷函数
|
||||
def info(msg: str, *args, **kwargs):
|
||||
"""记录INFO级别日志"""
|
||||
logging.info(msg, *args, **kwargs)
|
||||
|
||||
|
||||
def warning(msg: str, *args, **kwargs):
|
||||
"""记录WARNING级别日志"""
|
||||
logging.warning(msg, *args, **kwargs)
|
||||
|
||||
|
||||
def error(msg: str, *args, **kwargs):
|
||||
"""记录ERROR级别日志"""
|
||||
logging.error(msg, *args, **kwargs)
|
||||
|
||||
|
||||
def debug(msg: str, *args, **kwargs):
|
||||
"""记录DEBUG级别日志"""
|
||||
logging.debug(msg, *args, **kwargs)
|
||||
|
||||
|
||||
def exception(msg: str, *args, **kwargs):
|
||||
"""记录异常日志"""
|
||||
logging.exception(msg, *args, **kwargs)
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
# 测试日志配置
|
||||
logger = get_logger(__name__)
|
||||
logger.info("测试INFO日志")
|
||||
logger.warning("测试WARNING日志")
|
||||
logger.error("测试ERROR日志")
|
||||
logger.debug("测试DEBUG日志")
|
||||
|
||||
try:
|
||||
raise ValueError("测试异常")
|
||||
except ValueError as e:
|
||||
logger.exception("捕获到异常: %s", e)
|
||||
192
src/parser.py
192
src/parser.py
@@ -1,192 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
日志解析模块
|
||||
"""
|
||||
import re
|
||||
from typing import List, Dict, Optional
|
||||
from dataclasses import dataclass
|
||||
|
||||
|
||||
@dataclass
|
||||
class ShipLog:
|
||||
"""船次日志数据类"""
|
||||
date: str
|
||||
shift: str
|
||||
ship_name: str
|
||||
teu: Optional[int] = None
|
||||
efficiency: Optional[float] = None
|
||||
vehicles: Optional[int] = None
|
||||
|
||||
def to_dict(self) -> Dict:
|
||||
"""转换为字典"""
|
||||
return {
|
||||
'date': self.date,
|
||||
'shift': self.shift,
|
||||
'ship_name': self.ship_name,
|
||||
'teu': self.teu,
|
||||
'efficiency': self.efficiency,
|
||||
'vehicles': self.vehicles
|
||||
}
|
||||
|
||||
|
||||
class HandoverLogParser:
|
||||
"""交接班日志解析器"""
|
||||
|
||||
SEPARATOR = '———————————————————————————————————————————————'
|
||||
|
||||
def __init__(self):
|
||||
"""初始化解析器"""
|
||||
pass
|
||||
|
||||
@staticmethod
|
||||
def parse_date(date_str: str) -> str:
|
||||
"""解析日期字符串"""
|
||||
try:
|
||||
parts = date_str.split('.')
|
||||
if len(parts) == 3:
|
||||
return f"{parts[0]}-{parts[1]}-{parts[2]}"
|
||||
return date_str
|
||||
except Exception:
|
||||
return date_str
|
||||
|
||||
def parse(self, text: str) -> List[ShipLog]:
|
||||
"""
|
||||
解析日志文本
|
||||
|
||||
参数:
|
||||
text: 日志文本
|
||||
|
||||
返回:
|
||||
船次日志列表(已合并同日期同班次同船名的记录)
|
||||
"""
|
||||
logs = []
|
||||
|
||||
# 预处理:移除单行分隔符(前后都是空行的分隔符)
|
||||
# 保留真正的内容分隔符(前后有内容的)
|
||||
lines = text.split('\n')
|
||||
processed_lines = []
|
||||
i = 0
|
||||
while i < len(lines):
|
||||
line = lines[i]
|
||||
if line.strip() == self.SEPARATOR:
|
||||
# 检查是否是单行分隔符(前后都是空行或分隔符)
|
||||
prev_empty = i == 0 or not lines[i-1].strip() or lines[i-1].strip() == self.SEPARATOR
|
||||
next_empty = i == len(lines) - 1 or not lines[i+1].strip() or lines[i+1].strip() == self.SEPARATOR
|
||||
if prev_empty and next_empty:
|
||||
# 单行分隔符,跳过
|
||||
i += 1
|
||||
continue
|
||||
processed_lines.append(line)
|
||||
i += 1
|
||||
|
||||
processed_text = '\n'.join(processed_lines)
|
||||
blocks = processed_text.split(self.SEPARATOR)
|
||||
|
||||
for block in blocks:
|
||||
if not block.strip() or '日期:' not in block:
|
||||
continue
|
||||
|
||||
# 解析日期
|
||||
date_match = re.search(r'日期:(\d{4}\.\d{2}\.\d{2})', block)
|
||||
if not date_match:
|
||||
continue
|
||||
|
||||
date = self.parse_date(date_match.group(1))
|
||||
self._parse_block(block, date, logs)
|
||||
|
||||
# 合并同日期同班次同船名的记录(累加TEU)
|
||||
merged = {}
|
||||
for log in logs:
|
||||
key = (log.date, log.shift, log.ship_name)
|
||||
if key not in merged:
|
||||
merged[key] = ShipLog(
|
||||
date=log.date,
|
||||
shift=log.shift,
|
||||
ship_name=log.ship_name,
|
||||
teu=log.teu,
|
||||
efficiency=log.efficiency,
|
||||
vehicles=log.vehicles
|
||||
)
|
||||
else:
|
||||
# 累加TEU
|
||||
if log.teu:
|
||||
if merged[key].teu is None:
|
||||
merged[key].teu = log.teu
|
||||
else:
|
||||
merged[key].teu += log.teu
|
||||
# 累加车辆数
|
||||
if log.vehicles:
|
||||
if merged[key].vehicles is None:
|
||||
merged[key].vehicles = log.vehicles
|
||||
else:
|
||||
merged[key].vehicles += log.vehicles
|
||||
|
||||
return list(merged.values())
|
||||
|
||||
def _parse_block(self, block: str, date: str, logs: List[ShipLog]):
|
||||
"""解析日期块"""
|
||||
for shift in ['白班', '夜班']:
|
||||
shift_pattern = f'{shift}:'
|
||||
if shift_pattern not in block:
|
||||
continue
|
||||
|
||||
shift_start = block.find(shift_pattern) + len(shift_pattern)
|
||||
|
||||
# 只找到下一个班次作为边界,不限制"注意事项:"
|
||||
next_pos = len(block)
|
||||
for next_shift in ['白班', '夜班']:
|
||||
if next_shift != shift:
|
||||
pos = block.find(f'{next_shift}:', shift_start)
|
||||
if pos != -1 and pos < next_pos:
|
||||
next_pos = pos
|
||||
|
||||
shift_content = block[shift_start:next_pos]
|
||||
self._parse_ships(shift_content, date, shift, logs)
|
||||
|
||||
def _parse_ships(self, content: str, date: str, shift: str, logs: List[ShipLog]):
|
||||
"""解析船次"""
|
||||
parts = content.split('实船作业:')
|
||||
|
||||
for part in parts:
|
||||
if not part.strip():
|
||||
continue
|
||||
|
||||
cleaned = part.replace('\xa0', ' ').strip()
|
||||
# 匹配 "xxx# 船名" 格式(船号和船名分开)
|
||||
ship_match = re.search(r'(\d+)#\s*(\S+)', cleaned)
|
||||
|
||||
if not ship_match:
|
||||
continue
|
||||
|
||||
# 船名只取纯船名(去掉xx#前缀和二次靠泊等标注)
|
||||
ship_name = ship_match.group(2)
|
||||
# 移除二次靠泊等标注
|
||||
ship_name = re.sub(r'(二次靠泊)|(再次靠泊)|\(二次靠泊\)|\(再次靠泊\)', '', ship_name).strip()
|
||||
|
||||
vehicles_match = re.search(r'上场车辆数:(\d+)', cleaned)
|
||||
teu_eff_match = re.search(
|
||||
r'作业量/效率:(\d+)TEU[,,\s]*', cleaned
|
||||
)
|
||||
|
||||
log = ShipLog(
|
||||
date=date,
|
||||
shift=shift,
|
||||
ship_name=ship_name,
|
||||
teu=int(teu_eff_match.group(1)) if teu_eff_match else None,
|
||||
efficiency=None,
|
||||
vehicles=int(vehicles_match.group(1)) if vehicles_match else None
|
||||
)
|
||||
logs.append(log)
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
# 测试
|
||||
with open('layout_output.txt', 'r', encoding='utf-8') as f:
|
||||
text = f.read()
|
||||
|
||||
parser = HandoverLogParser()
|
||||
logs = parser.parse(text)
|
||||
|
||||
print(f'解析到 {len(logs)} 条记录')
|
||||
for log in logs[:5]:
|
||||
print(f'{log.date} {log.shift} {log.ship_name}: {log.teu}TEU')
|
||||
435
src/report.py
435
src/report.py
@@ -1,112 +1,181 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
日报生成模块
|
||||
更新依赖,使用新的配置和数据库模块
|
||||
"""
|
||||
from datetime import datetime, timedelta
|
||||
from typing import Dict, List, Optional
|
||||
import sys
|
||||
import os
|
||||
from typing import Dict, List, Optional, Any
|
||||
import logging
|
||||
|
||||
# 添加项目根目录到 Python 路径
|
||||
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
|
||||
from src.config import config
|
||||
from src.logging_config import get_logger
|
||||
from src.database.daily_logs import DailyLogsDatabase
|
||||
from src.feishu.manager import FeishuScheduleManager
|
||||
|
||||
from src.database import DailyLogsDatabase
|
||||
from src.feishu_v2 import FeishuScheduleManagerV2 as FeishuScheduleManager
|
||||
logger = get_logger(__name__)
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
class ReportGeneratorError(Exception):
|
||||
"""日报生成错误"""
|
||||
pass
|
||||
|
||||
|
||||
class DailyReportGenerator:
|
||||
"""每日作业报告生成器"""
|
||||
|
||||
DAILY_TARGET = 300 # 每日目标作业量
|
||||
def __init__(self, db_path: Optional[str] = None):
|
||||
"""
|
||||
初始化日报生成器
|
||||
|
||||
def __init__(self, db_path: str = 'data/daily_logs.db'):
|
||||
"""初始化"""
|
||||
参数:
|
||||
db_path: 数据库文件路径,如果为None则使用配置
|
||||
"""
|
||||
self.db = DailyLogsDatabase(db_path)
|
||||
logger.info("日报生成器初始化完成")
|
||||
|
||||
def get_latest_date(self) -> str:
|
||||
"""获取数据库中最新的日期"""
|
||||
logs = self.db.query_all(limit=1)
|
||||
if logs:
|
||||
return logs[0]['date']
|
||||
return datetime.now().strftime('%Y-%m-%d')
|
||||
"""
|
||||
获取数据库中最新的日期
|
||||
|
||||
def get_daily_data(self, date: str) -> Dict:
|
||||
"""获取指定日期的数据"""
|
||||
logs = self.db.query_by_date(date)
|
||||
返回:
|
||||
最新日期字符串,格式 "YYYY-MM-DD"
|
||||
"""
|
||||
try:
|
||||
logs = self.db.query_all(limit=1)
|
||||
if logs:
|
||||
return logs[0]['date']
|
||||
return datetime.now().strftime('%Y-%m-%d')
|
||||
|
||||
# 按船名汇总
|
||||
ships = {}
|
||||
for log in logs:
|
||||
ship = log['ship_name']
|
||||
if ship not in ships:
|
||||
ships[ship] = 0
|
||||
if log.get('teu'):
|
||||
ships[ship] += log['teu']
|
||||
except Exception as e:
|
||||
logger.error(f"获取最新日期失败: {e}")
|
||||
return datetime.now().strftime('%Y-%m-%d')
|
||||
|
||||
return {
|
||||
'date': date,
|
||||
'ships': ships,
|
||||
'total_teu': sum(ships.values()),
|
||||
'ship_count': len(ships)
|
||||
}
|
||||
def get_daily_data(self, date: str) -> Dict[str, Any]:
|
||||
"""
|
||||
获取指定日期的数据
|
||||
|
||||
def get_monthly_stats(self, date: str) -> Dict:
|
||||
"""获取月度统计(截止到指定日期)"""
|
||||
year_month = date[:7] # YYYY-MM
|
||||
target_date = datetime.strptime(date, '%Y-%m-%d').date()
|
||||
参数:
|
||||
date: 日期字符串,格式 "YYYY-MM-DD"
|
||||
|
||||
logs = self.db.query_all(limit=10000)
|
||||
返回:
|
||||
每日数据字典
|
||||
"""
|
||||
try:
|
||||
logs = self.db.query_by_date(date)
|
||||
|
||||
# 只统计当月且在指定日期之前的数据
|
||||
monthly_logs = [
|
||||
log for log in logs
|
||||
if log['date'].startswith(year_month)
|
||||
and datetime.strptime(log['date'], '%Y-%m-%d').date() <= target_date
|
||||
]
|
||||
# 按船名汇总
|
||||
ships: Dict[str, int] = {}
|
||||
for log in logs:
|
||||
ship = log['ship_name']
|
||||
if ship not in ships:
|
||||
ships[ship] = 0
|
||||
if log.get('teu'):
|
||||
ships[ship] += log['teu']
|
||||
|
||||
# 按日期汇总
|
||||
daily_totals = {}
|
||||
for log in monthly_logs:
|
||||
d = log['date']
|
||||
if d not in daily_totals:
|
||||
daily_totals[d] = 0
|
||||
if log.get('teu'):
|
||||
daily_totals[d] += log['teu']
|
||||
return {
|
||||
'date': date,
|
||||
'ships': ships,
|
||||
'total_teu': sum(ships.values()),
|
||||
'ship_count': len(ships)
|
||||
}
|
||||
|
||||
# 计算当月天数(已过的天数)
|
||||
current_date = datetime.strptime(date, '%Y-%m-%d')
|
||||
if current_date.day == 1:
|
||||
days_passed = 1
|
||||
else:
|
||||
days_passed = current_date.day
|
||||
except Exception as e:
|
||||
logger.error(f"获取每日数据失败: {date}, 错误: {e}")
|
||||
return {
|
||||
'date': date,
|
||||
'ships': {},
|
||||
'total_teu': 0,
|
||||
'ship_count': 0
|
||||
}
|
||||
|
||||
# 获取未统计数据
|
||||
unaccounted = self.db.get_unaccounted(year_month)
|
||||
def get_monthly_stats(self, date: str) -> Dict[str, Any]:
|
||||
"""
|
||||
获取月度统计(截止到指定日期)
|
||||
|
||||
planned = days_passed * self.DAILY_TARGET
|
||||
actual = sum(daily_totals.values()) + unaccounted
|
||||
参数:
|
||||
date: 日期字符串,格式 "YYYY-MM-DD"
|
||||
|
||||
return {
|
||||
'year_month': year_month,
|
||||
'days_passed': days_passed,
|
||||
'planned': planned,
|
||||
'actual': actual,
|
||||
'unaccounted': unaccounted,
|
||||
'completion': round(actual / planned * 100, 2) if planned > 0 else 0,
|
||||
'daily_totals': daily_totals
|
||||
}
|
||||
返回:
|
||||
月度统计字典
|
||||
"""
|
||||
try:
|
||||
year_month = date[:7] # YYYY-MM
|
||||
target_date = datetime.strptime(date, '%Y-%m-%d').date()
|
||||
|
||||
def get_shift_personnel(self, date: str) -> Dict:
|
||||
logs = self.db.query_all(limit=10000)
|
||||
|
||||
# 只统计当月且在指定日期之前的数据
|
||||
monthly_logs = [
|
||||
log for log in logs
|
||||
if log['date'].startswith(year_month)
|
||||
and datetime.strptime(log['date'], '%Y-%m-%d').date() <= target_date
|
||||
]
|
||||
|
||||
# 按日期汇总
|
||||
daily_totals: Dict[str, int] = {}
|
||||
for log in monthly_logs:
|
||||
d = log['date']
|
||||
if d not in daily_totals:
|
||||
daily_totals[d] = 0
|
||||
if log.get('teu'):
|
||||
daily_totals[d] += log['teu']
|
||||
|
||||
# 计算当月天数(已过的天数)
|
||||
current_date = datetime.strptime(date, '%Y-%m-%d')
|
||||
if current_date.day == config.FIRST_DAY_OF_MONTH_SPECIAL:
|
||||
days_passed = 1
|
||||
else:
|
||||
days_passed = current_date.day
|
||||
|
||||
# 获取未统计数据
|
||||
unaccounted = self.db.get_unaccounted(year_month)
|
||||
|
||||
planned = days_passed * config.DAILY_TARGET_TEU
|
||||
actual = sum(daily_totals.values()) + unaccounted
|
||||
|
||||
completion = round(actual / planned * 100, 2) if planned > 0 else 0
|
||||
|
||||
return {
|
||||
'year_month': year_month,
|
||||
'days_passed': days_passed,
|
||||
'planned': planned,
|
||||
'actual': actual,
|
||||
'unaccounted': unaccounted,
|
||||
'completion': completion,
|
||||
'daily_totals': daily_totals
|
||||
}
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"获取月度统计失败: {date}, 错误: {e}")
|
||||
return {
|
||||
'year_month': date[:7],
|
||||
'days_passed': 0,
|
||||
'planned': 0,
|
||||
'actual': 0,
|
||||
'unaccounted': 0,
|
||||
'completion': 0,
|
||||
'daily_totals': {}
|
||||
}
|
||||
|
||||
def get_shift_personnel(self, date: str) -> Dict[str, str]:
|
||||
"""
|
||||
获取班次人员(从飞书排班表获取)
|
||||
|
||||
注意:日报中显示的是次日的班次人员,所以需要获取 date+1 的排班
|
||||
例如:生成 12/29 的日报,显示的是 12/30 的人员
|
||||
|
||||
参数:
|
||||
date: 日期字符串,格式 "YYYY-MM-DD"
|
||||
|
||||
返回:
|
||||
班次人员字典
|
||||
"""
|
||||
try:
|
||||
# 检查飞书配置
|
||||
if not config.FEISHU_TOKEN or not config.FEISHU_SPREADSHEET_TOKEN:
|
||||
logger.warning("飞书配置不完整,跳过排班信息获取")
|
||||
return self._empty_personnel()
|
||||
|
||||
# 初始化飞书排班管理器
|
||||
manager = FeishuScheduleManager()
|
||||
|
||||
@@ -116,7 +185,7 @@ class DailyReportGenerator:
|
||||
|
||||
logger.info(f"获取 {date} 日报的班次人员,对应排班表日期: {tomorrow}")
|
||||
|
||||
# 获取次日的排班信息(使用缓存)
|
||||
# 获取次日的排班信息
|
||||
schedule = manager.get_schedule_for_date(tomorrow)
|
||||
|
||||
# 如果从飞书获取到数据,使用飞书数据
|
||||
@@ -124,88 +193,156 @@ class DailyReportGenerator:
|
||||
return {
|
||||
'day_shift': schedule.get('day_shift', ''),
|
||||
'night_shift': schedule.get('night_shift', ''),
|
||||
'duty_phone': '13107662315'
|
||||
'duty_phone': config.DUTY_PHONE
|
||||
}
|
||||
|
||||
# 如果飞书数据为空,返回空值
|
||||
logger.warning(f"无法从飞书获取 {tomorrow} 的排班信息")
|
||||
return {
|
||||
'day_shift': '',
|
||||
'night_shift': '',
|
||||
'duty_phone': '13107662315'
|
||||
}
|
||||
return self._empty_personnel()
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"获取排班信息失败: {e}")
|
||||
# 降级处理:返回空值
|
||||
return {
|
||||
'day_shift': '',
|
||||
'night_shift': '',
|
||||
'duty_phone': '13107662315'
|
||||
}
|
||||
return self._empty_personnel()
|
||||
|
||||
def generate_report(self, date: str = None) -> str:
|
||||
"""生成日报"""
|
||||
if not date:
|
||||
date = self.get_latest_date()
|
||||
def generate_report(self, date: Optional[str] = None) -> str:
|
||||
"""
|
||||
生成日报
|
||||
|
||||
# 转换日期格式 2025-12-28 -> 12/28,同时确保查询格式正确
|
||||
参数:
|
||||
date: 日期字符串,格式 "YYYY-MM-DD",如果为None则使用最新日期
|
||||
|
||||
返回:
|
||||
日报文本
|
||||
|
||||
异常:
|
||||
ReportGeneratorError: 生成失败
|
||||
"""
|
||||
try:
|
||||
# 尝试解析各种日期格式
|
||||
parsed = datetime.strptime(date, '%Y-%m-%d')
|
||||
display_date = parsed.strftime('%m/%d')
|
||||
query_date = parsed.strftime('%Y-%m-%d') # 标准化为双数字格式
|
||||
except ValueError:
|
||||
# 如果已经是标准格式,直接使用
|
||||
display_date = datetime.strptime(date, '%Y-%m-%d').strftime('%m/%d')
|
||||
query_date = date
|
||||
if not date:
|
||||
date = self.get_latest_date()
|
||||
|
||||
daily_data = self.get_daily_data(query_date)
|
||||
monthly_data = self.get_monthly_stats(query_date)
|
||||
personnel = self.get_shift_personnel(query_date)
|
||||
# 验证日期格式
|
||||
try:
|
||||
parsed = datetime.strptime(date, '%Y-%m-%d')
|
||||
display_date = parsed.strftime('%m/%d')
|
||||
query_date = parsed.strftime('%Y-%m-%d')
|
||||
except ValueError as e:
|
||||
error_msg = f"日期格式无效: {date}, 错误: {e}"
|
||||
logger.error(error_msg)
|
||||
raise ReportGeneratorError(error_msg) from e
|
||||
|
||||
# 月度统计
|
||||
month_display = date[5:7] + '/' + date[:4] # MM/YYYY
|
||||
# 获取数据
|
||||
daily_data = self.get_daily_data(query_date)
|
||||
monthly_data = self.get_monthly_stats(query_date)
|
||||
personnel = self.get_shift_personnel(query_date)
|
||||
|
||||
lines = []
|
||||
lines.append(f"日期:{display_date}")
|
||||
lines.append("")
|
||||
# 生成日报
|
||||
lines: List[str] = []
|
||||
lines.append(f"日期:{display_date}")
|
||||
lines.append("")
|
||||
|
||||
# 船次信息(紧凑格式,不留空行)
|
||||
ship_lines = []
|
||||
for ship, teu in sorted(daily_data['ships'].items(), key=lambda x: -x[1]):
|
||||
ship_lines.append(f"船名:{ship}")
|
||||
ship_lines.append(f"作业量:{teu}TEU")
|
||||
lines.extend(ship_lines)
|
||||
lines.append("")
|
||||
# 船次信息
|
||||
if daily_data['ships']:
|
||||
ship_lines: List[str] = []
|
||||
for ship, teu in sorted(daily_data['ships'].items(), key=lambda x: -x[1]):
|
||||
ship_lines.append(f"船名:{ship}")
|
||||
ship_lines.append(f"作业量:{teu}TEU")
|
||||
lines.extend(ship_lines)
|
||||
lines.append("")
|
||||
|
||||
# 当日实际作业量
|
||||
lines.append(f"当日实际作业量:{daily_data['total_teu']}TEU")
|
||||
# 当日实际作业量
|
||||
lines.append(f"当日实际作业量:{daily_data['total_teu']}TEU")
|
||||
|
||||
# 月度统计
|
||||
lines.append(f"当月计划作业量:{monthly_data['planned']}TEU (用天数*{self.DAILY_TARGET}TEU)")
|
||||
lines.append(f"当月实际作业量:{monthly_data['actual']}TEU")
|
||||
lines.append(f"当月完成比例:{monthly_data['completion']}%")
|
||||
lines.append("")
|
||||
# 月度统计
|
||||
lines.append(f"当月计划作业量:{monthly_data['planned']}TEU (用天数*{config.DAILY_TARGET_TEU}TEU)")
|
||||
lines.append(f"当月实际作业量:{monthly_data['actual']}TEU")
|
||||
lines.append(f"当月完成比例:{monthly_data['completion']}%")
|
||||
lines.append("")
|
||||
|
||||
# 人员信息(需要配合 Confluence 日志中的班次人员信息)
|
||||
day_personnel = personnel['day_shift']
|
||||
night_personnel = personnel['night_shift']
|
||||
duty_phone = personnel['duty_phone']
|
||||
# 人员信息
|
||||
day_personnel = personnel['day_shift']
|
||||
night_personnel = personnel['night_shift']
|
||||
duty_phone = personnel['duty_phone']
|
||||
|
||||
# 班次日期使用次日
|
||||
next_day = (parsed + timedelta(days=1)).strftime('%m/%d')
|
||||
lines.append(f"{next_day} 白班人员:{day_personnel}")
|
||||
lines.append(f"{next_day} 夜班人员:{night_personnel}")
|
||||
lines.append(f"24小时值班手机:{duty_phone}")
|
||||
# 班次日期使用次日
|
||||
next_day = (parsed + timedelta(days=1)).strftime('%m/%d')
|
||||
lines.append(f"{next_day} 白班人员:{day_personnel}")
|
||||
lines.append(f"{next_day} 夜班人员:{night_personnel}")
|
||||
lines.append(f"24小时值班手机:{duty_phone}")
|
||||
|
||||
return "\n".join(lines)
|
||||
report = "\n".join(lines)
|
||||
logger.info(f"日报生成完成: {date}")
|
||||
return report
|
||||
|
||||
def print_report(self, date: str = None):
|
||||
"""打印日报"""
|
||||
report = self.generate_report(date)
|
||||
print(report)
|
||||
return report
|
||||
except ReportGeneratorError:
|
||||
raise
|
||||
except Exception as e:
|
||||
error_msg = f"生成日报失败: {e}"
|
||||
logger.error(error_msg)
|
||||
raise ReportGeneratorError(error_msg) from e
|
||||
|
||||
def print_report(self, date: Optional[str] = None) -> str:
|
||||
"""
|
||||
打印日报
|
||||
|
||||
参数:
|
||||
date: 日期字符串,格式 "YYYY-MM-DD",如果为None则使用最新日期
|
||||
|
||||
返回:
|
||||
日报文本
|
||||
"""
|
||||
try:
|
||||
report = self.generate_report(date)
|
||||
print(report)
|
||||
return report
|
||||
|
||||
except ReportGeneratorError as e:
|
||||
print(f"生成日报失败: {e}")
|
||||
return ""
|
||||
|
||||
def save_report_to_file(self, date: Optional[str] = None, filepath: Optional[str] = None) -> bool:
|
||||
"""
|
||||
保存日报到文件
|
||||
|
||||
参数:
|
||||
date: 日期字符串,如果为None则使用最新日期
|
||||
filepath: 文件路径,如果为None则使用默认路径
|
||||
|
||||
返回:
|
||||
是否成功
|
||||
"""
|
||||
try:
|
||||
report = self.generate_report(date)
|
||||
|
||||
if filepath is None:
|
||||
# 使用默认路径
|
||||
import os
|
||||
report_dir = "reports"
|
||||
os.makedirs(report_dir, exist_ok=True)
|
||||
|
||||
if date is None:
|
||||
date = self.get_latest_date()
|
||||
filename = f"daily_report_{date}.txt"
|
||||
filepath = os.path.join(report_dir, filename)
|
||||
|
||||
with open(filepath, 'w', encoding='utf-8') as f:
|
||||
f.write(report)
|
||||
|
||||
logger.info(f"日报已保存到文件: {filepath}")
|
||||
return True
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"保存日报到文件失败: {e}")
|
||||
return False
|
||||
|
||||
def _empty_personnel(self) -> Dict[str, str]:
|
||||
"""返回空的人员信息"""
|
||||
return {
|
||||
'day_shift': '',
|
||||
'night_shift': '',
|
||||
'duty_phone': config.DUTY_PHONE
|
||||
}
|
||||
|
||||
def close(self):
|
||||
"""关闭数据库连接"""
|
||||
@@ -213,6 +350,32 @@ class DailyReportGenerator:
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
# 测试代码
|
||||
import sys
|
||||
|
||||
# 设置日志
|
||||
logging.basicConfig(level=logging.INFO)
|
||||
|
||||
generator = DailyReportGenerator()
|
||||
generator.print_report()
|
||||
generator.close()
|
||||
|
||||
try:
|
||||
# 测试获取最新日期
|
||||
latest_date = generator.get_latest_date()
|
||||
print(f"最新日期: {latest_date}")
|
||||
|
||||
# 测试生成日报
|
||||
report = generator.generate_report(latest_date)
|
||||
print(f"\n日报内容:\n{report}")
|
||||
|
||||
# 测试保存到文件
|
||||
success = generator.save_report_to_file(latest_date)
|
||||
print(f"\n保存到文件: {'成功' if success else '失败'}")
|
||||
|
||||
except ReportGeneratorError as e:
|
||||
print(f"日报生成错误: {e}")
|
||||
sys.exit(1)
|
||||
except Exception as e:
|
||||
print(f"未知错误: {e}")
|
||||
sys.exit(1)
|
||||
finally:
|
||||
generator.close()
|
||||
|
||||
@@ -1,323 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
排班人员数据库模块
|
||||
"""
|
||||
import sqlite3
|
||||
import os
|
||||
import json
|
||||
from datetime import datetime
|
||||
from typing import List, Dict, Optional, Tuple
|
||||
import hashlib
|
||||
|
||||
|
||||
class ScheduleDatabase:
|
||||
"""排班人员数据库"""
|
||||
|
||||
def __init__(self, db_path: str = 'data/daily_logs.db'):
|
||||
"""
|
||||
初始化数据库
|
||||
|
||||
参数:
|
||||
db_path: 数据库文件路径
|
||||
"""
|
||||
self.db_path = db_path
|
||||
self._ensure_directory()
|
||||
self.conn = self._connect()
|
||||
self._init_schema()
|
||||
|
||||
def _ensure_directory(self):
|
||||
"""确保数据目录存在"""
|
||||
data_dir = os.path.dirname(self.db_path)
|
||||
if data_dir and not os.path.exists(data_dir):
|
||||
os.makedirs(data_dir)
|
||||
|
||||
def _connect(self) -> sqlite3.Connection:
|
||||
"""连接数据库"""
|
||||
conn = sqlite3.connect(self.db_path)
|
||||
conn.row_factory = sqlite3.Row
|
||||
return conn
|
||||
|
||||
def _init_schema(self):
|
||||
"""初始化表结构"""
|
||||
cursor = self.conn.cursor()
|
||||
|
||||
# 创建排班人员表
|
||||
cursor.execute('''
|
||||
CREATE TABLE IF NOT EXISTS schedule_personnel (
|
||||
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
date TEXT NOT NULL,
|
||||
day_shift TEXT,
|
||||
night_shift TEXT,
|
||||
day_shift_list TEXT, -- JSON数组
|
||||
night_shift_list TEXT, -- JSON数组
|
||||
sheet_id TEXT,
|
||||
sheet_title TEXT,
|
||||
data_hash TEXT, -- 数据哈希,用于检测更新
|
||||
created_at TEXT DEFAULT CURRENT_TIMESTAMP,
|
||||
updated_at TEXT DEFAULT CURRENT_TIMESTAMP,
|
||||
UNIQUE(date)
|
||||
)
|
||||
''')
|
||||
|
||||
# 创建表格版本表(用于检测表格是否有更新)
|
||||
cursor.execute('''
|
||||
CREATE TABLE IF NOT EXISTS sheet_versions (
|
||||
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
sheet_id TEXT NOT NULL,
|
||||
sheet_title TEXT NOT NULL,
|
||||
revision INTEGER NOT NULL,
|
||||
data_hash TEXT,
|
||||
last_checked_at TEXT DEFAULT CURRENT_TIMESTAMP,
|
||||
UNIQUE(sheet_id)
|
||||
)
|
||||
''')
|
||||
|
||||
# 索引
|
||||
cursor.execute('CREATE INDEX IF NOT EXISTS idx_schedule_date ON schedule_personnel(date)')
|
||||
cursor.execute('CREATE INDEX IF NOT EXISTS idx_schedule_sheet ON schedule_personnel(sheet_id)')
|
||||
cursor.execute('CREATE INDEX IF NOT EXISTS idx_sheet_versions ON sheet_versions(sheet_id)')
|
||||
|
||||
self.conn.commit()
|
||||
|
||||
def _calculate_hash(self, data: Dict) -> str:
|
||||
"""计算数据哈希值"""
|
||||
data_str = json.dumps(data, sort_keys=True, ensure_ascii=False)
|
||||
return hashlib.md5(data_str.encode('utf-8')).hexdigest()
|
||||
|
||||
def check_sheet_update(self, sheet_id: str, sheet_title: str, revision: int, data: Dict) -> bool:
|
||||
"""
|
||||
检查表格是否有更新
|
||||
|
||||
参数:
|
||||
sheet_id: 表格ID
|
||||
sheet_title: 表格标题
|
||||
revision: 表格版本号
|
||||
data: 表格数据
|
||||
|
||||
返回:
|
||||
True: 有更新,需要重新获取
|
||||
False: 无更新,可以使用缓存
|
||||
"""
|
||||
cursor = self.conn.cursor()
|
||||
|
||||
# 查询当前版本
|
||||
cursor.execute(
|
||||
'SELECT revision, data_hash FROM sheet_versions WHERE sheet_id = ?',
|
||||
(sheet_id,)
|
||||
)
|
||||
result = cursor.fetchone()
|
||||
|
||||
if not result:
|
||||
# 第一次获取,记录版本
|
||||
data_hash = self._calculate_hash(data)
|
||||
cursor.execute('''
|
||||
INSERT INTO sheet_versions (sheet_id, sheet_title, revision, data_hash, last_checked_at)
|
||||
VALUES (?, ?, ?, ?, CURRENT_TIMESTAMP)
|
||||
''', (sheet_id, sheet_title, revision, data_hash))
|
||||
self.conn.commit()
|
||||
return True
|
||||
|
||||
# 检查版本号或数据是否有变化
|
||||
old_revision = result['revision']
|
||||
old_hash = result['data_hash']
|
||||
new_hash = self._calculate_hash(data)
|
||||
|
||||
if old_revision != revision or old_hash != new_hash:
|
||||
# 有更新,更新版本信息
|
||||
cursor.execute('''
|
||||
UPDATE sheet_versions
|
||||
SET revision = ?, data_hash = ?, last_checked_at = CURRENT_TIMESTAMP
|
||||
WHERE sheet_id = ?
|
||||
''', (revision, new_hash, sheet_id))
|
||||
self.conn.commit()
|
||||
return True
|
||||
|
||||
# 无更新,更新检查时间
|
||||
cursor.execute('''
|
||||
UPDATE sheet_versions
|
||||
SET last_checked_at = CURRENT_TIMESTAMP
|
||||
WHERE sheet_id = ?
|
||||
''', (sheet_id,))
|
||||
self.conn.commit()
|
||||
return False
|
||||
|
||||
def save_schedule(self, date: str, schedule_data: Dict, sheet_id: str = None, sheet_title: str = None) -> bool:
|
||||
"""
|
||||
保存排班信息到数据库
|
||||
|
||||
参数:
|
||||
date: 日期 (YYYY-MM-DD)
|
||||
schedule_data: 排班数据
|
||||
sheet_id: 表格ID
|
||||
sheet_title: 表格标题
|
||||
|
||||
返回:
|
||||
是否成功
|
||||
"""
|
||||
try:
|
||||
cursor = self.conn.cursor()
|
||||
|
||||
# 准备数据
|
||||
day_shift = schedule_data.get('day_shift', '')
|
||||
night_shift = schedule_data.get('night_shift', '')
|
||||
day_shift_list = json.dumps(schedule_data.get('day_shift_list', []), ensure_ascii=False)
|
||||
night_shift_list = json.dumps(schedule_data.get('night_shift_list', []), ensure_ascii=False)
|
||||
data_hash = self._calculate_hash(schedule_data)
|
||||
|
||||
# 使用 INSERT OR REPLACE 来更新已存在的记录
|
||||
cursor.execute('''
|
||||
INSERT OR REPLACE INTO schedule_personnel
|
||||
(date, day_shift, night_shift, day_shift_list, night_shift_list,
|
||||
sheet_id, sheet_title, data_hash, updated_at)
|
||||
VALUES (?, ?, ?, ?, ?, ?, ?, ?, CURRENT_TIMESTAMP)
|
||||
''', (
|
||||
date, day_shift, night_shift, day_shift_list, night_shift_list,
|
||||
sheet_id, sheet_title, data_hash
|
||||
))
|
||||
|
||||
self.conn.commit()
|
||||
return True
|
||||
|
||||
except sqlite3.Error as e:
|
||||
print(f"数据库错误: {e}")
|
||||
return False
|
||||
|
||||
def get_schedule(self, date: str) -> Optional[Dict]:
|
||||
"""
|
||||
获取指定日期的排班信息
|
||||
|
||||
参数:
|
||||
date: 日期 (YYYY-MM-DD)
|
||||
|
||||
返回:
|
||||
排班信息字典,未找到返回None
|
||||
"""
|
||||
cursor = self.conn.cursor()
|
||||
cursor.execute(
|
||||
'SELECT * FROM schedule_personnel WHERE date = ?',
|
||||
(date,)
|
||||
)
|
||||
result = cursor.fetchone()
|
||||
|
||||
if not result:
|
||||
return None
|
||||
|
||||
# 解析JSON数组
|
||||
day_shift_list = json.loads(result['day_shift_list']) if result['day_shift_list'] else []
|
||||
night_shift_list = json.loads(result['night_shift_list']) if result['night_shift_list'] else []
|
||||
|
||||
return {
|
||||
'date': result['date'],
|
||||
'day_shift': result['day_shift'],
|
||||
'night_shift': result['night_shift'],
|
||||
'day_shift_list': day_shift_list,
|
||||
'night_shift_list': night_shift_list,
|
||||
'sheet_id': result['sheet_id'],
|
||||
'sheet_title': result['sheet_title'],
|
||||
'updated_at': result['updated_at']
|
||||
}
|
||||
|
||||
def get_schedule_by_range(self, start_date: str, end_date: str) -> List[Dict]:
|
||||
"""
|
||||
获取日期范围内的排班信息
|
||||
|
||||
参数:
|
||||
start_date: 开始日期 (YYYY-MM-DD)
|
||||
end_date: 结束日期 (YYYY-MM-DD)
|
||||
|
||||
返回:
|
||||
排班信息列表
|
||||
"""
|
||||
cursor = self.conn.cursor()
|
||||
cursor.execute('''
|
||||
SELECT * FROM schedule_personnel
|
||||
WHERE date >= ? AND date <= ?
|
||||
ORDER BY date
|
||||
''', (start_date, end_date))
|
||||
|
||||
results = []
|
||||
for row in cursor.fetchall():
|
||||
day_shift_list = json.loads(row['day_shift_list']) if row['day_shift_list'] else []
|
||||
night_shift_list = json.loads(row['night_shift_list']) if row['night_shift_list'] else []
|
||||
|
||||
results.append({
|
||||
'date': row['date'],
|
||||
'day_shift': row['day_shift'],
|
||||
'night_shift': row['night_shift'],
|
||||
'day_shift_list': day_shift_list,
|
||||
'night_shift_list': night_shift_list,
|
||||
'sheet_id': row['sheet_id'],
|
||||
'sheet_title': row['sheet_title'],
|
||||
'updated_at': row['updated_at']
|
||||
})
|
||||
|
||||
return results
|
||||
|
||||
def delete_old_schedules(self, before_date: str) -> int:
|
||||
"""
|
||||
删除指定日期之前的排班记录
|
||||
|
||||
参数:
|
||||
before_date: 日期 (YYYY-MM-DD)
|
||||
|
||||
返回:
|
||||
删除的记录数
|
||||
"""
|
||||
cursor = self.conn.cursor()
|
||||
cursor.execute(
|
||||
'DELETE FROM schedule_personnel WHERE date < ?',
|
||||
(before_date,)
|
||||
)
|
||||
deleted_count = cursor.rowcount
|
||||
self.conn.commit()
|
||||
return deleted_count
|
||||
|
||||
def get_stats(self) -> Dict:
|
||||
"""获取统计信息"""
|
||||
cursor = self.conn.cursor()
|
||||
|
||||
cursor.execute('SELECT COUNT(*) FROM schedule_personnel')
|
||||
total = cursor.fetchone()[0]
|
||||
|
||||
cursor.execute('SELECT MIN(date), MAX(date) FROM schedule_personnel')
|
||||
date_range = cursor.fetchone()
|
||||
|
||||
cursor.execute('SELECT COUNT(DISTINCT sheet_id) FROM schedule_personnel')
|
||||
sheet_count = cursor.fetchone()[0]
|
||||
|
||||
return {
|
||||
'total': total,
|
||||
'date_range': {'start': date_range[0], 'end': date_range[1]},
|
||||
'sheet_count': sheet_count
|
||||
}
|
||||
|
||||
def close(self):
|
||||
"""关闭连接"""
|
||||
if self.conn:
|
||||
self.conn.close()
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
# 测试代码
|
||||
db = ScheduleDatabase()
|
||||
|
||||
# 测试保存
|
||||
test_schedule = {
|
||||
'day_shift': '张勤、杨俊豪',
|
||||
'night_shift': '刘炜彬、梁启迟',
|
||||
'day_shift_list': ['张勤', '杨俊豪'],
|
||||
'night_shift_list': ['刘炜彬', '梁启迟']
|
||||
}
|
||||
|
||||
success = db.save_schedule('2025-12-31', test_schedule, 'zcYLIk', '12月')
|
||||
print(f"保存成功: {success}")
|
||||
|
||||
# 测试获取
|
||||
schedule = db.get_schedule('2025-12-31')
|
||||
print(f"获取结果: {schedule}")
|
||||
|
||||
# 测试统计
|
||||
stats = db.get_stats()
|
||||
print(f"统计: {stats}")
|
||||
|
||||
db.close()
|
||||
Reference in New Issue
Block a user