重构: 完成代码审查和架构优化

主要改进: 1. 模块化架构重构 - 创建Confluence模块目录结构 - 统一飞书模块架构 - 重构数据库模块 2. 代码质量提升 - 创建统一配置管理 - 实现统一日志配置 - 完善类型提示和异常处理 3. 功能优化 - 移除parse-test功能 - 删除DEBUG_MODE配置 - 更新命令行选项 4. 文档完善 - 更新README.md项目结构 - 添加开发指南和故障排除 - 完善配置说明 5. 系统验证 - 所有核心功能测试通过 - 模块导入验证通过 - 架构完整性验证通过
2026-02-10 07:41:29 +08:00 · 2025-12-31 02:04:16 +08:00
parent 90317018b7
commit 5345dc75f2
30 changed files with 4355 additions and 2678 deletions
--- a/.gitignore
+++ b/.gitignore
@@ -11,6 +11,7 @@ data/daily_logs.db
 # Cache
 *.pyc
 *.pyo
+docs/

 # Debug output
 debug/
--- a/README.md
+++ b/README.md
@@ -11,6 +11,7 @@
 - 支持未统计数据手动录入
 - 支持二次靠泊记录合并
 - GUI 图形界面（可选）
+- 飞书排班表集成（自动获取班次人员）

 ## 项目结构

@@ -25,15 +26,32 @@ OrbitIn/
 ├── debug/                     # 调试输出目录
 │   └── layout_output_*.txt    # 带时间戳的调试文件
 ├── data/                      # 数据目录
-│   └── daily_logs.db          # SQLite3 数据库
+│   ├── daily_logs.db          # SQLite3 数据库
+│   └── schedule_cache.json    # 排班数据缓存
+├── logs/                      # 日志目录
+│   └── app.log                # 应用日志
 └── src/                       # 代码模块
    ├── __init__.py
-    ├── confluence.py          # Confluence API 客户端
-    ├── extractor.py           # HTML 文本提取器
-    ├── parser.py              # 日志解析器
-    ├── database.py            # 数据库操作
+    ├── config.py              # 统一配置管理
+    ├── logging_config.py      # 统一日志配置
    ├── report.py              # 报表生成器
-    └── gui.py                 # GUI 图形界面
+    ├── gui.py                 # GUI 图形界面
+    ├── database/              # 数据库模块
+    │   ├── base.py            # 数据库基类
+    │   ├── daily_logs.py      # 每日日志数据库
+    │   └── schedules.py       # 排班数据库
+    ├── confluence/            # Confluence API 模块
+    │   ├── client.py          # Confluence API 客户端
+    │   ├── parser.py          # HTML 内容解析器
+    │   ├── text.py            # HTML 文本提取器
+    │   ├── log_parser.py      # 日志解析器
+    │   ├── manager.py         # 内容管理器
+    │   └── __init__.py        # 模块导出
+    └── feishu/                # 飞书 API 模块
+        ├── client.py          # 飞书 API 客户端
+        ├── parser.py          # 排班数据解析器
+        ├── manager.py         # 飞书排班管理器
+        └── __init__.py        # 模块导出
 ```

 ## 快速开始
@@ -44,15 +62,31 @@ OrbitIn/
 pip install requests beautifulsoup4 python-dotenv
 ```

-### 配置 Confluence
+### 配置

 在 `.env` 文件中配置：

 ```bash
 # .env
+# Confluence 配置
 CONFLUENCE_BASE_URL=https://your-confluence.atlassian.net/rest/api
 CONFLUENCE_TOKEN=your-api-token
 CONFLUENCE_CONTENT_ID=155764524
+
+# 飞书表格配置（用于获取排班人员信息）
+FEISHU_BASE_URL=https://open.feishu.cn/open-apis/sheets/v3
+FEISHU_TOKEN=your-feishu-api-token
+FEISHU_SPREADSHEET_TOKEN=EgNPssi2ghZ7BLtGiTxcIBUmnVh
+
+# 数据库配置
+DATABASE_PATH=data/daily_logs.db
+
+# 业务配置
+DAILY_TARGET_TEU=300          # 每日目标TEU数量，用于计算完成率
+DUTY_PHONE=13107662315        # 值班电话，显示在日报中
+SEPARATOR_CHAR=─              # 分隔线字符，用于格式化输出
+SEPARATOR_LENGTH=50           # 分隔线长度
+SCHEDULE_REFRESH_DAYS=30      # 排班数据刷新间隔（天）
 ```

 参考 `.env.example` 文件创建 `.env` 文件。
@@ -63,7 +97,7 @@ CONFLUENCE_CONTENT_ID=155764524

 ```bash
 # 默认：获取、提取、解析并保存到数据库
-python3 main.py
+python3 main.py fetch-save

 # 仅获取HTML并提取文本（保存到debug目录）
 python3 main.py fetch
@@ -74,11 +108,11 @@ python3 main.py fetch-debug
 # 生成日报（指定日期）
 python3 main.py report 2025-12-28

-# 生成昨日日报
+# 生成今日日报
 python3 main.py report-today

-# 解析测试（使用已有的layout_output.txt）
-python3 main.py parse-test
+# 配置测试（验证所有连接）
+python3 main.py config-test

 # 添加未统计数据
 python3 main.py --unaccounted 118 --month 2025-12
@@ -97,10 +131,11 @@ GUI 功能：
 - 获取并处理数据
 - 获取 (Debug模式)
 - 生成日报
- 昨日日报（自动获取前一天数据）
+- 今日日报（自动获取前一天数据）
 - 添加未统计数据
 - 数据库统计（显示当月每艘船的作业量）
 - 日报内容可复制
+- 自动刷新排班信息

 ## 数据格式

@@ -160,13 +195,89 @@ GUI 功能：
 24小时值班手机：13107662315
 ```

+## 核心模块说明
+
+### Confluence 模块 (`src/confluence/`)
+- **`client.py`** - Confluence API 客户端，负责 HTTP 请求和连接管理
+- **`text.py`** - HTML 文本提取器，保留布局结构
+- **`log_parser.py`** - 日志解析器，解析船次作业数据
+- **`parser.py`** - HTML 内容解析器，提取链接、图片、表格
+- **`manager.py`** - 内容管理器，提供高级内容管理功能
+
+### 飞书模块 (`src/feishu/`)
+- **`client.py`** - 飞书 API 客户端
+- **`parser.py`** - 排班数据解析器
+- **`manager.py`** - 飞书排班管理器，缓存和刷新排班信息
+
+### 数据库模块 (`src/database/`)
+- **`base.py`** - 数据库基类，提供统一的连接管理
+- **`daily_logs.py`** - 每日交接班日志数据库
+- **`schedules.py`** - 排班数据库
+
 ## 技术栈

 - Python 3.7+
 - SQLite3
 - Requests (HTTP 客户端)
- HTMLParser (标准库)
+- BeautifulSoup4 (HTML 解析)
 - tkinter (GUI，可选)
+- 类型提示 (Python 3.5+)
+
+## 架构特点
+
+1. **模块化设计** - 每个模块职责单一，便于测试和维护
+2. **统一配置** - 集中管理所有环境变量和业务配置
+3. **统一日志** - 标准化的日志配置和文件轮转
+4. **异常处理** - 详细的错误处理和日志记录
+5. **类型安全** - 全面的 Python 类型提示
+
+## 开发指南
+
+### 添加新功能
+
+1. **配置管理**: 所有配置项应在 `src/config.py` 中定义
+2. **日志记录**: 使用 `from src.logging_config import get_logger` 获取日志器
+3. **异常处理**: 为每个模块创建自定义异常类
+4. **类型提示**: 所有函数和方法都应包含完整的类型提示
+5. **数据库操作**: 使用 `src/database/base.py` 中的基类确保连接管理
+
+### 测试
+
+```bash
+# 运行配置测试
+python3 main.py config-test
+
+# 测试特定功能
+python3 main.py fetch
+python3 main.py report-today
+```
+
+### 调试
+
+1. **日志查看**: 查看 `logs/app.log` 获取详细运行信息
+2. **调试文件**: 使用 `python3 main.py fetch-debug` 生成带时间戳的调试文件
+
+### 代码规范
+
+- 遵循 PEP 8 编码规范
+- 使用 Black 格式化代码（可选）
+- 使用 isort 排序导入
+- 所有公开 API 应有文档字符串
+
+## 故障排除
+
+### 常见问题
+
+1. **连接失败**: 检查 `.env` 文件中的 API 令牌和 URL
+2. **数据库错误**: 确保 `data/` 目录存在且有写入权限
+3. **解析错误**: 检查 Confluence 页面结构是否发生变化
+4. **飞书数据获取失败**: 验证飞书表格权限和 token 有效性
+
+### 日志级别
+
+- 默认日志级别: INFO
+- 调试日志级别: DEBUG (设置环境变量 `LOG_LEVEL=DEBUG`)
+- 日志文件: `logs/app.log`，自动轮转

 ## License

--- a/docs/feishu_data_flow.md
+++ b/docs/feishu_data_flow.md
@@ -1,179 +0,0 @@
-# 飞书数据获取流程
-
-## 整体流程
-
-```mermaid
-flowchart TD
-    A[开始: 生成日报] --> B[调用 get_shift_personnel]
-    B --> C[创建 FeishuScheduleManager]
-    C --> D[调用 get_schedule_for_date]
-    
-    D --> E[解析日期: 2025-12-30 → 12/30]
-    E --> F[检查缓存: data/schedule_cache.json]
-    
-    F --> G{缓存是否存在?}
-    G -->|是| H[直接返回缓存数据]
-    G -->|否| I[调用 API 获取数据]
-    
-    I --> J[调用 get_sheets_info]
-    J --> K[GET /spreadsheets/{token}/sheets/query]
-    K --> L[返回表格列表: 8月, 9月, 10月, 11月, 12月, 2026年...]
-    
-    L --> M[根据月份选择表格: 12月 → sheet_id='zcYLIk']
-    M --> N[调用 get_sheet_data]
-    N --> O[GET /spreadsheets/{token}/values/zcYLIk!A:AF]
-    O --> P[返回表格数据: 姓名, 12月1日, 12月2日...]
-    
-    P --> Q[调用 ScheduleDataParser.parse]
-    Q --> R[解析日期列: 查找12月30日对应的列索引]
-    R --> S[筛选班次人员: 白班='白', 夜班='夜']
-    
-    S --> T[返回结果: 白班人员列表, 夜班人员列表]
-    T --> U[保存到缓存]
-    U --> V[返回给日报模块]
-    V --> W[填充到日报中]
-```
-
-## API调用详情
-
-### 1. 获取表格列表
-
-**请求**:
-```
-GET https://open.feishu.cn/open-apis/sheets/v3/spreadsheets/EgNPssi2ghZ7BLtGiTxcIBUmnVh/sheets/query
-Authorization: Bearer u-dbctiP9qx1wF.wfoMV2ZHGkh1DNl14oriM8aZMI0026k
-```
-
-**响应**:
-```json
-{
-  "code": 0,
-  "data": {
-    "sheets": [
-      {"sheet_id": "904236", "title": "8月"},
-      {"sheet_id": "ATgwLm", "title": "9月"},
-      {"sheet_id": "2ml4B0", "title": "10月"},
-      {"sheet_id": "y5xv1D", "title": "11月"},
-      {"sheet_id": "zcYLIk", "title": "12月"},
-      {"sheet_id": "R35cIj", "title": "2026年排班表"},
-      {"sheet_id": "wMXHQg", "title": "12月（副本）"}
-    ]
-  }
-}
-```
-
-### 2. 获取表格数据
-
-**请求**:
-```
-GET https://open.feishu.cn/open-apis/sheets/v2/spreadsheets/EgNPssi2ghZ7BLtGiTxcIBUmnVh/values/zcYLIk!A:AF
-Authorization: Bearer u-dbctiP9qx1wF.wfoMV2ZHGkh1DNl14oriM8aZMI0026k
-params: {
-  valueRenderOption: "ToString",
-  dateTimeRenderOption: "FormattedString"
-}
-```
-
-**响应**:
-```json
-{
-  "code": 0,
-  "data": {
-    "valueRange": {
-      "range": "zcYLIk!A1:AF11",
-      "values": [
-        ["姓名", "12月1日", "12月2日", "12月3日", "12月4日", ...],
-        ["张勤", "白", "白", "白", "白", ...],
-        ["刘炜彬", "白", null, "夜", "夜", ...],
-        ["杨俊豪", "白", "白", "白", "白", ...],
-        ["梁启迟", "夜", "夜", "夜", "夜", ...],
-        ...
-      ]
-    }
-  }
-}
-```
-
-## 数据解析流程
-
-### 1. 查找日期列索引
-
-```python
-# 查找 "12月30日" 在表头中的位置
-headers = ["姓名", "12月1日", "12月2日", ..., "12月30日", ...]
-target = "12/30"  # 从 "2025-12-30" 转换而来
-
-# 遍历表头找到匹配的日期
-for i, header in enumerate(headers):
-    if header == "12月30日":
-        column_index = i
-        break
-# 结果: column_index = 31 (第32列)
-```
-
-### 2. 筛选班次人员
-
-```python
-# 遍历所有人员行
-for row in values[1:]:  # 跳过表头
-    name = row[0]           # 姓名
-    shift = row[31]         # 12月30日的班次
-    
-    if shift == "白":
-        day_shift_list.append(name)
-    elif shift == "夜":
-        night_shift_list.append(name)
-
-# 结果
-# day_shift_list = ["张勤", "杨俊豪", "冯栋", "汪钦良"]
-# night_shift_list = ["刘炜彬", "梁启迟"]
-```
-
-### 3. 生成日报输出
-
-```python
-day_shift_str = "、".join(day_shift_list)  # "张勤、杨俊豪、冯栋、汪钦良"
-night_shift_str = "、".join(night_shift_list)  # "刘炜彬、梁启迟"
-
-# 日报中的格式
-lines.append(f"12/31 白班人员：{day_shift_str}")
-lines.append(f"12/31 夜班人员：{night_shift_str}")
-```
-
-## 缓存机制
-
-```mermaid
-flowchart LR
-    A[首次请求] --> B[调用API]
-    B --> C[保存缓存: data/schedule_cache.json]
-    C --> D{"1小时内再次请求"}
-    D -->|是| E[直接读取缓存]
-    D -->|否| F[重新调用API]
-```
-
-缓存文件格式:
-```json
-{
-  "last_update": "2025-12-30T15:00:00",
-  "data": {
-    "2025-12-12/30": {
-      "day_shift": "张勤、杨俊豪、冯栋、汪钦良",
-      "night_shift": "刘炜彬、梁启迟",
-      "day_shift_list": ["张勤", "杨俊豪", "冯栋", "汪钦良"],
-      "night_shift_list": ["刘炜彬", "梁启迟"]
-    }
-  }
-}
-```
-
-## 关键代码位置
-
-| 功能 | 文件 | 行号 |
-|------|------|------|
-| 飞书API客户端 | [`src/feishu.py`](src/feishu.py:10) | 10 |
-| 获取表格列表 | [`src/feishu.py`](src/feishu.py:28) | 28 |
-| 获取表格数据 | [`src/feishu.py`](src/feishu.py:42) | 42 |
-| 数据解析器 | [`src/feishu.py`](src/feishu.py:58) | 58 |
-| 缓存管理 | [`src/feishu.py`](src/feishu.py:150) | 150 |
-| 主管理器 | [`src/feishu.py`](src/feishu.py:190) | 190 |
-| 日报集成 | [`src/report.py`](src/report.py:98) | 98 |
--- a/main.py
+++ b/main.py
@@ -2,130 +2,210 @@
 """
 码头作业日志管理工具
 从 Confluence 获取交接班日志并保存到数据库
+更新依赖，使用新的模块结构
 """
 import argparse
 import sys
 import os
 from datetime import datetime
+from typing import Optional, List

-from src.confluence import ConfluenceClient
-from src.extractor import HTMLTextExtractor
-from src.parser import HandoverLogParser
-from src.database import DailyLogsDatabase
-from src.report import DailyReportGenerator
+from src.config import config
+from src.logging_config import setup_logging, get_logger
+from src.confluence import ConfluenceClient, ConfluenceClientError, HTMLTextExtractor, HTMLTextExtractorError, HandoverLogParser, ShipLog, LogParserError
+from src.database.daily_logs import DailyLogsDatabase
+from src.report import DailyReportGenerator, ReportGeneratorError

-# 加载环境变量
-from dotenv import load_dotenv
-load_dotenv()
-
-# 配置（从环境变量读取）
-CONF_BASE_URL = os.getenv('CONFLUENCE_BASE_URL')
-CONF_TOKEN = os.getenv('CONFLUENCE_TOKEN')
-CONF_CONTENT_ID = os.getenv('CONFLUENCE_CONTENT_ID')
-
-# 飞书配置（可选）
-FEISHU_BASE_URL = os.getenv('FEISHU_BASE_URL')
-FEISHU_TOKEN = os.getenv('FEISHU_TOKEN')
-FEISHU_SPREADSHEET_TOKEN = os.getenv('FEISHU_SPREADSHEET_TOKEN')
-
-DEBUG_DIR = 'debug'
+# 初始化日志
+logger = get_logger(__name__)


 def ensure_debug_dir():
    """确保debug目录存在"""
-    if not os.path.exists(DEBUG_DIR):
-        os.makedirs(DEBUG_DIR)
+    if not os.path.exists(config.DEBUG_DIR):
+        os.makedirs(config.DEBUG_DIR)
+        logger.info(f"创建调试目录: {config.DEBUG_DIR}")


-def get_timestamp():
+def get_timestamp() -> str:
    """获取时间戳用于文件名"""
    return datetime.now().strftime('%Y%m%d_%H%M%S')


-def fetch_html():
-    """获取HTML内容"""
-    if not CONF_BASE_URL or not CONF_TOKEN or not CONF_CONTENT_ID:
-        print('错误：未配置 Confluence 信息，请检查 .env 文件')
+def fetch_html() -> str:
+    """
+    获取HTML内容
+    
+    返回:
+        HTML字符串
+    
+    异常:
+        SystemExit: 配置错误或获取失败
+    """
+    # 验证配置
+    if not config.validate():
+        logger.error("配置验证失败，请检查 .env 文件")
        sys.exit(1)
    
-    print('正在从 Confluence 获取 HTML 内容...')
-    client = ConfluenceClient(CONF_BASE_URL, CONF_TOKEN)
-    html = client.get_html(CONF_CONTENT_ID)
-    if not html:
-        print('错误：未获取到 HTML 内容')
-        sys.exit(1)
-    print(f'获取成功，共 {len(html)} 字符')
+    try:
+        logger.info("正在从 Confluence 获取 HTML 内容...")
+        client = ConfluenceClient()
+        html = client.get_html(config.CONFLUENCE_CONTENT_ID)
+        logger.info(f"获取成功，共 {len(html)} 字符")
        return html
        
+    except ConfluenceClientError as e:
+        logger.error(f"获取HTML失败: {e}")
+        sys.exit(1)
+    except Exception as e:
+        logger.error(f"未知错误: {e}")
+        sys.exit(1)

-def extract_text(html):
-    """提取布局文本"""
-    print('正在提取布局文本...')
+
+def extract_text(html: str) -> str:
+    """
+    提取布局文本
+    
+    参数:
+        html: HTML字符串
+    
+    返回:
+        提取的文本
+    """
+    try:
+        logger.info("正在提取布局文本...")
        extractor = HTMLTextExtractor()
        layout_text = extractor.extract(html)
-    print(f'提取完成，共 {len(layout_text)} 字符')
+        logger.info(f"提取完成，共 {len(layout_text)} 字符")
        return layout_text
        
+    except HTMLTextExtractorError as e:
+        logger.error(f"提取文本失败: {e}")
+        raise
+    except Exception as e:
+        logger.error(f"未知错误: {e}")
+        raise

-def save_debug_file(content, suffix=''):
-    """保存调试文件到debug目录"""
+
+def save_debug_file(content: str, suffix: str = '') -> str:
+    """
+    保存调试文件到debug目录
+    
+    参数:
+        content: 要保存的内容
+        suffix: 文件名后缀
+    
+    返回:
+        保存的文件路径
+    """
    ensure_debug_dir()
    filename = f'layout_output{suffix}.txt' if suffix else 'layout_output.txt'
-    filepath = os.path.join(DEBUG_DIR, filename)
+    filepath = os.path.join(config.DEBUG_DIR, filename)
+    
+    try:
        with open(filepath, 'w', encoding='utf-8') as f:
            f.write(content)
-    print(f'已保存到 {filepath}')
+        logger.info(f"已保存到 {filepath}")
        return filepath
        
+    except Exception as e:
+        logger.error(f"保存调试文件失败: {e}")
+        raise

-def parse_logs(text):
-    """解析日志数据"""
-    print('正在解析日志数据...')
+
+def parse_logs(text: str) -> List[ShipLog]:
+    """
+    解析日志数据
+    
+    参数:
+        text: 日志文本
+    
+    返回:
+        船次日志列表
+    """
+    try:
+        logger.info("正在解析日志数据...")
        parser = HandoverLogParser()
        logs = parser.parse(text)
-    print(f'解析到 {len(logs)} 条记录')
+        logger.info(f"解析到 {len(logs)} 条记录")
        return logs
        
+    except LogParserError as e:
+        logger.error(f"解析日志失败: {e}")
+        raise
+    except Exception as e:
+        logger.error(f"未知错误: {e}")
+        raise

-def save_to_db(logs):
-    """保存到数据库"""
+
+def save_to_db(logs: List[ShipLog]) -> int:
+    """
+    保存到数据库
+    
+    参数:
+        logs: 船次日志列表
+    
+    返回:
+        保存的记录数
+    """
    if not logs:
-        print('没有记录可保存')
+        logger.warning("没有记录可保存")
        return 0
    
+    try:
        db = DailyLogsDatabase()
        count = db.insert_many([log.to_dict() for log in logs])
-    print(f'已保存 {count} 条记录到数据库')
+        logger.info(f"已保存 {count} 条记录到数据库")
        
        stats = db.get_stats()
-    print(f'\n数据库统计:')
-    print(f'  总记录: {stats["total"]}')
-    print(f'  船次: {len(stats["ships"])}')
-    print(f'  日期范围: {stats["date_range"]["start"]} ~ {stats["date_range"]["end"]}')
+        logger.info(f"数据库统计: 总记录={stats['total']}, 船次={len(stats['ships'])}, "
+                   f"日期范围={stats['date_range']['start']}~{stats['date_range']['end']}")
        
-    db.close()
        return count
        
+    except Exception as e:
+        logger.error(f"保存到数据库失败: {e}")
+        raise

-def add_unaccounted(year_month, teu, note=''):
-    """添加未统计数据"""
+
+def add_unaccounted(year_month: str, teu: int, note: str = ''):
+    """
+    添加未统计数据
+    
+    参数:
+        year_month: 年月字符串，格式 "2025-12"
+        teu: 未统计TEU数量
+        note: 备注
+    """
+    try:
        db = DailyLogsDatabase()
        result = db.insert_unaccounted(year_month, teu, note)
        if result:
-        print(f'已添加 {year_month} 月未统计数据: {teu}TEU')
+            logger.info(f"已添加 {year_month} 月未统计数据: {teu}TEU")
        else:
-        print('添加失败')
-    db.close()
+            logger.error("添加失败")
+    except Exception as e:
+        logger.error(f"添加未统计数据失败: {e}")
+        raise


-def show_stats(date):
-    """显示指定日期的统计"""
-    g = DailyReportGenerator()
-    g.print_report(date)
-    g.close()
+def show_stats(date: str):
+    """
+    显示指定日期的统计
+    
+    参数:
+        date: 日期字符串，格式 "YYYY-MM-DD"
+    """
+    try:
+        generator = DailyReportGenerator()
+        generator.print_report(date)
+    except ReportGeneratorError as e:
+        logger.error(f"生成统计失败: {e}")
+    except Exception as e:
+        logger.error(f"未知错误: {e}")


-def run_fetch():
+def run_fetch() -> str:
    """执行：获取HTML并提取文本"""
    html = fetch_html()
    text = extract_text(html)
@@ -140,7 +220,7 @@ def run_fetch_and_save():
    save_to_db(logs)


-def run_fetch_save_debug():
+def run_fetch_save_debug() -> str:
    """执行：获取、提取、保存到debug目录"""
    html = fetch_html()
    text = extract_text(html)
@@ -149,33 +229,37 @@ def run_fetch_save_debug():
    return text


-def run_report(date=None):
+def run_report(date: Optional[str] = None):
    """执行：生成日报"""
    if not date:
        date = datetime.now().strftime('%Y-%m-%d')
    show_stats(date)


-def run_parser_test():
-    """执行：解析测试"""
-    ensure_debug_file_path = os.path.join(DEBUG_DIR, 'layout_output.txt')
-    if os.path.exists('layout_output.txt'):
-        filepath = 'layout_output.txt'
-    elif os.path.exists(ensure_debug_file_path):
-        filepath = ensure_debug_file_path
+
+
+def run_config_test():
+    """执行：配置测试"""
+    logger.info("配置测试:")
+    config.print_summary()
+    
+    # 测试Confluence连接
+    try:
+        client = ConfluenceClient()
+        if client.test_connection():
+            logger.info("Confluence连接测试: 成功")
        else:
-        print('未找到 layout_output.txt 文件')
-        return
+            logger.warning("Confluence连接测试: 失败")
+    except Exception as e:
+        logger.error(f"Confluence连接测试失败: {e}")
    
-    print(f'使用文件: {filepath}')
-    with open(filepath, 'r', encoding='utf-8') as f:
-        text = f.read()
-    
-    parser = HandoverLogParser()
-    logs = parser.parse(text)
-    print(f'解析到 {len(logs)} 条记录')
-    for log in logs[:5]:
-        print(f'  {log.date} {log.shift} {log.ship_name}: {log.teu}TEU')
+    # 测试数据库连接
+    try:
+        db = DailyLogsDatabase()
+        stats = db.get_stats()
+        logger.info(f"数据库连接测试: 成功，总记录: {stats['total']}")
+    except Exception as e:
+        logger.error(f"数据库连接测试失败: {e}")


 # 功能映射
@@ -185,7 +269,7 @@ FUNCTIONS = {
    'fetch-debug': run_fetch_save_debug,
    'report': lambda: run_report(),
    'report-today': lambda: run_report(datetime.now().strftime('%Y-%m-%d')),
-    'parse-test': run_parser_test,
+    'config-test': run_config_test,
    'stats': lambda: show_stats(datetime.now().strftime('%Y-%m-%d')),
 }

@@ -201,21 +285,21 @@ def main():
  fetch-debug  获取、提取并保存带时间戳的debug文件
  report       生成日报（默认今天）
  report-today 生成今日日报
-  parse-test   解析测试（使用已有的layout_output.txt）
+  config-test  配置测试
  stats        显示今日统计

 示例:
  python3 main.py fetch
  python3 main.py fetch-save
  python3 main.py report 2025-12-28
-  python3 main.py parse-test
+  python3 main.py config-test
 '''
    )
    parser.add_argument(
        'function',
        nargs='?',
        default='fetch-save',
-        choices=list(FUNCTIONS.keys()),
+        choices=['fetch', 'fetch-save', 'fetch-debug', 'report', 'report-today', 'config-test', 'stats'],
        help='要执行的功能 (默认: fetch-save)'
    )
    parser.add_argument(
@@ -242,15 +326,36 @@ def main():
    # 添加未统计数据
    if args.unaccounted:
        year_month = args.month or datetime.now().strftime('%Y-%m')
+        try:
            add_unaccounted(year_month, args.unaccounted)
+        except Exception as e:
+            logger.error(f"添加未统计数据失败: {e}")
+            sys.exit(1)
        return
    
    # 执行功能
+    try:
        if args.function == 'report' and args.date:
            run_report(args.date)
        else:
            FUNCTIONS[args.function]()
+    except KeyboardInterrupt:
+        logger.info("用户中断操作")
+        sys.exit(0)
+    except Exception as e:
+        logger.error(f"执行功能失败: {e}")
+        sys.exit(1)


 if __name__ == '__main__':
+    # 初始化日志系统
+    setup_logging()
+    
+    # 打印启动信息
+    logger.info("=" * 50)
+    logger.info("码头作业日志管理工具 - OrbitIn")
+    logger.info(f"启动时间: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
+    logger.info("=" * 50)
+    
+    # 运行主程序
    main()
--- a/src/init.py
+++ b/src/init.py
@@ -2,9 +2,7 @@
 """
 OrbitIn - Confluence 日志抓取与处理工具包
 """
-from .confluence import ConfluenceClient
-from .extractor import HTMLTextExtractor
-from .parser import HandoverLogParser
+from .confluence import ConfluenceClient, HTMLTextExtractor, HandoverLogParser
 from .database import DailyLogsDatabase

 __version__ = '1.0.0'
--- a/src/config.py
+++ b/src/config.py
@@ -0,0 +1,107 @@
+#!/usr/bin/env python3
+"""
+统一配置模块
+集中管理所有配置项，避免硬编码
+"""
+import os
+from typing import Optional
+from dotenv import load_dotenv
+
+# 加载环境变量
+load_dotenv()
+
+
+class Config:
+    """应用配置类"""
+    
+    # Confluence 配置
+    CONFLUENCE_BASE_URL = os.getenv('CONFLUENCE_BASE_URL')
+    CONFLUENCE_TOKEN = os.getenv('CONFLUENCE_TOKEN')
+    CONFLUENCE_CONTENT_ID = os.getenv('CONFLUENCE_CONTENT_ID')
+    
+    # 飞书配置
+    FEISHU_BASE_URL = os.getenv('FEISHU_BASE_URL', 'https://open.feishu.cn/open-apis/sheets/v3')
+    FEISHU_TOKEN = os.getenv('FEISHU_TOKEN')
+    FEISHU_SPREADSHEET_TOKEN = os.getenv('FEISHU_SPREADSHEET_TOKEN')
+    
+    # 数据库配置
+    DATABASE_PATH = os.getenv('DATABASE_PATH', 'data/daily_logs.db')
+    SCHEDULE_DATABASE_PATH = os.getenv('SCHEDULE_DATABASE_PATH', 'data/daily_logs.db')
+    
+    # 业务配置
+    DAILY_TARGET_TEU = int(os.getenv('DAILY_TARGET_TEU', '300'))
+    DUTY_PHONE = os.getenv('DUTY_PHONE', '13107662315')
+    
+    # 缓存配置
+    CACHE_TTL = int(os.getenv('CACHE_TTL', '3600'))  # 1小时
+    SCHEDULE_CACHE_FILE = os.getenv('SCHEDULE_CACHE_FILE', 'data/schedule_cache.json')
+    
+    # 调试目录配置
+    DEBUG_DIR = os.getenv('DEBUG_DIR', 'debug')
+    
+    # 飞书表格配置
+    SHEET_RANGE = os.getenv('SHEET_RANGE', 'A:AF')
+    REQUEST_TIMEOUT = int(os.getenv('REQUEST_TIMEOUT', '30'))
+    
+    # GUI 配置
+    GUI_FONT_FAMILY = os.getenv('GUI_FONT_FAMILY', 'SimHei')
+    GUI_FONT_SIZE = int(os.getenv('GUI_FONT_SIZE', '10'))
+    GUI_WINDOW_SIZE = os.getenv('GUI_WINDOW_SIZE', '900x700')
+    
+    # 排班刷新配置
+    SCHEDULE_REFRESH_DAYS = int(os.getenv('SCHEDULE_REFRESH_DAYS', '30'))
+    
+    # 特殊常量
+    FIRST_DAY_OF_MONTH_SPECIAL = 1
+    SEPARATOR_CHAR = '─'
+    SEPARATOR_LENGTH = 50
+    
+    @classmethod
+    def validate(cls) -> bool:
+        """验证必要配置是否完整"""
+        errors = []
+        
+        # 检查 Confluence 配置
+        if not cls.CONFLUENCE_BASE_URL:
+            errors.append("CONFLUENCE_BASE_URL 未配置")
+        if not cls.CONFLUENCE_TOKEN:
+            errors.append("CONFLUENCE_TOKEN 未配置")
+        if not cls.CONFLUENCE_CONTENT_ID:
+            errors.append("CONFLUENCE_CONTENT_ID 未配置")
+        
+        # 检查飞书配置（可选，但建议配置）
+        if not cls.FEISHU_TOKEN:
+            print("警告: FEISHU_TOKEN 未配置，排班功能将不可用")
+        if not cls.FEISHU_SPREADSHEET_TOKEN:
+            print("警告: FEISHU_SPREADSHEET_TOKEN 未配置，排班功能将不可用")
+        
+        if errors:
+            print("配置验证失败:")
+            for error in errors:
+                print(f"  - {error}")
+            return False
+        
+        return True
+    
+    @classmethod
+    def print_summary(cls):
+        """打印配置摘要"""
+        print("配置摘要:")
+        print(f"  Confluence: {'已配置' if cls.CONFLUENCE_BASE_URL else '未配置'}")
+        print(f"  飞书: {'已配置' if cls.FEISHU_TOKEN else '未配置'}")
+        print(f"  数据库路径: {cls.DATABASE_PATH}")
+        print(f"  每日目标TEU: {cls.DAILY_TARGET_TEU}")
+        print(f"  排班刷新天数: {cls.SCHEDULE_REFRESH_DAYS}")
+
+
+# 全局配置实例
+config = Config()
+
+
+if __name__ == '__main__':
+    # 测试配置
+    config.print_summary()
+    if config.validate():
+        print("配置验证通过")
+    else:
+        print("配置验证失败")
--- a/src/confluence.py
+++ b/src/confluence.py
@@ -1,68 +0,0 @@
-#!/usr/bin/env python3
-"""
-Confluence API 客户端模块
-"""
-import requests
-from typing import Optional
-
-
-class ConfluenceClient:
-    """Confluence REST API 客户端"""
-    
-    def __init__(self, base_url: str, token: str):
-        """
-        初始化客户端
-        
-        参数:
-            base_url: Confluence API 基础URL (不包含 /content)
-            token: Bearer 认证令牌
-        """
-        self.base_url = base_url.rstrip('/')
-        self.headers = {
-            'Authorization': f'Bearer {token}',
-            'Accept': 'application/json'
-        }
-    
-    def fetch_content(self, content_id: str, expand: str = 'body.storage') -> dict:
-        """
-        获取页面内容
-        
-        参数:
-            content_id: 页面ID
-            expand: 展开字段
-            
-        返回:
-            API 响应数据
-        """
-        url = f'{self.base_url}/content/{content_id}'
-        params = {'expand': expand}
-        
-        response = requests.get(url, headers=self.headers, params=params, timeout=30)
-        response.raise_for_status()
-        return response.json()
-    
-    def get_html(self, content_id: str) -> str:
-        """
-        获取页面HTML内容
-        
-        参数:
-            content_id: 页面ID
-            
-        返回:
-            HTML 字符串
-        """
-        data = self.fetch_content(content_id)
-        return data.get('body', {}).get('storage', {}).get('value', '')
-
-
-if __name__ == '__main__':
-    # 使用示例
-    import os
-    
-    client = ConfluenceClient(
-        base_url='https://confluence.westwell-lab.com/rest/api',
-        token=os.getenv('CONFLUENCE_TOKEN', '')
-    )
-    
-    html = client.get_html('155764524')
-    print(f'获取到 {len(html)} 字符的HTML内容')
--- a/src/confluence/init.py
+++ b/src/confluence/init.py
@@ -0,0 +1,22 @@
+"""
+Confluence API 模块
+提供Confluence页面内容获取和解析功能
+"""
+
+from .client import ConfluenceClient, ConfluenceClientError
+from .parser import HTMLContentParser
+from .manager import ConfluenceContentManager
+from .text import HTMLTextExtractor, HTMLTextExtractorError
+from .log_parser import HandoverLogParser, ShipLog, LogParserError
+
+__all__ = [
+    'ConfluenceClient',
+    'ConfluenceClientError',
+    'HTMLContentParser',
+    'ConfluenceContentManager',
+    'HTMLTextExtractor',
+    'HTMLTextExtractorError',
+    'HandoverLogParser',
+    'ShipLog',
+    'LogParserError'
+]
--- a/src/confluence/client.py
+++ b/src/confluence/client.py
@@ -0,0 +1,212 @@
+#!/usr/bin/env python3
+"""
+Confluence API 客户端
+提供Confluence页面内容获取功能
+"""
+import requests
+from typing import Optional, Dict, Any
+import logging
+
+from src.config import config
+from src.logging_config import get_logger
+
+logger = get_logger(__name__)
+
+
+class ConfluenceClientError(Exception):
+    """Confluence API 错误"""
+    pass
+
+
+class ConfluenceClient:
+    """Confluence REST API 客户端"""
+    
+    def __init__(self, base_url: Optional[str] = None, token: Optional[str] = None):
+        """
+        初始化客户端
+        
+        参数:
+            base_url: Confluence API 基础URL (不包含 /content)，如果为None则使用配置
+            token: Bearer 认证令牌，如果为None则使用配置
+        """
+        self.base_url = (base_url or config.CONFLUENCE_BASE_URL).rstrip('/')
+        self.token = token or config.CONFLUENCE_TOKEN
+        
+        if not self.base_url or not self.token:
+            raise ConfluenceClientError("Confluence配置不完整，请检查环境变量")
+        
+        self.headers = {
+            'Authorization': f'Bearer {self.token}',
+            'Accept': 'application/json'
+        }
+        
+        # 使用 Session 重用连接
+        self.session = requests.Session()
+        self.session.headers.update(self.headers)
+        self.session.timeout = config.REQUEST_TIMEOUT
+        
+        logger.debug(f"Confluence客户端初始化完成，基础URL: {self.base_url}")
+    
+    def fetch_content(self, content_id: str, expand: str = 'body.storage') -> Dict[str, Any]:
+        """
+        获取页面内容
+        
+        参数:
+            content_id: 页面ID
+            expand: 展开字段
+        
+        返回:
+            API 响应数据
+        
+        异常:
+            ConfluenceClientError: API调用失败
+            requests.exceptions.RequestException: 网络请求失败
+        """
+        url = f'{self.base_url}/content/{content_id}'
+        params = {'expand': expand}
+        
+        try:
+            logger.debug(f"获取Confluence内容: {content_id}")
+            response = self.session.get(url, params=params, timeout=config.REQUEST_TIMEOUT)
+            response.raise_for_status()
+            
+            data = response.json()
+            logger.info(f"成功获取Confluence内容: {content_id}")
+            return data
+            
+        except requests.exceptions.HTTPError as e:
+            status_code = e.response.status_code if e.response else '未知'
+            error_msg = f"Confluence API HTTP错误: {status_code}, URL: {url}"
+            logger.error(error_msg)
+            raise ConfluenceClientError(error_msg) from e
+            
+        except requests.exceptions.RequestException as e:
+            error_msg = f"Confluence API 网络错误: {e}"
+            logger.error(error_msg)
+            raise ConfluenceClientError(error_msg) from e
+            
+        except ValueError as e:
+            error_msg = f"Confluence API 响应解析失败: {e}"
+            logger.error(error_msg)
+            raise ConfluenceClientError(error_msg) from e
+    
+    def get_html(self, content_id: str) -> str:
+        """
+        获取页面HTML内容
+        
+        参数:
+            content_id: 页面ID
+        
+        返回:
+            HTML 字符串
+        
+        异常:
+            ConfluenceClientError: API调用失败或HTML内容为空
+        """
+        try:
+            data = self.fetch_content(content_id)
+            html = data.get('body', {}).get('storage', {}).get('value', '')
+            
+            if not html:
+                error_msg = f"Confluence页面HTML内容为空: {content_id}"
+                logger.error(error_msg)
+                raise ConfluenceClientError(error_msg)
+            
+            logger.info(f"获取到Confluence HTML内容，长度: {len(html)} 字符")
+            return html
+            
+        except KeyError as e:
+            error_msg = f"Confluence响应格式错误，缺少字段: {e}"
+            logger.error(error_msg)
+            raise ConfluenceClientError(error_msg) from e
+    
+    def test_connection(self, content_id: Optional[str] = None) -> bool:
+        """
+        测试Confluence连接是否正常
+        
+        参数:
+            content_id: 测试页面ID，如果为None则使用配置
+        
+        返回:
+            连接是否正常
+        """
+        test_content_id = content_id or config.CONFLUENCE_CONTENT_ID
+        
+        try:
+            data = self.fetch_content(test_content_id)
+            title = data.get('title', '未知标题')
+            logger.info(f"Confluence连接测试成功，页面: {title}")
+            return True
+            
+        except ConfluenceClientError as e:
+            logger.error(f"Confluence连接测试失败: {e}")
+            return False
+            
+        except Exception as e:
+            logger.error(f"Confluence连接测试异常: {e}")
+            return False
+    
+    def get_page_info(self, content_id: str) -> Dict[str, Any]:
+        """
+        获取页面基本信息
+        
+        参数:
+            content_id: 页面ID
+        
+        返回:
+            页面信息字典
+        """
+        try:
+            data = self.fetch_content(content_id)
+            return {
+                'id': data.get('id'),
+                'title': data.get('title'),
+                'version': data.get('version', {}).get('number'),
+                'created': data.get('history', {}).get('createdDate'),
+                'last_updated': data.get('version', {}).get('when'),
+                'space': data.get('space', {}).get('key'),
+                'url': f"{self.base_url.replace('/rest/api', '')}/pages/{content_id}"
+            }
+            
+        except Exception as e:
+            error_msg = f"获取页面信息失败: {e}"
+            logger.error(error_msg)
+            raise ConfluenceClientError(error_msg) from e
+
+
+if __name__ == '__main__':
+    # 测试代码
+    import sys
+    
+    # 设置日志
+    logging.basicConfig(level=logging.INFO)
+    
+    try:
+        # 测试连接
+        client = ConfluenceClient()
+        
+        if client.test_connection():
+            print("Confluence连接测试成功")
+            
+            # 获取HTML内容
+            content_id = config.CONFLUENCE_CONTENT_ID
+            if content_id:
+                html = client.get_html(content_id)
+                print(f"获取到HTML内容，长度: {len(html)} 字符")
+                
+                # 获取页面信息
+                page_info = client.get_page_info(content_id)
+                print(f"页面标题: {page_info.get('title')}")
+                print(f"页面URL: {page_info.get('url')}")
+            else:
+                print("未配置CONFLUENCE_CONTENT_ID，跳过HTML获取")
+        else:
+            print("Confluence连接测试失败")
+            sys.exit(1)
+            
+    except ConfluenceClientError as e:
+        print(f"Confluence客户端错误: {e}")
+        sys.exit(1)
+    except Exception as e:
+        print(f"未知错误: {e}")
+        sys.exit(1)
--- a/src/confluence/log_parser.py
+++ b/src/confluence/log_parser.py
@@ -0,0 +1,350 @@
+#!/usr/bin/env python3
+"""
+日志解析模块
+完善类型提示和异常处理
+"""
+import re
+from typing import List, Dict, Optional, Tuple, Any
+from dataclasses import dataclass, asdict
+import logging
+
+from src.logging_config import get_logger
+
+logger = get_logger(__name__)
+
+
+@dataclass
+class ShipLog:
+    """船次日志数据类"""
+    date: str
+    shift: str
+    ship_name: str
+    teu: Optional[int] = None
+    efficiency: Optional[float] = None
+    vehicles: Optional[int] = None
+    
+    def to_dict(self) -> Dict[str, Any]:
+        """转换为字典"""
+        return asdict(self)
+
+
+class LogParserError(Exception):
+    """日志解析错误"""
+    pass
+
+
+class HandoverLogParser:
+    """交接班日志解析器"""
+    
+    SEPARATOR = '———————————————————————————————————————————————'
+    
+    def __init__(self):
+        """初始化解析器"""
+        pass
+    
+    @staticmethod
+    def parse_date(date_str: str) -> str:
+        """
+        解析日期字符串
+        
+        参数:
+            date_str: 日期字符串，格式 "2025.12.30"
+        
+        返回:
+            标准化日期字符串 "2025-12-30"
+        
+        异常:
+            ValueError: 日期格式无效
+        """
+        if not date_str:
+            return date_str
+        
+        try:
+            parts = date_str.split('.')
+            if len(parts) == 3:
+                # 验证每个部分都是数字
+                year, month, day = parts
+                if not (year.isdigit() and month.isdigit() and day.isdigit()):
+                    raise ValueError(f"日期包含非数字字符: {date_str}")
+                
+                # 标准化为YYYY-MM-DD格式
+                return f"{year}-{month.zfill(2)}-{day.zfill(2)}"
+            
+            # 如果不是点分隔格式，尝试其他格式
+            if '-' in date_str:
+                # 已经是标准格式
+                return date_str
+            
+            logger.warning(f"无法解析日期格式: {date_str}")
+            return date_str
+            
+        except Exception as e:
+            logger.warning(f"解析日期失败: {date_str}, 错误: {e}")
+            return date_str
+    
+    def parse(self, text: str) -> List[ShipLog]:
+        """
+        解析日志文本
+        
+        参数:
+            text: 日志文本
+        
+        返回:
+            船次日志列表（已合并同日期同班次同船名的记录）
+        
+        异常:
+            LogParserError: 解析失败
+            ValueError: 输入参数无效
+        """
+        if not text:
+            logger.warning("日志文本为空")
+            return []
+        
+        if not isinstance(text, str):
+            error_msg = f"日志文本类型错误，应为字符串，实际为: {type(text)}"
+            logger.error(error_msg)
+            raise ValueError(error_msg)
+        
+        try:
+            logs: List[ShipLog] = []
+            
+            # 预处理：移除单行分隔符（前后都是空行的分隔符）
+            # 保留真正的内容分隔符（前后有内容的）
+            lines = text.split('\n')
+            processed_lines: List[str] = []
+            i = 0
+            while i < len(lines):
+                line = lines[i]
+                if line.strip() == self.SEPARATOR:
+                    # 检查是否是单行分隔符（前后都是空行或分隔符）
+                    prev_empty = i == 0 or not lines[i-1].strip() or lines[i-1].strip() == self.SEPARATOR
+                    next_empty = i == len(lines) - 1 or not lines[i+1].strip() or lines[i+1].strip() == self.SEPARATOR
+                    if prev_empty and next_empty:
+                        # 单行分隔符，跳过
+                        i += 1
+                        continue
+                processed_lines.append(line)
+                i += 1
+            
+            processed_text = '\n'.join(processed_lines)
+            blocks = processed_text.split(self.SEPARATOR)
+            
+            for block in blocks:
+                if not block.strip() or '日期：' not in block:
+                    continue
+                
+                # 解析日期
+                date_match = re.search(r'日期：(\d{4}\.\d{2}\.\d{2})', block)
+                if not date_match:
+                    continue
+                
+                date = self.parse_date(date_match.group(1))
+                self._parse_block(block, date, logs)
+            
+            # 合并同日期同班次同船名的记录（累加TEU）
+            merged: Dict[Tuple[str, str, str], ShipLog] = {}
+            for log in logs:
+                key = (log.date, log.shift, log.ship_name)
+                if key not in merged:
+                    merged[key] = ShipLog(
+                        date=log.date,
+                        shift=log.shift,
+                        ship_name=log.ship_name,
+                        teu=log.teu,
+                        efficiency=log.efficiency,
+                        vehicles=log.vehicles
+                    )
+                else:
+                    # 累加TEU
+                    if log.teu:
+                        if merged[key].teu is None:
+                            merged[key].teu = log.teu
+                        else:
+                            merged[key].teu += log.teu
+                    # 累加车辆数
+                    if log.vehicles:
+                        if merged[key].vehicles is None:
+                            merged[key].vehicles = log.vehicles
+                        else:
+                            merged[key].vehicles += log.vehicles
+            
+            result = list(merged.values())
+            logger.info(f"日志解析完成，共 {len(result)} 条记录")
+            return result
+            
+        except Exception as e:
+            error_msg = f"日志解析失败: {e}"
+            logger.error(error_msg)
+            raise LogParserError(error_msg) from e
+    
+    def _parse_block(self, block: str, date: str, logs: List[ShipLog]) -> None:
+        """解析日期块"""
+        try:
+            for shift in ['白班', '夜班']:
+                shift_pattern = f'{shift}：'
+                if shift_pattern not in block:
+                    continue
+                
+                shift_start = block.find(shift_pattern) + len(shift_pattern)
+                
+                # 只找到下一个班次作为边界，不限制"注意事项："
+                next_pos = len(block)
+                for next_shift in ['白班', '夜班']:
+                    if next_shift != shift:
+                        pos = block.find(f'{next_shift}：', shift_start)
+                        if pos != -1 and pos < next_pos:
+                            next_pos = pos
+                
+                shift_content = block[shift_start:next_pos]
+                self._parse_ships(shift_content, date, shift, logs)
+                
+        except Exception as e:
+            logger.warning(f"解析日期块失败: {date}, 错误: {e}")
+    
+    def _parse_ships(self, content: str, date: str, shift: str, logs: List[ShipLog]) -> None:
+        """解析船次"""
+        try:
+            parts = content.split('实船作业：')
+            
+            for part in parts:
+                if not part.strip():
+                    continue
+                
+                cleaned = part.replace('\xa0', ' ').strip()
+                # 匹配 "xxx# 船名" 格式（船号和船名分开）
+                ship_match = re.search(r'(\d+)#\s*(\S+)', cleaned)
+                
+                if not ship_match:
+                    continue
+                
+                # 船名只取纯船名（去掉xx#前缀和二次靠泊等标注）
+                ship_name = ship_match.group(2)
+                # 移除二次靠泊等标注
+                ship_name = re.sub(r'（二次靠泊）|（再次靠泊）|\(二次靠泊\)|\(再次靠泊\)', '', ship_name).strip()
+                
+                vehicles_match = re.search(r'上场车辆数：(\d+)', cleaned)
+                teu_eff_match = re.search(
+                    r'作业量/效率：(\d+)TEU[，,\s]*', cleaned
+                )
+                
+                # 解析TEU
+                teu = None
+                if teu_eff_match:
+                    try:
+                        teu = int(teu_eff_match.group(1))
+                    except ValueError as e:
+                        logger.warning(f"TEU解析失败: {teu_eff_match.group(1)}, 错误: {e}")
+                
+                # 解析车辆数
+                vehicles = None
+                if vehicles_match:
+                    try:
+                        vehicles = int(vehicles_match.group(1))
+                    except ValueError as e:
+                        logger.warning(f"车辆数解析失败: {vehicles_match.group(1)}, 错误: {e}")
+                
+                log = ShipLog(
+                    date=date,
+                    shift=shift,
+                    ship_name=ship_name,
+                    teu=teu,
+                    efficiency=None,  # 目前日志中没有效率数据
+                    vehicles=vehicles
+                )
+                logs.append(log)
+                
+        except Exception as e:
+            logger.warning(f"解析船次失败: {date} {shift}, 错误: {e}")
+    
+    def parse_from_file(self, filepath: str) -> List[ShipLog]:
+        """
+        从文件解析日志
+        
+        参数:
+            filepath: 文件路径
+        
+        返回:
+            船次日志列表
+        
+        异常:
+            FileNotFoundError: 文件不存在
+            LogParserError: 解析失败
+        """
+        try:
+            with open(filepath, 'r', encoding='utf-8') as f:
+                text = f.read()
+            
+            return self.parse(text)
+            
+        except FileNotFoundError as e:
+            error_msg = f"日志文件不存在: {filepath}"
+            logger.error(error_msg)
+            raise
+        except Exception as e:
+            error_msg = f"从文件解析日志失败: {filepath}, 错误: {e}"
+            logger.error(error_msg)
+            raise LogParserError(error_msg) from e
+
+
+if __name__ == '__main__':
+    # 测试代码
+    import sys
+    
+    # 设置日志
+    logging.basicConfig(level=logging.INFO)
+    
+    parser = HandoverLogParser()
+    
+    # 测试日期解析
+    test_dates = ["2025.12.30", "2025.01.01", "无效日期", "2025-12-30"]
+    for date in test_dates:
+        parsed = parser.parse_date(date)
+        print(f"解析日期 '{date}' -> '{parsed}'")
+    
+    # 测试日志解析
+    test_text = """
+日期：2025.12.30
+———————————————————————————————————————————————
+白班：
+实船作业：123# 测试船1
+上场车辆数：5
+作业量/效率：100TEU，
+注意事项：无
+———————————————————————————————————————————————
+夜班：
+实船作业：456# 测试船2
+上场车辆数：3
+作业量/效率：80TEU，
+注意事项：无
+"""
+    
+    try:
+        logs = parser.parse(test_text)
+        print(f"\n解析到 {len(logs)} 条记录")
+        for log in logs:
+            print(f"  {log.date} {log.shift} {log.ship_name}: {log.teu}TEU, {log.vehicles}辆车")
+    except LogParserError as e:
+        print(f"日志解析失败: {e}")
+        sys.exit(1)
+    
+    # 测试合并功能
+    duplicate_text = """
+日期：2025.12.30
+———————————————————————————————————————————————
+白班：
+实船作业：123# 测试船1
+上场车辆数：5
+作业量/效率：100TEU，
+实船作业：123# 测试船1（二次靠泊）
+上场车辆数：3
+作业量/效率：50TEU，
+"""
+    
+    try:
+        logs = parser.parse(duplicate_text)
+        print(f"\n合并测试，解析到 {len(logs)} 条记录")
+        for log in logs:
+            print(f"  {log.date} {log.shift} {log.ship_name}: {log.teu}TEU, {log.vehicles}辆车")
+    except LogParserError as e:
+        print(f"合并测试失败: {e}")
+        sys.exit(1)
--- a/src/confluence/manager.py
+++ b/src/confluence/manager.py
@@ -0,0 +1,354 @@
+#!/usr/bin/env python3
+"""
+Confluence 内容管理器
+提供高级的Confluence内容管理功能
+"""
+from typing import Dict, List, Optional, Any
+import logging
+from datetime import datetime, timedelta
+
+from src.logging_config import get_logger
+from .client import ConfluenceClient, ConfluenceClientError
+from .parser import HTMLContentParser
+
+logger = get_logger(__name__)
+
+
+class ConfluenceContentManager:
+    """Confluence 内容管理器"""
+    
+    def __init__(self, client: Optional[ConfluenceClient] = None):
+        """
+        初始化内容管理器
+        
+        参数:
+            client: Confluence客户端实例，如果为None则创建新实例
+        """
+        self.client = client or ConfluenceClient()
+        self.parser = HTMLContentParser()
+        logger.debug("Confluence内容管理器初始化完成")
+    
+    def get_content_with_analysis(self, content_id: str) -> Dict[str, Any]:
+        """
+        获取内容并进行分析
+        
+        参数:
+            content_id: 页面ID
+        
+        返回:
+            包含内容和分析结果的字典
+        """
+        try:
+            logger.info(f"获取并分析Confluence内容: {content_id}")
+            
+            # 获取页面信息
+            page_info = self.client.get_page_info(content_id)
+            
+            # 获取HTML内容
+            html = self.client.get_html(content_id)
+            
+            # 分析内容
+            analysis = self.parser.analyze_content(html)
+            
+            # 提取纯文本（前500字符）
+            plain_text = self.parser.extract_plain_text(html)
+            preview_text = plain_text[:500] + "..." if len(plain_text) > 500 else plain_text
+            
+            result = {
+                'page_info': page_info,
+                'html_length': len(html),
+                'analysis': analysis,
+                'preview_text': preview_text,
+                'has_content': len(html) > 0,
+                'timestamp': datetime.now().isoformat()
+            }
+            
+            logger.info(f"内容分析完成: {content_id}")
+            return result
+            
+        except ConfluenceClientError as e:
+            logger.error(f"获取内容失败: {e}")
+            raise
+        except Exception as e:
+            error_msg = f"内容分析失败: {e}"
+            logger.error(error_msg)
+            raise ValueError(error_msg) from e
+    
+    def check_content_health(self, content_id: str) -> Dict[str, Any]:
+        """
+        检查内容健康状况
+        
+        参数:
+            content_id: 页面ID
+        
+        返回:
+            健康检查结果
+        """
+        try:
+            logger.info(f"检查内容健康状况: {content_id}")
+            
+            # 获取页面信息
+            page_info = self.client.get_page_info(content_id)
+            
+            # 获取HTML内容
+            html = self.client.get_html(content_id)
+            
+            # 分析内容
+            analysis = self.parser.analyze_content(html)
+            
+            # 检查健康状况
+            health_checks = {
+                'has_content': len(html) > 0,
+                'has_text': analysis['plain_text_length'] > 0,
+                'has_structure': analysis['has_tables'] or analysis['has_links'] or analysis['has_images'],
+                'content_size_ok': 100 <= len(html) <= 1000000,  # 100字节到1MB
+                'text_ratio_ok': analysis['plain_text_length'] / max(len(html), 1) > 0.1,  # 文本占比至少10%
+                'word_count_ok': analysis['word_count'] >= 10,  # 至少10个单词
+                'has_links': analysis['has_links'],
+                'has_images': analysis['has_images'],
+                'has_tables': analysis['has_tables']
+            }
+            
+            # 计算健康分数
+            passed_checks = sum(1 for check in health_checks.values() if check)
+            total_checks = len(health_checks)
+            health_score = passed_checks / total_checks
+            
+            # 生成建议
+            suggestions = []
+            if not health_checks['has_content']:
+                suggestions.append("页面内容为空")
+            if not health_checks['has_text']:
+                suggestions.append("页面缺少文本内容")
+            if not health_checks['content_size_ok']:
+                suggestions.append("页面内容大小异常")
+            if not health_checks['text_ratio_ok']:
+                suggestions.append("文本占比过低")
+            if not health_checks['word_count_ok']:
+                suggestions.append("单词数量不足")
+            
+            result = {
+                'page_info': page_info,
+                'health_score': health_score,
+                'health_status': '健康' if health_score >= 0.8 else '警告' if health_score >= 0.5 else '异常',
+                'health_checks': health_checks,
+                'analysis': analysis,
+                'suggestions': suggestions,
+                'timestamp': datetime.now().isoformat()
+            }
+            
+            logger.info(f"健康检查完成: {content_id}, 分数: {health_score:.2f}")
+            return result
+            
+        except ConfluenceClientError as e:
+            logger.error(f"健康检查失败: {e}")
+            raise
+        except Exception as e:
+            error_msg = f"健康检查失败: {e}"
+            logger.error(error_msg)
+            raise ValueError(error_msg) from e
+    
+    def extract_content_summary(self, content_id: str, max_length: int = 200) -> Dict[str, Any]:
+        """
+        提取内容摘要
+        
+        参数:
+            content_id: 页面ID
+            max_length: 摘要最大长度
+        
+        返回:
+            内容摘要
+        """
+        try:
+            logger.info(f"提取内容摘要: {content_id}")
+            
+            # 获取页面信息
+            page_info = self.client.get_page_info(content_id)
+            
+            # 获取HTML内容
+            html = self.client.get_html(content_id)
+            
+            # 提取纯文本
+            plain_text = self.parser.extract_plain_text(html)
+            
+            # 生成摘要
+            if len(plain_text) <= max_length:
+                summary = plain_text
+            else:
+                # 尝试在句子边界处截断
+                sentences = plain_text.split('. ')
+                summary_parts = []
+                current_length = 0
+                
+                for sentence in sentences:
+                    if current_length + len(sentence) + 2 <= max_length:  # +2 for ". "
+                        summary_parts.append(sentence)
+                        current_length += len(sentence) + 2
+                    else:
+                        break
+                
+                summary = '. '.join(summary_parts)
+                if summary and not summary.endswith('.'):
+                    summary += '...'
+            
+            # 提取关键信息
+            links = self.parser.extract_links(html)
+            images = self.parser.extract_images(html)
+            tables = self.parser.extract_tables(html)
+            
+            result = {
+                'page_info': page_info,
+                'summary': summary,
+                'summary_length': len(summary),
+                'total_length': len(plain_text),
+                'key_elements': {
+                    'link_count': len(links),
+                    'image_count': len(images),
+                    'table_count': len(tables)
+                },
+                'has_rich_content': len(links) > 0 or len(images) > 0 or len(tables) > 0,
+                'timestamp': datetime.now().isoformat()
+            }
+            
+            logger.info(f"内容摘要提取完成: {content_id}")
+            return result
+            
+        except ConfluenceClientError as e:
+            logger.error(f"提取摘要失败: {e}")
+            raise
+        except Exception as e:
+            error_msg = f"提取摘要失败: {e}"
+            logger.error(error_msg)
+            raise ValueError(error_msg) from e
+    
+    def batch_analyze_pages(self, content_ids: List[str]) -> Dict[str, Any]:
+        """
+        批量分析多个页面
+        
+        参数:
+            content_ids: 页面ID列表
+        
+        返回:
+            批量分析结果
+        """
+        try:
+            logger.info(f"批量分析 {len(content_ids)} 个页面")
+            
+            results = []
+            errors = []
+            
+            for content_id in content_ids:
+                try:
+                    result = self.get_content_with_analysis(content_id)
+                    results.append(result)
+                    logger.debug(f"页面分析完成: {content_id}")
+                    
+                except Exception as e:
+                    errors.append({
+                        'content_id': content_id,
+                        'error': str(e)
+                    })
+                    logger.warning(f"页面分析失败: {content_id}, 错误: {e}")
+            
+            # 计算统计信息
+            if results:
+                total_pages = len(results)
+                successful_pages = len(results)
+                failed_pages = len(errors)
+                
+                total_html_length = sum(r['html_length'] for r in results)
+                avg_html_length = total_html_length / successful_pages if successful_pages > 0 else 0
+                
+                stats = {
+                    'total_pages': total_pages,
+                    'successful_pages': successful_pages,
+                    'failed_pages': failed_pages,
+                    'success_rate': successful_pages / total_pages if total_pages > 0 else 0,
+                    'total_html_length': total_html_length,
+                    'avg_html_length': avg_html_length,
+                    'has_content_pages': sum(1 for r in results if r['has_content']),
+                    'timestamp': datetime.now().isoformat()
+                }
+            else:
+                stats = {
+                    'total_pages': 0,
+                    'successful_pages': 0,
+                    'failed_pages': len(errors),
+                    'success_rate': 0,
+                    'total_html_length': 0,
+                    'avg_html_length': 0,
+                    'has_content_pages': 0,
+                    'timestamp': datetime.now().isoformat()
+                }
+            
+            batch_result = {
+                'stats': stats,
+                'results': results,
+                'errors': errors,
+                'timestamp': datetime.now().isoformat()
+            }
+            
+            logger.info(f"批量分析完成: 成功 {len(results)} 个，失败 {len(errors)} 个")
+            return batch_result
+            
+        except Exception as e:
+            error_msg = f"批量分析失败: {e}"
+            logger.error(error_msg)
+            raise ValueError(error_msg) from e
+
+
+if __name__ == '__main__':
+    # 测试代码
+    import sys
+    
+    # 设置日志
+    logging.basicConfig(level=logging.INFO)
+    
+    try:
+        # 创建管理器
+        manager = ConfluenceContentManager()
+        
+        # 测试连接
+        from src.config import config
+        content_id = config.CONFLUENCE_CONTENT_ID
+        
+        if not content_id:
+            print("未配置CONFLUENCE_CONTENT_ID，跳过测试")
+            sys.exit(0)
+        
+        if manager.client.test_connection(content_id):
+            print("Confluence连接测试成功")
+            
+            # 测试内容分析
+            print("\n1. 测试内容分析:")
+            analysis = manager.get_content_with_analysis(content_id)
+            print(f"   页面标题: {analysis['page_info'].get('title')}")
+            print(f"   内容长度: {analysis['html_length']} 字符")
+            print(f"   文本预览: {analysis['preview_text'][:100]}...")
+            
+            # 测试健康检查
+            print("\n2. 测试健康检查:")
+            health = manager.check_content_health(content_id)
+            print(f"   健康分数: {health['health_score']:.2f}")
+            print(f"   健康状态: {health['health_status']}")
+            print(f"   建议: {health['suggestions']}")
+            
+            # 测试内容摘要
+            print("\n3. 测试内容摘要:")
+            summary = manager.extract_content_summary(content_id)
+            print(f"   摘要: {summary['summary']}")
+            print(f"   摘要长度: {summary['summary_length']} 字符")
+            print(f"   总长度: {summary['total_length']} 字符")
+            
+            print("\n所有测试通过")
+            
+        else:
+            print("Confluence连接测试失败")
+            sys.exit(1)
+            
+    except ConfluenceClientError as e:
+        print(f"Confluence客户端错误: {e}")
+        sys.exit(1)
+    except Exception as e:
+        print(f"未知错误: {e}")
+        sys.exit(1)
--- a/src/confluence/parser.py
+++ b/src/confluence/parser.py
@@ -0,0 +1,244 @@
+#!/usr/bin/env python3
+"""
+Confluence HTML 内容解析器
+提供Confluence HTML内容的解析和格式化功能
+"""
+import re
+from typing import Dict, List, Optional, Tuple
+import logging
+
+from src.logging_config import get_logger
+
+logger = get_logger(__name__)
+
+
+class HTMLContentParser:
+    """Confluence HTML 内容解析器"""
+    
+    def __init__(self):
+        """初始化解析器"""
+        logger.debug("HTML内容解析器初始化完成")
+    
+    def extract_plain_text(self, html: str) -> str:
+        """
+        从HTML中提取纯文本（简单版本）
+        
+        参数:
+            html: HTML字符串
+        
+        返回:
+            纯文本字符串
+        """
+        try:
+            # 移除HTML标签
+            text = re.sub(r'<[^>]+>', ' ', html)
+            # 合并多个空格
+            text = re.sub(r'\s+', ' ', text)
+            # 解码HTML实体（简单版本）
+            text = text.replace('&nbsp;', ' ').replace('&', '&').replace('<', '<').replace('>', '>')
+            # 去除首尾空格
+            text = text.strip()
+            
+            logger.debug(f"提取纯文本完成，长度: {len(text)} 字符")
+            return text
+            
+        except Exception as e:
+            error_msg = f"提取纯文本失败: {e}"
+            logger.error(error_msg)
+            raise ValueError(error_msg) from e
+    
+    def extract_links(self, html: str) -> List[Dict[str, str]]:
+        """
+        从HTML中提取链接
+        
+        参数:
+            html: HTML字符串
+        
+        返回:
+            链接列表，每个链接包含 'text' 和 'url'
+        """
+        links = []
+        try:
+            # 简单的正则表达式匹配链接
+            link_pattern = r'<a\s+[^>]*href=["\']([^"\']+)["\'][^>]*>([^<]+)</a>'
+            matches = re.findall(link_pattern, html, re.IGNORECASE)
+            
+            for url, text in matches:
+                links.append({
+                    'text': text.strip(),
+                    'url': url.strip()
+                })
+            
+            logger.debug(f"提取到 {len(links)} 个链接")
+            return links
+            
+        except Exception as e:
+            error_msg = f"提取链接失败: {e}"
+            logger.error(error_msg)
+            raise ValueError(error_msg) from e
+    
+    def extract_images(self, html: str) -> List[Dict[str, str]]:
+        """
+        从HTML中提取图片
+        
+        参数:
+            html: HTML字符串
+        
+        返回:
+            图片列表，每个图片包含 'src' 和 'alt'
+        """
+        images = []
+        try:
+            # 简单的正则表达式匹配图片
+            img_pattern = r'<img\s+[^>]*src=["\']([^"\']+)["\'][^>]*alt=["\']([^"\']*)["\'][^>]*>'
+            matches = re.findall(img_pattern, html, re.IGNORECASE)
+            
+            for src, alt in matches:
+                images.append({
+                    'src': src.strip(),
+                    'alt': alt.strip()
+                })
+            
+            logger.debug(f"提取到 {len(images)} 张图片")
+            return images
+            
+        except Exception as e:
+            error_msg = f"提取图片失败: {e}"
+            logger.error(error_msg)
+            raise ValueError(error_msg) from e
+    
+    def extract_tables(self, html: str) -> List[List[List[str]]]:
+        """
+        从HTML中提取表格数据
+        
+        参数:
+            html: HTML字符串
+        
+        返回:
+            表格列表，每个表格是二维列表
+        """
+        tables = []
+        try:
+            # 简单的表格提取（仅支持简单表格）
+            table_pattern = r'<table[^>]*>(.*?)</table>'
+            table_matches = re.findall(table_pattern, html, re.IGNORECASE | re.DOTALL)
+            
+            for table_html in table_matches:
+                rows = []
+                # 提取行
+                row_pattern = r'<tr[^>]*>(.*?)</tr>'
+                row_matches = re.findall(row_pattern, table_html, re.IGNORECASE | re.DOTALL)
+                
+                for row_html in row_matches:
+                    cells = []
+                    # 提取单元格
+                    cell_pattern = r'<t[dh][^>]*>(.*?)</t[dh]>'
+                    cell_matches = re.findall(cell_pattern, row_html, re.IGNORECASE | re.DOTALL)
+                    
+                    for cell_html in cell_matches:
+                        # 清理单元格内容
+                        cell_text = re.sub(r'<[^>]+>', '', cell_html)
+                        cell_text = re.sub(r'\s+', ' ', cell_text).strip()
+                        cells.append(cell_text)
+                    
+                    rows.append(cells)
+                
+                if rows:  # 只添加非空表格
+                    tables.append(rows)
+            
+            logger.debug(f"提取到 {len(tables)} 个表格")
+            return tables
+            
+        except Exception as e:
+            error_msg = f"提取表格失败: {e}"
+            logger.error(error_msg)
+            raise ValueError(error_msg) from e
+    
+    def analyze_content(self, html: str) -> Dict[str, any]:
+        """
+        分析HTML内容
+        
+        参数:
+            html: HTML字符串
+        
+        返回:
+            内容分析结果
+        """
+        try:
+            plain_text = self.extract_plain_text(html)
+            links = self.extract_links(html)
+            images = self.extract_images(html)
+            tables = self.extract_tables(html)
+            
+            analysis = {
+                'total_length': len(html),
+                'plain_text_length': len(plain_text),
+                'link_count': len(links),
+                'image_count': len(images),
+                'table_count': len(tables),
+                'word_count': len(plain_text.split()),
+                'line_count': plain_text.count('\n') + 1,
+                'has_tables': len(tables) > 0,
+                'has_images': len(images) > 0,
+                'has_links': len(links) > 0
+            }
+            
+            logger.info(f"内容分析完成: {analysis}")
+            return analysis
+            
+        except Exception as e:
+            error_msg = f"内容分析失败: {e}"
+            logger.error(error_msg)
+            raise ValueError(error_msg) from e
+
+
+if __name__ == '__main__':
+    # 测试代码
+    import sys
+    
+    # 设置日志
+    logging.basicConfig(level=logging.INFO)
+    
+    # 测试HTML
+    test_html = """
+    <html>
+    <body>
+        <h1>测试页面</h1>
+        <p>这是一个测试页面，包含<a href="https://example.com">链接</a>和图片。</p>
+        <img src="test.jpg" alt="测试图片">
+        <table>
+            <tr><th>标题1</th><th>标题2</th></tr>
+            <tr><td>数据1</td><td>数据2</td></tr>
+        </table>
+    </body>
+    </html>
+    """
+    
+    try:
+        parser = HTMLContentParser()
+        
+        # 测试纯文本提取
+        text = parser.extract_plain_text(test_html)
+        print(f"纯文本: {text[:100]}...")
+        
+        # 测试链接提取
+        links = parser.extract_links(test_html)
+        print(f"链接: {links}")
+        
+        # 测试图片提取
+        images = parser.extract_images(test_html)
+        print(f"图片: {images}")
+        
+        # 测试表格提取
+        tables = parser.extract_tables(test_html)
+        print(f"表格: {tables}")
+        
+        # 测试内容分析
+        analysis = parser.analyze_content(test_html)
+        print(f"内容分析: {analysis}")
+        
+        print("所有测试通过")
+        
+    except Exception as e:
+        print(f"测试失败: {e}")
+        sys.exit(1)
--- a/src/confluence/text.py
+++ b/src/confluence/text.py
@@ -0,0 +1,309 @@
+#!/usr/bin/env python3
+"""
+HTML 文本提取模块
+改进异常处理和类型提示
+"""
+import re
+from bs4 import BeautifulSoup, Tag, NavigableString
+from typing import List, Optional, Any, Union
+import logging
+
+from src.config import config
+from src.logging_config import get_logger
+
+logger = get_logger(__name__)
+
+
+class HTMLTextExtractorError(Exception):
+    """HTML文本提取错误"""
+    pass
+
+
+class HTMLTextExtractor:
+    """HTML 文本提取器 - 保留布局结构"""
+    
+    # 块级元素列表
+    BLOCK_TAGS = {
+        'h1', 'h2', 'h3', 'h4', 'h5', 'h6', 'p', 'div', 'section',
+        'table', 'tr', 'td', 'th', 'li', 'ul', 'ol', 'blockquote',
+        'pre', 'hr', 'br', 'tbody', 'thead', 'tfoot'
+    }
+    
+    def __init__(self):
+        """初始化提取器"""
+        self.output_lines: List[str] = []
+    
+    def extract(self, html: str) -> str:
+        """
+        从HTML中提取保留布局的文本
+        
+        参数:
+            html: HTML字符串
+        
+        返回:
+            格式化的纯文本
+        
+        异常:
+            HTMLTextExtractorError: HTML解析失败
+            ValueError: 输入参数无效
+        """
+        if not html:
+            logger.warning("HTML内容为空")
+            return ''
+        
+        if not isinstance(html, str):
+            error_msg = f"HTML参数类型错误，应为字符串，实际为: {type(html)}"
+            logger.error(error_msg)
+            raise ValueError(error_msg)
+        
+        try:
+            logger.debug(f"开始解析HTML，长度: {len(html)} 字符")
+            soup = BeautifulSoup(html, 'html.parser')
+            
+            # 移除不需要的元素
+            for tag in soup(["script", "style", "noscript"]):
+                tag.decompose()
+            
+            # 移除 Confluence 宏
+            for macro in soup.find_all(attrs={"ac:name": True}):
+                macro.decompose()
+            
+            self.output_lines = []
+            
+            # 处理 body 或整个文档
+            body = soup.body if soup.body else soup
+            for child in body.children:
+                self._process_node(child)
+            
+            # 清理结果
+            result = ''.join(self.output_lines)
+            result = re.sub(r'\n\s*\n\s*\n', '\n\n', result)
+            result = '\n'.join(line.rstrip() for line in result.split('\n'))
+            
+            logger.info(f"HTML提取完成，输出长度: {len(result)} 字符")
+            return result.strip()
+            
+        except Exception as e:
+            error_msg = f"HTML解析失败: {e}"
+            logger.error(error_msg)
+            raise HTMLTextExtractorError(error_msg) from e
+    
+    def _process_node(self, node: Union[Tag, NavigableString], indent: int = 0, 
+                     list_context: Optional[tuple] = None) -> None:
+        """递归处理节点"""
+        if isinstance(node, NavigableString):
+            text = str(node).strip()
+            if text:
+                text = re.sub(r'\s+', ' ', text)
+                if self.output_lines and not self.output_lines[-1].endswith('\n'):
+                    self.output_lines[-1] += text
+                else:
+                    self.output_lines.append(' ' * indent + text)
+            return
+        
+        if not isinstance(node, Tag):
+            return
+        
+        tag_name = node.name.lower()
+        is_block = tag_name in self.BLOCK_TAGS
+        
+        # 块级元素前添加换行
+        if is_block and self.output_lines and not self.output_lines[-1].endswith('\n'):
+            self.output_lines.append('\n')
+        
+        # 处理特定标签
+        if tag_name in ('h1', 'h2', 'h3', 'h4', 'h5', 'h6'):
+            try:
+                level = int(tag_name[1])
+                prefix = '#' * level + ' '
+                text = node.get_text().strip()
+                if text:
+                    self.output_lines.append(' ' * indent + prefix + text + '\n')
+            except (ValueError, IndexError) as e:
+                logger.warning(f"解析标题标签失败: {tag_name}, 错误: {e}")
+            return
+        
+        elif tag_name == 'p':
+            text = node.get_text().strip()
+            if text:
+                self.output_lines.append(' ' * indent + text + '\n')
+            return
+        
+        elif tag_name == 'hr':
+            self.output_lines.append(' ' * indent + config.SEPARATOR_CHAR * config.SEPARATOR_LENGTH + '\n')
+            return
+        
+        elif tag_name == 'br':
+            self.output_lines.append('\n')
+            return
+        
+        elif tag_name == 'table':
+            self._process_table(node, indent)
+            return
+        
+        elif tag_name in ('ul', 'ol'):
+            self._process_list(node, indent, tag_name)
+            return
+        
+        elif tag_name == 'li':
+            self._process_list_item(node, indent, list_context)
+            return
+        
+        elif tag_name == 'a':
+            try:
+                href = node.get('href', '')
+                text = node.get_text().strip()
+                if href and text:
+                    self.output_lines.append(f'{text} ({href})')
+                elif text:
+                    self.output_lines.append(text)
+            except Exception as e:
+                logger.warning(f"解析链接标签失败: {e}")
+            return
+        
+        elif tag_name in ('strong', 'b'):
+            text = node.get_text().strip()
+            if text:
+                self.output_lines.append(f'**{text}**')
+            return
+        
+        elif tag_name in ('em', 'i'):
+            text = node.get_text().strip()
+            if text:
+                self.output_lines.append(f'*{text}*')
+            return
+        
+        else:
+            # 默认递归处理子元素
+            for child in node.children:
+                self._process_node(child, indent, list_context)
+        
+        if is_block and self.output_lines and not self.output_lines[-1].endswith('\n'):
+            self.output_lines.append('\n')
+    
+    def _process_table(self, table: Tag, indent: int) -> None:
+        """处理表格"""
+        try:
+            rows = []
+            for tr in table.find_all('tr'):
+                row = []
+                for td in tr.find_all(['td', 'th']):
+                    row.append(td.get_text().strip())
+                if row:
+                    rows.append(row)
+            
+            if rows:
+                # 计算列宽
+                col_widths = []
+                max_cols = max(len(r) for r in rows)
+                for i in range(max_cols):
+                    col_width = max((len(r[i]) if i < len(r) else 0) for r in rows)
+                    col_widths.append(col_width)
+                
+                for row in rows:
+                    line = ' ' * indent
+                    for i, cell in enumerate(row):
+                        width = col_widths[i] if i < len(col_widths) else 0
+                        line += cell.ljust(width) + '  '
+                    self.output_lines.append(line.rstrip() + '\n')
+                self.output_lines.append('\n')
+                
+        except Exception as e:
+            logger.warning(f"处理表格失败: {e}")
+            # 降级处理：简单提取表格文本
+            table_text = table.get_text().strip()
+            if table_text:
+                self.output_lines.append(' ' * indent + table_text + '\n')
+    
+    def _process_list(self, ul: Tag, indent: int, list_type: str) -> None:
+        """处理列表"""
+        try:
+            counter = 1 if list_type == 'ol' else None
+            for child in ul.children:
+                if isinstance(child, Tag) and child.name == 'li':
+                    ctx = (list_type, counter) if counter else (list_type, 1)
+                    self._process_list_item(child, indent, ctx)
+                    if counter:
+                        counter += 1
+                else:
+                    self._process_node(child, indent, (list_type, 1) if not counter else None)
+        except Exception as e:
+            logger.warning(f"处理列表失败: {e}")
+    
+    def _process_list_item(self, li: Tag, indent: int, list_context: Optional[tuple]) -> None:
+        """处理列表项"""
+        try:
+            prefix = ''
+            if list_context:
+                list_type, num = list_context
+                prefix = '• ' if list_type == 'ul' else f'{num}. '
+            
+            # 收集直接文本
+            direct_parts = []
+            for child in li.children:
+                if isinstance(child, NavigableString):
+                    text = str(child).strip()
+                    if text:
+                        direct_parts.append(text)
+                elif isinstance(child, Tag) and child.name == 'a':
+                    href = child.get('href', '')
+                    link_text = child.get_text().strip()
+                    if href and link_text:
+                        direct_parts.append(f'{link_text} ({href})')
+            
+            if direct_parts:
+                self.output_lines.append(' ' * indent + prefix + ' '.join(direct_parts) + '\n')
+            
+            # 处理子元素
+            for child in li.children:
+                if isinstance(child, Tag) and child.name != 'a':
+                    self._process_node(child, indent + 2, None)
+                    
+        except Exception as e:
+            logger.warning(f"处理列表项失败: {e}")
+
+
+if __name__ == '__main__':
+    # 测试代码
+    import sys
+    
+    # 设置日志
+    logging.basicConfig(level=logging.INFO)
+    
+    extractor = HTMLTextExtractor()
+    
+    # 测试正常HTML
+    html = "<h1>标题</h1><p>段落</p><ul><li>项目1</li><li>项目2</li></ul>"
+    try:
+        result = extractor.extract(html)
+        print(f"测试1 - 正常HTML提取结果:\n{result}")
+    except Exception as e:
+        print(f"测试1失败: {e}")
+    
+    # 测试空HTML
+    try:
+        result = extractor.extract("")
+        print(f"测试2 - 空HTML提取结果: '{result}'")
+    except Exception as e:
+        print(f"测试2失败: {e}")
+    
+    # 测试无效HTML
+    try:
+        result = extractor.extract("<invalid>html")
+        print(f"测试3 - 无效HTML提取结果:\n{result}")
+    except Exception as e:
+        print(f"测试3失败: {e}")
+    
+    # 测试表格
+    table_html = """
+    <table>
+        <tr><th>姓名</th><th>年龄</th></tr>
+        <tr><td>张三</td><td>25</td></tr>
+        <tr><td>李四</td><td>30</td></tr>
+    </table>
+    """
+    try:
+        result = extractor.extract(table_html)
+        print(f"测试4 - 表格提取结果:\n{result}")
+    except Exception as e:
+        print(f"测试4失败: {e}")
--- a/src/database.py
+++ b/src/database.py
@@ -1,253 +0,0 @@
-#!/usr/bin/env python3
-"""
-数据库模块
-"""
-import sqlite3
-import os
-from datetime import datetime
-from typing import List, Dict, Optional
-
-
-class DailyLogsDatabase:
-    """每日交接班日志数据库"""
-    
-    def __init__(self, db_path: str = 'data/daily_logs.db'):
-        """
-        初始化数据库
-        
-        参数:
-            db_path: 数据库文件路径
-        """
-        self.db_path = db_path
-        self._ensure_directory()
-        self.conn = self._connect()
-        self._init_schema()
-    
-    def _ensure_directory(self):
-        """确保数据目录存在"""
-        data_dir = os.path.dirname(self.db_path)
-        if data_dir and not os.path.exists(data_dir):
-            os.makedirs(data_dir)
-    
-    def _connect(self) -> sqlite3.Connection:
-        """连接数据库"""
-        conn = sqlite3.connect(self.db_path)
-        conn.row_factory = sqlite3.Row
-        return conn
-    
-    def _init_schema(self):
-        """初始化表结构"""
-        cursor = self.conn.cursor()
-        
-        cursor.execute('''
-            CREATE TABLE IF NOT EXISTS daily_handover_logs (
-                id INTEGER PRIMARY KEY AUTOINCREMENT,
-                date TEXT NOT NULL,
-                shift TEXT NOT NULL,
-                ship_name TEXT NOT NULL,
-                teu INTEGER,
-                efficiency REAL,
-                vehicles INTEGER,
-                created_at TEXT DEFAULT CURRENT_TIMESTAMP,
-                UNIQUE(date, shift, ship_name) ON CONFLICT REPLACE
-            )
-        ''')
-        
-        # 检查是否需要迁移旧表结构
-        cursor.execute("SELECT sql FROM sqlite_master WHERE type='table' AND name='daily_handover_logs'")
-        table_sql = cursor.fetchone()[0]
-        if 'UNIQUE' not in table_sql:
-            # 旧表结构，需要迁移
-            print("检测到旧表结构，正在迁移...")
-            
-            # 重命名旧表
-            cursor.execute('ALTER TABLE daily_handover_logs RENAME TO daily_handover_logs_old')
-            
-            # 创建新表
-            cursor.execute('''
-                CREATE TABLE daily_handover_logs (
-                    id INTEGER PRIMARY KEY AUTOINCREMENT,
-                    date TEXT NOT NULL,
-                    shift TEXT NOT NULL,
-                    ship_name TEXT NOT NULL,
-                    teu INTEGER,
-                    efficiency REAL,
-                    vehicles INTEGER,
-                    created_at TEXT DEFAULT CURRENT_TIMESTAMP,
-                    UNIQUE(date, shift, ship_name) ON CONFLICT REPLACE
-                )
-            ''')
-            
-            # 复制数据（忽略重复）
-            cursor.execute('''
-                INSERT OR IGNORE INTO daily_handover_logs 
-                (date, shift, ship_name, teu, efficiency, vehicles, created_at)
-                SELECT date, shift, ship_name, teu, efficiency, vehicles, created_at 
-                FROM daily_handover_logs_old
-            ''')
-            
-            # 删除旧表
-            cursor.execute('DROP TABLE daily_handover_logs_old')
-            print("迁移完成！")
-        
-        # 索引
-        cursor.execute('CREATE INDEX IF NOT EXISTS idx_date ON daily_handover_logs(date)')
-        cursor.execute('CREATE INDEX IF NOT EXISTS idx_ship ON daily_handover_logs(ship_name)')
-        
-        # 创建未统计月报数据表
-        cursor.execute('''
-            CREATE TABLE IF NOT EXISTS monthly_unaccounted (
-                id INTEGER PRIMARY KEY AUTOINCREMENT,
-                year_month TEXT NOT NULL UNIQUE,
-                teu INTEGER NOT NULL,
-                note TEXT,
-                created_at TEXT DEFAULT CURRENT_TIMESTAMP
-            )
-        ''')
-        
-        self.conn.commit()
-    
-    def insert(self, log: Dict) -> bool:
-        """插入记录（存在则替换，不存在则插入）"""
-        try:
-            cursor = self.conn.cursor()
-            # 使用 INSERT OR REPLACE 来更新已存在的记录
-            cursor.execute('''
-                INSERT OR REPLACE INTO daily_handover_logs 
-                (date, shift, ship_name, teu, efficiency, vehicles, created_at)
-                VALUES (?, ?, ?, ?, ?, ?, CURRENT_TIMESTAMP)
-            ''', (
-                log['date'], log['shift'], log['ship_name'],
-                log.get('teu'), log.get('efficiency'), log.get('vehicles')
-            ))
-            self.conn.commit()
-            return True
-        except sqlite3.Error as e:
-            print(f"数据库错误: {e}")
-            return False
-    
-    def insert_many(self, logs: List[Dict]) -> int:
-        """批量插入"""
-        count = 0
-        for log in logs:
-            if self.insert(log):
-                count += 1
-        return count
-    
-    def query_by_date(self, date: str) -> List[Dict]:
-        """按日期查询"""
-        cursor = self.conn.cursor()
-        cursor.execute('''
-            SELECT * FROM daily_handover_logs 
-            WHERE date = ? ORDER BY shift, ship_name
-        ''', (date,))
-        return [dict(row) for row in cursor.fetchall()]
-    
-    def query_by_ship(self, ship_name: str) -> List[Dict]:
-        """按船名查询"""
-        cursor = self.conn.cursor()
-        cursor.execute('''
-            SELECT * FROM daily_handover_logs 
-            WHERE ship_name LIKE ? ORDER BY date DESC
-        ''', (f'%{ship_name}%',))
-        return [dict(row) for row in cursor.fetchall()]
-    
-    def query_all(self, limit: int = 1000) -> List[Dict]:
-        """查询所有"""
-        cursor = self.conn.cursor()
-        cursor.execute('''
-            SELECT * FROM daily_handover_logs 
-            ORDER BY date DESC, shift LIMIT ?
-        ''', (limit,))
-        return [dict(row) for row in cursor.fetchall()]
-    
-    def get_stats(self) -> Dict:
-        """获取统计信息"""
-        cursor = self.conn.cursor()
-        
-        cursor.execute('SELECT COUNT(*) FROM daily_handover_logs')
-        total = cursor.fetchone()[0]
-        
-        cursor.execute('SELECT DISTINCT ship_name FROM daily_handover_logs')
-        ships = [row[0] for row in cursor.fetchall()]
-        
-        cursor.execute('SELECT MIN(date), MAX(date) FROM daily_handover_logs')
-        date_range = cursor.fetchone()
-        
-        return {
-            'total': total,
-            'ships': ships,
-            'date_range': {'start': date_range[0], 'end': date_range[1]}
-        }
-    
-    def get_ships_with_monthly_teu(self, year_month: str = None) -> List[Dict]:
-        """获取所有船只及其当月TEU总量"""
-        cursor = self.conn.cursor()
-        
-        if year_month:
-            # 按年月筛选
-            cursor.execute('''
-                SELECT ship_name, SUM(teu) as monthly_teu
-                FROM daily_handover_logs
-                WHERE date LIKE ?
-                GROUP BY ship_name
-                ORDER BY monthly_teu DESC
-            ''', (f'{year_month}%',))
-        else:
-            # 全部
-            cursor.execute('''
-                SELECT ship_name, SUM(teu) as monthly_teu
-                FROM daily_handover_logs
-                GROUP BY ship_name
-                ORDER BY monthly_teu DESC
-            ''')
-        
-        return [{'ship_name': row[0], 'monthly_teu': row[1]} for row in cursor.fetchall()]
-    
-    def insert_unaccounted(self, year_month: str, teu: int, note: str = '') -> bool:
-        """插入未统计数据"""
-        try:
-            cursor = self.conn.cursor()
-            cursor.execute('''
-                INSERT OR REPLACE INTO monthly_unaccounted 
-                (year_month, teu, note, created_at)
-                VALUES (?, ?, ?, CURRENT_TIMESTAMP)
-            ''', (year_month, teu, note))
-            self.conn.commit()
-            return True
-        except sqlite3.Error as e:
-            print(f"数据库错误: {e}")
-            return False
-    
-    def get_unaccounted(self, year_month: str) -> int:
-        """获取指定月份的未统计数据"""
-        cursor = self.conn.cursor()
-        cursor.execute(
-            'SELECT teu FROM monthly_unaccounted WHERE year_month = ?',
-            (year_month,)
-        )
-        result = cursor.fetchone()
-        return result[0] if result else 0
-    
-    def close(self):
-        """关闭连接"""
-        if self.conn:
-            self.conn.close()
-
-
-if __name__ == '__main__':
-    db = DailyLogsDatabase()
-    
-    # 测试插入
-    test_log = {
-        'date': '2025-12-28',
-        'shift': '白班',
-        'ship_name': '测试船',
-        'teu': 100,
-        'efficiency': 3.5,
-        'vehicles': 5
-    }
-    
-    db.insert(test_log)
-    print(f'总记录: {db.get_stats()["total"]}')
-    db.close()
--- a/src/database/init.py
+++ b/src/database/init.py
@@ -0,0 +1,15 @@
+#!/usr/bin/env python3
+"""
+数据库模块包
+提供统一的数据库接口
+"""
+from src.database.base import DatabaseBase, DatabaseConnectionError
+from src.database.daily_logs import DailyLogsDatabase
+from src.database.schedules import ScheduleDatabase
+
+__all__ = [
+    'DatabaseBase',
+    'DatabaseConnectionError',
+    'DailyLogsDatabase',
+    'ScheduleDatabase'
+]
--- a/src/database/base.py
+++ b/src/database/base.py
@@ -0,0 +1,257 @@
+#!/usr/bin/env python3
+"""
+数据库基类模块
+提供统一的数据库连接管理和上下文管理器
+"""
+import os
+import sqlite3
+from contextlib import contextmanager
+from typing import Generator, Optional, Any
+from pathlib import Path
+
+from src.config import config
+from src.logging_config import get_logger
+
+logger = get_logger(__name__)
+
+
+class DatabaseConnectionError(Exception):
+    """数据库连接错误"""
+    pass
+
+
+class DatabaseBase:
+    """数据库基类，提供统一的连接管理"""
+    
+    def __init__(self, db_path: Optional[str] = None):
+        """
+        初始化数据库基类
+        
+        参数:
+            db_path: 数据库文件路径，如果为None则使用默认配置
+        """
+        self.db_path = db_path or config.DATABASE_PATH
+        self._connection: Optional[sqlite3.Connection] = None
+        self._ensure_directory()
+    
+    def _ensure_directory(self):
+        """确保数据库目录存在"""
+        data_dir = os.path.dirname(self.db_path)
+        if data_dir and not os.path.exists(data_dir):
+            os.makedirs(data_dir)
+            logger.info(f"创建数据库目录: {data_dir}")
+    
+    def _connect(self) -> sqlite3.Connection:
+        """
+        创建数据库连接
+        
+        返回:
+            sqlite3.Connection 对象
+        
+        异常:
+            DatabaseConnectionError: 连接失败时抛出
+        """
+        try:
+            conn = sqlite3.connect(self.db_path)
+            conn.row_factory = sqlite3.Row
+            logger.debug(f"数据库连接已建立: {self.db_path}")
+            return conn
+        except sqlite3.Error as e:
+            error_msg = f"数据库连接失败: {self.db_path}, 错误: {e}"
+            logger.error(error_msg)
+            raise DatabaseConnectionError(error_msg) from e
+    
+    @contextmanager
+    def get_connection(self) -> Generator[sqlite3.Connection, None, None]:
+        """
+        获取数据库连接的上下文管理器
+        
+        使用示例:
+            with self.get_connection() as conn:
+                cursor = conn.cursor()
+                cursor.execute(...)
+        
+        返回:
+            数据库连接对象
+        """
+        conn = None
+        try:
+            conn = self._connect()
+            yield conn
+        except sqlite3.Error as e:
+            logger.error(f"数据库操作失败: {e}")
+            raise
+        finally:
+            if conn:
+                conn.close()
+                logger.debug("数据库连接已关闭")
+    
+    def execute_query(self, query: str, params: tuple = ()) -> list:
+        """
+        执行查询并返回结果
+        
+        参数:
+            query: SQL查询语句
+            params: 查询参数
+        
+        返回:
+            查询结果列表
+        """
+        with self.get_connection() as conn:
+            cursor = conn.cursor()
+            cursor.execute(query, params)
+            return [dict(row) for row in cursor.fetchall()]
+    
+    def execute_update(self, query: str, params: tuple = ()) -> int:
+        """
+        执行更新操作
+        
+        参数:
+            query: SQL更新语句
+            params: 更新参数
+        
+        返回:
+            受影响的行数
+        """
+        with self.get_connection() as conn:
+            cursor = conn.cursor()
+            cursor.execute(query, params)
+            conn.commit()
+            return cursor.rowcount
+    
+    def execute_many(self, query: str, params_list: list) -> int:
+        """
+        批量执行操作
+        
+        参数:
+            query: SQL语句
+            params_list: 参数列表
+        
+        返回:
+            受影响的总行数
+        """
+        with self.get_connection() as conn:
+            cursor = conn.cursor()
+            cursor.executemany(query, params_list)
+            conn.commit()
+            return cursor.rowcount
+    
+    def table_exists(self, table_name: str) -> bool:
+        """
+        检查表是否存在
+        
+        参数:
+            table_name: 表名
+        
+        返回:
+            表是否存在
+        """
+        query = """
+            SELECT name FROM sqlite_master 
+            WHERE type='table' AND name=?
+        """
+        result = self.execute_query(query, (table_name,))
+        return len(result) > 0
+    
+    def get_table_info(self, table_name: str) -> list:
+        """
+        获取表结构信息
+        
+        参数:
+            table_name: 表名
+        
+        返回:
+            表结构信息列表
+        """
+        with self.get_connection() as conn:
+            cursor = conn.cursor()
+            cursor.execute(f"PRAGMA table_info({table_name})")
+            return [dict(row) for row in cursor.fetchall()]
+    
+    def vacuum(self):
+        """执行数据库整理"""
+        with self.get_connection() as conn:
+            conn.execute("VACUUM")
+            logger.info("数据库整理完成")
+    
+    def backup(self, backup_path: Optional[str] = None):
+        """
+        备份数据库
+        
+        参数:
+            backup_path: 备份文件路径，如果为None则使用默认路径
+        """
+        if backup_path is None:
+            backup_dir = "backups"
+            os.makedirs(backup_dir, exist_ok=True)
+            timestamp = os.path.getmtime(self.db_path)
+            from datetime import datetime
+            dt = datetime.fromtimestamp(timestamp)
+            backup_path = os.path.join(
+                backup_dir, 
+                f"backup_{dt.strftime('%Y%m%d_%H%M%S')}.db"
+            )
+        
+        try:
+            with self.get_connection() as src_conn:
+                dest_conn = sqlite3.connect(backup_path)
+                src_conn.backup(dest_conn)
+                dest_conn.close()
+            logger.info(f"数据库备份完成: {backup_path}")
+        except sqlite3.Error as e:
+            logger.error(f"数据库备份失败: {e}")
+            raise
+
+
+# 全局数据库连接池（可选，用于高性能场景）
+class ConnectionPool:
+    """简单的数据库连接池"""
+    
+    def __init__(self, db_path: str, max_connections: int = 5):
+        self.db_path = db_path
+        self.max_connections = max_connections
+        self._connections: list[sqlite3.Connection] = []
+        self._in_use: set[sqlite3.Connection] = set()
+    
+    @contextmanager
+    def get_connection(self) -> Generator[sqlite3.Connection, None, None]:
+        """从连接池获取连接"""
+        conn = None
+        try:
+            if self._connections:
+                conn = self._connections.pop()
+            elif len(self._in_use) < self.max_connections:
+                conn = sqlite3.connect(self.db_path)
+                conn.row_factory = sqlite3.Row
+            else:
+                raise DatabaseConnectionError("连接池已满")
+            
+            self._in_use.add(conn)
+            yield conn
+        finally:
+            if conn:
+                self._in_use.remove(conn)
+                self._connections.append(conn)
+
+
+if __name__ == '__main__':
+    # 测试数据库基类
+    db = DatabaseBase()
+    
+    # 测试连接
+    with db.get_connection() as conn:
+        cursor = conn.cursor()
+        cursor.execute("SELECT sqlite_version()")
+        version = cursor.fetchone()[0]
+        print(f"SQLite版本: {version}")
+    
+    # 测试查询
+    if db.table_exists("sqlite_master"):
+        print("sqlite_master表存在")
+    
+    # 测试备份
+    try:
+        db.backup("test_backup.db")
+        print("备份测试完成")
+    except Exception as e:
+        print(f"备份测试失败: {e}")
--- a/src/database/daily_logs.py
+++ b/src/database/daily_logs.py
@@ -0,0 +1,336 @@
+#!/usr/bin/env python3
+"""
+每日交接班日志数据库模块
+基于新的数据库基类重构
+"""
+from typing import List, Dict, Optional, Any
+from datetime import datetime
+
+from src.database.base import DatabaseBase
+from src.logging_config import get_logger
+
+logger = get_logger(__name__)
+
+
+class DailyLogsDatabase(DatabaseBase):
+    """每日交接班日志数据库"""
+    
+    def __init__(self, db_path: Optional[str] = None):
+        """
+        初始化数据库
+        
+        参数:
+            db_path: 数据库文件路径，如果为None则使用默认配置
+        """
+        super().__init__(db_path)
+        self._init_schema()
+    
+    def _init_schema(self):
+        """初始化表结构"""
+        with self.get_connection() as conn:
+            cursor = conn.cursor()
+            
+            # 创建每日交接班日志表
+            cursor.execute('''
+                CREATE TABLE IF NOT EXISTS daily_handover_logs (
+                    id INTEGER PRIMARY KEY AUTOINCREMENT,
+                    date TEXT NOT NULL,
+                    shift TEXT NOT NULL,
+                    ship_name TEXT NOT NULL,
+                    teu INTEGER,
+                    efficiency REAL,
+                    vehicles INTEGER,
+                    created_at TEXT DEFAULT CURRENT_TIMESTAMP,
+                    UNIQUE(date, shift, ship_name) ON CONFLICT REPLACE
+                )
+            ''')
+            
+            # 检查是否需要迁移旧表结构
+            cursor.execute("SELECT sql FROM sqlite_master WHERE type='table' AND name='daily_handover_logs'")
+            table_sql = cursor.fetchone()[0]
+            if 'UNIQUE' not in table_sql:
+                logger.warning("检测到旧表结构，正在迁移...")
+                
+                # 重命名旧表
+                cursor.execute('ALTER TABLE daily_handover_logs RENAME TO daily_handover_logs_old')
+                
+                # 创建新表
+                cursor.execute('''
+                    CREATE TABLE daily_handover_logs (
+                        id INTEGER PRIMARY KEY AUTOINCREMENT,
+                        date TEXT NOT NULL,
+                        shift TEXT NOT NULL,
+                        ship_name TEXT NOT NULL,
+                        teu INTEGER,
+                        efficiency REAL,
+                        vehicles INTEGER,
+                        created_at TEXT DEFAULT CURRENT_TIMESTAMP,
+                        UNIQUE(date, shift, ship_name) ON CONFLICT REPLACE
+                    )
+                ''')
+                
+                # 复制数据（忽略重复）
+                cursor.execute('''
+                    INSERT OR IGNORE INTO daily_handover_logs 
+                    (date, shift, ship_name, teu, efficiency, vehicles, created_at)
+                    SELECT date, shift, ship_name, teu, efficiency, vehicles, created_at 
+                    FROM daily_handover_logs_old
+                ''')
+                
+                # 删除旧表
+                cursor.execute('DROP TABLE daily_handover_logs_old')
+                logger.info("迁移完成！")
+            
+            # 创建索引
+            cursor.execute('CREATE INDEX IF NOT EXISTS idx_date ON daily_handover_logs(date)')
+            cursor.execute('CREATE INDEX IF NOT EXISTS idx_ship ON daily_handover_logs(ship_name)')
+            
+            # 创建未统计月报数据表
+            cursor.execute('''
+                CREATE TABLE IF NOT EXISTS monthly_unaccounted (
+                    id INTEGER PRIMARY KEY AUTOINCREMENT,
+                    year_month TEXT NOT NULL UNIQUE,
+                    teu INTEGER NOT NULL,
+                    note TEXT,
+                    created_at TEXT DEFAULT CURRENT_TIMESTAMP
+                )
+            ''')
+            
+            conn.commit()
+            logger.debug("数据库表结构初始化完成")
+    
+    def insert(self, log: Dict[str, Any]) -> bool:
+        """
+        插入记录（存在则替换，不存在则插入）
+        
+        参数:
+            log: 日志记录字典
+        
+        返回:
+            是否成功
+        """
+        try:
+            query = '''
+                INSERT OR REPLACE INTO daily_handover_logs 
+                (date, shift, ship_name, teu, efficiency, vehicles, created_at)
+                VALUES (?, ?, ?, ?, ?, ?, CURRENT_TIMESTAMP)
+            '''
+            params = (
+                log['date'], log['shift'], log['ship_name'],
+                log.get('teu'), log.get('efficiency'), log.get('vehicles')
+            )
+            
+            self.execute_update(query, params)
+            logger.debug(f"插入记录: {log['date']} {log['shift']} {log['ship_name']}")
+            return True
+            
+        except Exception as e:
+            logger.error(f"插入记录失败: {e}, 记录: {log}")
+            return False
+    
+    def insert_many(self, logs: List[Dict[str, Any]]) -> int:
+        """
+        批量插入
+        
+        参数:
+            logs: 日志记录列表
+        
+        返回:
+            成功插入的数量
+        """
+        count = 0
+        for log in logs:
+            if self.insert(log):
+                count += 1
+        
+        logger.info(f"批量插入完成，成功 {count}/{len(logs)} 条记录")
+        return count
+    
+    def query_by_date(self, date: str) -> List[Dict[str, Any]]:
+        """
+        按日期查询
+        
+        参数:
+            date: 日期字符串
+        
+        返回:
+            日志记录列表
+        """
+        query = '''
+            SELECT * FROM daily_handover_logs 
+            WHERE date = ? ORDER BY shift, ship_name
+        '''
+        return self.execute_query(query, (date,))
+    
+    def query_by_ship(self, ship_name: str) -> List[Dict[str, Any]]:
+        """
+        按船名查询
+        
+        参数:
+            ship_name: 船名
+        
+        返回:
+            日志记录列表
+        """
+        query = '''
+            SELECT * FROM daily_handover_logs 
+            WHERE ship_name LIKE ? ORDER BY date DESC
+        '''
+        return self.execute_query(query, (f'%{ship_name}%',))
+    
+    def query_all(self, limit: int = 1000) -> List[Dict[str, Any]]:
+        """
+        查询所有记录
+        
+        参数:
+            limit: 限制返回数量
+        
+        返回:
+            日志记录列表
+        """
+        query = '''
+            SELECT * FROM daily_handover_logs 
+            ORDER BY date DESC, shift LIMIT ?
+        '''
+        return self.execute_query(query, (limit,))
+    
+    def get_stats(self) -> Dict[str, Any]:
+        """
+        获取统计信息
+        
+        返回:
+            统计信息字典
+        """
+        with self.get_connection() as conn:
+            cursor = conn.cursor()
+            
+            cursor.execute('SELECT COUNT(*) FROM daily_handover_logs')
+            total = cursor.fetchone()[0]
+            
+            cursor.execute('SELECT DISTINCT ship_name FROM daily_handover_logs')
+            ships = [row[0] for row in cursor.fetchall()]
+            
+            cursor.execute('SELECT MIN(date), MAX(date) FROM daily_handover_logs')
+            date_range = cursor.fetchone()
+            
+            return {
+                'total': total,
+                'ships': ships,
+                'date_range': {'start': date_range[0], 'end': date_range[1]}
+            }
+    
+    def get_ships_with_monthly_teu(self, year_month: Optional[str] = None) -> List[Dict[str, Any]]:
+        """
+        获取所有船只及其当月TEU总量
+        
+        参数:
+            year_month: 年月字符串，格式 "2025-12"，如果为None则统计所有
+        
+        返回:
+            船只统计列表
+        """
+        if year_month:
+            query = '''
+                SELECT ship_name, SUM(teu) as monthly_teu
+                FROM daily_handover_logs
+                WHERE date LIKE ?
+                GROUP BY ship_name
+                ORDER BY monthly_teu DESC
+            '''
+            return self.execute_query(query, (f'{year_month}%',))
+        else:
+            query = '''
+                SELECT ship_name, SUM(teu) as monthly_teu
+                FROM daily_handover_logs
+                GROUP BY ship_name
+                ORDER BY monthly_teu DESC
+            '''
+            return self.execute_query(query)
+    
+    def insert_unaccounted(self, year_month: str, teu: int, note: str = '') -> bool:
+        """
+        插入未统计数据
+        
+        参数:
+            year_month: 年月字符串，格式 "2025-12"
+            teu: 未统计TEU数量
+            note: 备注
+        
+        返回:
+            是否成功
+        """
+        try:
+            query = '''
+                INSERT OR REPLACE INTO monthly_unaccounted 
+                (year_month, teu, note, created_at)
+                VALUES (?, ?, ?, CURRENT_TIMESTAMP)
+            '''
+            self.execute_update(query, (year_month, teu, note))
+            logger.info(f"插入未统计数据: {year_month} {teu}TEU")
+            return True
+            
+        except Exception as e:
+            logger.error(f"插入未统计数据失败: {e}")
+            return False
+    
+    def get_unaccounted(self, year_month: str) -> int:
+        """
+        获取指定月份的未统计数据
+        
+        参数:
+            year_month: 年月字符串，格式 "2025-12"
+        
+        返回:
+            未统计TEU数量
+        """
+        query = 'SELECT teu FROM monthly_unaccounted WHERE year_month = ?'
+        result = self.execute_query(query, (year_month,))
+        return result[0]['teu'] if result else 0
+    
+    def delete_by_date(self, date: str) -> int:
+        """
+        删除指定日期的记录
+        
+        参数:
+            date: 日期字符串
+        
+        返回:
+            删除的记录数
+        """
+        query = 'DELETE FROM daily_handover_logs WHERE date = ?'
+        return self.execute_update(query, (date,))
+
+
+if __name__ == '__main__':
+    # 测试代码
+    db = DailyLogsDatabase()
+    
+    # 测试插入
+    test_log = {
+        'date': '2025-12-30',
+        'shift': '白班',
+        'ship_name': '测试船',
+        'teu': 100,
+        'efficiency': 3.5,
+        'vehicles': 5
+    }
+    
+    success = db.insert(test_log)
+    print(f"插入测试: {'成功' if success else '失败'}")
+    
+    # 测试查询
+    logs = db.query_by_date('2025-12-30')
+    print(f"查询结果: {len(logs)} 条记录")
+    
+    # 测试统计
+    stats = db.get_stats()
+    print(f"统计信息: {stats}")
+    
+    # 测试未统计数据
+    db.insert_unaccounted('2025-12', 118, '测试备注')
+    unaccounted = db.get_unaccounted('2025-12')
+    print(f"未统计数据: {unaccounted}TEU")
+    
+    # 清理测试数据
+    db.delete_by_date('2025-12-30')
+    print("测试数据已清理")
--- a/src/database/schedules.py
+++ b/src/database/schedules.py
@@ -0,0 +1,342 @@
+#!/usr/bin/env python3
+"""
+排班人员数据库模块
+基于新的数据库基类重构
+"""
+import json
+import hashlib
+from typing import List, Dict, Optional, Any
+
+from src.database.base import DatabaseBase
+from src.logging_config import get_logger
+
+logger = get_logger(__name__)
+
+
+class ScheduleDatabase(DatabaseBase):
+    """排班人员数据库"""
+    
+    def __init__(self, db_path: Optional[str] = None):
+        """
+        初始化数据库
+        
+        参数:
+            db_path: 数据库文件路径，如果为None则使用默认配置
+        """
+        super().__init__(db_path)
+        self._init_schema()
+    
+    def _init_schema(self):
+        """初始化表结构"""
+        with self.get_connection() as conn:
+            cursor = conn.cursor()
+            
+            # 创建排班人员表
+            cursor.execute('''
+                CREATE TABLE IF NOT EXISTS schedule_personnel (
+                    id INTEGER PRIMARY KEY AUTOINCREMENT,
+                    date TEXT NOT NULL,
+                    day_shift TEXT,
+                    night_shift TEXT,
+                    day_shift_list TEXT,  -- JSON数组
+                    night_shift_list TEXT, -- JSON数组
+                    sheet_id TEXT,
+                    sheet_title TEXT,
+                    data_hash TEXT,        -- 数据哈希，用于检测更新
+                    created_at TEXT DEFAULT CURRENT_TIMESTAMP,
+                    updated_at TEXT DEFAULT CURRENT_TIMESTAMP,
+                    UNIQUE(date)
+                )
+            ''')
+            
+            # 创建表格版本表（用于检测表格是否有更新）
+            cursor.execute('''
+                CREATE TABLE IF NOT EXISTS sheet_versions (
+                    id INTEGER PRIMARY KEY AUTOINCREMENT,
+                    sheet_id TEXT NOT NULL,
+                    sheet_title TEXT NOT NULL,
+                    revision INTEGER NOT NULL,
+                    data_hash TEXT,
+                    last_checked_at TEXT DEFAULT CURRENT_TIMESTAMP,
+                    UNIQUE(sheet_id)
+                )
+            ''')
+            
+            # 创建索引
+            cursor.execute('CREATE INDEX IF NOT EXISTS idx_schedule_date ON schedule_personnel(date)')
+            cursor.execute('CREATE INDEX IF NOT EXISTS idx_schedule_sheet ON schedule_personnel(sheet_id)')
+            cursor.execute('CREATE INDEX IF NOT EXISTS idx_sheet_versions ON sheet_versions(sheet_id)')
+            
+            conn.commit()
+            logger.debug("排班数据库表结构初始化完成")
+    
+    def _calculate_hash(self, data: Dict[str, Any]) -> str:
+        """
+        计算数据哈希值
+        
+        参数:
+            data: 数据字典
+        
+        返回:
+            MD5哈希值
+        """
+        data_str = json.dumps(data, sort_keys=True, ensure_ascii=False)
+        return hashlib.md5(data_str.encode('utf-8')).hexdigest()
+    
+    def check_sheet_update(self, sheet_id: str, sheet_title: str, revision: int, data: Dict[str, Any]) -> bool:
+        """
+        检查表格是否有更新
+        
+        参数:
+            sheet_id: 表格ID
+            sheet_title: 表格标题
+            revision: 表格版本号
+            data: 表格数据
+        
+        返回:
+            True: 有更新，需要重新获取
+            False: 无更新，可以使用缓存
+        """
+        with self.get_connection() as conn:
+            cursor = conn.cursor()
+            
+            # 查询当前版本
+            cursor.execute(
+                'SELECT revision, data_hash FROM sheet_versions WHERE sheet_id = ?',
+                (sheet_id,)
+            )
+            result = cursor.fetchone()
+            
+            if not result:
+                # 第一次获取，记录版本
+                data_hash = self._calculate_hash(data)
+                cursor.execute('''
+                    INSERT INTO sheet_versions (sheet_id, sheet_title, revision, data_hash, last_checked_at)
+                    VALUES (?, ?, ?, ?, CURRENT_TIMESTAMP)
+                ''', (sheet_id, sheet_title, revision, data_hash))
+                conn.commit()
+                logger.debug(f"首次记录表格版本: {sheet_title} (ID: {sheet_id})")
+                return True
+            
+            # 检查版本号或数据是否有变化
+            old_revision = result['revision']
+            old_hash = result['data_hash']
+            new_hash = self._calculate_hash(data)
+            
+            if old_revision != revision or old_hash != new_hash:
+                # 有更新，更新版本信息
+                cursor.execute('''
+                    UPDATE sheet_versions 
+                    SET revision = ?, data_hash = ?, last_checked_at = CURRENT_TIMESTAMP
+                    WHERE sheet_id = ?
+                ''', (revision, new_hash, sheet_id))
+                conn.commit()
+                logger.info(f"表格有更新: {sheet_title} (ID: {sheet_id})")
+                return True
+            
+            # 无更新，更新检查时间
+            cursor.execute('''
+                UPDATE sheet_versions 
+                SET last_checked_at = CURRENT_TIMESTAMP
+                WHERE sheet_id = ?
+            ''', (sheet_id,))
+            conn.commit()
+            logger.debug(f"表格无更新: {sheet_title} (ID: {sheet_id})")
+            return False
+    
+    def save_schedule(self, date: str, schedule_data: Dict[str, Any], 
+                     sheet_id: Optional[str] = None, sheet_title: Optional[str] = None) -> bool:
+        """
+        保存排班信息到数据库
+        
+        参数:
+            date: 日期 (YYYY-MM-DD)
+            schedule_data: 排班数据
+            sheet_id: 表格ID
+            sheet_title: 表格标题
+        
+        返回:
+            是否成功
+        """
+        try:
+            # 准备数据
+            day_shift = schedule_data.get('day_shift', '')
+            night_shift = schedule_data.get('night_shift', '')
+            day_shift_list = json.dumps(schedule_data.get('day_shift_list', []), ensure_ascii=False)
+            night_shift_list = json.dumps(schedule_data.get('night_shift_list', []), ensure_ascii=False)
+            data_hash = self._calculate_hash(schedule_data)
+            
+            # 使用 INSERT OR REPLACE 来更新已存在的记录
+            query = '''
+                INSERT OR REPLACE INTO schedule_personnel 
+                (date, day_shift, night_shift, day_shift_list, night_shift_list, 
+                 sheet_id, sheet_title, data_hash, updated_at)
+                VALUES (?, ?, ?, ?, ?, ?, ?, ?, CURRENT_TIMESTAMP)
+            '''
+            params = (
+                date, day_shift, night_shift, day_shift_list, night_shift_list,
+                sheet_id, sheet_title, data_hash
+            )
+            
+            self.execute_update(query, params)
+            logger.debug(f"保存排班信息: {date}")
+            return True
+            
+        except Exception as e:
+            logger.error(f"保存排班信息失败: {e}, 日期: {date}")
+            return False
+    
+    def get_schedule(self, date: str) -> Optional[Dict[str, Any]]:
+        """
+        获取指定日期的排班信息
+        
+        参数:
+            date: 日期 (YYYY-MM-DD)
+        
+        返回:
+            排班信息字典，未找到返回None
+        """
+        query = 'SELECT * FROM schedule_personnel WHERE date = ?'
+        result = self.execute_query(query, (date,))
+        
+        if not result:
+            return None
+        
+        row = result[0]
+        
+        # 解析JSON数组
+        day_shift_list = json.loads(row['day_shift_list']) if row['day_shift_list'] else []
+        night_shift_list = json.loads(row['night_shift_list']) if row['night_shift_list'] else []
+        
+        return {
+            'date': row['date'],
+            'day_shift': row['day_shift'],
+            'night_shift': row['night_shift'],
+            'day_shift_list': day_shift_list,
+            'night_shift_list': night_shift_list,
+            'sheet_id': row['sheet_id'],
+            'sheet_title': row['sheet_title'],
+            'updated_at': row['updated_at']
+        }
+    
+    def get_schedule_by_range(self, start_date: str, end_date: str) -> List[Dict[str, Any]]:
+        """
+        获取日期范围内的排班信息
+        
+        参数:
+            start_date: 开始日期 (YYYY-MM-DD)
+            end_date: 结束日期 (YYYY-MM-DD)
+        
+        返回:
+            排班信息列表
+        """
+        query = '''
+            SELECT * FROM schedule_personnel 
+            WHERE date >= ? AND date <= ?
+            ORDER BY date
+        '''
+        results = self.execute_query(query, (start_date, end_date))
+        
+        processed_results = []
+        for row in results:
+            day_shift_list = json.loads(row['day_shift_list']) if row['day_shift_list'] else []
+            night_shift_list = json.loads(row['night_shift_list']) if row['night_shift_list'] else []
+            
+            processed_results.append({
+                'date': row['date'],
+                'day_shift': row['day_shift'],
+                'night_shift': row['night_shift'],
+                'day_shift_list': day_shift_list,
+                'night_shift_list': night_shift_list,
+                'sheet_id': row['sheet_id'],
+                'sheet_title': row['sheet_title'],
+                'updated_at': row['updated_at']
+            })
+        
+        return processed_results
+    
+    def delete_old_schedules(self, before_date: str) -> int:
+        """
+        删除指定日期之前的排班记录
+        
+        参数:
+            before_date: 日期 (YYYY-MM-DD)
+        
+        返回:
+            删除的记录数
+        """
+        query = 'DELETE FROM schedule_personnel WHERE date < ?'
+        return self.execute_update(query, (before_date,))
+    
+    def get_stats(self) -> Dict[str, Any]:
+        """获取统计信息"""
+        with self.get_connection() as conn:
+            cursor = conn.cursor()
+            
+            cursor.execute('SELECT COUNT(*) FROM schedule_personnel')
+            total = cursor.fetchone()[0]
+            
+            cursor.execute('SELECT MIN(date), MAX(date) FROM schedule_personnel')
+            date_range = cursor.fetchone()
+            
+            cursor.execute('SELECT COUNT(DISTINCT sheet_id) FROM schedule_personnel')
+            sheet_count = cursor.fetchone()[0]
+            
+            return {
+                'total': total,
+                'date_range': {'start': date_range[0], 'end': date_range[1]},
+                'sheet_count': sheet_count
+            }
+    
+    def clear_all(self) -> int:
+        """
+        清空所有排班数据
+        
+        返回:
+            删除的记录数
+        """
+        query1 = 'DELETE FROM schedule_personnel'
+        query2 = 'DELETE FROM sheet_versions'
+        
+        count1 = self.execute_update(query1)
+        count2 = self.execute_update(query2)
+        
+        logger.info(f"清空排班数据，删除 {count1} 条排班记录和 {count2} 条版本记录")
+        return count1 + count2
+
+
+if __name__ == '__main__':
+    # 测试代码
+    db = ScheduleDatabase()
+    
+    # 测试保存
+    test_schedule = {
+        'day_shift': '张勤、杨俊豪',
+        'night_shift': '刘炜彬、梁启迟',
+        'day_shift_list': ['张勤', '杨俊豪'],
+        'night_shift_list': ['刘炜彬', '梁启迟']
+    }
+    
+    success = db.save_schedule('2025-12-31', test_schedule, 'zcYLIk', '12月')
+    print(f"保存测试: {'成功' if success else '失败'}")
+    
+    # 测试获取
+    schedule = db.get_schedule('2025-12-31')
+    print(f"获取结果: {schedule}")
+    
+    # 测试范围查询
+    schedules = db.get_schedule_by_range('2025-12-01', '2025-12-31')
+    print(f"范围查询: {len(schedules)} 条记录")
+    
+    # 测试统计
+    stats = db.get_stats()
+    print(f"统计信息: {stats}")
+    
+    # 测试表格版本检查
+    test_data = {'values': [['姓名', '12月31日'], ['张三', '白']]}
+    needs_update = db.check_sheet_update('test_sheet', '测试表格', 1, test_data)
+    print(f"表格更新检查: {'需要更新' if needs_update else '无需更新'}")
+    
+    # 清理测试数据
+    db.delete_old_schedules('2026-01-01')
+    print("测试数据已清理")
--- a/src/extractor.py
+++ b/src/extractor.py
@@ -1,216 +0,0 @@
-#!/usr/bin/env python3
-"""
-HTML 文本提取模块
-"""
-import re
-from bs4 import BeautifulSoup, Tag, NavigableString
-from typing import List
-
-
-class HTMLTextExtractor:
-    """HTML 文本提取器 - 保留布局结构"""
-    
-    # 块级元素列表
-    BLOCK_TAGS = {
-        'h1', 'h2', 'h3', 'h4', 'h5', 'h6', 'p', 'div', 'section',
-        'table', 'tr', 'td', 'th', 'li', 'ul', 'ol', 'blockquote',
-        'pre', 'hr', 'br', 'tbody', 'thead', 'tfoot'
-    }
-    
-    def __init__(self):
-        """初始化提取器"""
-        self.output_lines: List[str] = []
-    
-    def extract(self, html: str) -> str:
-        """
-        从HTML中提取保留布局的文本
-        
-        参数:
-            html: HTML字符串
-            
-        返回:
-            格式化的纯文本
-        """
-        if not html:
-            return ''
-        
-        soup = BeautifulSoup(html, 'html.parser')
-        
-        # 移除不需要的元素
-        for tag in soup(["script", "style", "noscript"]):
-            tag.decompose()
-        
-        # 移除 Confluence 宏
-        for macro in soup.find_all(attrs={"ac:name": True}):
-            macro.decompose()
-        
-        self.output_lines = []
-        
-        # 处理 body 或整个文档
-        body = soup.body if soup.body else soup
-        for child in body.children:
-            self._process_node(child)
-        
-        # 清理结果
-        result = ''.join(self.output_lines)
-        result = re.sub(r'\n\s*\n\s*\n', '\n\n', result)
-        result = '\n'.join(line.rstrip() for line in result.split('\n'))
-        return result.strip()
-    
-    def _process_node(self, node, indent: int = 0, list_context=None):
-        """递归处理节点"""
-        if isinstance(node, NavigableString):
-            text = str(node).strip()
-            if text:
-                text = re.sub(r'\s+', ' ', text)
-                if self.output_lines and not self.output_lines[-1].endswith('\n'):
-                    self.output_lines[-1] += text
-                else:
-                    self.output_lines.append(' ' * indent + text)
-            return
-        
-        if not isinstance(node, Tag):
-            return
-        
-        tag_name = node.name.lower()
-        is_block = tag_name in self.BLOCK_TAGS
-        
-        # 块级元素前添加换行
-        if is_block and self.output_lines and not self.output_lines[-1].endswith('\n'):
-            self.output_lines.append('\n')
-        
-        # 处理特定标签
-        if tag_name in ('h1', 'h2', 'h3', 'h4', 'h5', 'h6'):
-            level = int(tag_name[1])
-            prefix = '#' * level + ' '
-            text = node.get_text().strip()
-            if text:
-                self.output_lines.append(' ' * indent + prefix + text + '\n')
-            return
-        
-        elif tag_name == 'p':
-            text = node.get_text().strip()
-            if text:
-                self.output_lines.append(' ' * indent + text + '\n')
-            return
-        
-        elif tag_name == 'hr':
-            self.output_lines.append(' ' * indent + '─' * 50 + '\n')
-            return
-        
-        elif tag_name == 'br':
-            self.output_lines.append('\n')
-            return
-        
-        elif tag_name == 'table':
-            self._process_table(node, indent)
-            return
-        
-        elif tag_name in ('ul', 'ol'):
-            self._process_list(node, indent, tag_name)
-            return
-        
-        elif tag_name == 'li':
-            self._process_list_item(node, indent, list_context)
-            return
-        
-        elif tag_name == 'a':
-            href = node.get('href', '')
-            text = node.get_text().strip()
-            if href and text:
-                self.output_lines.append(f'{text} ({href})')
-            elif text:
-                self.output_lines.append(text)
-            return
-        
-        elif tag_name in ('strong', 'b'):
-            text = node.get_text().strip()
-            if text:
-                self.output_lines.append(f'**{text}**')
-            return
-        
-        elif tag_name in ('em', 'i'):
-            text = node.get_text().strip()
-            if text:
-                self.output_lines.append(f'*{text}*')
-            return
-        
-        else:
-            # 默认递归处理子元素
-            for child in node.children:
-                self._process_node(child, indent, list_context)
-        
-        if is_block and self.output_lines and not self.output_lines[-1].endswith('\n'):
-            self.output_lines.append('\n')
-    
-    def _process_table(self, table: Tag, indent: int):
-        """处理表格"""
-        rows = []
-        for tr in table.find_all('tr'):
-            row = []
-            for td in tr.find_all(['td', 'th']):
-                row.append(td.get_text().strip())
-            if row:
-                rows.append(row)
-        
-        if rows:
-            # 计算列宽
-            col_widths = []
-            for i in range(max(len(r) for r in rows)):
-                col_width = max((len(r[i]) if i < len(r) else 0) for r in rows)
-                col_widths.append(col_width)
-            
-            for row in rows:
-                line = ' ' * indent
-                for i, cell in enumerate(row):
-                    width = col_widths[i] if i < len(col_widths) else 0
-                    line += cell.ljust(width) + '  '
-                self.output_lines.append(line.rstrip() + '\n')
-            self.output_lines.append('\n')
-    
-    def _process_list(self, ul: Tag, indent: int, list_type: str):
-        """处理列表"""
-        counter = 1 if list_type == 'ol' else None
-        for child in ul.children:
-            if isinstance(child, Tag) and child.name == 'li':
-                ctx = (list_type, counter) if counter else (list_type, 1)
-                self._process_list_item(child, indent, ctx)
-                if counter:
-                    counter += 1
-            else:
-                self._process_node(child, indent, (list_type, 1) if not counter else None)
-    
-    def _process_list_item(self, li: Tag, indent: int, list_context):
-        """处理列表项"""
-        prefix = ''
-        if list_context:
-            list_type, num = list_context
-            prefix = '• ' if list_type == 'ul' else f'{num}. '
-        
-        # 收集直接文本
-        direct_parts = []
-        for child in li.children:
-            if isinstance(child, NavigableString):
-                text = str(child).strip()
-                if text:
-                    direct_parts.append(text)
-            elif isinstance(child, Tag) and child.name == 'a':
-                href = child.get('href', '')
-                link_text = child.get_text().strip()
-                if href and link_text:
-                    direct_parts.append(f'{link_text} ({href})')
-        
-        if direct_parts:
-            self.output_lines.append(' ' * indent + prefix + ' '.join(direct_parts) + '\n')
-        
-        # 处理子元素
-        for child in li.children:
-            if isinstance(child, Tag) and child.name != 'a':
-                self._process_node(child, indent + 2, None)
-
-
-if __name__ == '__main__':
-    # 测试
-    html = "<h1>标题</h1><p>段落</p><ul><li>项目1</li><li>项目2</li></ul>"
-    extractor = HTMLTextExtractor()
-    print(extractor.extract(html))
--- a/src/feishu.py
+++ b/src/feishu.py
@@ -1,467 +0,0 @@
-#!/usr/bin/env python3
-"""
-飞书表格 API 客户端模块
-用于获取码头作业人员排班信息
-"""
-import requests
-import json
-import os
-import time
-from datetime import datetime, timedelta
-from typing import Dict, List, Optional, Tuple
-import logging
-
-logger = logging.getLogger(__name__)
-
-
-class FeishuSheetsClient:
-    """飞书表格 API 客户端"""
-    
-    def __init__(self, base_url: str, token: str, spreadsheet_token: str):
-        """
-        初始化客户端
-        
-        参数:
-            base_url: 飞书 API 基础URL
-            token: Bearer 认证令牌
-            spreadsheet_token: 表格 token
-        """
-        self.base_url = base_url.rstrip('/')
-        self.spreadsheet_token = spreadsheet_token
-        self.headers = {
-            'Authorization': f'Bearer {token}',
-            'Content-Type': 'application/json',
-            'Accept': 'application/json'
-        }
-    
-    def get_sheets_info(self) -> List[Dict]:
-        """
-        获取所有表格信息（sheet_id 和 title）
-        
-        返回:
-            表格信息列表 [{'sheet_id': 'xxx', 'title': 'xxx'}, ...]
-        """
-        url = f'{self.base_url}/spreadsheets/{self.spreadsheet_token}/sheets/query'
-        
-        try:
-            response = requests.get(url, headers=self.headers, timeout=30)
-            response.raise_for_status()
-            data = response.json()
-            
-            if data.get('code') != 0:
-                logger.error(f"飞书API错误: {data.get('msg')}")
-                return []
-            
-            sheets = data.get('data', {}).get('sheets', [])
-            result = []
-            for sheet in sheets:
-                result.append({
-                    'sheet_id': sheet.get('sheet_id'),
-                    'title': sheet.get('title')
-                })
-            
-            logger.info(f"获取到 {len(result)} 个表格")
-            return result
-            
-        except requests.exceptions.RequestException as e:
-            logger.error(f"获取表格信息失败: {e}")
-            return []
-        except Exception as e:
-            logger.error(f"解析表格信息失败: {e}")
-            return []
-    
-    def get_sheet_data(self, sheet_id: str, range_: str = 'A:AF') -> Dict:
-        """
-        获取指定表格的数据
-        
-        参数:
-            sheet_id: 表格ID
-            range_: 数据范围，默认 A:AF (31列)
-        
-        返回:
-            飞书API返回的原始数据
-        """
-        # 注意：获取表格数据使用 v2 API，而不是 v3
-        # 根据你提供的示例：GET 'https://open.feishu.cn/open-apis/sheets/v2/spreadsheets/{token}/values/{sheet_id}!A:AF'
-        url = f'{self.base_url.replace("/v3", "/v2")}/spreadsheets/{self.spreadsheet_token}/values/{sheet_id}!{range_}'
-        params = {
-            'valueRenderOption': 'ToString',
-            'dateTimeRenderOption': 'FormattedString'
-        }
-        
-        try:
-            response = requests.get(url, headers=self.headers, params=params, timeout=30)
-            response.raise_for_status()
-            data = response.json()
-            
-            if data.get('code') != 0:
-                logger.error(f"飞书API错误: {data.get('msg')}")
-                return {}
-            
-            return data.get('data', {})
-            
-        except requests.exceptions.RequestException as e:
-            logger.error(f"获取表格数据失败: {e}")
-            return {}
-        except Exception as e:
-            logger.error(f"解析表格数据失败: {e}")
-            return {}
-
-
-class ScheduleDataParser:
-    """排班数据解析器"""
-    
-    @staticmethod
-    def _parse_chinese_date(date_str: str) -> Optional[str]:
-        """
-        解析中文日期格式
-        
-        参数:
-            date_str: 中文日期，如 "12月30日" 或 "12/30" 或 "12月1日"
-        
-        返回:
-            标准化日期字符串 "MM月DD日"
-        """
-        if not date_str:
-            return None
-        
-        # 如果是 "12/30" 格式
-        if '/' in date_str:
-            try:
-                month, day = date_str.split('/')
-                # 移除可能的空格
-                month = month.strip()
-                day = day.strip()
-                return f"{int(month)}月{int(day)}日"
-            except:
-                return None
-        
-        # 如果是 "12月30日" 格式
-        if '月' in date_str and '日' in date_str:
-            return date_str
-        
-        # 如果是 "12月1日" 格式（已经包含"日"字）
-        if '月' in date_str:
-            # 检查是否已经有"日"字
-            if '日' not in date_str:
-                return f"{date_str}日"
-            return date_str
-        
-        return None
-    
-    @staticmethod
-    def _find_date_column_index(headers: List[str], target_date: str) -> Optional[int]:
-        """
-        在表头中查找目标日期对应的列索引
-        
-        参数:
-            headers: 表头行 ["姓名", "12月1日", "12月2日", ...]
-            target_date: 目标日期 "12月30日"
-        
-        返回:
-            列索引（从0开始），未找到返回None
-        """
-        if not headers or not target_date:
-            return None
-        
-        # 标准化目标日期
-        target_std = ScheduleDataParser._parse_chinese_date(target_date)
-        if not target_std:
-            return None
-        
-        # 遍历表头查找匹配的日期
-        for i, header in enumerate(headers):
-            header_std = ScheduleDataParser._parse_chinese_date(header)
-            if header_std == target_std:
-                return i
-        
-        return None
-    
-    def parse(self, values: List[List[str]], target_date: str) -> Dict:
-        """
-        解析排班数据，获取指定日期的班次人员
-        
-        参数:
-            values: 飞书表格返回的二维数组
-            target_date: 目标日期（格式: "12月30日" 或 "12/30"）
-        
-        返回:
-            {
-                'day_shift': '张勤、刘炜彬、杨俊豪',
-                'night_shift': '梁启迟、江唯、汪钦良',
-                'day_shift_list': ['张勤', '刘炜彬', '杨俊豪'],
-                'night_shift_list': ['梁启迟', '江唯', '汪钦良']
-            }
-        """
-        if not values or len(values) < 2:
-            return {
-                'day_shift': '',
-                'night_shift': '',
-                'day_shift_list': [],
-                'night_shift_list': []
-            }
-        
-        # 第一行是表头
-        headers = values[0]
-        date_column_index = self._find_date_column_index(headers, target_date)
-        
-        if date_column_index is None:
-            logger.warning(f"未找到日期列: {target_date}")
-            return {
-                'day_shift': '',
-                'night_shift': '',
-                'day_shift_list': [],
-                'night_shift_list': []
-            }
-        
-        # 收集白班和夜班人员
-        day_shift_names = []
-        night_shift_names = []
-        
-        # 从第二行开始是人员数据
-        for row in values[1:]:
-            if len(row) <= date_column_index:
-                continue
-            
-            name = row[0] if row else ''
-            shift = row[date_column_index] if date_column_index < len(row) else ''
-            
-            if not name or not shift:
-                continue
-            
-            if shift == '白':
-                day_shift_names.append(name)
-            elif shift == '夜':
-                night_shift_names.append(name)
-        
-        # 格式化输出
-        day_shift_str = '、'.join(day_shift_names) if day_shift_names else ''
-        night_shift_str = '、'.join(night_shift_names) if night_shift_names else ''
-        
-        return {
-            'day_shift': day_shift_str,
-            'night_shift': night_shift_str,
-            'day_shift_list': day_shift_names,
-            'night_shift_list': night_shift_names
-        }
-
-
-class ScheduleCache:
-    """排班数据缓存"""
-    
-    def __init__(self, cache_file: str = 'data/schedule_cache.json'):
-        self.cache_file = cache_file
-        self.cache_ttl = 3600  # 1小时
-    
-    def load(self) -> Optional[Dict]:
-        """加载缓存"""
-        try:
-            if not os.path.exists(self.cache_file):
-                return None
-            
-            with open(self.cache_file, 'r', encoding='utf-8') as f:
-                cache_data = json.load(f)
-            
-            # 检查缓存是否过期
-            last_update = cache_data.get('last_update')
-            if last_update:
-                last_time = datetime.fromisoformat(last_update)
-                if (datetime.now() - last_time).total_seconds() < self.cache_ttl:
-                    return cache_data.get('data')
-            
-            return None
-            
-        except Exception as e:
-            logger.warning(f"加载缓存失败: {e}")
-            return None
-    
-    def save(self, data: Dict):
-        """保存缓存"""
-        try:
-            # 确保目录存在
-            os.makedirs(os.path.dirname(self.cache_file), exist_ok=True)
-            
-            cache_data = {
-                'last_update': datetime.now().isoformat(),
-                'data': data
-            }
-            
-            with open(self.cache_file, 'w', encoding='utf-8') as f:
-                json.dump(cache_data, f, ensure_ascii=False, indent=2)
-            
-            logger.info(f"缓存已保存到 {self.cache_file}")
-            
-        except Exception as e:
-            logger.error(f"保存缓存失败: {e}")
-
-
-class FeishuScheduleManager:
-    """飞书排班管理器（主入口）"""
-    
-    def __init__(self, base_url: str = None, token: str = None, 
-                 spreadsheet_token: str = None):
-        """
-        初始化管理器
-        
-        参数:
-            base_url: 飞书API基础URL，从环境变量读取
-            token: 飞书API令牌，从环境变量读取
-            spreadsheet_token: 表格token，从环境变量读取
-        """
-        # 从环境变量读取配置
-        self.base_url = base_url or os.getenv('FEISHU_BASE_URL', 'https://open.feishu.cn/open-apis/sheets/v3')
-        self.token = token or os.getenv('FEISHU_TOKEN', '')
-        self.spreadsheet_token = spreadsheet_token or os.getenv('FEISHU_SPREADSHEET_TOKEN', '')
-        
-        if not self.token or not self.spreadsheet_token:
-            logger.warning("飞书配置不完整，请检查环境变量")
-        
-        self.client = FeishuSheetsClient(self.base_url, self.token, self.spreadsheet_token)
-        self.parser = ScheduleDataParser()
-        self.cache = ScheduleCache()
-    
-    def get_schedule_for_date(self, date_str: str, use_cache: bool = True) -> Dict:
-        """
-        获取指定日期的排班信息
-        
-        参数:
-            date_str: 日期字符串，格式 "2025-12-30" 或 "12/30"
-            use_cache: 是否使用缓存
-        
-        返回:
-            排班信息字典
-        """
-        # 转换日期格式
-        try:
-            if '-' in date_str:
-                # "2025-12-30" -> "12/30"
-                dt = datetime.strptime(date_str, '%Y-%m-%d')
-                target_date = dt.strftime('%m/%d')
-                year_month = dt.strftime('%Y-%m')
-                month_name = dt.strftime('%m月')  # "12月"
-            else:
-                # "12/30" -> "12/30"
-                target_date = date_str
-                # 假设当前年份
-                current_year = datetime.now().year
-                month = int(date_str.split('/')[0])
-                year_month = f"{current_year}-{month:02d}"
-                month_name = f"{month}月"
-        except:
-            target_date = date_str
-            year_month = datetime.now().strftime('%Y-%m')
-            month_name = datetime.now().strftime('%m月')
-        
-        # 尝试从缓存获取
-        if use_cache:
-            cached_data = self.cache.load()
-            cache_key = f"{year_month}_{target_date}"
-            if cached_data and cache_key in cached_data:
-                logger.info(f"从缓存获取 {cache_key} 的排班信息")
-                return cached_data[cache_key]
-        
-        # 获取表格信息
-        sheets = self.client.get_sheets_info()
-        if not sheets:
-            logger.error("未获取到表格信息")
-            return {
-                'day_shift': '',
-                'night_shift': '',
-                'day_shift_list': [],
-                'night_shift_list': []
-            }
-        
-        # 根据月份选择对应的表格
-        sheet_id = None
-        sheet_title = None
-        
-        # 优先查找月份表格，如 "12月"
-        for sheet in sheets:
-            title = sheet.get('title', '')
-            if month_name in title:
-                sheet_id = sheet['sheet_id']
-                sheet_title = title
-                logger.info(f"找到月份表格: {title} (ID: {sheet_id})")
-                break
-        
-        # 如果没有找到月份表格，使用第一个表格
-        if not sheet_id and sheets:
-            sheet_id = sheets[0]['sheet_id']
-            sheet_title = sheets[0]['title']
-            logger.warning(f"未找到 {month_name} 表格，使用第一个表格: {sheet_title}")
-        
-        if not sheet_id:
-            logger.error("未找到可用的表格")
-            return {
-                'day_shift': '',
-                'night_shift': '',
-                'day_shift_list': [],
-                'night_shift_list': []
-            }
-        
-        # 获取表格数据
-        sheet_data = self.client.get_sheet_data(sheet_id)
-        if not sheet_data:
-            logger.error("未获取到表格数据")
-            return {
-                'day_shift': '',
-                'night_shift': '',
-                'day_shift_list': [],
-                'night_shift_list': []
-            }
-        
-        values = sheet_data.get('valueRange', {}).get('values', [])
-        if not values:
-            logger.error("表格数据为空")
-            return {
-                'day_shift': '',
-                'night_shift': '',
-                'day_shift_list': [],
-                'night_shift_list': []
-            }
-        
-        # 解析数据
-        result = self.parser.parse(values, target_date)
-        
-        # 更新缓存
-        if use_cache:
-            cached_data = self.cache.load() or {}
-            cache_key = f"{year_month}_{target_date}"
-            cached_data[cache_key] = result
-            self.cache.save(cached_data)
-        
-        return result
-    
-    def get_schedule_for_today(self) -> Dict:
-        """获取今天的排班信息"""
-        today = datetime.now().strftime('%Y-%m-%d')
-        return self.get_schedule_for_date(today)
-    
-    def get_schedule_for_tomorrow(self) -> Dict:
-        """获取明天的排班信息"""
-        tomorrow = (datetime.now() + timedelta(days=1)).strftime('%Y-%m-%d')
-        return self.get_schedule_for_date(tomorrow)
-
-
-if __name__ == '__main__':
-    # 测试代码
-    import sys
-    
-    # 设置日志
-    logging.basicConfig(level=logging.INFO)
-    
-    # 从环境变量读取配置
-    manager = FeishuScheduleManager()
-    
-    if len(sys.argv) > 1:
-        date_str = sys.argv[1]
-    else:
-        date_str = datetime.now().strftime('%Y-%m-%d')
-    
-    print(f"获取 {date_str} 的排班信息...")
-    schedule = manager.get_schedule_for_date(date_str)
-    
-    print(f"白班人员: {schedule['day_shift']}")
-    print(f"夜班人员: {schedule['night_shift']}")
-    print(f"白班列表: {schedule['day_shift_list']}")
-    print(f"夜班列表: {schedule['night_shift_list']}")
--- a/src/feishu/init.py
+++ b/src/feishu/init.py
@@ -0,0 +1,15 @@
+#!/usr/bin/env python3
+"""
+飞书模块包
+提供统一的飞书API接口
+"""
+from src.feishu.client import FeishuSheetsClient, FeishuClientError
+from src.feishu.parser import ScheduleDataParser
+from src.feishu.manager import FeishuScheduleManager
+
+__all__ = [
+    'FeishuSheetsClient',
+    'FeishuClientError',
+    'ScheduleDataParser',
+    'FeishuScheduleManager'
+]
--- a/src/feishu/client.py
+++ b/src/feishu/client.py
@@ -0,0 +1,182 @@
+#!/usr/bin/env python3
+"""
+飞书表格 API 客户端模块
+统一版本，支持月度表格和年度表格
+"""
+import requests
+from typing import Dict, List, Optional
+import logging
+
+from src.config import config
+from src.logging_config import get_logger
+
+logger = get_logger(__name__)
+
+
+class FeishuClientError(Exception):
+    """飞书客户端异常基类"""
+    pass
+
+
+class FeishuSheetsClient:
+    """飞书表格 API 客户端"""
+    
+    def __init__(self, base_url: Optional[str] = None, token: Optional[str] = None, 
+                 spreadsheet_token: Optional[str] = None):
+        """
+        初始化客户端
+        
+        参数:
+            base_url: 飞书 API 基础URL，如果为None则使用配置
+            token: Bearer 认证令牌，如果为None则使用配置
+            spreadsheet_token: 表格 token，如果为None则使用配置
+        """
+        self.base_url = (base_url or config.FEISHU_BASE_URL).rstrip('/')
+        self.spreadsheet_token = spreadsheet_token or config.FEISHU_SPREADSHEET_TOKEN
+        self.token = token or config.FEISHU_TOKEN
+        
+        self.headers = {
+            'Authorization': f'Bearer {self.token}',
+            'Content-Type': 'application/json',
+            'Accept': 'application/json'
+        }
+        
+        # 使用 Session 重用连接
+        self.session = requests.Session()
+        self.session.headers.update(self.headers)
+        self.session.timeout = config.REQUEST_TIMEOUT
+        
+        logger.debug(f"飞书客户端初始化完成，基础URL: {self.base_url}")
+    
+    def get_sheets_info(self) -> List[Dict[str, str]]:
+        """
+        获取所有表格信息（sheet_id 和 title）
+        
+        返回:
+            表格信息列表 [{'sheet_id': 'xxx', 'title': 'xxx'}, ...]
+        
+        异常:
+            requests.exceptions.RequestException: 网络请求失败
+            ValueError: API返回错误
+        """
+        url = f'{self.base_url}/spreadsheets/{self.spreadsheet_token}/sheets/query'
+        
+        try:
+            response = self.session.get(url, timeout=config.REQUEST_TIMEOUT)
+            response.raise_for_status()
+            data = response.json()
+            
+            if data.get('code') != 0:
+                error_msg = f"飞书API错误: {data.get('msg')}"
+                logger.error(error_msg)
+                raise ValueError(error_msg)
+            
+            sheets = data.get('data', {}).get('sheets', [])
+            result = []
+            for sheet in sheets:
+                result.append({
+                    'sheet_id': sheet.get('sheet_id'),
+                    'title': sheet.get('title')
+                })
+            
+            logger.info(f"获取到 {len(result)} 个表格")
+            return result
+            
+        except requests.exceptions.RequestException as e:
+            logger.error(f"获取表格信息失败: {e}")
+            raise
+        except Exception as e:
+            logger.error(f"解析表格信息失败: {e}")
+            raise
+    
+    def get_sheet_data(self, sheet_id: str, range_: Optional[str] = None) -> Dict:
+        """
+        获取指定表格的数据
+        
+        参数:
+            sheet_id: 表格ID
+            range_: 数据范围，如果为None则使用配置
+        
+        返回:
+            飞书API返回的原始数据，包含revision版本号
+        
+        异常:
+            requests.exceptions.RequestException: 网络请求失败
+            ValueError: API返回错误
+        """
+        if range_ is None:
+            range_ = config.SHEET_RANGE
+        
+        # 注意：获取表格数据使用 v2 API，而不是 v3
+        url = f'{self.base_url.replace("/v3", "/v2")}/spreadsheets/{self.spreadsheet_token}/values/{sheet_id}!{range_}'
+        params = {
+            'valueRenderOption': 'ToString',
+            'dateTimeRenderOption': 'FormattedString'
+        }
+        
+        try:
+            response = self.session.get(url, params=params, timeout=config.REQUEST_TIMEOUT)
+            response.raise_for_status()
+            data = response.json()
+            
+            if data.get('code') != 0:
+                error_msg = f"飞书API错误: {data.get('msg')}"
+                logger.error(error_msg)
+                raise ValueError(error_msg)
+            
+            logger.debug(f"获取表格数据成功: {sheet_id}, 范围: {range_}")
+            return data.get('data', {})
+            
+        except requests.exceptions.RequestException as e:
+            logger.error(f"获取表格数据失败: {e}, sheet_id: {sheet_id}")
+            raise
+        except Exception as e:
+            logger.error(f"解析表格数据失败: {e}, sheet_id: {sheet_id}")
+            raise
+    
+    def test_connection(self) -> bool:
+        """
+        测试飞书连接是否正常
+        
+        返回:
+            连接是否正常
+        """
+        try:
+            sheets = self.get_sheets_info()
+            if sheets:
+                logger.info(f"飞书连接测试成功，找到 {len(sheets)} 个表格")
+                return True
+            else:
+                logger.warning("飞书连接测试成功，但未找到表格")
+                return False
+        except Exception as e:
+            logger.error(f"飞书连接测试失败: {e}")
+            return False
+
+
+if __name__ == '__main__':
+    # 测试代码
+    import sys
+    
+    # 设置日志级别
+    logging.basicConfig(level=logging.INFO)
+    
+    # 测试连接
+    client = FeishuSheetsClient()
+    
+    if client.test_connection():
+        print("飞书连接测试成功")
+        
+        # 获取表格信息
+        sheets = client.get_sheets_info()
+        for sheet in sheets[:3]:  # 只显示前3个
+            print(f"表格: {sheet['title']} (ID: {sheet['sheet_id']})")
+        
+        if sheets:
+            # 获取第一个表格的数据
+            sheet_id = sheets[0]['sheet_id']
+            data = client.get_sheet_data(sheet_id, 'A1:C5')
+            print(f"获取到表格数据，版本: {data.get('revision', '未知')}")
+    else:
+        print("飞书连接测试失败")
+        sys.exit(1)
--- a/src/feishu/manager.py
+++ b/src/feishu/manager.py
@@ -0,0 +1,323 @@
+#!/usr/bin/env python3
+"""
+飞书排班管理器模块
+统一入口，使用数据库存储和缓存
+"""
+from datetime import datetime, timedelta
+from typing import Dict, List, Optional, Tuple
+import logging
+
+from src.config import config
+from src.logging_config import get_logger
+from src.feishu.client import FeishuSheetsClient
+from src.feishu.parser import ScheduleDataParser
+from src.database.schedules import ScheduleDatabase
+
+logger = get_logger(__name__)
+
+
+class FeishuScheduleManager:
+    """飞书排班管理器（统一入口）"""
+    
+    def __init__(self, base_url: Optional[str] = None, token: Optional[str] = None, 
+                 spreadsheet_token: Optional[str] = None, db_path: Optional[str] = None):
+        """
+        初始化管理器
+        
+        参数:
+            base_url: 飞书API基础URL，如果为None则使用配置
+            token: 飞书API令牌，如果为None则使用配置
+            spreadsheet_token: 表格token，如果为None则使用配置
+            db_path: 数据库路径，如果为None则使用配置
+        """
+        # 检查配置是否完整
+        self._check_config(token, spreadsheet_token)
+        
+        # 初始化组件
+        self.client = FeishuSheetsClient(base_url, token, spreadsheet_token)
+        self.parser = ScheduleDataParser()
+        self.db = ScheduleDatabase(db_path)
+        
+        logger.info("飞书排班管理器初始化完成")
+    
+    def _check_config(self, token: Optional[str], spreadsheet_token: Optional[str]) -> None:
+        """检查必要配置"""
+        if not token and not config.FEISHU_TOKEN:
+            logger.warning("飞书令牌未配置，排班功能将不可用")
+        
+        if not spreadsheet_token and not config.FEISHU_SPREADSHEET_TOKEN:
+            logger.warning("飞书表格令牌未配置，排班功能将不可用")
+    
+    def _select_sheet_for_date(self, sheets: List[Dict[str, str]], target_year_month: str) -> Optional[Dict[str, str]]:
+        """
+        为指定日期选择最合适的表格
+        
+        参数:
+            sheets: 表格列表
+            target_year_month: 目标年月，格式 "2025-12"
+        
+        返回:
+            选中的表格信息，未找到返回None
+        """
+        if not sheets:
+            logger.error("表格列表为空")
+            return None
+        
+        # 提取年份和月份
+        try:
+            year = target_year_month[:4]
+            month = target_year_month[5:7].lstrip('0')
+        except (IndexError, ValueError) as e:
+            logger.error(f"解析年月失败: {target_year_month}, 错误: {e}")
+            return None
+        
+        # 对于2026年，优先使用年度表格
+        if year == '2026':
+            # 查找年度表格，如 "2026年排班表"
+            year_name = f"{year}年"
+            for sheet in sheets:
+                title = sheet.get('title', '')
+                if year_name in title and '排班表' in title:
+                    logger.info(f"找到2026年年度表格: {title}")
+                    return sheet
+        
+        # 优先查找月份表格，如 "12月"
+        month_name = f"{int(month)}月"
+        for sheet in sheets:
+            title = sheet.get('title', '')
+            if month_name in title:
+                logger.info(f"找到月份表格: {title}")
+                return sheet
+        
+        # 查找年度表格，如 "2026年排班表"
+        year_name = f"{year}年"
+        for sheet in sheets:
+            title = sheet.get('title', '')
+            if year_name in title and '排班表' in title:
+                logger.info(f"找到年度表格: {title}")
+                return sheet
+        
+        # 如果没有找到匹配的表格，使用第一个表格
+        logger.warning(f"未找到 {target_year_month} 的匹配表格，使用第一个表格: {sheets[0]['title']}")
+        return sheets[0]
+    
+    def get_schedule_for_date(self, date_str: str) -> Dict[str, any]:
+        """
+        获取指定日期的排班信息
+        
+        参数:
+            date_str: 日期字符串，格式 "2025-12-30"
+        
+        返回:
+            排班信息字典
+        
+        异常:
+            ValueError: 日期格式无效
+            Exception: 其他错误
+        """
+        try:
+            # 解析日期
+            dt = datetime.strptime(date_str, '%Y-%m-%d')
+            
+            # 生成两种格式的日期字符串，用于匹配不同表格
+            target_date_mm_dd = dt.strftime('%m/%d')  # "01/01" 用于月度表格
+            target_date_chinese = f"{dt.month}月{dt.day}日"  # "1月1日" 用于年度表格
+            target_year_month = dt.strftime('%Y-%m')  # "2025-12"
+            
+            logger.info(f"获取 {date_str} 的排班信息 (格式: {target_date_mm_dd}/{target_date_chinese})")
+            
+            # 1. 首先尝试从数据库获取
+            cached_schedule = self.db.get_schedule(date_str)
+            if cached_schedule:
+                logger.info(f"从数据库获取 {date_str} 的排班信息")
+                return self._format_db_result(cached_schedule)
+            
+            # 2. 数据库中没有，需要从飞书获取
+            logger.info(f"数据库中没有 {date_str} 的排班信息，从飞书获取")
+            
+            # 获取表格信息
+            sheets = self.client.get_sheets_info()
+            if not sheets:
+                logger.error("未获取到表格信息")
+                return self._empty_result()
+            
+            # 选择最合适的表格
+            selected_sheet = self._select_sheet_for_date(sheets, target_year_month)
+            if not selected_sheet:
+                logger.error("未找到合适的表格")
+                return self._empty_result()
+            
+            sheet_id = selected_sheet['sheet_id']
+            sheet_title = selected_sheet['title']
+            
+            # 3. 获取表格数据
+            sheet_data = self.client.get_sheet_data(sheet_id)
+            if not sheet_data:
+                logger.error("未获取到表格数据")
+                return self._empty_result()
+            
+            values = sheet_data.get('valueRange', {}).get('values', [])
+            revision = sheet_data.get('revision', 0)
+            
+            if not values:
+                logger.error("表格数据为空")
+                return self._empty_result()
+            
+            # 4. 检查表格是否有更新
+            need_update = self.db.check_sheet_update(
+                sheet_id, sheet_title, revision, {'values': values}
+            )
+            
+            if not need_update and cached_schedule:
+                # 表格无更新，且数据库中有缓存，直接返回
+                logger.info(f"表格无更新，使用数据库缓存")
+                return self._format_db_result(cached_schedule)
+            
+            # 5. 解析数据 - 根据表格类型选择合适的日期格式
+            # 如果是年度表格，使用中文日期格式；否则使用mm/dd格式
+            if '年' in sheet_title and '排班表' in sheet_title:
+                target_date = target_date_chinese  # "1月1日"
+            else:
+                target_date = target_date_mm_dd  # "01/01"
+            
+            logger.info(f"使用日期格式: {target_date} 解析表格: {sheet_title}")
+            result = self.parser.parse(values, target_date, sheet_title)
+            
+            # 6. 保存到数据库
+            if result['day_shift'] or result['night_shift']:
+                self.db.save_schedule(date_str, result, sheet_id, sheet_title)
+                logger.info(f"已保存 {date_str} 的排班信息到数据库")
+            
+            return result
+            
+        except ValueError as e:
+            logger.error(f"日期格式无效: {date_str}, 错误: {e}")
+            raise
+        except Exception as e:
+            logger.error(f"获取排班信息失败: {e}")
+            # 降级处理：返回空值
+            return self._empty_result()
+    
+    def get_schedule_for_today(self) -> Dict[str, any]:
+        """获取今天的排班信息"""
+        today = datetime.now().strftime('%Y-%m-%d')
+        return self.get_schedule_for_date(today)
+    
+    def get_schedule_for_tomorrow(self) -> Dict[str, any]:
+        """获取明天的排班信息"""
+        tomorrow = (datetime.now() + timedelta(days=1)).strftime('%Y-%m-%d')
+        return self.get_schedule_for_date(tomorrow)
+    
+    def refresh_all_schedules(self, days: Optional[int] = None):
+        """
+        刷新未来指定天数的排班信息
+        
+        参数:
+            days: 刷新未来多少天的排班信息，如果为None则使用配置
+        """
+        if days is None:
+            days = config.SCHEDULE_REFRESH_DAYS
+        
+        logger.info(f"开始刷新未来 {days} 天的排班信息")
+        
+        today = datetime.now()
+        success_count = 0
+        error_count = 0
+        
+        for i in range(days):
+            date = (today + timedelta(days=i)).strftime('%Y-%m-%d')
+            try:
+                logger.debug(f"刷新 {date} 的排班信息...")
+                self.get_schedule_for_date(date)
+                success_count += 1
+            except Exception as e:
+                logger.error(f"刷新 {date} 的排班信息失败: {e}")
+                error_count += 1
+        
+        logger.info(f"排班信息刷新完成，成功: {success_count}, 失败: {error_count}")
+    
+    def get_schedule_by_range(self, start_date: str, end_date: str) -> List[Dict[str, any]]:
+        """
+        获取日期范围内的排班信息
+        
+        参数:
+            start_date: 开始日期 (YYYY-MM-DD)
+            end_date: 结束日期 (YYYY-MM-DD)
+        
+        返回:
+            排班信息列表
+        """
+        try:
+            # 验证日期格式
+            datetime.strptime(start_date, '%Y-%m-%d')
+            datetime.strptime(end_date, '%Y-%m-%d')
+            
+            return self.db.get_schedule_by_range(start_date, end_date)
+            
+        except ValueError as e:
+            logger.error(f"日期格式无效: {e}")
+            return []
+        except Exception as e:
+            logger.error(f"获取排班范围失败: {e}")
+            return []
+    
+    def test_connection(self) -> bool:
+        """测试飞书连接是否正常"""
+        return self.client.test_connection()
+    
+    def get_stats(self) -> Dict[str, any]:
+        """获取排班数据库统计信息"""
+        return self.db.get_stats()
+    
+    def _empty_result(self) -> Dict[str, any]:
+        """返回空结果"""
+        return {
+            'day_shift': '',
+            'night_shift': '',
+            'day_shift_list': [],
+            'night_shift_list': []
+        }
+    
+    def _format_db_result(self, db_result: Dict[str, any]) -> Dict[str, any]:
+        """格式化数据库结果"""
+        return {
+            'day_shift': db_result['day_shift'],
+            'night_shift': db_result['night_shift'],
+            'day_shift_list': db_result['day_shift_list'],
+            'night_shift_list': db_result['night_shift_list']
+        }
+
+
+if __name__ == '__main__':
+    # 测试代码
+    import sys
+    
+    # 设置日志
+    logging.basicConfig(level=logging.INFO)
+    
+    # 初始化管理器
+    manager = FeishuScheduleManager()
+    
+    # 测试连接
+    if not manager.test_connection():
+        print("飞书连接测试失败")
+        sys.exit(1)
+    
+    print("飞书连接测试成功")
+    
+    # 测试获取今天和明天的排班
+    today_schedule = manager.get_schedule_for_today()
+    print(f"今天排班: 白班={today_schedule['day_shift']}, 夜班={today_schedule['night_shift']}")
+    
+    tomorrow_schedule = manager.get_schedule_for_tomorrow()
+    print(f"明天排班: 白班={tomorrow_schedule['day_shift']}, 夜班={tomorrow_schedule['night_shift']}")
+    
+    # 测试统计
+    stats = manager.get_stats()
+    print(f"排班统计: {stats}")
+    
+    # 测试范围查询（最近7天）
+    end_date = datetime.now().strftime('%Y-%m-%d')
+    start_date = (datetime.now() - timedelta(days=7)).strftime('%Y-%m-%d')
+    schedules = manager.get_schedule_by_range(start_date, end_date)
+    print(f"最近7天排班记录: {len(schedules)} 条")
--- a/src/feishu/parser.py
+++ b/src/feishu/parser.py
@@ -0,0 +1,339 @@
+#!/usr/bin/env python3
+"""
+排班数据解析器模块
+支持月度表格和年度表格解析
+"""
+import re
+from typing import Dict, List, Optional, Tuple
+import logging
+
+from src.logging_config import get_logger
+
+logger = get_logger(__name__)
+
+
+class ScheduleDataParser:
+    """排班数据解析器（支持月度表格和年度表格）"""
+    
+    @staticmethod
+    def _parse_chinese_date(date_str: str) -> Optional[str]:
+        """
+        解析中文日期格式
+        
+        参数:
+            date_str: 中文日期，如 "12月30日" 或 "12/30" 或 "12月1日" 或 "1月1日"
+        
+        返回:
+            标准化日期字符串 "M月D日" (不补零)
+        
+        异常:
+            ValueError: 日期格式无效
+        """
+        if not date_str or not isinstance(date_str, str):
+            return None
+        
+        date_str = date_str.strip()
+        
+        try:
+            # 如果是 "12/30" 格式
+            if '/' in date_str:
+                month, day = date_str.split('/')
+                # 移除可能的空格和前导零
+                month = month.strip().lstrip('0')
+                day = day.strip().lstrip('0')
+                if not month.isdigit() or not day.isdigit():
+                    raise ValueError(f"日期格式无效: {date_str}")
+                return f"{int(month)}月{int(day)}日"
+            
+            # 如果是 "12月30日" 或 "1月1日" 格式
+            if '月' in date_str and '日' in date_str:
+                # 移除前导零，如 "01月01日" -> "1月1日"
+                parts = date_str.split('月')
+                if len(parts) == 2:
+                    month_part = parts[0].lstrip('0')
+                    day_part = parts[1].rstrip('日').lstrip('0')
+                    if not month_part or not day_part:
+                        raise ValueError(f"日期格式无效: {date_str}")
+                    return f"{month_part}月{day_part}日"
+                return date_str
+            
+            # 如果是 "12月1日" 格式（已经包含"日"字）
+            if '月' in date_str:
+                # 检查是否已经有"日"字
+                if '日' not in date_str:
+                    return f"{date_str}日"
+                return date_str
+            
+            # 如果是纯数字，尝试解析
+            if date_str.isdigit() and len(date_str) == 4:
+                # 假设是 "1230" 格式
+                month = date_str[:2].lstrip('0')
+                day = date_str[2:].lstrip('0')
+                return f"{month}月{day}日"
+            
+            return None
+            
+        except Exception as e:
+            logger.warning(f"解析日期失败: {date_str}, 错误: {e}")
+            return None
+    
+    @staticmethod
+    def _find_date_column_index(headers: List[str], target_date: str) -> Optional[int]:
+        """
+        在表头中查找目标日期对应的列索引
+        
+        参数:
+            headers: 表头行 ["姓名", "12月1日", "12月2日", ...]
+            target_date: 目标日期 "12月30日"
+        
+        返回:
+            列索引（从0开始），未找到返回None
+        """
+        if not headers or not target_date:
+            return None
+        
+        # 标准化目标日期
+        target_std = ScheduleDataParser._parse_chinese_date(target_date)
+        if not target_std:
+            logger.warning(f"无法标准化目标日期: {target_date}")
+            return None
+        
+        # 遍历表头查找匹配的日期
+        for i, header in enumerate(headers):
+            if not header:
+                continue
+                
+            header_std = ScheduleDataParser._parse_chinese_date(header)
+            if header_std == target_std:
+                logger.debug(f"找到日期列: {target_date} -> {header} (索引: {i})")
+                return i
+        
+        logger.warning(f"未找到日期列: {target_date}, 表头: {headers}")
+        return None
+    
+    def parse_monthly_sheet(self, values: List[List[str]], target_date: str) -> Dict[str, any]:
+        """
+        解析月度表格数据（如12月表格）
+        
+        参数:
+            values: 飞书表格返回的二维数组
+            target_date: 目标日期（格式: "12月30日" 或 "12/30"）
+        
+        返回:
+            排班信息字典
+        """
+        if not values or len(values) < 2:
+            logger.warning("表格数据为空或不足")
+            return self._empty_result()
+        
+        # 第一行是表头
+        headers = values[0]
+        date_column_index = self._find_date_column_index(headers, target_date)
+        
+        if date_column_index is None:
+            logger.warning(f"未找到日期列: {target_date}")
+            return self._empty_result()
+        
+        # 收集白班和夜班人员
+        day_shift_names = []
+        night_shift_names = []
+        
+        # 从第二行开始是人员数据
+        for row_idx, row in enumerate(values[1:], start=2):
+            if len(row) <= date_column_index:
+                continue
+            
+            name = row[0] if row else ''
+            shift = row[date_column_index] if date_column_index < len(row) else ''
+            
+            if not name or not shift:
+                continue
+            
+            # 清理班次值
+            shift = shift.strip()
+            if shift == '白':
+                day_shift_names.append(name.strip())
+            elif shift == '夜':
+                night_shift_names.append(name.strip())
+            elif shift:  # 其他班次类型
+                logger.debug(f"忽略未知班次类型: {shift} (行: {row_idx})")
+        
+        return self._format_result(day_shift_names, night_shift_names)
+    
+    def parse_yearly_sheet(self, values: List[List[str]], target_date: str) -> Dict[str, any]:
+        """
+        解析年度表格数据（如2026年排班表）
+        
+        参数:
+            values: 飞书表格返回的二维数组
+            target_date: 目标日期（格式: "12月30日" 或 "12/30"）
+        
+        返回:
+            排班信息字典
+        """
+        if not values:
+            logger.warning("年度表格数据为空")
+            return self._empty_result()
+        
+        # 查找目标月份的数据块
+        target_month = target_date.split('月')[0] if '月' in target_date else ''
+        if not target_month:
+            logger.warning(f"无法从 {target_date} 提取月份")
+            return self._empty_result()
+        
+        # 在年度表格中查找对应的月份块
+        current_block_start = -1
+        current_month = ''
+        
+        for i, row in enumerate(values):
+            if not row:
+                continue
+            
+            first_cell = str(row[0]) if row else ''
+            
+            # 检查是否是月份标题行，如 "福州港1月排班表"
+            if '排班表' in first_cell and '月' in first_cell:
+                # 提取月份数字
+                month_match = re.search(r'(\d+)月', first_cell)
+                if month_match:
+                    current_month = month_match.group(1).lstrip('0')
+                    current_block_start = i
+                    logger.debug(f"找到月份块: {current_month}月 (行: {i+1})")
+            
+            # 如果找到目标月份，检查下一行是否是表头行
+            if current_month == target_month and i == current_block_start + 1:
+                # 当前行是表头行
+                headers = row
+                date_column_index = self._find_date_column_index(headers, target_date)
+                
+                if date_column_index is None:
+                    logger.warning(f"在年度表格中未找到日期列: {target_date}")
+                    return self._empty_result()
+                
+                # 收集人员数据（从表头行的下一行开始）
+                day_shift_names = []
+                night_shift_names = []
+                
+                for j in range(i + 1, len(values)):
+                    person_row = values[j]
+                    if not person_row:
+                        # 遇到空行，继续检查下一行
+                        continue
+                    
+                    # 检查是否是下一个月份块的开始
+                    if person_row[0] and isinstance(person_row[0], str) and '排班表' in person_row[0] and '月' in person_row[0]:
+                        break
+                    
+                    # 跳过星期行（第一列为空的行）
+                    if not person_row[0]:
+                        continue
+                    
+                    if len(person_row) <= date_column_index:
+                        continue
+                    
+                    name = person_row[0] if person_row else ''
+                    shift = person_row[date_column_index] if date_column_index < len(person_row) else ''
+                    
+                    if not name or not shift:
+                        continue
+                    
+                    # 清理班次值
+                    shift = shift.strip()
+                    if shift == '白':
+                        day_shift_names.append(name.strip())
+                    elif shift == '夜':
+                        night_shift_names.append(name.strip())
+                
+                return self._format_result(day_shift_names, night_shift_names)
+        
+        logger.warning(f"在年度表格中未找到 {target_month}月 的数据块")
+        return self._empty_result()
+    
+    def parse(self, values: List[List[str]], target_date: str, sheet_title: str = '') -> Dict[str, any]:
+        """
+        解析排班数据，自动判断表格类型
+        
+        参数:
+            values: 飞书表格返回的二维数组
+            target_date: 目标日期（格式: "12月30日" 或 "12/30"）
+            sheet_title: 表格标题，用于判断表格类型
+        
+        返回:
+            排班信息字典
+        """
+        # 根据表格标题判断表格类型
+        if '年' in sheet_title and '排班表' in sheet_title:
+            # 年度表格
+            logger.info(f"使用年度表格解析器: {sheet_title}")
+            return self.parse_yearly_sheet(values, target_date)
+        else:
+            # 月度表格
+            logger.info(f"使用月度表格解析器: {sheet_title}")
+            return self.parse_monthly_sheet(values, target_date)
+    
+    def _empty_result(self) -> Dict[str, any]:
+        """返回空结果"""
+        return {
+            'day_shift': '',
+            'night_shift': '',
+            'day_shift_list': [],
+            'night_shift_list': []
+        }
+    
+    def _format_result(self, day_shift_names: List[str], night_shift_names: List[str]) -> Dict[str, any]:
+        """格式化结果"""
+        # 去重并排序
+        day_shift_names = sorted(set(day_shift_names))
+        night_shift_names = sorted(set(night_shift_names))
+        
+        # 格式化输出
+        day_shift_str = '、'.join(day_shift_names) if day_shift_names else ''
+        night_shift_str = '、'.join(night_shift_names) if night_shift_names else ''
+        
+        return {
+            'day_shift': day_shift_str,
+            'night_shift': night_shift_str,
+            'day_shift_list': day_shift_names,
+            'night_shift_list': night_shift_names
+        }
+
+
+if __name__ == '__main__':
+    # 测试代码
+    import sys
+    
+    # 设置日志
+    logging.basicConfig(level=logging.DEBUG)
+    
+    parser = ScheduleDataParser()
+    
+    # 测试日期解析
+    test_dates = ["12/30", "12月30日", "1月1日", "01/01", "1230", "无效日期"]
+    for date in test_dates:
+        parsed = parser._parse_chinese_date(date)
+        print(f"解析 '{date}' -> '{parsed}'")
+    
+    # 测试月度表格解析
+    monthly_values = [
+        ["姓名", "12月1日", "12月2日", "12月3日"],
+        ["张三", "白", "夜", ""],
+        ["李四", "夜", "白", "白"],
+        ["王五", "", "白", "夜"]
+    ]
+    
+    result = parser.parse_monthly_sheet(monthly_values, "12月2日")
+    print(f"\n月度表格解析结果: {result}")
+    
+    # 测试年度表格解析
+    yearly_values = [
+        ["福州港2026年排班表"],
+        ["姓名", "1月1日", "1月2日", "1月3日"],
+        ["张三", "白", "夜", ""],
+        ["李四", "夜", "白", "白"],
+        ["福州港2月排班表"],
+        ["姓名", "2月1日", "2月2日"],
+        ["王五", "白", "夜"]
+    ]
+    
+    result = parser.parse_yearly_sheet(yearly_values, "1月2日")
+    print(f"年度表格解析结果: {result}")
--- a/src/feishu_v2.py
+++ b/src/feishu_v2.py
@@ -1,642 +0,0 @@
-#!/usr/bin/env python3
-"""
-飞书表格 API 客户端模块 v2
-支持数据库存储和2026年全年排班表
-"""
-import requests
-import json
-import os
-import time
-from datetime import datetime, timedelta
-from typing import Dict, List, Optional, Tuple
-import logging
-import hashlib
-
-from src.schedule_database import ScheduleDatabase
-
-logger = logging.getLogger(__name__)
-
-
-class FeishuSheetsClient:
-    """飞书表格 API 客户端"""
-    
-    def __init__(self, base_url: str, token: str, spreadsheet_token: str):
-        """
-        初始化客户端
-        
-        参数:
-            base_url: 飞书 API 基础URL
-            token: Bearer 认证令牌
-            spreadsheet_token: 表格 token
-        """
-        self.base_url = base_url.rstrip('/')
-        self.spreadsheet_token = spreadsheet_token
-        self.headers = {
-            'Authorization': f'Bearer {token}',
-            'Content-Type': 'application/json',
-            'Accept': 'application/json'
-        }
-    
-    def get_sheets_info(self) -> List[Dict]:
-        """
-        获取所有表格信息（sheet_id 和 title）
-        
-        返回:
-            表格信息列表 [{'sheet_id': 'xxx', 'title': 'xxx'}, ...]
-        """
-        url = f'{self.base_url}/spreadsheets/{self.spreadsheet_token}/sheets/query'
-        
-        try:
-            response = requests.get(url, headers=self.headers, timeout=30)
-            response.raise_for_status()
-            data = response.json()
-            
-            if data.get('code') != 0:
-                logger.error(f"飞书API错误: {data.get('msg')}")
-                return []
-            
-            sheets = data.get('data', {}).get('sheets', [])
-            result = []
-            for sheet in sheets:
-                result.append({
-                    'sheet_id': sheet.get('sheet_id'),
-                    'title': sheet.get('title')
-                })
-            
-            logger.info(f"获取到 {len(result)} 个表格")
-            return result
-            
-        except requests.exceptions.RequestException as e:
-            logger.error(f"获取表格信息失败: {e}")
-            return []
-        except Exception as e:
-            logger.error(f"解析表格信息失败: {e}")
-            return []
-    
-    def get_sheet_data(self, sheet_id: str, range_: str = 'A:AF') -> Dict:
-        """
-        获取指定表格的数据
-        
-        参数:
-            sheet_id: 表格ID
-            range_: 数据范围，默认 A:AF (31列)
-        
-        返回:
-            飞书API返回的原始数据，包含revision版本号
-        """
-        # 注意：获取表格数据使用 v2 API，而不是 v3
-        url = f'{self.base_url.replace("/v3", "/v2")}/spreadsheets/{self.spreadsheet_token}/values/{sheet_id}!{range_}'
-        params = {
-            'valueRenderOption': 'ToString',
-            'dateTimeRenderOption': 'FormattedString'
-        }
-        
-        try:
-            response = requests.get(url, headers=self.headers, params=params, timeout=30)
-            response.raise_for_status()
-            data = response.json()
-            
-            if data.get('code') != 0:
-                logger.error(f"飞书API错误: {data.get('msg')}")
-                return {}
-            
-            return data.get('data', {})
-            
-        except requests.exceptions.RequestException as e:
-            logger.error(f"获取表格数据失败: {e}")
-            return {}
-        except Exception as e:
-            logger.error(f"解析表格数据失败: {e}")
-            return {}
-
-
-class ScheduleDataParser:
-    """排班数据解析器（支持2026年全年排班表）"""
-    
-    @staticmethod
-    def _parse_chinese_date(date_str: str) -> Optional[str]:
-        """
-        解析中文日期格式
-        
-        参数:
-            date_str: 中文日期，如 "12月30日" 或 "12/30" 或 "12月1日" 或 "1月1日"
-        
-        返回:
-            标准化日期字符串 "M月D日" (不补零)
-        """
-        if not date_str:
-            return None
-        
-        # 如果是 "12/30" 格式
-        if '/' in date_str:
-            try:
-                month, day = date_str.split('/')
-                # 移除可能的空格和前导零
-                month = month.strip().lstrip('0')
-                day = day.strip().lstrip('0')
-                return f"{int(month)}月{int(day)}日"
-            except:
-                return None
-        
-        # 如果是 "12月30日" 或 "1月1日" 格式
-        if '月' in date_str and '日' in date_str:
-            # 移除前导零，如 "01月01日" -> "1月1日"
-            parts = date_str.split('月')
-            if len(parts) == 2:
-                month_part = parts[0].lstrip('0')
-                day_part = parts[1].rstrip('日').lstrip('0')
-                return f"{month_part}月{day_part}日"
-            return date_str
-        
-        # 如果是 "12月1日" 格式（已经包含"日"字）
-        if '月' in date_str:
-            # 检查是否已经有"日"字
-            if '日' not in date_str:
-                return f"{date_str}日"
-            return date_str
-        
-        return None
-    
-    @staticmethod
-    def _find_date_column_index(headers: List[str], target_date: str) -> Optional[int]:
-        """
-        在表头中查找目标日期对应的列索引
-        
-        参数:
-            headers: 表头行 ["姓名", "12月1日", "12月2日", ...]
-            target_date: 目标日期 "12月30日"
-        
-        返回:
-            列索引（从0开始），未找到返回None
-        """
-        if not headers or not target_date:
-            return None
-        
-        # 标准化目标日期
-        target_std = ScheduleDataParser._parse_chinese_date(target_date)
-        if not target_std:
-            return None
-        
-        # 遍历表头查找匹配的日期
-        for i, header in enumerate(headers):
-            header_std = ScheduleDataParser._parse_chinese_date(header)
-            if header_std == target_std:
-                return i
-        
-        return None
-    
-    def parse_monthly_sheet(self, values: List[List[str]], target_date: str) -> Dict:
-        """
-        解析月度表格数据（如12月表格）
-        
-        参数:
-            values: 飞书表格返回的二维数组
-            target_date: 目标日期（格式: "12月30日" 或 "12/30"）
-        
-        返回:
-            排班信息字典
-        """
-        if not values or len(values) < 2:
-            return {
-                'day_shift': '',
-                'night_shift': '',
-                'day_shift_list': [],
-                'night_shift_list': []
-            }
-        
-        # 第一行是表头
-        headers = values[0]
-        date_column_index = self._find_date_column_index(headers, target_date)
-        
-        if date_column_index is None:
-            logger.warning(f"未找到日期列: {target_date}")
-            return {
-                'day_shift': '',
-                'night_shift': '',
-                'day_shift_list': [],
-                'night_shift_list': []
-            }
-        
-        # 收集白班和夜班人员
-        day_shift_names = []
-        night_shift_names = []
-        
-        # 从第二行开始是人员数据
-        for row in values[1:]:
-            if len(row) <= date_column_index:
-                continue
-            
-            name = row[0] if row else ''
-            shift = row[date_column_index] if date_column_index < len(row) else ''
-            
-            if not name or not shift:
-                continue
-            
-            if shift == '白':
-                day_shift_names.append(name)
-            elif shift == '夜':
-                night_shift_names.append(name)
-        
-        # 格式化输出
-        day_shift_str = '、'.join(day_shift_names) if day_shift_names else ''
-        night_shift_str = '、'.join(night_shift_names) if night_shift_names else ''
-        
-        return {
-            'day_shift': day_shift_str,
-            'night_shift': night_shift_str,
-            'day_shift_list': day_shift_names,
-            'night_shift_list': night_shift_names
-        }
-    
-    def parse_yearly_sheet(self, values: List[List[str]], target_date: str) -> Dict:
-        """
-        解析年度表格数据（如2026年排班表）
-        
-        参数:
-            values: 飞书表格返回的二维数组
-            target_date: 目标日期（格式: "12月30日" 或 "12/30"）
-        
-        返回:
-            排班信息字典
-        """
-        if not values:
-            return {
-                'day_shift': '',
-                'night_shift': '',
-                'day_shift_list': [],
-                'night_shift_list': []
-            }
-        
-        # 查找目标月份的数据块
-        target_month = target_date.split('月')[0] if '月' in target_date else ''
-        if not target_month:
-            logger.warning(f"无法从 {target_date} 提取月份")
-            return {
-                'day_shift': '',
-                'night_shift': '',
-                'day_shift_list': [],
-                'night_shift_list': []
-            }
-        
-        # 在年度表格中查找对应的月份块
-        current_block_start = -1
-        current_month = ''
-        
-        for i, row in enumerate(values):
-            if not row:
-                continue
-            
-            first_cell = str(row[0]) if row else ''
-            
-            # 检查是否是月份标题行，如 "福州港1月排班表"
-            if '排班表' in first_cell and '月' in first_cell:
-                # 提取月份数字
-                for char in first_cell:
-                    if char.isdigit():
-                        month_str = ''
-                        j = first_cell.index(char)
-                        while j < len(first_cell) and first_cell[j].isdigit():
-                            month_str += first_cell[j]
-                            j += 1
-                        if month_str:
-                            current_month = month_str
-                            current_block_start = i
-                            break
-            
-            # 如果找到目标月份，检查下一行是否是表头行
-            if current_month == target_month and i == current_block_start + 1:
-                # 当前行是表头行
-                headers = row
-                date_column_index = self._find_date_column_index(headers, target_date)
-                
-                if date_column_index is None:
-                    logger.warning(f"在年度表格中未找到日期列: {target_date}")
-                    return {
-                        'day_shift': '',
-                        'night_shift': '',
-                        'day_shift_list': [],
-                        'night_shift_list': []
-                    }
-                
-                # 收集人员数据（从表头行的下一行开始）
-                day_shift_names = []
-                night_shift_names = []
-                
-                for j in range(i + 1, len(values)):
-                    person_row = values[j]
-                    if not person_row:
-                        # 遇到空行，继续检查下一行
-                        continue
-                    
-                    # 检查是否是下一个月份块的开始
-                    if person_row[0] and isinstance(person_row[0], str) and '排班表' in person_row[0] and '月' in person_row[0]:
-                        break
-                    
-                    # 跳过星期行（第一列为空的行）
-                    if not person_row[0]:
-                        continue
-                    
-                    if len(person_row) <= date_column_index:
-                        continue
-                    
-                    name = person_row[0] if person_row else ''
-                    shift = person_row[date_column_index] if date_column_index < len(person_row) else ''
-                    
-                    if not name or not shift:
-                        continue
-                    
-                    if shift == '白':
-                        day_shift_names.append(name)
-                    elif shift == '夜':
-                        night_shift_names.append(name)
-                
-                # 格式化输出
-                day_shift_str = '、'.join(day_shift_names) if day_shift_names else ''
-                night_shift_str = '、'.join(night_shift_names) if night_shift_names else ''
-                
-                return {
-                    'day_shift': day_shift_str,
-                    'night_shift': night_shift_str,
-                    'day_shift_list': day_shift_names,
-                    'night_shift_list': night_shift_names
-                }
-        
-        logger.warning(f"在年度表格中未找到 {target_month}月 的数据块")
-        return {
-            'day_shift': '',
-            'night_shift': '',
-            'day_shift_list': [],
-            'night_shift_list': []
-        }
-    
-    def parse(self, values: List[List[str]], target_date: str, sheet_title: str = '') -> Dict:
-        """
-        解析排班数据，自动判断表格类型
-        
-        参数:
-            values: 飞书表格返回的二维数组
-            target_date: 目标日期（格式: "12月30日" 或 "12/30"）
-            sheet_title: 表格标题，用于判断表格类型
-        
-        返回:
-            排班信息字典
-        """
-        # 根据表格标题判断表格类型
-        if '年' in sheet_title and '排班表' in sheet_title:
-            # 年度表格
-            logger.info(f"使用年度表格解析器: {sheet_title}")
-            return self.parse_yearly_sheet(values, target_date)
-        else:
-            # 月度表格
-            logger.info(f"使用月度表格解析器: {sheet_title}")
-            return self.parse_monthly_sheet(values, target_date)
-
-
-class FeishuScheduleManagerV2:
-    """飞书排班管理器 v2（使用数据库存储）"""
-    
-    def __init__(self, base_url: str = None, token: str = None, 
-                 spreadsheet_token: str = None):
-        """
-        初始化管理器
-        
-        参数:
-            base_url: 飞书API基础URL，从环境变量读取
-            token: 飞书API令牌，从环境变量读取
-            spreadsheet_token: 表格token，从环境变量读取
-        """
-        # 从环境变量读取配置
-        self.base_url = base_url or os.getenv('FEISHU_BASE_URL', 'https://open.feishu.cn/open-apis/sheets/v3')
-        self.token = token or os.getenv('FEISHU_TOKEN', '')
-        self.spreadsheet_token = spreadsheet_token or os.getenv('FEISHU_SPREADSHEET_TOKEN', '')
-        
-        if not self.token or not self.spreadsheet_token:
-            logger.warning("飞书配置不完整，请检查环境变量")
-        
-        self.client = FeishuSheetsClient(self.base_url, self.token, self.spreadsheet_token)
-        self.parser = ScheduleDataParser()
-        self.db = ScheduleDatabase()
-    
-    def _select_sheet_for_date(self, sheets: List[Dict], target_year_month: str) -> Optional[Dict]:
-        """
-        为指定日期选择最合适的表格
-        
-        参数:
-            sheets: 表格列表
-            target_year_month: 目标年月，格式 "2025-12"
-        
-        返回:
-            选中的表格信息，未找到返回None
-        """
-        if not sheets:
-            return None
-        
-        # 提取年份和月份
-        year = target_year_month[:4]
-        month = target_year_month[5:7]
-        
-        # 对于2026年，优先使用年度表格
-        if year == '2026':
-            # 查找年度表格，如 "2026年排班表"
-            year_name = f"{year}年"
-            for sheet in sheets:
-                title = sheet.get('title', '')
-                if year_name in title and '排班表' in title:
-                    logger.info(f"找到2026年年度表格: {title}")
-                    return sheet
-        
-        # 优先查找月份表格，如 "12月"
-        month_name = f"{int(month)}月"
-        for sheet in sheets:
-            title = sheet.get('title', '')
-            if month_name in title:
-                logger.info(f"找到月份表格: {title}")
-                return sheet
-        
-        # 查找年度表格，如 "2026年排班表"
-        year_name = f"{year}年"
-        for sheet in sheets:
-            title = sheet.get('title', '')
-            if year_name in title and '排班表' in title:
-                logger.info(f"找到年度表格: {title}")
-                return sheet
-        
-        # 如果没有找到匹配的表格，使用第一个表格
-        logger.warning(f"未找到 {target_year_month} 的匹配表格，使用第一个表格: {sheets[0]['title']}")
-        return sheets[0]
-    
-    def get_schedule_for_date(self, date_str: str) -> Dict:
-        """
-        获取指定日期的排班信息
-        
-        参数:
-            date_str: 日期字符串，格式 "2025-12-30"
-        
-        返回:
-            排班信息字典
-        """
-        try:
-            # 解析日期
-            dt = datetime.strptime(date_str, '%Y-%m-%d')
-            # 生成两种格式的日期字符串，用于匹配不同表格
-            target_date_mm_dd = dt.strftime('%m/%d')  # "01/01" 用于月度表格
-            target_date_chinese = f"{dt.month}月{dt.day}日"  # "1月1日" 用于年度表格
-            target_year_month = dt.strftime('%Y-%m')  # "2025-12"
-            
-            logger.info(f"获取 {date_str} 的排班信息 (格式: {target_date_mm_dd}/{target_date_chinese})")
-            
-            # 1. 首先尝试从数据库获取
-            cached_schedule = self.db.get_schedule(date_str)
-            if cached_schedule:
-                logger.info(f"从数据库获取 {date_str} 的排班信息")
-                return {
-                    'day_shift': cached_schedule['day_shift'],
-                    'night_shift': cached_schedule['night_shift'],
-                    'day_shift_list': cached_schedule['day_shift_list'],
-                    'night_shift_list': cached_schedule['night_shift_list']
-                }
-            
-            # 2. 数据库中没有，需要从飞书获取
-            logger.info(f"数据库中没有 {date_str} 的排班信息，从飞书获取")
-            
-            # 获取表格信息
-            sheets = self.client.get_sheets_info()
-            if not sheets:
-                logger.error("未获取到表格信息")
-                return {
-                    'day_shift': '',
-                    'night_shift': '',
-                    'day_shift_list': [],
-                    'night_shift_list': []
-                }
-            
-            # 选择最合适的表格
-            selected_sheet = self._select_sheet_for_date(sheets, target_year_month)
-            if not selected_sheet:
-                logger.error("未找到合适的表格")
-                return {
-                    'day_shift': '',
-                    'night_shift': '',
-                    'day_shift_list': [],
-                    'night_shift_list': []
-                }
-            
-            sheet_id = selected_sheet['sheet_id']
-            sheet_title = selected_sheet['title']
-            
-            # 3. 获取表格数据
-            sheet_data = self.client.get_sheet_data(sheet_id)
-            if not sheet_data:
-                logger.error("未获取到表格数据")
-                return {
-                    'day_shift': '',
-                    'night_shift': '',
-                    'day_shift_list': [],
-                    'night_shift_list': []
-                }
-            
-            values = sheet_data.get('valueRange', {}).get('values', [])
-            revision = sheet_data.get('revision', 0)
-            
-            if not values:
-                logger.error("表格数据为空")
-                return {
-                    'day_shift': '',
-                    'night_shift': '',
-                    'day_shift_list': [],
-                    'night_shift_list': []
-                }
-            
-            # 4. 检查表格是否有更新
-            need_update = self.db.check_sheet_update(
-                sheet_id, sheet_title, revision, {'values': values}
-            )
-            
-            if not need_update and cached_schedule:
-                # 表格无更新，且数据库中有缓存，直接返回
-                logger.info(f"表格无更新，使用数据库缓存")
-                return {
-                    'day_shift': cached_schedule['day_shift'],
-                    'night_shift': cached_schedule['night_shift'],
-                    'day_shift_list': cached_schedule['day_shift_list'],
-                    'night_shift_list': cached_schedule['night_shift_list']
-                }
-            
-            # 5. 解析数据 - 根据表格类型选择合适的日期格式
-            # 如果是年度表格，使用中文日期格式；否则使用mm/dd格式
-            if '年' in sheet_title and '排班表' in sheet_title:
-                target_date = target_date_chinese  # "1月1日"
-            else:
-                target_date = target_date_mm_dd  # "01/01"
-            
-            logger.info(f"使用日期格式: {target_date} 解析表格: {sheet_title}")
-            result = self.parser.parse(values, target_date, sheet_title)
-            
-            # 6. 保存到数据库
-            if result['day_shift'] or result['night_shift']:
-                self.db.save_schedule(date_str, result, sheet_id, sheet_title)
-                logger.info(f"已保存 {date_str} 的排班信息到数据库")
-            
-            return result
-            
-        except Exception as e:
-            logger.error(f"获取排班信息失败: {e}")
-            # 降级处理：返回空值
-            return {
-                'day_shift': '',
-                'night_shift': '',
-                'day_shift_list': [],
-                'night_shift_list': []
-            }
-    
-    def get_schedule_for_today(self) -> Dict:
-        """获取今天的排班信息"""
-        today = datetime.now().strftime('%Y-%m-%d')
-        return self.get_schedule_for_date(today)
-    
-    def get_schedule_for_tomorrow(self) -> Dict:
-        """获取明天的排班信息"""
-        tomorrow = (datetime.now() + timedelta(days=1)).strftime('%Y-%m-%d')
-        return self.get_schedule_for_date(tomorrow)
-    
-    def refresh_all_schedules(self, days: int = 30):
-        """
-        刷新未来指定天数的排班信息
-        
-        参数:
-            days: 刷新未来多少天的排班信息
-        """
-        logger.info(f"开始刷新未来 {days} 天的排班信息")
-        
-        today = datetime.now()
-        for i in range(days):
-            date = (today + timedelta(days=i)).strftime('%Y-%m-%d')
-            logger.info(f"刷新 {date} 的排班信息...")
-            self.get_schedule_for_date(date)
-        
-        logger.info(f"排班信息刷新完成")
-
-
-if __name__ == '__main__':
-    # 测试代码
-    import sys
-    
-    # 设置日志
-    logging.basicConfig(level=logging.INFO)
-    
-    # 从环境变量读取配置
-    manager = FeishuScheduleManagerV2()
-    
-    if len(sys.argv) > 1:
-        date_str = sys.argv[1]
-    else:
-        date_str = datetime.now().strftime('%Y-%m-%d')
-    
-    print(f"获取 {date_str} 的排班信息...")
-    schedule = manager.get_schedule_for_date(date_str)
-    
-    print(f"白班人员: {schedule['day_shift']}")
-    print(f"夜班人员: {schedule['night_shift']}")
-    print(f"白班列表: {schedule['day_shift_list']}")
-    print(f"夜班列表: {schedule['night_shift_list']}")
-            
--- a/src/gui.py
+++ b/src/gui.py
@@ -12,11 +12,15 @@ import os
 # 添加项目根目录到 Python 路径
 sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))

-from src.confluence import ConfluenceClient
-from src.extractor import HTMLTextExtractor
-from src.parser import HandoverLogParser
-from src.database import DailyLogsDatabase
-from src.report import DailyReportGenerator
+# 导入新架构的模块
+from src.config import config
+from src.logging_config import get_logger
+from src.confluence import ConfluenceClient, ConfluenceClientError, HTMLTextExtractor, HTMLTextExtractorError, HandoverLogParser, LogParserError
+from src.report import DailyReportGenerator, ReportGeneratorError
+from src.database.base import DatabaseConnectionError
+from src.database.daily_logs import DailyLogsDatabase
+from src.database.schedules import ScheduleDatabase
+from src.feishu import FeishuScheduleManager, FeishuClientError


 class OrbitInGUI:
@@ -25,9 +29,12 @@ class OrbitInGUI:
    def __init__(self, root):
        self.root = root
        self.root.title("码头作业日志管理工具 - OrbitIn")
-        self.root.geometry("900x700")
+        self.root.geometry(config.GUI_WINDOW_SIZE)
        self.root.resizable(True, True)
        
+        # 初始化日志器
+        self.logger = get_logger(__name__)
+        
        # 设置样式
        style = ttk.Style()
        style.theme_use('clam')
@@ -154,7 +161,7 @@ class OrbitInGUI:
        self.report_text = scrolledtext.ScrolledText(
            right_frame,
            wrap=tk.WORD,
-            font=('SimHei', 10),
+            font=(config.GUI_FONT_FAMILY, config.GUI_FONT_SIZE),
            bg='white',
            height=18
        )
@@ -227,72 +234,94 @@ class OrbitInGUI:
        """获取并处理数据"""
        self.set_status("正在获取数据...")
        self.log_message("开始获取数据...")
+        self.logger.info("开始获取数据...")
        
        try:
-            # 加载配置
-            from dotenv import load_dotenv
-            load_dotenv()
-            
-            base_url = os.getenv('CONFLUENCE_BASE_URL')
-            token = os.getenv('CONFLUENCE_TOKEN')
-            content_id = os.getenv('CONFLUENCE_CONTENT_ID')
-            
-            if not base_url or not token or not content_id:
+            # 检查配置
+            if not config.CONFLUENCE_BASE_URL or not config.CONFLUENCE_TOKEN or not config.CONFLUENCE_CONTENT_ID:
                self.log_message("错误: 未配置 Confluence 信息，请检查 .env 文件", is_error=True)
+                self.logger.error("Confluence 配置不完整")
                return
            
            # 获取 HTML
            self.log_message("正在从 Confluence 获取 HTML...")
-            client = ConfluenceClient(base_url, token)
-            html = client.get_html(content_id)
+            self.logger.info("正在从 Confluence 获取 HTML...")
+            client = ConfluenceClient(config.CONFLUENCE_BASE_URL, config.CONFLUENCE_TOKEN)
+            html = client.get_html(config.CONFLUENCE_CONTENT_ID)
            
            if not html:
                self.log_message("错误: 未获取到 HTML 内容", is_error=True)
+                self.logger.error("未获取到 HTML 内容")
                return
            
            self.log_message(f"获取成功，共 {len(html)} 字符")
+            self.logger.info(f"获取成功，共 {len(html)} 字符")
            
            # 提取文本
            self.log_message("正在提取布局文本...")
+            self.logger.info("正在提取布局文本...")
            extractor = HTMLTextExtractor()
            layout_text = extractor.extract(html)
            self.log_message(f"提取完成，共 {len(layout_text)} 字符")
+            self.logger.info(f"提取完成，共 {len(layout_text)} 字符")
            
            # 解析数据
            self.log_message("正在解析日志数据...")
+            self.logger.info("正在解析日志数据...")
            parser = HandoverLogParser()
            logs = parser.parse(layout_text)
            self.log_message(f"解析到 {len(logs)} 条记录")
+            self.logger.info(f"解析到 {len(logs)} 条记录")
            
            # 保存到数据库
            if logs:
                self.log_message("正在保存到数据库...")
+                self.logger.info("正在保存到数据库...")
                db = DailyLogsDatabase()
                count = db.insert_many([log.to_dict() for log in logs])
-                db.close()
                self.log_message(f"已保存 {count} 条记录")
+                self.logger.info(f"已保存 {count} 条记录")
                
                # 显示统计
-                db = DailyLogsDatabase()
                stats = db.get_stats()
-                db.close()
                self.log_message(f"数据库总计: {stats['total']} 条记录, {len(stats['ships'])} 艘船")
+                self.logger.info(f"数据库总计: {stats['total']} 条记录, {len(stats['ships'])} 艘船")
                
                # 刷新日报显示
                self.generate_today_report()
            else:
                self.log_message("未解析到任何记录")
+                self.logger.warning("未解析到任何记录")
            
            self.set_status("完成")
+            self.logger.info("数据获取完成")
            
+        except ConfluenceClientError as e:
+            self.log_message(f"Confluence API 错误: {e}", is_error=True)
+            self.logger.error(f"Confluence API 错误: {e}")
+            self.set_status("错误")
+        except HTMLTextExtractorError as e:
+            self.log_message(f"HTML 提取错误: {e}", is_error=True)
+            self.logger.error(f"HTML 提取错误: {e}")
+            self.set_status("错误")
+        except LogParserError as e:
+            self.log_message(f"日志解析错误: {e}", is_error=True)
+            self.logger.error(f"日志解析错误: {e}")
+            self.set_status("错误")
+        except DatabaseConnectionError as e:
+            self.log_message(f"数据库连接错误: {e}", is_error=True)
+            self.logger.error(f"数据库连接错误: {e}")
+            self.set_status("错误")
        except Exception as e:
-            self.log_message(f"错误: {e}", is_error=True)
+            self.log_message(f"未知错误: {e}", is_error=True)
+            self.logger.error(f"未知错误: {e}", exc_info=True)
            self.set_status("错误")
    
    def fetch_debug(self):
        """Debug模式获取数据"""
        self.set_status("正在获取 Debug 数据...")
        self.log_message("使用本地 layout_output.txt 进行 Debug...")
+        self.logger.info("使用本地 layout_output.txt 进行 Debug...")
        
        try:
            # 检查本地文件
@@ -302,35 +331,51 @@ class OrbitInGUI:
                filepath = 'debug/layout_output.txt'
            else:
                self.log_message("错误: 未找到 layout_output.txt 文件", is_error=True)
+                self.logger.error("未找到 layout_output.txt 文件")
                return
            
            self.log_message(f"使用文件: {filepath}")
+            self.logger.info(f"使用文件: {filepath}")
            
            with open(filepath, 'r', encoding='utf-8') as f:
                text = f.read()
            
            self.log_message(f"读取完成，共 {len(text)} 字符")
+            self.logger.info(f"读取完成，共 {len(text)} 字符")
            
            # 解析数据
            self.log_message("正在解析日志数据...")
+            self.logger.info("正在解析日志数据...")
            parser = HandoverLogParser()
            logs = parser.parse(text)
            self.log_message(f"解析到 {len(logs)} 条记录")
+            self.logger.info(f"解析到 {len(logs)} 条记录")
            
            if logs:
                self.log_message("正在保存到数据库...")
+                self.logger.info("正在保存到数据库...")
                db = DailyLogsDatabase()
                count = db.insert_many([log.to_dict() for log in logs])
-                db.close()
                self.log_message(f"已保存 {count} 条记录")
+                self.logger.info(f"已保存 {count} 条记录")
                
                # 刷新日报显示
                self.generate_today_report()
            
            self.set_status("完成")
+            self.logger.info("Debug 数据获取完成")
            
+        except LogParserError as e:
+            self.log_message(f"日志解析错误: {e}", is_error=True)
+            self.logger.error(f"日志解析错误: {e}")
+            self.set_status("错误")
+        except DatabaseConnectionError as e:
+            self.log_message(f"数据库连接错误: {e}", is_error=True)
+            self.logger.error(f"数据库连接错误: {e}")
+            self.set_status("错误")
        except Exception as e:
-            self.log_message(f"错误: {e}", is_error=True)
+            self.log_message(f"未知错误: {e}", is_error=True)
+            self.logger.error(f"未知错误: {e}", exc_info=True)
            self.set_status("错误")
    
    def generate_report(self):
@@ -339,21 +384,23 @@ class OrbitInGUI:
        
        if not date:
            self.log_message("错误: 请输入日期", is_error=True)
+            self.logger.error("未输入日期")
            return
        
        try:
            datetime.strptime(date, '%Y-%m-%d')
        except ValueError:
            self.log_message("错误: 日期格式无效，请使用 YYYY-MM-DD", is_error=True)
+            self.logger.error(f"日期格式无效: {date}")
            return
        
        self.set_status("正在生成日报...")
        self.log_message(f"生成 {date} 的日报...")
+        self.logger.info(f"生成 {date} 的日报...")
        
        try:
            g = DailyReportGenerator()
            report = g.generate_report(date)
-            g.close()
            
            # 在日报文本框中显示（可复制）
            self.report_text.delete("1.0", tk.END)
@@ -366,9 +413,19 @@ class OrbitInGUI:
            self.log_message("=" * 40)
            
            self.set_status("完成")
+            self.logger.info(f"日报生成完成: {date}")
            
+        except ReportGeneratorError as e:
+            self.log_message(f"日报生成错误: {e}", is_error=True)
+            self.logger.error(f"日报生成错误: {e}")
+            self.set_status("错误")
+        except DatabaseConnectionError as e:
+            self.log_message(f"数据库连接错误: {e}", is_error=True)
+            self.logger.error(f"数据库连接错误: {e}")
+            self.set_status("错误")
        except Exception as e:
-            self.log_message(f"错误: {e}", is_error=True)
+            self.log_message(f"未知错误: {e}", is_error=True)
+            self.logger.error(f"未知错误: {e}", exc_info=True)
            self.set_status("错误")
    
    def generate_today_report(self):
@@ -384,117 +441,143 @@ class OrbitInGUI:
        
        if not year_month or not teu:
            self.log_message("错误: 请输入月份和 TEU", is_error=True)
+            self.logger.error("未输入月份和 TEU")
            return
        
        try:
            teu = int(teu)
        except ValueError:
            self.log_message("错误: TEU 必须是数字", is_error=True)
+            self.logger.error(f"TEU 不是数字: {teu}")
            return
        
        self.set_status("正在添加...")
        self.log_message(f"添加 {year_month} 月未统计数据: {teu}TEU")
+        self.logger.info(f"添加 {year_month} 月未统计数据: {teu}TEU")
        
        try:
            db = DailyLogsDatabase()
            result = db.insert_unaccounted(year_month, teu, '')
-            db.close()
            
            if result:
                self.log_message("添加成功!")
+                self.logger.info(f"未统计数据添加成功: {year_month} {teu}TEU")
                # 刷新日报显示
                self.generate_today_report()
            else:
                self.log_message("添加失败!", is_error=True)
+                self.logger.error(f"未统计数据添加失败: {year_month} {teu}TEU")
            
            self.set_status("完成")
            
+        except DatabaseConnectionError as e:
+            self.log_message(f"数据库连接错误: {e}", is_error=True)
+            self.logger.error(f"数据库连接错误: {e}")
+            self.set_status("错误")
        except Exception as e:
-            self.log_message(f"错误: {e}", is_error=True)
+            self.log_message(f"未知错误: {e}", is_error=True)
+            self.logger.error(f"未知错误: {e}", exc_info=True)
            self.set_status("错误")
    
    def auto_fetch_data(self):
        """自动获取新数据（GUI启动时调用）"""
        self.set_status("正在自动获取新数据...")
        self.log_message("GUI启动，开始自动获取新数据...")
+        self.logger.info("GUI启动，开始自动获取新数据...")
        
        try:
            # 1. 检查飞书配置，如果配置完整则刷新排班信息
-            from dotenv import load_dotenv
-            load_dotenv()
-            
-            feishu_token = os.getenv('FEISHU_TOKEN')
-            feishu_spreadsheet_token = os.getenv('FEISHU_SPREADSHEET_TOKEN')
-            
-            if feishu_token and feishu_spreadsheet_token:
+            if config.FEISHU_TOKEN and config.FEISHU_SPREADSHEET_TOKEN:
                try:
                    self.log_message("正在刷新排班信息...")
-                    from src.feishu_v2 import FeishuScheduleManagerV2
-                    feishu_manager = FeishuScheduleManagerV2()
+                    self.logger.info("正在刷新排班信息...")
+                    feishu_manager = FeishuScheduleManager()
                    # 只刷新未来7天的排班，减少API调用
                    feishu_manager.refresh_all_schedules(days=7)
                    self.log_message("排班信息刷新完成")
-                except Exception as e:
+                    self.logger.info("排班信息刷新完成")
+                except FeishuClientError as e:
                    self.log_message(f"刷新排班信息时出错: {e}", is_error=True)
+                    self.logger.error(f"刷新排班信息时出错: {e}")
+                    self.log_message("将继续处理其他任务...")
+                except Exception as e:
+                    self.log_message(f"刷新排班信息时出现未知错误: {e}", is_error=True)
+                    self.logger.error(f"刷新排班信息时出现未知错误: {e}", exc_info=True)
                    self.log_message("将继续处理其他任务...")
            else:
                self.log_message("飞书配置不完整，跳过排班信息刷新")
+                self.logger.warning("飞书配置不完整，跳过排班信息刷新")
            
            # 2. 尝试获取最新的作业数据
            self.log_message("正在尝试获取最新作业数据...")
+            self.logger.info("正在尝试获取最新作业数据...")
            
-            base_url = os.getenv('CONFLUENCE_BASE_URL')
-            token = os.getenv('CONFLUENCE_TOKEN')
-            content_id = os.getenv('CONFLUENCE_CONTENT_ID')
-            
-            if base_url and token and content_id:
+            if config.CONFLUENCE_BASE_URL and config.CONFLUENCE_TOKEN and config.CONFLUENCE_CONTENT_ID:
                try:
                    # 获取 HTML
                    self.log_message("正在从 Confluence 获取 HTML...")
-                    from src.confluence import ConfluenceClient
-                    client = ConfluenceClient(base_url, token)
-                    html = client.get_html(content_id)
+                    self.logger.info("正在从 Confluence 获取 HTML...")
+                    client = ConfluenceClient(config.CONFLUENCE_BASE_URL, config.CONFLUENCE_TOKEN)
+                    html = client.get_html(config.CONFLUENCE_CONTENT_ID)
                    
                    if html:
                        self.log_message(f"获取成功，共 {len(html)} 字符")
+                        self.logger.info(f"获取成功，共 {len(html)} 字符")
                        
                        # 提取文本
                        self.log_message("正在提取布局文本...")
-                        from src.extractor import HTMLTextExtractor
+                        self.logger.info("正在提取布局文本...")
                        extractor = HTMLTextExtractor()
                        layout_text = extractor.extract(html)
                        
                        # 解析数据
                        self.log_message("正在解析日志数据...")
-                        from src.parser import HandoverLogParser
+                        self.logger.info("正在解析日志数据...")
                        parser = HandoverLogParser()
                        logs = parser.parse(layout_text)
                        
                        if logs:
                            # 保存到数据库
                            self.log_message("正在保存到数据库...")
+                            self.logger.info("正在保存到数据库...")
                            db = DailyLogsDatabase()
                            count = db.insert_many([log.to_dict() for log in logs])
-                            db.close()
                            self.log_message(f"已保存 {count} 条新记录")
+                            self.logger.info(f"已保存 {count} 条新记录")
                        else:
                            self.log_message("未解析到新记录")
+                            self.logger.warning("未解析到新记录")
                    else:
                        self.log_message("未获取到 HTML 内容，跳过数据获取")
-                except Exception as e:
+                        self.logger.warning("未获取到 HTML 内容，跳过数据获取")
+                except ConfluenceClientError as e:
                    self.log_message(f"获取作业数据时出错: {e}", is_error=True)
+                    self.logger.error(f"获取作业数据时出错: {e}")
+                except HTMLTextExtractorError as e:
+                    self.log_message(f"HTML 提取错误: {e}", is_error=True)
+                    self.logger.error(f"HTML 提取错误: {e}")
+                except LogParserError as e:
+                    self.log_message(f"日志解析错误: {e}", is_error=True)
+                    self.logger.error(f"日志解析错误: {e}")
+                except Exception as e:
+                    self.log_message(f"获取作业数据时出现未知错误: {e}", is_error=True)
+                    self.logger.error(f"获取作业数据时出现未知错误: {e}", exc_info=True)
            else:
                self.log_message("Confluence 配置不完整，跳过数据获取")
+                self.logger.warning("Confluence 配置不完整，跳过数据获取")
            
            # 3. 显示今日日报
            self.log_message("正在生成今日日报...")
+            self.logger.info("正在生成今日日报...")
            self.generate_today_report()
            
            self.set_status("就绪")
            self.log_message("自动获取完成，GUI已就绪")
+            self.logger.info("自动获取完成，GUI已就绪")
            
        except Exception as e:
            self.log_message(f"自动获取过程中出现错误: {e}", is_error=True)
+            self.logger.error(f"自动获取过程中出现错误: {e}", exc_info=True)
            self.log_message("将继续显示GUI界面...")
            self.set_status("就绪")
            # 即使出错也显示今日日报
@@ -505,6 +588,7 @@ class OrbitInGUI:
        self.set_status("正在统计...")
        self.log_message("数据库统计信息:")
        self.log_message("-" * 30)
+        self.logger.info("显示数据库统计信息...")
        
        try:
            db = DailyLogsDatabase()
@@ -514,8 +598,6 @@ class OrbitInGUI:
            current_month = datetime.now().strftime('%Y-%m')
            ships_monthly = db.get_ships_with_monthly_teu(current_month)
            
-            db.close()
-            
            self.log_message(f"总记录数: {stats['total']}")
            self.log_message(f"船次数量: {len(stats['ships'])}")
            self.log_message(f"日期范围: {stats['date_range']['start']} ~ {stats['date_range']['end']}")
@@ -532,9 +614,15 @@ class OrbitInGUI:
                self.log_message(f"  本月合计: {total_monthly_teu}TEU")
            
            self.set_status("完成")
+            self.logger.info(f"数据库统计完成: {stats['total']} 条记录, {len(stats['ships'])} 艘船")
            
+        except DatabaseConnectionError as e:
+            self.log_message(f"数据库连接错误: {e}", is_error=True)
+            self.logger.error(f"数据库连接错误: {e}")
+            self.set_status("错误")
        except Exception as e:
-            self.log_message(f"错误: {e}", is_error=True)
+            self.log_message(f"未知错误: {e}", is_error=True)
+            self.logger.error(f"未知错误: {e}", exc_info=True)
            self.set_status("错误")


--- a/src/logging_config.py
+++ b/src/logging_config.py
@@ -0,0 +1,144 @@
+#!/usr/bin/env python3
+"""
+统一日志配置模块
+提供统一的日志配置，避免各模块自行配置
+"""
+import os
+import logging
+import sys
+from logging.handlers import RotatingFileHandler
+from typing import Optional
+
+from src.config import config
+
+
+def setup_logging(
+    log_file: Optional[str] = None,
+    console_level: int = logging.INFO,
+    file_level: int = logging.DEBUG,
+    max_bytes: int = 10 * 1024 * 1024,  # 10MB
+    backup_count: int = 5
+) -> logging.Logger:
+    """
+    配置统一的日志系统
+    
+    参数:
+        log_file: 日志文件路径，如果为None则使用默认路径
+        console_level: 控制台日志级别
+        file_level: 文件日志级别
+        max_bytes: 单个日志文件最大大小
+        backup_count: 备份文件数量
+    
+    返回:
+        配置好的根日志器
+    """
+    # 创建日志目录
+    if log_file is None:
+        log_dir = 'logs'
+        log_file = os.path.join(log_dir, 'app.log')
+    else:
+        log_dir = os.path.dirname(log_file)
+    
+    if log_dir and not os.path.exists(log_dir):
+        os.makedirs(log_dir)
+    
+    # 获取根日志器
+    logger = logging.getLogger()
+    logger.setLevel(logging.DEBUG)  # 根日志器设置为最低级别
+    
+    # 清除现有handler，避免重复添加
+    logger.handlers.clear()
+    
+    # 控制台handler
+    console_handler = logging.StreamHandler(sys.stdout)
+    console_handler.setLevel(console_level)
+    console_formatter = logging.Formatter(
+        '%(asctime)s - %(name)s - %(levelname)s - %(message)s',
+        datefmt='%Y-%m-%d %H:%M:%S'
+    )
+    console_handler.setFormatter(console_formatter)
+    logger.addHandler(console_handler)
+    
+    # 文件handler（轮转）
+    file_handler = RotatingFileHandler(
+        log_file,
+        maxBytes=max_bytes,
+        backupCount=backup_count,
+        encoding='utf-8'
+    )
+    file_handler.setLevel(file_level)
+    file_formatter = logging.Formatter(
+        '%(asctime)s - %(name)s - %(levelname)s - %(filename)s:%(lineno)d - %(message)s',
+        datefmt='%Y-%m-%d %H:%M:%S'
+    )
+    file_handler.setFormatter(file_formatter)
+    logger.addHandler(file_handler)
+    
+    # 设置第三方库的日志级别
+    logging.getLogger('urllib3').setLevel(logging.WARNING)
+    logging.getLogger('requests').setLevel(logging.WARNING)
+    
+    logger.info(f"日志系统已初始化，日志文件: {log_file}")
+    logger.info(f"控制台日志级别: {logging.getLevelName(console_level)}")
+    logger.info(f"文件日志级别: {logging.getLevelName(file_level)}")
+    
+    return logger
+
+
+def get_logger(name: str) -> logging.Logger:
+    """
+    获取指定名称的日志器
+    
+    参数:
+        name: 日志器名称，通常使用 __name__
+    
+    返回:
+        配置好的日志器
+    """
+    return logging.getLogger(name)
+
+
+# 自动初始化日志系统
+if not logging.getLogger().handlers:
+    # 只有在没有handler时才初始化，避免重复初始化
+    setup_logging()
+
+
+# 便捷函数
+def info(msg: str, *args, **kwargs):
+    """记录INFO级别日志"""
+    logging.info(msg, *args, **kwargs)
+
+
+def warning(msg: str, *args, **kwargs):
+    """记录WARNING级别日志"""
+    logging.warning(msg, *args, **kwargs)
+
+
+def error(msg: str, *args, **kwargs):
+    """记录ERROR级别日志"""
+    logging.error(msg, *args, **kwargs)
+
+
+def debug(msg: str, *args, **kwargs):
+    """记录DEBUG级别日志"""
+    logging.debug(msg, *args, **kwargs)
+
+
+def exception(msg: str, *args, **kwargs):
+    """记录异常日志"""
+    logging.exception(msg, *args, **kwargs)
+
+
+if __name__ == '__main__':
+    # 测试日志配置
+    logger = get_logger(__name__)
+    logger.info("测试INFO日志")
+    logger.warning("测试WARNING日志")
+    logger.error("测试ERROR日志")
+    logger.debug("测试DEBUG日志")
+    
+    try:
+        raise ValueError("测试异常")
+    except ValueError as e:
+        logger.exception("捕获到异常: %s", e)
--- a/src/parser.py
+++ b/src/parser.py
@@ -1,192 +0,0 @@
-#!/usr/bin/env python3
-"""
-日志解析模块
-"""
-import re
-from typing import List, Dict, Optional
-from dataclasses import dataclass
-
-
-@dataclass
-class ShipLog:
-    """船次日志数据类"""
-    date: str
-    shift: str
-    ship_name: str
-    teu: Optional[int] = None
-    efficiency: Optional[float] = None
-    vehicles: Optional[int] = None
-    
-    def to_dict(self) -> Dict:
-        """转换为字典"""
-        return {
-            'date': self.date,
-            'shift': self.shift,
-            'ship_name': self.ship_name,
-            'teu': self.teu,
-            'efficiency': self.efficiency,
-            'vehicles': self.vehicles
-        }
-
-
-class HandoverLogParser:
-    """交接班日志解析器"""
-    
-    SEPARATOR = '———————————————————————————————————————————————'
-    
-    def __init__(self):
-        """初始化解析器"""
-        pass
-    
-    @staticmethod
-    def parse_date(date_str: str) -> str:
-        """解析日期字符串"""
-        try:
-            parts = date_str.split('.')
-            if len(parts) == 3:
-                return f"{parts[0]}-{parts[1]}-{parts[2]}"
-            return date_str
-        except Exception:
-            return date_str
-    
-    def parse(self, text: str) -> List[ShipLog]:
-        """
-        解析日志文本
-        
-        参数:
-            text: 日志文本
-            
-        返回:
-            船次日志列表（已合并同日期同班次同船名的记录）
-        """
-        logs = []
-        
-        # 预处理：移除单行分隔符（前后都是空行的分隔符）
-        # 保留真正的内容分隔符（前后有内容的）
-        lines = text.split('\n')
-        processed_lines = []
-        i = 0
-        while i < len(lines):
-            line = lines[i]
-            if line.strip() == self.SEPARATOR:
-                # 检查是否是单行分隔符（前后都是空行或分隔符）
-                prev_empty = i == 0 or not lines[i-1].strip() or lines[i-1].strip() == self.SEPARATOR
-                next_empty = i == len(lines) - 1 or not lines[i+1].strip() or lines[i+1].strip() == self.SEPARATOR
-                if prev_empty and next_empty:
-                    # 单行分隔符，跳过
-                    i += 1
-                    continue
-            processed_lines.append(line)
-            i += 1
-        
-        processed_text = '\n'.join(processed_lines)
-        blocks = processed_text.split(self.SEPARATOR)
-        
-        for block in blocks:
-            if not block.strip() or '日期：' not in block:
-                continue
-            
-            # 解析日期
-            date_match = re.search(r'日期：(\d{4}\.\d{2}\.\d{2})', block)
-            if not date_match:
-                continue
-            
-            date = self.parse_date(date_match.group(1))
-            self._parse_block(block, date, logs)
-        
-        # 合并同日期同班次同船名的记录（累加TEU）
-        merged = {}
-        for log in logs:
-            key = (log.date, log.shift, log.ship_name)
-            if key not in merged:
-                merged[key] = ShipLog(
-                    date=log.date,
-                    shift=log.shift,
-                    ship_name=log.ship_name,
-                    teu=log.teu,
-                    efficiency=log.efficiency,
-                    vehicles=log.vehicles
-                )
-            else:
-                # 累加TEU
-                if log.teu:
-                    if merged[key].teu is None:
-                        merged[key].teu = log.teu
-                    else:
-                        merged[key].teu += log.teu
-                # 累加车辆数
-                if log.vehicles:
-                    if merged[key].vehicles is None:
-                        merged[key].vehicles = log.vehicles
-                    else:
-                        merged[key].vehicles += log.vehicles
-        
-        return list(merged.values())
-    
-    def _parse_block(self, block: str, date: str, logs: List[ShipLog]):
-        """解析日期块"""
-        for shift in ['白班', '夜班']:
-            shift_pattern = f'{shift}：'
-            if shift_pattern not in block:
-                continue
-            
-            shift_start = block.find(shift_pattern) + len(shift_pattern)
-            
-            # 只找到下一个班次作为边界，不限制"注意事项："
-            next_pos = len(block)
-            for next_shift in ['白班', '夜班']:
-                if next_shift != shift:
-                    pos = block.find(f'{next_shift}：', shift_start)
-                    if pos != -1 and pos < next_pos:
-                        next_pos = pos
-            
-            shift_content = block[shift_start:next_pos]
-            self._parse_ships(shift_content, date, shift, logs)
-    
-    def _parse_ships(self, content: str, date: str, shift: str, logs: List[ShipLog]):
-        """解析船次"""
-        parts = content.split('实船作业：')
-        
-        for part in parts:
-            if not part.strip():
-                continue
-            
-            cleaned = part.replace('\xa0', ' ').strip()
-            # 匹配 "xxx# 船名" 格式（船号和船名分开）
-            ship_match = re.search(r'(\d+)#\s*(\S+)', cleaned)
-            
-            if not ship_match:
-                continue
-            
-            # 船名只取纯船名（去掉xx#前缀和二次靠泊等标注）
-            ship_name = ship_match.group(2)
-            # 移除二次靠泊等标注
-            ship_name = re.sub(r'（二次靠泊）|（再次靠泊）|\(二次靠泊\)|\(再次靠泊\)', '', ship_name).strip()
-            
-            vehicles_match = re.search(r'上场车辆数：(\d+)', cleaned)
-            teu_eff_match = re.search(
-                r'作业量/效率：(\d+)TEU[，,\s]*', cleaned
-            )
-            
-            log = ShipLog(
-                date=date,
-                shift=shift,
-                ship_name=ship_name,
-                teu=int(teu_eff_match.group(1)) if teu_eff_match else None,
-                efficiency=None,
-                vehicles=int(vehicles_match.group(1)) if vehicles_match else None
-            )
-            logs.append(log)
-
-
-if __name__ == '__main__':
-    # 测试
-    with open('layout_output.txt', 'r', encoding='utf-8') as f:
-        text = f.read()
-    
-    parser = HandoverLogParser()
-    logs = parser.parse(text)
-    
-    print(f'解析到 {len(logs)} 条记录')
-    for log in logs[:5]:
-        print(f'{log.date} {log.shift} {log.ship_name}: {log.teu}TEU')
--- a/src/report.py
+++ b/src/report.py
@@ -1,44 +1,70 @@
 #!/usr/bin/env python3
 """
 日报生成模块
+更新依赖，使用新的配置和数据库模块
 """
 from datetime import datetime, timedelta
-from typing import Dict, List, Optional
-import sys
-import os
+from typing import Dict, List, Optional, Any
 import logging

-# 添加项目根目录到 Python 路径
-sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
+from src.config import config
+from src.logging_config import get_logger
+from src.database.daily_logs import DailyLogsDatabase
+from src.feishu.manager import FeishuScheduleManager

-from src.database import DailyLogsDatabase
-from src.feishu_v2 import FeishuScheduleManagerV2 as FeishuScheduleManager
+logger = get_logger(__name__)

-logger = logging.getLogger(__name__)
+
+class ReportGeneratorError(Exception):
+    """日报生成错误"""
+    pass


 class DailyReportGenerator:
    """每日作业报告生成器"""
    
-    DAILY_TARGET = 300  # 每日目标作业量
+    def __init__(self, db_path: Optional[str] = None):
+        """
+        初始化日报生成器
        
-    def __init__(self, db_path: str = 'data/daily_logs.db'):
-        """初始化"""
+        参数:
+            db_path: 数据库文件路径，如果为None则使用配置
+        """
        self.db = DailyLogsDatabase(db_path)
+        logger.info("日报生成器初始化完成")
    
    def get_latest_date(self) -> str:
-        """获取数据库中最新的日期"""
+        """
+        获取数据库中最新的日期
+        
+        返回:
+            最新日期字符串，格式 "YYYY-MM-DD"
+        """
+        try:
            logs = self.db.query_all(limit=1)
            if logs:
                return logs[0]['date']
            return datetime.now().strftime('%Y-%m-%d')
            
-    def get_daily_data(self, date: str) -> Dict:
-        """获取指定日期的数据"""
+        except Exception as e:
+            logger.error(f"获取最新日期失败: {e}")
+            return datetime.now().strftime('%Y-%m-%d')
+    
+    def get_daily_data(self, date: str) -> Dict[str, Any]:
+        """
+        获取指定日期的数据
+        
+        参数:
+            date: 日期字符串，格式 "YYYY-MM-DD"
+        
+        返回:
+            每日数据字典
+        """
+        try:
            logs = self.db.query_by_date(date)
            
            # 按船名汇总
-        ships = {}
+            ships: Dict[str, int] = {}
            for log in logs:
                ship = log['ship_name']
                if ship not in ships:
@@ -53,8 +79,26 @@ class DailyReportGenerator:
                'ship_count': len(ships)
            }
            
-    def get_monthly_stats(self, date: str) -> Dict:
-        """获取月度统计（截止到指定日期）"""
+        except Exception as e:
+            logger.error(f"获取每日数据失败: {date}, 错误: {e}")
+            return {
+                'date': date,
+                'ships': {},
+                'total_teu': 0,
+                'ship_count': 0
+            }
+    
+    def get_monthly_stats(self, date: str) -> Dict[str, Any]:
+        """
+        获取月度统计（截止到指定日期）
+        
+        参数:
+            date: 日期字符串，格式 "YYYY-MM-DD"
+        
+        返回:
+            月度统计字典
+        """
+        try:
            year_month = date[:7]  # YYYY-MM
            target_date = datetime.strptime(date, '%Y-%m-%d').date()
            
@@ -68,7 +112,7 @@ class DailyReportGenerator:
            ]
            
            # 按日期汇总
-        daily_totals = {}
+            daily_totals: Dict[str, int] = {}
            for log in monthly_logs:
                d = log['date']
                if d not in daily_totals:
@@ -78,7 +122,7 @@ class DailyReportGenerator:
            
            # 计算当月天数（已过的天数）
            current_date = datetime.strptime(date, '%Y-%m-%d')
-        if current_date.day == 1:
+            if current_date.day == config.FIRST_DAY_OF_MONTH_SPECIAL:
                days_passed = 1
            else:
                days_passed = current_date.day
@@ -86,27 +130,52 @@ class DailyReportGenerator:
            # 获取未统计数据
            unaccounted = self.db.get_unaccounted(year_month)
            
-        planned = days_passed * self.DAILY_TARGET
+            planned = days_passed * config.DAILY_TARGET_TEU
            actual = sum(daily_totals.values()) + unaccounted
            
+            completion = round(actual / planned * 100, 2) if planned > 0 else 0
+            
            return {
                'year_month': year_month,
                'days_passed': days_passed,
                'planned': planned,
                'actual': actual,
                'unaccounted': unaccounted,
-            'completion': round(actual / planned * 100, 2) if planned > 0 else 0,
+                'completion': completion,
                'daily_totals': daily_totals
            }
            
-    def get_shift_personnel(self, date: str) -> Dict:
+        except Exception as e:
+            logger.error(f"获取月度统计失败: {date}, 错误: {e}")
+            return {
+                'year_month': date[:7],
+                'days_passed': 0,
+                'planned': 0,
+                'actual': 0,
+                'unaccounted': 0,
+                'completion': 0,
+                'daily_totals': {}
+            }
+    
+    def get_shift_personnel(self, date: str) -> Dict[str, str]:
        """
        获取班次人员（从飞书排班表获取）
        
        注意：日报中显示的是次日的班次人员，所以需要获取 date+1 的排班
        例如：生成 12/29 的日报，显示的是 12/30 的人员
+        
+        参数:
+            date: 日期字符串，格式 "YYYY-MM-DD"
+        
+        返回:
+            班次人员字典
        """
        try:
+            # 检查飞书配置
+            if not config.FEISHU_TOKEN or not config.FEISHU_SPREADSHEET_TOKEN:
+                logger.warning("飞书配置不完整，跳过排班信息获取")
+                return self._empty_personnel()
+            
            # 初始化飞书排班管理器
            manager = FeishuScheduleManager()
            
@@ -116,7 +185,7 @@ class DailyReportGenerator:
            
            logger.info(f"获取 {date} 日报的班次人员，对应排班表日期: {tomorrow}")
            
-            # 获取次日的排班信息（使用缓存）
+            # 获取次日的排班信息
            schedule = manager.get_schedule_for_date(tomorrow)
            
            # 如果从飞书获取到数据，使用飞书数据
@@ -124,55 +193,58 @@ class DailyReportGenerator:
                return {
                    'day_shift': schedule.get('day_shift', ''),
                    'night_shift': schedule.get('night_shift', ''),
-                    'duty_phone': '13107662315'
+                    'duty_phone': config.DUTY_PHONE
                }
            
            # 如果飞书数据为空，返回空值
            logger.warning(f"无法从飞书获取 {tomorrow} 的排班信息")
-            return {
-                'day_shift': '',
-                'night_shift': '',
-                'duty_phone': '13107662315'
-            }
+            return self._empty_personnel()
            
        except Exception as e:
            logger.error(f"获取排班信息失败: {e}")
            # 降级处理：返回空值
-            return {
-                'day_shift': '',
-                'night_shift': '',
-                'duty_phone': '13107662315'
-            }
+            return self._empty_personnel()
    
-    def generate_report(self, date: str = None) -> str:
-        """生成日报"""
+    def generate_report(self, date: Optional[str] = None) -> str:
+        """
+        生成日报
+        
+        参数:
+            date: 日期字符串，格式 "YYYY-MM-DD"，如果为None则使用最新日期
+        
+        返回:
+            日报文本
+        
+        异常:
+            ReportGeneratorError: 生成失败
+        """
+        try:
            if not date:
                date = self.get_latest_date()
            
-        # 转换日期格式 2025-12-28 -> 12/28，同时确保查询格式正确
+            # 验证日期格式
            try:
-            # 尝试解析各种日期格式
                parsed = datetime.strptime(date, '%Y-%m-%d')
                display_date = parsed.strftime('%m/%d')
-            query_date = parsed.strftime('%Y-%m-%d')  # 标准化为双数字格式
-        except ValueError:
-            # 如果已经是标准格式，直接使用
-            display_date = datetime.strptime(date, '%Y-%m-%d').strftime('%m/%d')
-            query_date = date
+                query_date = parsed.strftime('%Y-%m-%d')
+            except ValueError as e:
+                error_msg = f"日期格式无效: {date}, 错误: {e}"
+                logger.error(error_msg)
+                raise ReportGeneratorError(error_msg) from e
            
+            # 获取数据
            daily_data = self.get_daily_data(query_date)
            monthly_data = self.get_monthly_stats(query_date)
            personnel = self.get_shift_personnel(query_date)
            
-        # 月度统计
-        month_display = date[5:7] + '/' + date[:4]  # MM/YYYY
-        
-        lines = []
+            # 生成日报
+            lines: List[str] = []
            lines.append(f"日期：{display_date}")
            lines.append("")
            
-        # 船次信息（紧凑格式，不留空行）
-        ship_lines = []
+            # 船次信息
+            if daily_data['ships']:
+                ship_lines: List[str] = []
                for ship, teu in sorted(daily_data['ships'].items(), key=lambda x: -x[1]):
                    ship_lines.append(f"船名：{ship}")
                    ship_lines.append(f"作业量：{teu}TEU")
@@ -183,12 +255,12 @@ class DailyReportGenerator:
            lines.append(f"当日实际作业量：{daily_data['total_teu']}TEU")
            
            # 月度统计
-        lines.append(f"当月计划作业量：{monthly_data['planned']}TEU (用天数*{self.DAILY_TARGET}TEU)")
+            lines.append(f"当月计划作业量：{monthly_data['planned']}TEU (用天数*{config.DAILY_TARGET_TEU}TEU)")
            lines.append(f"当月实际作业量：{monthly_data['actual']}TEU")
            lines.append(f"当月完成比例：{monthly_data['completion']}%")
            lines.append("")
            
-        # 人员信息（需要配合 Confluence 日志中的班次人员信息）
+            # 人员信息
            day_personnel = personnel['day_shift']
            night_personnel = personnel['night_shift']
            duty_phone = personnel['duty_phone']
@@ -199,20 +271,111 @@ class DailyReportGenerator:
            lines.append(f"{next_day} 夜班人员：{night_personnel}")
            lines.append(f"24小时值班手机：{duty_phone}")
            
-        return "\n".join(lines)
+            report = "\n".join(lines)
+            logger.info(f"日报生成完成: {date}")
+            return report
            
-    def print_report(self, date: str = None):
-        """打印日报"""
+        except ReportGeneratorError:
+            raise
+        except Exception as e:
+            error_msg = f"生成日报失败: {e}"
+            logger.error(error_msg)
+            raise ReportGeneratorError(error_msg) from e
+    
+    def print_report(self, date: Optional[str] = None) -> str:
+        """
+        打印日报
+        
+        参数:
+            date: 日期字符串，格式 "YYYY-MM-DD"，如果为None则使用最新日期
+        
+        返回:
+            日报文本
+        """
+        try:
            report = self.generate_report(date)
            print(report)
            return report
            
+        except ReportGeneratorError as e:
+            print(f"生成日报失败: {e}")
+            return ""
+    
+    def save_report_to_file(self, date: Optional[str] = None, filepath: Optional[str] = None) -> bool:
+        """
+        保存日报到文件
+        
+        参数:
+            date: 日期字符串，如果为None则使用最新日期
+            filepath: 文件路径，如果为None则使用默认路径
+        
+        返回:
+            是否成功
+        """
+        try:
+            report = self.generate_report(date)
+            
+            if filepath is None:
+                # 使用默认路径
+                import os
+                report_dir = "reports"
+                os.makedirs(report_dir, exist_ok=True)
+                
+                if date is None:
+                    date = self.get_latest_date()
+                filename = f"daily_report_{date}.txt"
+                filepath = os.path.join(report_dir, filename)
+            
+            with open(filepath, 'w', encoding='utf-8') as f:
+                f.write(report)
+            
+            logger.info(f"日报已保存到文件: {filepath}")
+            return True
+            
+        except Exception as e:
+            logger.error(f"保存日报到文件失败: {e}")
+            return False
+    
+    def _empty_personnel(self) -> Dict[str, str]:
+        """返回空的人员信息"""
+        return {
+            'day_shift': '',
+            'night_shift': '',
+            'duty_phone': config.DUTY_PHONE
+        }
+    
    def close(self):
        """关闭数据库连接"""
        self.db.close()


 if __name__ == '__main__':
+    # 测试代码
+    import sys
+    
+    # 设置日志
+    logging.basicConfig(level=logging.INFO)
+    
    generator = DailyReportGenerator()
-    generator.print_report()
+    
+    try:
+        # 测试获取最新日期
+        latest_date = generator.get_latest_date()
+        print(f"最新日期: {latest_date}")
+        
+        # 测试生成日报
+        report = generator.generate_report(latest_date)
+        print(f"\n日报内容:\n{report}")
+        
+        # 测试保存到文件
+        success = generator.save_report_to_file(latest_date)
+        print(f"\n保存到文件: {'成功' if success else '失败'}")
+        
+    except ReportGeneratorError as e:
+        print(f"日报生成错误: {e}")
+        sys.exit(1)
+    except Exception as e:
+        print(f"未知错误: {e}")
+        sys.exit(1)
+    finally:
        generator.close()
--- a/src/schedule_database.py
+++ b/src/schedule_database.py
@@ -1,323 +0,0 @@
-#!/usr/bin/env python3
-"""
-排班人员数据库模块
-"""
-import sqlite3
-import os
-import json
-from datetime import datetime
-from typing import List, Dict, Optional, Tuple
-import hashlib
-
-
-class ScheduleDatabase:
-    """排班人员数据库"""
-    
-    def __init__(self, db_path: str = 'data/daily_logs.db'):
-        """
-        初始化数据库
-        
-        参数:
-            db_path: 数据库文件路径
-        """
-        self.db_path = db_path
-        self._ensure_directory()
-        self.conn = self._connect()
-        self._init_schema()
-    
-    def _ensure_directory(self):
-        """确保数据目录存在"""
-        data_dir = os.path.dirname(self.db_path)
-        if data_dir and not os.path.exists(data_dir):
-            os.makedirs(data_dir)
-    
-    def _connect(self) -> sqlite3.Connection:
-        """连接数据库"""
-        conn = sqlite3.connect(self.db_path)
-        conn.row_factory = sqlite3.Row
-        return conn
-    
-    def _init_schema(self):
-        """初始化表结构"""
-        cursor = self.conn.cursor()
-        
-        # 创建排班人员表
-        cursor.execute('''
-            CREATE TABLE IF NOT EXISTS schedule_personnel (
-                id INTEGER PRIMARY KEY AUTOINCREMENT,
-                date TEXT NOT NULL,
-                day_shift TEXT,
-                night_shift TEXT,
-                day_shift_list TEXT,  -- JSON数组
-                night_shift_list TEXT, -- JSON数组
-                sheet_id TEXT,
-                sheet_title TEXT,
-                data_hash TEXT,        -- 数据哈希，用于检测更新
-                created_at TEXT DEFAULT CURRENT_TIMESTAMP,
-                updated_at TEXT DEFAULT CURRENT_TIMESTAMP,
-                UNIQUE(date)
-            )
-        ''')
-        
-        # 创建表格版本表（用于检测表格是否有更新）
-        cursor.execute('''
-            CREATE TABLE IF NOT EXISTS sheet_versions (
-                id INTEGER PRIMARY KEY AUTOINCREMENT,
-                sheet_id TEXT NOT NULL,
-                sheet_title TEXT NOT NULL,
-                revision INTEGER NOT NULL,
-                data_hash TEXT,
-                last_checked_at TEXT DEFAULT CURRENT_TIMESTAMP,
-                UNIQUE(sheet_id)
-            )
-        ''')
-        
-        # 索引
-        cursor.execute('CREATE INDEX IF NOT EXISTS idx_schedule_date ON schedule_personnel(date)')
-        cursor.execute('CREATE INDEX IF NOT EXISTS idx_schedule_sheet ON schedule_personnel(sheet_id)')
-        cursor.execute('CREATE INDEX IF NOT EXISTS idx_sheet_versions ON sheet_versions(sheet_id)')
-        
-        self.conn.commit()
-    
-    def _calculate_hash(self, data: Dict) -> str:
-        """计算数据哈希值"""
-        data_str = json.dumps(data, sort_keys=True, ensure_ascii=False)
-        return hashlib.md5(data_str.encode('utf-8')).hexdigest()
-    
-    def check_sheet_update(self, sheet_id: str, sheet_title: str, revision: int, data: Dict) -> bool:
-        """
-        检查表格是否有更新
-        
-        参数:
-            sheet_id: 表格ID
-            sheet_title: 表格标题
-            revision: 表格版本号
-            data: 表格数据
-        
-        返回:
-            True: 有更新，需要重新获取
-            False: 无更新，可以使用缓存
-        """
-        cursor = self.conn.cursor()
-        
-        # 查询当前版本
-        cursor.execute(
-            'SELECT revision, data_hash FROM sheet_versions WHERE sheet_id = ?',
-            (sheet_id,)
-        )
-        result = cursor.fetchone()
-        
-        if not result:
-            # 第一次获取，记录版本
-            data_hash = self._calculate_hash(data)
-            cursor.execute('''
-                INSERT INTO sheet_versions (sheet_id, sheet_title, revision, data_hash, last_checked_at)
-                VALUES (?, ?, ?, ?, CURRENT_TIMESTAMP)
-            ''', (sheet_id, sheet_title, revision, data_hash))
-            self.conn.commit()
-            return True
-        
-        # 检查版本号或数据是否有变化
-        old_revision = result['revision']
-        old_hash = result['data_hash']
-        new_hash = self._calculate_hash(data)
-        
-        if old_revision != revision or old_hash != new_hash:
-            # 有更新，更新版本信息
-            cursor.execute('''
-                UPDATE sheet_versions 
-                SET revision = ?, data_hash = ?, last_checked_at = CURRENT_TIMESTAMP
-                WHERE sheet_id = ?
-            ''', (revision, new_hash, sheet_id))
-            self.conn.commit()
-            return True
-        
-        # 无更新，更新检查时间
-        cursor.execute('''
-            UPDATE sheet_versions 
-            SET last_checked_at = CURRENT_TIMESTAMP
-            WHERE sheet_id = ?
-        ''', (sheet_id,))
-        self.conn.commit()
-        return False
-    
-    def save_schedule(self, date: str, schedule_data: Dict, sheet_id: str = None, sheet_title: str = None) -> bool:
-        """
-        保存排班信息到数据库
-        
-        参数:
-            date: 日期 (YYYY-MM-DD)
-            schedule_data: 排班数据
-            sheet_id: 表格ID
-            sheet_title: 表格标题
-        
-        返回:
-            是否成功
-        """
-        try:
-            cursor = self.conn.cursor()
-            
-            # 准备数据
-            day_shift = schedule_data.get('day_shift', '')
-            night_shift = schedule_data.get('night_shift', '')
-            day_shift_list = json.dumps(schedule_data.get('day_shift_list', []), ensure_ascii=False)
-            night_shift_list = json.dumps(schedule_data.get('night_shift_list', []), ensure_ascii=False)
-            data_hash = self._calculate_hash(schedule_data)
-            
-            # 使用 INSERT OR REPLACE 来更新已存在的记录
-            cursor.execute('''
-                INSERT OR REPLACE INTO schedule_personnel 
-                (date, day_shift, night_shift, day_shift_list, night_shift_list, 
-                 sheet_id, sheet_title, data_hash, updated_at)
-                VALUES (?, ?, ?, ?, ?, ?, ?, ?, CURRENT_TIMESTAMP)
-            ''', (
-                date, day_shift, night_shift, day_shift_list, night_shift_list,
-                sheet_id, sheet_title, data_hash
-            ))
-            
-            self.conn.commit()
-            return True
-            
-        except sqlite3.Error as e:
-            print(f"数据库错误: {e}")
-            return False
-    
-    def get_schedule(self, date: str) -> Optional[Dict]:
-        """
-        获取指定日期的排班信息
-        
-        参数:
-            date: 日期 (YYYY-MM-DD)
-        
-        返回:
-            排班信息字典，未找到返回None
-        """
-        cursor = self.conn.cursor()
-        cursor.execute(
-            'SELECT * FROM schedule_personnel WHERE date = ?',
-            (date,)
-        )
-        result = cursor.fetchone()
-        
-        if not result:
-            return None
-        
-        # 解析JSON数组
-        day_shift_list = json.loads(result['day_shift_list']) if result['day_shift_list'] else []
-        night_shift_list = json.loads(result['night_shift_list']) if result['night_shift_list'] else []
-        
-        return {
-            'date': result['date'],
-            'day_shift': result['day_shift'],
-            'night_shift': result['night_shift'],
-            'day_shift_list': day_shift_list,
-            'night_shift_list': night_shift_list,
-            'sheet_id': result['sheet_id'],
-            'sheet_title': result['sheet_title'],
-            'updated_at': result['updated_at']
-        }
-    
-    def get_schedule_by_range(self, start_date: str, end_date: str) -> List[Dict]:
-        """
-        获取日期范围内的排班信息
-        
-        参数:
-            start_date: 开始日期 (YYYY-MM-DD)
-            end_date: 结束日期 (YYYY-MM-DD)
-        
-        返回:
-            排班信息列表
-        """
-        cursor = self.conn.cursor()
-        cursor.execute('''
-            SELECT * FROM schedule_personnel 
-            WHERE date >= ? AND date <= ?
-            ORDER BY date
-        ''', (start_date, end_date))
-        
-        results = []
-        for row in cursor.fetchall():
-            day_shift_list = json.loads(row['day_shift_list']) if row['day_shift_list'] else []
-            night_shift_list = json.loads(row['night_shift_list']) if row['night_shift_list'] else []
-            
-            results.append({
-                'date': row['date'],
-                'day_shift': row['day_shift'],
-                'night_shift': row['night_shift'],
-                'day_shift_list': day_shift_list,
-                'night_shift_list': night_shift_list,
-                'sheet_id': row['sheet_id'],
-                'sheet_title': row['sheet_title'],
-                'updated_at': row['updated_at']
-            })
-        
-        return results
-    
-    def delete_old_schedules(self, before_date: str) -> int:
-        """
-        删除指定日期之前的排班记录
-        
-        参数:
-            before_date: 日期 (YYYY-MM-DD)
-        
-        返回:
-            删除的记录数
-        """
-        cursor = self.conn.cursor()
-        cursor.execute(
-            'DELETE FROM schedule_personnel WHERE date < ?',
-            (before_date,)
-        )
-        deleted_count = cursor.rowcount
-        self.conn.commit()
-        return deleted_count
-    
-    def get_stats(self) -> Dict:
-        """获取统计信息"""
-        cursor = self.conn.cursor()
-        
-        cursor.execute('SELECT COUNT(*) FROM schedule_personnel')
-        total = cursor.fetchone()[0]
-        
-        cursor.execute('SELECT MIN(date), MAX(date) FROM schedule_personnel')
-        date_range = cursor.fetchone()
-        
-        cursor.execute('SELECT COUNT(DISTINCT sheet_id) FROM schedule_personnel')
-        sheet_count = cursor.fetchone()[0]
-        
-        return {
-            'total': total,
-            'date_range': {'start': date_range[0], 'end': date_range[1]},
-            'sheet_count': sheet_count
-        }
-    
-    def close(self):
-        """关闭连接"""
-        if self.conn:
-            self.conn.close()
-
-
-if __name__ == '__main__':
-    # 测试代码
-    db = ScheduleDatabase()
-    
-    # 测试保存
-    test_schedule = {
-        'day_shift': '张勤、杨俊豪',
-        'night_shift': '刘炜彬、梁启迟',
-        'day_shift_list': ['张勤', '杨俊豪'],
-        'night_shift_list': ['刘炜彬', '梁启迟']
-    }
-    
-    success = db.save_schedule('2025-12-31', test_schedule, 'zcYLIk', '12月')
-    print(f"保存成功: {success}")
-    
-    # 测试获取
-    schedule = db.get_schedule('2025-12-31')
-    print(f"获取结果: {schedule}")
-    
-    # 测试统计
-    stats = db.get_stats()
-    print(f"统计: {stats}")
-    
-    db.close()