docs(docs): 调整数据库与详细设计目录编号
This commit is contained in:
651
docs/3.数据库设计/1.自动汇总.md
Normal file
651
docs/3.数据库设计/1.自动汇总.md
Normal file
@@ -0,0 +1,651 @@
|
||||
# 自动汇总文件夹文件目录与页数流程数据库设计
|
||||
|
||||
## 文档信息
|
||||
|
||||
| 项目 | 内容 |
|
||||
| --- | --- |
|
||||
| 需求分析文档 | docs/1.需求分析/1.自动汇总.md |
|
||||
| 功能设计文档 | docs/2.功能设计/1.自动汇总.md |
|
||||
| 详细设计文档 | docs/3.详细设计/1.自动汇总.md |
|
||||
| 数据库类型 | SQLite / Django ORM |
|
||||
| 表名前缀 | ra_ |
|
||||
| 设计日期 | 2026-06-05 |
|
||||
| 设计版本 | V1.0 |
|
||||
|
||||
---
|
||||
|
||||
## 一、设计原则
|
||||
|
||||
| 原则 | 说明 |
|
||||
| --- | --- |
|
||||
| ORM 优先 | 当前项目使用 Django,实际落地以 Django Model 与 migration 为准 |
|
||||
| SQLite 兼容 | 字段类型、索引和约束优先保证 SQLite 可运行 |
|
||||
| 短表名前缀 | 使用 `ra_` 作为审核智能体文件汇总相关表前缀 |
|
||||
| 不建枚举表 | 状态枚举使用 Django `TextChoices`,数据库存储字符串 |
|
||||
| 对话隔离 | 所有附件、批次、导出文件均可追溯到 Conversation 和 User |
|
||||
| 多版本附件 | 同一对话同名附件允许多次上传,以版本号区分 |
|
||||
| 批次固化 | 每次汇总批次通过中间表绑定本次使用的附件版本,防止串文件 |
|
||||
| 事件留痕 | 保留 WorkflowEvent,用于 SSE 断线续传、页面刷新恢复和排查问题 |
|
||||
|
||||
---
|
||||
|
||||
## 二、ER 图
|
||||
|
||||
```mermaid
|
||||
erDiagram
|
||||
AUTH_USER ||--o{ CONVERSATION : owns
|
||||
CONVERSATION ||--o{ MESSAGE : contains
|
||||
CONVERSATION ||--o{ RA_FILE_ATTACHMENT : has
|
||||
CONVERSATION ||--o{ RA_FILE_SUMMARY_BATCH : has
|
||||
AUTH_USER ||--o{ RA_FILE_ATTACHMENT : uploads
|
||||
AUTH_USER ||--o{ RA_FILE_SUMMARY_BATCH : runs
|
||||
MESSAGE ||--o{ RA_FILE_SUMMARY_BATCH : triggers
|
||||
RA_FILE_SUMMARY_BATCH ||--o{ RA_FILE_SUMMARY_BATCH_ATTACHMENT : binds
|
||||
RA_FILE_ATTACHMENT ||--o{ RA_FILE_SUMMARY_BATCH_ATTACHMENT : selected
|
||||
RA_FILE_SUMMARY_BATCH ||--o{ RA_FILE_SUMMARY_ITEM : produces
|
||||
RA_FILE_SUMMARY_BATCH ||--o{ RA_WORKFLOW_NODE_RUN : tracks
|
||||
RA_FILE_SUMMARY_BATCH ||--o{ RA_WORKFLOW_EVENT : emits
|
||||
RA_FILE_SUMMARY_BATCH ||--o{ RA_EXPORTED_SUMMARY_FILE : exports
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 三、表结构设计
|
||||
|
||||
### 3.1 ra_file_attachment
|
||||
|
||||
用户在对话右侧上传区上传后的附件记录。上传即存储,不代表已启动工作流。
|
||||
|
||||
| 字段名 | Django 类型 | SQLite 类型 | 必填 | 说明 |
|
||||
| --- | --- | --- | --- | --- |
|
||||
| id | BigAutoField | integer | 是 | 主键 |
|
||||
| conversation_id | ForeignKey | bigint | 是 | 绑定对话 |
|
||||
| user_id | ForeignKey | bigint | 是 | 上传用户 |
|
||||
| original_name | CharField(255) | varchar(255) | 是 | 原始文件名 |
|
||||
| version_no | PositiveIntegerField | integer | 是 | 同一对话同名文件版本号,从 1 递增 |
|
||||
| is_active | BooleanField | bool | 是 | 是否当前默认版本 |
|
||||
| storage_path | CharField(500) | varchar(500) | 是 | 文件存储路径 |
|
||||
| file_size | BigIntegerField | bigint | 是 | 文件大小 |
|
||||
| content_type | CharField(120) | varchar(120) | 否 | MIME 类型 |
|
||||
| upload_status | CharField(20) | varchar(20) | 是 | uploaded、bound、deleted |
|
||||
| created_at | DateTimeField | datetime | 是 | 上传时间 |
|
||||
|
||||
唯一约束:
|
||||
|
||||
| 约束名 | 字段 |
|
||||
| --- | --- |
|
||||
| uq_ra_attachment_conv_name_version | conversation_id, original_name, version_no |
|
||||
|
||||
索引:
|
||||
|
||||
| 索引名 | 字段 | 说明 |
|
||||
| --- | --- | --- |
|
||||
| idx_ra_attachment_conv_created | conversation_id, created_at | 查询对话附件列表 |
|
||||
| idx_ra_attachment_user_created | user_id, created_at | 查询用户上传记录 |
|
||||
| idx_ra_attachment_active | conversation_id, original_name, is_active | 查询当前默认版本 |
|
||||
|
||||
---
|
||||
|
||||
### 3.2 ra_file_summary_batch
|
||||
|
||||
一次文件目录与页数汇总工作流批次。
|
||||
|
||||
| 字段名 | Django 类型 | SQLite 类型 | 必填 | 说明 |
|
||||
| --- | --- | --- | --- | --- |
|
||||
| id | BigAutoField | integer | 是 | 主键 |
|
||||
| conversation_id | ForeignKey | bigint | 是 | 绑定对话 |
|
||||
| user_id | ForeignKey | bigint | 是 | 执行用户 |
|
||||
| trigger_message_id | ForeignKey | bigint | 否 | 触发工作流的用户消息 |
|
||||
| batch_no | CharField(64) | varchar(64) | 是 | 批次编号,唯一 |
|
||||
| product_name | CharField(200) | varchar(200) | 否 | 识别出的产品名称 |
|
||||
| status | CharField(20) | varchar(20) | 是 | pending、running、success、failed |
|
||||
| total_files | IntegerField | integer | 是 | 文件总数 |
|
||||
| supported_files | IntegerField | integer | 是 | 支持统计文件数 |
|
||||
| success_files | IntegerField | integer | 是 | 统计成功文件数 |
|
||||
| failed_files | IntegerField | integer | 是 | 统计失败文件数 |
|
||||
| unsupported_files | IntegerField | integer | 是 | 不支持文件数 |
|
||||
| uncertain_files | IntegerField | integer | 是 | 页数不可确定文件数 |
|
||||
| total_pages | IntegerField | integer | 是 | 总页数 |
|
||||
| work_dir | CharField(500) | varchar(500) | 否 | 批次工作目录 |
|
||||
| error_message | TextField | text | 否 | 批次异常说明 |
|
||||
| created_at | DateTimeField | datetime | 是 | 创建时间 |
|
||||
| started_at | DateTimeField | datetime | 否 | 开始时间 |
|
||||
| finished_at | DateTimeField | datetime | 否 | 完成时间 |
|
||||
|
||||
唯一约束:
|
||||
|
||||
| 约束名 | 字段 |
|
||||
| --- | --- |
|
||||
| uq_ra_batch_no | batch_no |
|
||||
|
||||
索引:
|
||||
|
||||
| 索引名 | 字段 | 说明 |
|
||||
| --- | --- | --- |
|
||||
| idx_ra_batch_conv_created | conversation_id, created_at | 查询对话下批次 |
|
||||
| idx_ra_batch_user_created | user_id, created_at | 查询用户批次 |
|
||||
| idx_ra_batch_status | status, created_at | 查询执行中或失败批次 |
|
||||
|
||||
---
|
||||
|
||||
### 3.3 ra_file_summary_batch_attachment
|
||||
|
||||
批次与附件版本绑定表。一个对话可多次上传同名附件,批次必须固化本次使用的附件版本。
|
||||
|
||||
| 字段名 | Django 类型 | SQLite 类型 | 必填 | 说明 |
|
||||
| --- | --- | --- | --- | --- |
|
||||
| id | BigAutoField | integer | 是 | 主键 |
|
||||
| batch_id | ForeignKey | bigint | 是 | 汇总批次 |
|
||||
| attachment_id | ForeignKey | bigint | 是 | 本次使用的附件版本 |
|
||||
| source_role | CharField(20) | varchar(20) | 是 | archive、multi_file |
|
||||
| created_at | DateTimeField | datetime | 是 | 绑定时间 |
|
||||
|
||||
唯一约束:
|
||||
|
||||
| 约束名 | 字段 |
|
||||
| --- | --- |
|
||||
| uq_ra_batch_attachment | batch_id, attachment_id |
|
||||
|
||||
索引:
|
||||
|
||||
| 索引名 | 字段 | 说明 |
|
||||
| --- | --- | --- |
|
||||
| idx_ra_batch_attachment_batch | batch_id, created_at | 查询批次附件 |
|
||||
| idx_ra_batch_attachment_attachment | attachment_id | 查询附件被哪些批次使用 |
|
||||
|
||||
---
|
||||
|
||||
### 3.4 ra_file_summary_item
|
||||
|
||||
文件明细表,记录扫描到的每个文件及页数统计结果。
|
||||
|
||||
| 字段名 | Django 类型 | SQLite 类型 | 必填 | 说明 |
|
||||
| --- | --- | --- | --- | --- |
|
||||
| id | BigAutoField | integer | 是 | 主键 |
|
||||
| batch_id | ForeignKey | bigint | 是 | 所属批次 |
|
||||
| file_index | PositiveIntegerField | integer | 是 | 文件序号 |
|
||||
| directory_level | CharField(300) | varchar(300) | 否 | 目录层级 |
|
||||
| file_name | CharField(255) | varchar(255) | 是 | 文件名 |
|
||||
| file_type | CharField(20) | varchar(20) | 是 | 文件类型 |
|
||||
| relative_path | CharField(500) | varchar(500) | 是 | 相对路径,用于展示和导出 |
|
||||
| storage_path | CharField(500) | varchar(500) | 是 | 实际处理路径 |
|
||||
| page_count | IntegerField | integer | 否 | 页数,失败或不可确定时为空 |
|
||||
| statistics_status | CharField(20) | varchar(20) | 是 | success、failed、unsupported、uncertain、skipped |
|
||||
| retry_count | PositiveIntegerField | integer | 是 | 页数统计重试次数 |
|
||||
| error_message | TextField | text | 否 | 异常说明 |
|
||||
| created_at | DateTimeField | datetime | 是 | 创建时间 |
|
||||
| updated_at | DateTimeField | datetime | 是 | 更新时间 |
|
||||
|
||||
唯一约束:
|
||||
|
||||
| 约束名 | 字段 |
|
||||
| --- | --- |
|
||||
| uq_ra_item_batch_relative_path | batch_id, relative_path |
|
||||
|
||||
索引:
|
||||
|
||||
| 索引名 | 字段 | 说明 |
|
||||
| --- | --- | --- |
|
||||
| idx_ra_item_batch_index | batch_id, file_index | 按序展示文件明细 |
|
||||
| idx_ra_item_batch_status | batch_id, statistics_status | 查询失败/不可确定文件 |
|
||||
| idx_ra_item_batch_type | batch_id, file_type | 按类型统计 |
|
||||
|
||||
---
|
||||
|
||||
### 3.5 ra_workflow_node_run
|
||||
|
||||
工作流节点运行状态表,用于右侧工作流卡片状态恢复。
|
||||
|
||||
| 字段名 | Django 类型 | SQLite 类型 | 必填 | 说明 |
|
||||
| --- | --- | --- | --- | --- |
|
||||
| id | BigAutoField | integer | 是 | 主键 |
|
||||
| batch_id | ForeignKey | bigint | 是 | 所属批次 |
|
||||
| node_code | CharField(40) | varchar(40) | 是 | 节点编码 |
|
||||
| node_name | CharField(80) | varchar(80) | 是 | 节点名称 |
|
||||
| status | CharField(20) | varchar(20) | 是 | pending、running、retrying、success、failed、skipped |
|
||||
| progress | PositiveIntegerField | integer | 是 | 进度百分比,0-100 |
|
||||
| message | TextField | text | 否 | 节点提示 |
|
||||
| started_at | DateTimeField | datetime | 否 | 开始时间 |
|
||||
| finished_at | DateTimeField | datetime | 否 | 完成时间 |
|
||||
|
||||
唯一约束:
|
||||
|
||||
| 约束名 | 字段 |
|
||||
| --- | --- |
|
||||
| uq_ra_node_batch_code | batch_id, node_code |
|
||||
|
||||
索引:
|
||||
|
||||
| 索引名 | 字段 | 说明 |
|
||||
| --- | --- | --- |
|
||||
| idx_ra_node_batch_status | batch_id, status | 查询批次节点状态 |
|
||||
|
||||
---
|
||||
|
||||
### 3.6 ra_workflow_event
|
||||
|
||||
工作流事件表,用于 SSE 事件持久化、断线续传和调试。
|
||||
|
||||
| 字段名 | Django 类型 | SQLite 类型 | 必填 | 说明 |
|
||||
| --- | --- | --- | --- | --- |
|
||||
| id | BigAutoField | integer | 是 | 主键,同时可作为 event_id |
|
||||
| batch_id | ForeignKey | bigint | 是 | 所属批次 |
|
||||
| event_type | CharField(40) | varchar(40) | 是 | workflow_started、node_progress 等 |
|
||||
| payload | JSONField | text/json | 是 | 事件载荷 |
|
||||
| created_at | DateTimeField | datetime | 是 | 事件时间 |
|
||||
|
||||
索引:
|
||||
|
||||
| 索引名 | 字段 | 说明 |
|
||||
| --- | --- | --- |
|
||||
| idx_ra_event_batch_id | batch_id, id | SSE after 续传 |
|
||||
| idx_ra_event_batch_created | batch_id, created_at | 按时间查询事件 |
|
||||
|
||||
---
|
||||
|
||||
### 3.7 ra_exported_summary_file
|
||||
|
||||
导出文件记录表。下载链接运行时根据 export_id 生成。
|
||||
|
||||
| 字段名 | Django 类型 | SQLite 类型 | 必填 | 说明 |
|
||||
| --- | --- | --- | --- | --- |
|
||||
| id | BigAutoField | integer | 是 | 主键 |
|
||||
| batch_id | ForeignKey | bigint | 是 | 所属批次 |
|
||||
| export_type | CharField(20) | varchar(20) | 是 | markdown、excel |
|
||||
| file_name | CharField(255) | varchar(255) | 是 | 导出文件名 |
|
||||
| storage_path | CharField(500) | varchar(500) | 是 | 保存路径 |
|
||||
| status | CharField(20) | varchar(20) | 是 | success、failed |
|
||||
| error_message | TextField | text | 否 | 导出异常说明 |
|
||||
| created_at | DateTimeField | datetime | 是 | 生成时间 |
|
||||
|
||||
索引:
|
||||
|
||||
| 索引名 | 字段 | 说明 |
|
||||
| --- | --- | --- |
|
||||
| idx_ra_export_batch_type | batch_id, export_type | 查询批次导出文件 |
|
||||
| idx_ra_export_batch_created | batch_id, created_at | 按生成时间查询 |
|
||||
|
||||
---
|
||||
|
||||
## 四、枚举设计
|
||||
|
||||
本功能不建立枚举表,枚举通过 Django `TextChoices` 定义,数据库存储字符串。
|
||||
|
||||
### 4.1 附件状态 upload_status
|
||||
|
||||
| 值 | 中文 | 说明 |
|
||||
| --- | --- | --- |
|
||||
| uploaded | 已上传 | 上传完成,尚未绑定批次 |
|
||||
| bound | 已绑定 | 已被某个批次使用 |
|
||||
| deleted | 已删除 | 用户逻辑删除,不再作为默认候选 |
|
||||
|
||||
### 4.2 批次状态 batch.status
|
||||
|
||||
| 值 | 中文 | 说明 |
|
||||
| --- | --- | --- |
|
||||
| pending | 待执行 | 批次已创建 |
|
||||
| running | 执行中 | 后台工作流运行中 |
|
||||
| success | 成功 | 工作流完成 |
|
||||
| failed | 失败 | 批次级失败 |
|
||||
|
||||
### 4.3 节点状态 node.status
|
||||
|
||||
| 值 | 中文 | 说明 |
|
||||
| --- | --- | --- |
|
||||
| pending | 等待中 | 节点未开始 |
|
||||
| running | 执行中 | 节点正在执行 |
|
||||
| retrying | 重试中 | 单文件解析失败后重试 |
|
||||
| success | 成功 | 节点执行成功 |
|
||||
| failed | 失败 | 节点失败 |
|
||||
| skipped | 跳过 | 当前批次不需要执行该节点 |
|
||||
|
||||
### 4.4 文件统计状态 statistics_status
|
||||
|
||||
| 值 | 中文 | 说明 |
|
||||
| --- | --- | --- |
|
||||
| success | 成功 | 页数统计成功 |
|
||||
| failed | 失败 | 重试后仍失败 |
|
||||
| unsupported | 不支持 | 文件类型不在支持范围 |
|
||||
| uncertain | 不确定 | 文件可读,但无可靠页数元数据 |
|
||||
| skipped | 跳过 | 空文件、隐藏文件或规则跳过 |
|
||||
|
||||
### 4.5 导出类型 export_type
|
||||
|
||||
| 值 | 中文 | 说明 |
|
||||
| --- | --- | --- |
|
||||
| markdown | Markdown | Markdown 汇总报告 |
|
||||
| excel | Excel | Excel 明细文件 |
|
||||
|
||||
### 4.6 导出状态 export.status
|
||||
|
||||
| 值 | 中文 | 说明 |
|
||||
| --- | --- | --- |
|
||||
| success | 成功 | 导出文件生成成功 |
|
||||
| failed | 失败 | 导出失败 |
|
||||
|
||||
---
|
||||
|
||||
## 五、关系与业务规则
|
||||
|
||||
### 5.1 对话与附件
|
||||
|
||||
```text
|
||||
Conversation 1:N ra_file_attachment
|
||||
```
|
||||
|
||||
规则:
|
||||
|
||||
| 规则 | 说明 |
|
||||
| --- | --- |
|
||||
| 上传即存储 | 用户上传后立即创建 FileAttachment |
|
||||
| 对话隔离 | 附件只能被同一 Conversation 下的批次使用 |
|
||||
| 多版本 | 同一 conversation + original_name 可存在多个 version_no |
|
||||
| 默认版本 | is_active=true 的记录作为默认候选版本 |
|
||||
| 逻辑删除 | 删除附件时设置 upload_status=deleted,不立即物理删除 |
|
||||
|
||||
### 5.2 对话与批次
|
||||
|
||||
```text
|
||||
Conversation 1:N ra_file_summary_batch
|
||||
```
|
||||
|
||||
规则:
|
||||
|
||||
| 规则 | 说明 |
|
||||
| --- | --- |
|
||||
| 多次汇总 | 同一对话允许多次触发自动汇总 |
|
||||
| 提示词触发 | 批次由用户消息触发,可关联 trigger_message_id |
|
||||
| 批次固化 | 批次启动时固化本次使用的附件版本 |
|
||||
|
||||
### 5.3 批次与附件版本
|
||||
|
||||
```text
|
||||
ra_file_summary_batch N:M ra_file_attachment
|
||||
```
|
||||
|
||||
通过 `ra_file_summary_batch_attachment` 实现。
|
||||
|
||||
规则:
|
||||
|
||||
| 规则 | 说明 |
|
||||
| --- | --- |
|
||||
| 不串文件 | 工作流只能读取中间表绑定的附件 |
|
||||
| 保留历史 | 即使附件后续上传新版本,历史批次仍指向旧版本 |
|
||||
| 版本选择 | 用户未选择时默认使用同名文件的最新 active 版本 |
|
||||
|
||||
### 5.4 批次与文件明细
|
||||
|
||||
```text
|
||||
ra_file_summary_batch 1:N ra_file_summary_item
|
||||
```
|
||||
|
||||
规则:
|
||||
|
||||
| 规则 | 说明 |
|
||||
| --- | --- |
|
||||
| 相对路径唯一 | 同一批次下 relative_path 唯一 |
|
||||
| 处理路径保留 | relative_path 用于展示,storage_path 用于后台处理 |
|
||||
| 单文件失败不阻断 | 文件解析失败记录 failed,批次继续处理其他文件 |
|
||||
|
||||
---
|
||||
|
||||
## 六、索引设计汇总
|
||||
|
||||
| 表 | 索引/约束 | 字段 | 用途 |
|
||||
| --- | --- | --- | --- |
|
||||
| ra_file_attachment | uq_ra_attachment_conv_name_version | conversation_id, original_name, version_no | 同名附件版本唯一 |
|
||||
| ra_file_attachment | idx_ra_attachment_conv_created | conversation_id, created_at | 对话附件列表 |
|
||||
| ra_file_attachment | idx_ra_attachment_user_created | user_id, created_at | 用户上传记录 |
|
||||
| ra_file_attachment | idx_ra_attachment_active | conversation_id, original_name, is_active | 默认版本查询 |
|
||||
| ra_file_summary_batch | uq_ra_batch_no | batch_no | 批次编号唯一 |
|
||||
| ra_file_summary_batch | idx_ra_batch_conv_created | conversation_id, created_at | 对话批次列表 |
|
||||
| ra_file_summary_batch | idx_ra_batch_user_created | user_id, created_at | 用户批次列表 |
|
||||
| ra_file_summary_batch | idx_ra_batch_status | status, created_at | 查询运行中/失败批次 |
|
||||
| ra_file_summary_batch_attachment | uq_ra_batch_attachment | batch_id, attachment_id | 批次附件唯一 |
|
||||
| ra_file_summary_item | uq_ra_item_batch_relative_path | batch_id, relative_path | 批次内文件唯一 |
|
||||
| ra_file_summary_item | idx_ra_item_batch_index | batch_id, file_index | 文件明细排序 |
|
||||
| ra_file_summary_item | idx_ra_item_batch_status | batch_id, statistics_status | 查询异常文件 |
|
||||
| ra_workflow_node_run | uq_ra_node_batch_code | batch_id, node_code | 每批次每节点唯一 |
|
||||
| ra_workflow_event | idx_ra_event_batch_id | batch_id, id | SSE 断点续传 |
|
||||
| ra_exported_summary_file | idx_ra_export_batch_type | batch_id, export_type | 查询导出文件 |
|
||||
|
||||
---
|
||||
|
||||
## 七、SQLite 参考 DDL
|
||||
|
||||
> 说明:以下 DDL 为设计参考,实际落地以 Django migration 为准。
|
||||
|
||||
```sql
|
||||
CREATE TABLE ra_file_attachment (
|
||||
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
conversation_id BIGINT NOT NULL,
|
||||
user_id BIGINT NOT NULL,
|
||||
original_name VARCHAR(255) NOT NULL,
|
||||
version_no INTEGER NOT NULL DEFAULT 1,
|
||||
is_active BOOLEAN NOT NULL DEFAULT 1,
|
||||
storage_path VARCHAR(500) NOT NULL,
|
||||
file_size BIGINT NOT NULL DEFAULT 0,
|
||||
content_type VARCHAR(120) NOT NULL DEFAULT '',
|
||||
upload_status VARCHAR(20) NOT NULL DEFAULT 'uploaded',
|
||||
created_at DATETIME NOT NULL,
|
||||
UNIQUE (conversation_id, original_name, version_no)
|
||||
);
|
||||
|
||||
CREATE INDEX idx_ra_attachment_conv_created
|
||||
ON ra_file_attachment (conversation_id, created_at);
|
||||
|
||||
CREATE INDEX idx_ra_attachment_user_created
|
||||
ON ra_file_attachment (user_id, created_at);
|
||||
|
||||
CREATE INDEX idx_ra_attachment_active
|
||||
ON ra_file_attachment (conversation_id, original_name, is_active);
|
||||
```
|
||||
|
||||
```sql
|
||||
CREATE TABLE ra_file_summary_batch (
|
||||
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
conversation_id BIGINT NOT NULL,
|
||||
user_id BIGINT NOT NULL,
|
||||
trigger_message_id BIGINT NULL,
|
||||
batch_no VARCHAR(64) NOT NULL UNIQUE,
|
||||
product_name VARCHAR(200) NOT NULL DEFAULT '',
|
||||
status VARCHAR(20) NOT NULL DEFAULT 'pending',
|
||||
total_files INTEGER NOT NULL DEFAULT 0,
|
||||
supported_files INTEGER NOT NULL DEFAULT 0,
|
||||
success_files INTEGER NOT NULL DEFAULT 0,
|
||||
failed_files INTEGER NOT NULL DEFAULT 0,
|
||||
unsupported_files INTEGER NOT NULL DEFAULT 0,
|
||||
uncertain_files INTEGER NOT NULL DEFAULT 0,
|
||||
total_pages INTEGER NOT NULL DEFAULT 0,
|
||||
work_dir VARCHAR(500) NOT NULL DEFAULT '',
|
||||
error_message TEXT NOT NULL DEFAULT '',
|
||||
created_at DATETIME NOT NULL,
|
||||
started_at DATETIME NULL,
|
||||
finished_at DATETIME NULL
|
||||
);
|
||||
|
||||
CREATE INDEX idx_ra_batch_conv_created
|
||||
ON ra_file_summary_batch (conversation_id, created_at);
|
||||
|
||||
CREATE INDEX idx_ra_batch_user_created
|
||||
ON ra_file_summary_batch (user_id, created_at);
|
||||
|
||||
CREATE INDEX idx_ra_batch_status
|
||||
ON ra_file_summary_batch (status, created_at);
|
||||
```
|
||||
|
||||
```sql
|
||||
CREATE TABLE ra_file_summary_batch_attachment (
|
||||
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
batch_id BIGINT NOT NULL,
|
||||
attachment_id BIGINT NOT NULL,
|
||||
source_role VARCHAR(20) NOT NULL DEFAULT 'multi_file',
|
||||
created_at DATETIME NOT NULL,
|
||||
UNIQUE (batch_id, attachment_id)
|
||||
);
|
||||
|
||||
CREATE INDEX idx_ra_batch_attachment_batch
|
||||
ON ra_file_summary_batch_attachment (batch_id, created_at);
|
||||
|
||||
CREATE INDEX idx_ra_batch_attachment_attachment
|
||||
ON ra_file_summary_batch_attachment (attachment_id);
|
||||
```
|
||||
|
||||
```sql
|
||||
CREATE TABLE ra_file_summary_item (
|
||||
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
batch_id BIGINT NOT NULL,
|
||||
file_index INTEGER NOT NULL,
|
||||
directory_level VARCHAR(300) NOT NULL DEFAULT '',
|
||||
file_name VARCHAR(255) NOT NULL,
|
||||
file_type VARCHAR(20) NOT NULL,
|
||||
relative_path VARCHAR(500) NOT NULL,
|
||||
storage_path VARCHAR(500) NOT NULL,
|
||||
page_count INTEGER NULL,
|
||||
statistics_status VARCHAR(20) NOT NULL DEFAULT 'skipped',
|
||||
retry_count INTEGER NOT NULL DEFAULT 0,
|
||||
error_message TEXT NOT NULL DEFAULT '',
|
||||
created_at DATETIME NOT NULL,
|
||||
updated_at DATETIME NOT NULL,
|
||||
UNIQUE (batch_id, relative_path)
|
||||
);
|
||||
|
||||
CREATE INDEX idx_ra_item_batch_index
|
||||
ON ra_file_summary_item (batch_id, file_index);
|
||||
|
||||
CREATE INDEX idx_ra_item_batch_status
|
||||
ON ra_file_summary_item (batch_id, statistics_status);
|
||||
|
||||
CREATE INDEX idx_ra_item_batch_type
|
||||
ON ra_file_summary_item (batch_id, file_type);
|
||||
```
|
||||
|
||||
```sql
|
||||
CREATE TABLE ra_workflow_node_run (
|
||||
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
batch_id BIGINT NOT NULL,
|
||||
node_code VARCHAR(40) NOT NULL,
|
||||
node_name VARCHAR(80) NOT NULL,
|
||||
status VARCHAR(20) NOT NULL DEFAULT 'pending',
|
||||
progress INTEGER NOT NULL DEFAULT 0,
|
||||
message TEXT NOT NULL DEFAULT '',
|
||||
started_at DATETIME NULL,
|
||||
finished_at DATETIME NULL,
|
||||
UNIQUE (batch_id, node_code)
|
||||
);
|
||||
|
||||
CREATE INDEX idx_ra_node_batch_status
|
||||
ON ra_workflow_node_run (batch_id, status);
|
||||
```
|
||||
|
||||
```sql
|
||||
CREATE TABLE ra_workflow_event (
|
||||
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
batch_id BIGINT NOT NULL,
|
||||
event_type VARCHAR(40) NOT NULL,
|
||||
payload TEXT NOT NULL DEFAULT '{}',
|
||||
created_at DATETIME NOT NULL
|
||||
);
|
||||
|
||||
CREATE INDEX idx_ra_event_batch_id
|
||||
ON ra_workflow_event (batch_id, id);
|
||||
|
||||
CREATE INDEX idx_ra_event_batch_created
|
||||
ON ra_workflow_event (batch_id, created_at);
|
||||
```
|
||||
|
||||
```sql
|
||||
CREATE TABLE ra_exported_summary_file (
|
||||
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
batch_id BIGINT NOT NULL,
|
||||
export_type VARCHAR(20) NOT NULL,
|
||||
file_name VARCHAR(255) NOT NULL,
|
||||
storage_path VARCHAR(500) NOT NULL,
|
||||
status VARCHAR(20) NOT NULL DEFAULT 'success',
|
||||
error_message TEXT NOT NULL DEFAULT '',
|
||||
created_at DATETIME NOT NULL
|
||||
);
|
||||
|
||||
CREATE INDEX idx_ra_export_batch_type
|
||||
ON ra_exported_summary_file (batch_id, export_type);
|
||||
|
||||
CREATE INDEX idx_ra_export_batch_created
|
||||
ON ra_exported_summary_file (batch_id, created_at);
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 八、Django ORM 落地注意事项
|
||||
|
||||
### 8.1 db_table
|
||||
|
||||
每个模型通过 `class Meta: db_table = "ra_xxx"` 固定表名,避免 Django 默认生成较长表名。
|
||||
|
||||
### 8.2 JSONField
|
||||
|
||||
`WorkflowEvent.payload` 使用 Django `models.JSONField(default=dict)`。SQLite 下实际以文本形式存储,Django 负责序列化与反序列化。
|
||||
|
||||
### 8.3 版本号生成
|
||||
|
||||
同一对话同名文件上传时:
|
||||
|
||||
```text
|
||||
version_no = max(existing version_no) + 1
|
||||
```
|
||||
|
||||
若新版本设为默认版本,需要将旧版本 `is_active` 更新为 false。
|
||||
|
||||
### 8.4 逻辑删除
|
||||
|
||||
附件删除时:
|
||||
|
||||
```text
|
||||
upload_status = deleted
|
||||
is_active = false
|
||||
```
|
||||
|
||||
历史批次仍可通过中间表追溯该附件。
|
||||
|
||||
### 8.5 批次选择附件
|
||||
|
||||
用户发送提示词触发工作流时:
|
||||
|
||||
| 场景 | 处理 |
|
||||
| --- | --- |
|
||||
| 用户显式选择附件版本 | 使用所选 attachment_id |
|
||||
| 用户未选择版本 | 使用当前对话下 is_active=true 且未删除的附件 |
|
||||
| 存在多个同名 active 异常 | 取 created_at 最新,并记录待修复数据异常 |
|
||||
|
||||
---
|
||||
|
||||
## 九、数据保留策略
|
||||
|
||||
| 数据 | Demo 策略 | 正式部署建议 |
|
||||
| --- | --- | --- |
|
||||
| 上传附件记录 | 永久保留 | 随会话归档周期清理 |
|
||||
| 上传原始文件 | 永久保留 | 可按用户/项目配置保留期限 |
|
||||
| 汇总批次 | 永久保留 | 保留用于审计追溯 |
|
||||
| 文件明细 | 永久保留 | 保留用于历史报告复现 |
|
||||
| 工作流事件 | 永久保留 | 可定期清理已完成批次的事件 |
|
||||
| 导出文件 | 永久保留 | 可设置下载有效期或归档 |
|
||||
|
||||
---
|
||||
|
||||
## 十、待确认事项
|
||||
|
||||
| 序号 | 问题 | 当前设计 | 状态 |
|
||||
| --- | --- | --- | --- |
|
||||
| 1 | 正式部署是否从 SQLite 迁移到 PostgreSQL/MySQL | 当前按 SQLite/Django ORM 设计,保留 ORM 兼容性 | 待后续确认 |
|
||||
| 2 | 同名附件 active 是否允许多个 | 设计上不允许,代码更新时应关闭旧 active | 待开发实现 |
|
||||
| 3 | 文件物理删除时机 | Demo 不物理删除 | 待后续确认 |
|
||||
|
||||
---
|
||||
|
||||
## 十一、开发顺序建议
|
||||
|
||||
1. 在 `review_agent/models.py` 中新增上述 7 个模型。
|
||||
2. 为状态字段定义 Django `TextChoices`。
|
||||
3. 配置 `db_table`、`indexes`、`constraints`。
|
||||
4. 执行 `python manage.py makemigrations review_agent` 生成迁移。
|
||||
5. 执行 `python manage.py migrate` 验证 SQLite 可落表。
|
||||
6. 编写模型级测试,覆盖同名附件版本、批次附件绑定、唯一约束和权限查询。
|
||||
485
docs/3.数据库设计/2.NMPA注册资料法规核查与整改闭环.md
Normal file
485
docs/3.数据库设计/2.NMPA注册资料法规核查与整改闭环.md
Normal file
@@ -0,0 +1,485 @@
|
||||
# NMPA 注册资料法规核查与整改闭环工作流数据库设计
|
||||
|
||||
## 文档信息
|
||||
|
||||
| 项目 | 内容 |
|
||||
| --- | --- |
|
||||
| 需求分析文档 | docs/1.需求分析/2.NMPA注册资料法规核查与整改闭环.md |
|
||||
| 功能设计文档 | docs/2.功能设计/2.NMPA注册资料法规核查与整改闭环.md |
|
||||
| 数据库类型 | SQLite / Django ORM |
|
||||
| 表名前缀 | ra_ |
|
||||
| 设计日期 | 2026-06-06 |
|
||||
| 设计版本 | V1.0 |
|
||||
|
||||
---
|
||||
|
||||
## 一、设计原则
|
||||
|
||||
| 原则 | 说明 |
|
||||
| --- | --- |
|
||||
| 复用汇总批次 | 法规核查不重复保存文件清单,必须关联既有 `ra_file_summary_batch` |
|
||||
| 独立核查批次 | 同一个文件汇总批次可以产生多次法规核查批次,适用条件变更时创建新批次 |
|
||||
| 规则版本入库 | 结构化规则版本进入数据库,便于追溯规则文件、RAG 索引和启用状态 |
|
||||
| RAG 不单独建表 | RAG 索引信息挂在规则版本和核查批次字段中,不新增索引表 |
|
||||
| 枚举存值 | 数据库存英文枚举 value,前端或服务层映射为中文展示 |
|
||||
| 关键字段独立 | 常用查询字段独立存储,其余过程上下文进入 JSON 或文件产物 |
|
||||
| 大文本不入库 | 过程产物只在数据库保存路径、摘要和 hash,大文本内容写入文件 |
|
||||
| 软删除优先 | 法规核查相关数据采用软删除/归档策略,便于审计和恢复 |
|
||||
| 过程产物留底 | 条件确认、核查矩阵、风险清单、RAG 结果、通知记录、复核记录均需留底 |
|
||||
|
||||
---
|
||||
|
||||
## 二、ER 图
|
||||
|
||||
```mermaid
|
||||
erDiagram
|
||||
AUTH_USER ||--o{ CONVERSATION : owns
|
||||
CONVERSATION ||--o{ RA_FILE_SUMMARY_BATCH : has
|
||||
RA_FILE_SUMMARY_BATCH ||--o{ RA_FILE_SUMMARY_ITEM : produces
|
||||
RA_FILE_SUMMARY_BATCH ||--o{ RA_REGULATORY_REVIEW_BATCH : reviews
|
||||
AUTH_USER ||--o{ RA_REGULATORY_REVIEW_BATCH : runs
|
||||
AUTH_USER ||--o{ RA_REGULATORY_ISSUE : owns
|
||||
RA_REGULATORY_RULE_VERSION ||--o{ RA_REGULATORY_REVIEW_BATCH : used_by
|
||||
RA_REGULATORY_REVIEW_BATCH ||--o{ RA_REGULATORY_ISSUE : produces
|
||||
RA_REGULATORY_REVIEW_BATCH ||--o{ RA_REGULATORY_ARTIFACT : keeps
|
||||
RA_REGULATORY_REVIEW_BATCH ||--o{ RA_REGULATORY_NOTIFICATION_RECORD : sends
|
||||
RA_REGULATORY_REVIEW_BATCH ||--o{ RA_EXPORTED_SUMMARY_FILE : exports
|
||||
RA_REGULATORY_REVIEW_BATCH ||--o{ RA_WORKFLOW_NODE_RUN : tracks
|
||||
RA_REGULATORY_REVIEW_BATCH ||--o{ RA_WORKFLOW_EVENT : emits
|
||||
```
|
||||
|
||||
说明:`ra_workflow_node_run`、`ra_workflow_event` 在第一阶段设计中属于文件汇总批次节点记录表。法规核查工作流复用同一套事件机制,采用 `workflow_type`、`workflow_batch_id` 兼容多工作流;原 `batch_id` 保留用于兼容文件汇总旧逻辑。
|
||||
|
||||
---
|
||||
|
||||
## 三、表结构设计
|
||||
|
||||
### 3.1 ra_regulatory_rule_version
|
||||
|
||||
法规结构化规则版本表。规则文件仍以 YAML/JSON 文件形式维护,数据库记录版本元数据、文件 hash、RAG 索引版本和启用状态。
|
||||
|
||||
| 字段名 | Django 类型 | SQLite 类型 | 必填 | 说明 |
|
||||
| --- | --- | --- | --- | --- |
|
||||
| id | BigAutoField | integer | 是 | 主键 |
|
||||
| version | CharField(80) | varchar(80) | 是 | 规则版本,如 nmpa_ivd_2021_v1 |
|
||||
| source_url | URLField(500) | varchar(500) | 是 | 法规来源 URL |
|
||||
| source_path | CharField(500) | varchar(500) | 是 | 本地法规资料路径 |
|
||||
| effective_date | DateField | date | 否 | 规则生效日期或公告日期 |
|
||||
| rule_file_path | CharField(500) | varchar(500) | 是 | 结构化规则文件路径 |
|
||||
| rule_file_hash | CharField(128) | varchar(128) | 是 | 规则文件 hash |
|
||||
| rag_index_version | CharField(80) | varchar(80) | 否 | RAG 索引版本 |
|
||||
| rag_index_path | CharField(500) | varchar(500) | 否 | RAG 索引存储路径 |
|
||||
| is_active | BooleanField | bool | 是 | 是否当前启用版本 |
|
||||
| created_by_id | ForeignKey(User) | bigint | 否 | 创建人 |
|
||||
| activated_at | DateTimeField | datetime | 否 | 启用时间 |
|
||||
| description | TextField | text | 否 | 版本说明 |
|
||||
| created_at | DateTimeField | datetime | 是 | 创建时间 |
|
||||
| updated_at | DateTimeField | datetime | 是 | 更新时间 |
|
||||
| is_deleted | BooleanField | bool | 是 | 软删除标记 |
|
||||
|
||||
唯一约束:
|
||||
|
||||
| 约束名 | 字段 |
|
||||
| --- | --- |
|
||||
| uq_ra_reg_rule_version | version |
|
||||
|
||||
索引:
|
||||
|
||||
| 索引名 | 字段 | 说明 |
|
||||
| --- | --- | --- |
|
||||
| idx_ra_reg_rule_active | is_active, is_deleted | 查询当前启用规则 |
|
||||
| idx_ra_reg_rule_effective | effective_date | 按生效日期追溯 |
|
||||
| idx_ra_reg_rule_created | created_at | 查看规则版本历史 |
|
||||
|
||||
---
|
||||
|
||||
### 3.2 ra_regulatory_review_batch
|
||||
|
||||
法规核查批次表。一次法规核查工作流对应一条记录。同一个 `ra_file_summary_batch` 可关联多个法规核查批次,用于适用条件变更或重新核查。
|
||||
|
||||
| 字段名 | Django 类型 | SQLite 类型 | 必填 | 说明 |
|
||||
| --- | --- | --- | --- | --- |
|
||||
| id | BigAutoField | integer | 是 | 主键 |
|
||||
| conversation_id | ForeignKey | bigint | 是 | 绑定对话 |
|
||||
| user_id | ForeignKey | bigint | 是 | 发起用户 |
|
||||
| file_summary_batch_id | ForeignKey | bigint | 是 | 关联文件汇总批次 |
|
||||
| rule_version_id | ForeignKey | bigint | 否 | 使用的规则版本 |
|
||||
| batch_no | CharField(64) | varchar(64) | 是 | 法规核查批次编号,唯一 |
|
||||
| status | CharField(30) | varchar(30) | 是 | pending、running、waiting_user、success、failed、reference_only、partial_success、cancelled |
|
||||
| product_category | CharField(80) | varchar(80) | 否 | 产品类别 |
|
||||
| registration_type | CharField(80) | varchar(80) | 否 | 注册类型 |
|
||||
| clinical_evaluation_path | CharField(120) | varchar(120) | 否 | 临床评价路径 |
|
||||
| product_name | CharField(200) | varchar(200) | 否 | 产品名称 |
|
||||
| model_specification | CharField(200) | varchar(200) | 否 | 型号规格 |
|
||||
| intended_use | TextField | text | 否 | 预期用途 |
|
||||
| condition_json | JSONField | text/json | 否 | 其他适用条件、用户确认记录和抽取置信度 |
|
||||
| rule_version_value | CharField(80) | varchar(80) | 否 | 冗余记录规则版本值,便于历史追溯 |
|
||||
| rule_source_url | URLField(500) | varchar(500) | 否 | 冗余记录法规来源 URL |
|
||||
| rule_source_path | CharField(500) | varchar(500) | 否 | 冗余记录本地法规资料路径 |
|
||||
| rag_index_version | CharField(80) | varchar(80) | 否 | 本次使用的 RAG 索引版本 |
|
||||
| risk_summary_json | JSONField | text/json | 否 | 风险数量摘要 |
|
||||
| artifact_root | CharField(500) | varchar(500) | 否 | 本批次过程产物根目录 |
|
||||
| error_message | TextField | text | 否 | 批次异常说明 |
|
||||
| created_at | DateTimeField | datetime | 是 | 创建时间 |
|
||||
| started_at | DateTimeField | datetime | 否 | 开始时间 |
|
||||
| finished_at | DateTimeField | datetime | 否 | 完成时间 |
|
||||
| archived_at | DateTimeField | datetime | 否 | 归档时间 |
|
||||
| is_deleted | BooleanField | bool | 是 | 软删除标记 |
|
||||
|
||||
唯一约束:
|
||||
|
||||
| 约束名 | 字段 |
|
||||
| --- | --- |
|
||||
| uq_ra_reg_batch_no | batch_no |
|
||||
|
||||
索引:
|
||||
|
||||
| 索引名 | 字段 | 说明 |
|
||||
| --- | --- | --- |
|
||||
| idx_ra_reg_batch_conv_status | conversation_id, status | 查询对话下法规核查批次状态 |
|
||||
| idx_ra_reg_batch_summary | file_summary_batch_id | 根据文件汇总批次查询法规核查历史 |
|
||||
| idx_ra_reg_batch_created | created_at | 按创建时间查询 |
|
||||
| idx_ra_reg_batch_rule | rule_version_value | 规则版本追溯 |
|
||||
| idx_ra_reg_batch_user_created | user_id, created_at | 查询用户发起记录 |
|
||||
|
||||
---
|
||||
|
||||
### 3.3 ra_regulatory_issue
|
||||
|
||||
法规核查问题表,记录完整性、章节结构、一致性、通知、复核等业务问题及整改状态。
|
||||
|
||||
| 字段名 | Django 类型 | SQLite 类型 | 必填 | 说明 |
|
||||
| --- | --- | --- | --- | --- |
|
||||
| id | BigAutoField | integer | 是 | 主键 |
|
||||
| batch_id | ForeignKey | bigint | 是 | 所属法规核查批次 |
|
||||
| owner_id | ForeignKey(User) | bigint | 否 | 责任人,默认上传人 |
|
||||
| issue_code | CharField(100) | varchar(100) | 是 | 问题编码 |
|
||||
| issue_type | CharField(40) | varchar(40) | 是 | completeness、structure、consistency、notification、review |
|
||||
| risk_level | CharField(20) | varchar(20) | 是 | blocking、high、medium、low、info |
|
||||
| status | CharField(30) | varchar(30) | 是 | pending_confirm、pending_fix、fixed、review_passed、review_failed、closed |
|
||||
| title | CharField(255) | varchar(255) | 是 | 问题标题 |
|
||||
| description | TextField | text | 否 | 问题描述 |
|
||||
| rule_id | CharField(120) | varchar(120) | 否 | 命中的规则 ID |
|
||||
| regulation_basis | TextField | text | 否 | 法规依据或规则依据 |
|
||||
| file_item_id | ForeignKey(FileSummaryItem) | bigint | 否 | 关联文件明细,可为空 |
|
||||
| file_path | CharField(500) | varchar(500) | 否 | 常用证据文件路径 |
|
||||
| page_no | PositiveIntegerField | integer | 否 | 常用证据页码 |
|
||||
| field_name | CharField(120) | varchar(120) | 否 | 一致性或字段问题名称 |
|
||||
| evidence_json | JSONField | text/json | 否 | 证据详情,如文本片段、多个来源值、RAG 引用等 |
|
||||
| suggestion | TextField | text | 否 | 整改建议 |
|
||||
| source_node | CharField(60) | varchar(60) | 否 | 产生问题的工作流节点 |
|
||||
| confirmed_by_id | ForeignKey(User) | bigint | 否 | 确认人 |
|
||||
| confirmed_at | DateTimeField | datetime | 否 | 确认时间 |
|
||||
| closed_by_id | ForeignKey(User) | bigint | 否 | 关闭人 |
|
||||
| closed_at | DateTimeField | datetime | 否 | 关闭时间 |
|
||||
| created_at | DateTimeField | datetime | 是 | 创建时间 |
|
||||
| updated_at | DateTimeField | datetime | 是 | 更新时间 |
|
||||
| is_deleted | BooleanField | bool | 是 | 软删除标记 |
|
||||
|
||||
唯一约束:
|
||||
|
||||
| 约束名 | 字段 |
|
||||
| --- | --- |
|
||||
| uq_ra_reg_issue_batch_code | batch_id, issue_code |
|
||||
|
||||
索引:
|
||||
|
||||
| 索引名 | 字段 | 说明 |
|
||||
| --- | --- | --- |
|
||||
| idx_ra_reg_issue_batch | batch_id, created_at | 查询批次问题 |
|
||||
| idx_ra_reg_issue_risk_status | risk_level, status | 风险列表和整改状态筛选 |
|
||||
| idx_ra_reg_issue_owner_status | owner_id, status | 责任人待办 |
|
||||
| idx_ra_reg_issue_rule | rule_id | 规则问题追溯 |
|
||||
| idx_ra_reg_issue_file | file_item_id | 关联文件问题 |
|
||||
| idx_ra_reg_issue_field | field_name | 字段一致性问题查询 |
|
||||
|
||||
---
|
||||
|
||||
### 3.4 ra_regulatory_artifact
|
||||
|
||||
法规核查过程产物表。只保存文件元数据,不保存大文本全文。文件内容写入受控存储目录,`file_hash` 必填。
|
||||
|
||||
| 字段名 | Django 类型 | SQLite 类型 | 必填 | 说明 |
|
||||
| --- | --- | --- | --- | --- |
|
||||
| id | BigAutoField | integer | 是 | 主键 |
|
||||
| batch_id | ForeignKey | bigint | 是 | 所属法规核查批次 |
|
||||
| artifact_type | CharField(60) | varchar(60) | 是 | condition_record、rule_matrix、risk_list、text_extract_json、rag_result_json、notification_record、review_record |
|
||||
| file_format | CharField(20) | varchar(20) | 是 | markdown、excel、json |
|
||||
| file_name | CharField(255) | varchar(255) | 是 | 文件名 |
|
||||
| storage_path | CharField(500) | varchar(500) | 是 | 存储路径 |
|
||||
| file_size | BigIntegerField | bigint | 是 | 文件大小 |
|
||||
| file_hash | CharField(128) | varchar(128) | 是 | 文件 hash,用于校验留底文件未被篡改 |
|
||||
| summary | TextField | text | 否 | 产物摘要 |
|
||||
| created_by_node | CharField(60) | varchar(60) | 否 | 产生该产物的工作流节点 |
|
||||
| created_at | DateTimeField | datetime | 是 | 创建时间 |
|
||||
| is_deleted | BooleanField | bool | 是 | 软删除标记 |
|
||||
|
||||
索引:
|
||||
|
||||
| 索引名 | 字段 | 说明 |
|
||||
| --- | --- | --- |
|
||||
| idx_ra_reg_artifact_batch_type | batch_id, artifact_type | 查询批次过程产物 |
|
||||
| idx_ra_reg_artifact_format | file_format | 按格式查询 |
|
||||
| idx_ra_reg_artifact_created | created_at | 按时间追溯 |
|
||||
|
||||
---
|
||||
|
||||
### 3.5 ra_regulatory_notification_record
|
||||
|
||||
法规核查通知记录表,记录飞书 CLI 发送结果。飞书失败不阻断工作流,但需要留痕。
|
||||
|
||||
| 字段名 | Django 类型 | SQLite 类型 | 必填 | 说明 |
|
||||
| --- | --- | --- | --- | --- |
|
||||
| id | BigAutoField | integer | 是 | 主键 |
|
||||
| batch_id | ForeignKey | bigint | 是 | 所属法规核查批次 |
|
||||
| recipient_id | ForeignKey(User) | bigint | 是 | 通知对象 |
|
||||
| channel | CharField(30) | varchar(30) | 是 | feishu_cli、feishu_api、mock |
|
||||
| risk_levels | JSONField | text/json | 是 | 本次通知包含的风险等级 |
|
||||
| issue_ids | JSONField | text/json | 是 | 本次通知关联的问题 ID 列表 |
|
||||
| message_summary | TextField | text | 是 | 通知内容摘要 |
|
||||
| send_status | CharField(20) | varchar(20) | 是 | pending、success、failed |
|
||||
| retry_count | PositiveIntegerField | integer | 是 | 已重试次数,最多 3 次 |
|
||||
| external_message_id | CharField(120) | varchar(120) | 否 | 飞书外部消息 ID |
|
||||
| error_message | TextField | text | 否 | 失败原因 |
|
||||
| sent_at | DateTimeField | datetime | 否 | 发送成功时间 |
|
||||
| created_at | DateTimeField | datetime | 是 | 创建时间 |
|
||||
| updated_at | DateTimeField | datetime | 是 | 更新时间 |
|
||||
| is_deleted | BooleanField | bool | 是 | 软删除标记 |
|
||||
|
||||
索引:
|
||||
|
||||
| 索引名 | 字段 | 说明 |
|
||||
| --- | --- | --- |
|
||||
| idx_ra_reg_notify_batch | batch_id, created_at | 查询批次通知记录 |
|
||||
| idx_ra_reg_notify_recipient | recipient_id, send_status | 查询用户通知状态 |
|
||||
| idx_ra_reg_notify_status | send_status, retry_count | 查询待重试通知 |
|
||||
|
||||
---
|
||||
|
||||
## 四、枚举设计
|
||||
|
||||
### 4.1 RegulatoryReviewBatch.status
|
||||
|
||||
| value | 中文展示 | 说明 |
|
||||
| --- | --- | --- |
|
||||
| pending | 待执行 | 已创建,等待执行 |
|
||||
| running | 执行中 | 工作流正在执行 |
|
||||
| waiting_user | 等待用户 | 等待用户确认适用条件或关闭复核 |
|
||||
| success | 已完成 | 核查完成且无关键失败 |
|
||||
| failed | 失败 | 关键节点失败,无法输出有效结果 |
|
||||
| reference_only | 仅供参考 | 规则文件加载失败,降级为 RAG 辅助核查 |
|
||||
| partial_success | 部分完成 | 部分节点或通知失败,但已输出主要结果 |
|
||||
| cancelled | 已取消 | 用户或系统取消执行 |
|
||||
|
||||
### 4.2 RegulatoryIssue.status
|
||||
|
||||
| value | 中文展示 | 说明 |
|
||||
| --- | --- | --- |
|
||||
| pending_confirm | 待确认 | 条件性问题或低置信度问题等待人工确认 |
|
||||
| pending_fix | 待处理 | 已确认需要补充或整改 |
|
||||
| fixed | 已补充 | 用户已上传补充资料或声明已处理 |
|
||||
| review_passed | 复核通过 | 系统复核通过,关闭前仍需人工确认 |
|
||||
| review_failed | 复核不通过 | 系统复核后问题仍存在 |
|
||||
| closed | 已关闭 | 用户确认问题解决并关闭 |
|
||||
|
||||
### 4.3 RegulatoryIssue.risk_level
|
||||
|
||||
| value | 中文展示 | 说明 |
|
||||
| --- | --- | --- |
|
||||
| blocking | 阻断项 | 直接影响资料能否进入有效申报或审核 |
|
||||
| high | 高风险 | 可能导致注册审评补正或重大整改 |
|
||||
| medium | 中风险 | 需要补充说明或修改 |
|
||||
| low | 低风险 | 建议修正但影响较小 |
|
||||
| info | 提示项 | 系统无法充分判断或建议人工关注 |
|
||||
|
||||
### 4.4 其他枚举
|
||||
|
||||
| 字段 | value |
|
||||
| --- | --- |
|
||||
| issue_type | completeness、structure、consistency、notification、review |
|
||||
| artifact_type | condition_record、rule_matrix、risk_list、text_extract_json、rag_result_json、notification_record、review_record |
|
||||
| file_format | markdown、excel、json |
|
||||
| send_status | pending、success、failed |
|
||||
| channel | feishu_cli、feishu_api、mock |
|
||||
|
||||
---
|
||||
|
||||
## 五、软删除与归档策略
|
||||
|
||||
| 对象 | 策略 |
|
||||
| --- | --- |
|
||||
| RegulatoryRuleVersion | 使用 `is_deleted` 软删除;已被批次引用的版本不允许物理删除 |
|
||||
| RegulatoryReviewBatch | 使用 `is_deleted` 和 `archived_at` 归档;归档后默认不在对话主列表展示 |
|
||||
| RegulatoryIssue | 使用 `is_deleted` 软删除;删除时保留批次摘要和过程产物 |
|
||||
| RegulatoryArtifact | 使用 `is_deleted` 软删除;正式环境可配合对象存储生命周期归档 |
|
||||
| RegulatoryNotificationRecord | 使用 `is_deleted` 软删除;保留通知失败原因和重试次数 |
|
||||
|
||||
删除 Conversation 时,本期不建议物理级联法规核查数据。应先标记相关批次归档或删除,再由后台清理任务处理文件和产物。
|
||||
|
||||
---
|
||||
|
||||
## 六、过程产物存储设计
|
||||
|
||||
### 6.1 存储目录
|
||||
|
||||
法规核查过程产物使用独立目录,按用户、对话、法规核查批次隔离:
|
||||
|
||||
```text
|
||||
media/regulatory_review/{user_id}/{conversation_id}/{batch_id}/
|
||||
```
|
||||
|
||||
示例:
|
||||
|
||||
```text
|
||||
media/regulatory_review/12/1001/2001/
|
||||
condition_record.md
|
||||
condition_record.json
|
||||
rule_matrix.xlsx
|
||||
risk_list.md
|
||||
risk_list.json
|
||||
text_extract.json
|
||||
rag_result.json
|
||||
notification_record.md
|
||||
review_record.json
|
||||
```
|
||||
|
||||
### 6.2 文件 hash
|
||||
|
||||
`ra_regulatory_artifact.file_hash` 必填。建议使用 SHA-256。
|
||||
|
||||
| 场景 | 处理 |
|
||||
| --- | --- |
|
||||
| 文件生成成功 | 计算 hash 后写入记录 |
|
||||
| hash 计算失败 | 产物生成视为失败,节点进入 partial_success 或 failed |
|
||||
| 下载文件 | 可选重新计算 hash 校验 |
|
||||
|
||||
---
|
||||
|
||||
## 七、JSON 字段结构建议
|
||||
|
||||
### 7.1 condition_json
|
||||
|
||||
```json
|
||||
{
|
||||
"extracted": {
|
||||
"product_category": {"value": "in_vitro_diagnostic", "confidence": 0.92},
|
||||
"registration_type": {"value": "initial_registration", "confidence": 0.76}
|
||||
},
|
||||
"confirmed": {
|
||||
"confirmed_by": 1,
|
||||
"confirmed_at": "2026-06-06T00:00:00+08:00",
|
||||
"source": "dialog_choice"
|
||||
},
|
||||
"raw_user_input": "按体外诊断试剂首次注册处理"
|
||||
}
|
||||
```
|
||||
|
||||
### 7.2 risk_summary_json
|
||||
|
||||
```json
|
||||
{
|
||||
"blocking": 2,
|
||||
"high": 1,
|
||||
"medium": 3,
|
||||
"low": 4,
|
||||
"info": 2,
|
||||
"notified": {
|
||||
"feishu": 6
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 7.3 evidence_json
|
||||
|
||||
```json
|
||||
{
|
||||
"matched_rule": {
|
||||
"rule_id": "ivd_registration_test_report",
|
||||
"rule_title": "注册检验报告"
|
||||
},
|
||||
"matched_files": [
|
||||
{
|
||||
"file_item_id": 33,
|
||||
"relative_path": "注册检验/检验报告.pdf",
|
||||
"matched_by": "directory_keyword"
|
||||
}
|
||||
],
|
||||
"rag_citations": [
|
||||
{
|
||||
"source_file": "体外诊断试剂注册申报资料要求及说明.doc",
|
||||
"section_title": "注册申报资料要求",
|
||||
"snippet": "..."
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 八、与现有表的改造建议
|
||||
|
||||
### 8.1 ra_workflow_node_run
|
||||
|
||||
第一阶段设计中该表通过 `batch_id` 直接关联文件汇总批次。法规核查复用同一套工作流状态机制,采用通用工作流引用:
|
||||
|
||||
| 字段 | 说明 |
|
||||
| --- | --- |
|
||||
| workflow_type | 新增,用于区分 file_summary 和 regulatory_review |
|
||||
| workflow_batch_id | 新增,记录对应工作流批次 ID |
|
||||
| batch_id | 保留,兼容文件汇总旧逻辑 |
|
||||
|
||||
### 8.2 ra_workflow_event
|
||||
|
||||
同样增加 `workflow_type`、`workflow_batch_id`,使 SSE 能同时服务文件汇总和法规核查卡片。
|
||||
|
||||
### 8.3 ra_exported_summary_file
|
||||
|
||||
最终法规核查报告复用导出文件表。现有 `batch_id` 关联文件汇总批次,需要通用化:
|
||||
|
||||
| 字段 | 说明 |
|
||||
| --- | --- |
|
||||
| workflow_type | 新增,用于区分 file_summary 和 regulatory_review |
|
||||
| workflow_batch_id | 新增,记录对应工作流批次 ID |
|
||||
| batch_id | 保留,兼容文件汇总旧逻辑 |
|
||||
| export_category | 新增,用于区分 summary_report、risk_report、excel_list、json_package |
|
||||
|
||||
最终法规核查报告进入 `ExportedSummaryFile`,过程产物进入 `RegulatoryArtifact`。
|
||||
|
||||
---
|
||||
|
||||
## 九、Django Model 命名建议
|
||||
|
||||
| 表名 | Model 名称 |
|
||||
| --- | --- |
|
||||
| ra_regulatory_rule_version | RegulatoryRuleVersion |
|
||||
| ra_regulatory_review_batch | RegulatoryReviewBatch |
|
||||
| ra_regulatory_issue | RegulatoryIssue |
|
||||
| ra_regulatory_artifact | RegulatoryArtifact |
|
||||
| ra_regulatory_notification_record | RegulatoryNotificationRecord |
|
||||
|
||||
---
|
||||
|
||||
## 十、验收检查点
|
||||
|
||||
| 序号 | 检查项 | 验收标准 |
|
||||
| --- | --- | --- |
|
||||
| 1 | 规则版本可追溯 | 每个法规核查批次能查到 rule_version、source_path、rule_file_hash 和 rag_index_version |
|
||||
| 2 | 批次可多次核查 | 同一个 FileSummaryBatch 可创建多个 RegulatoryReviewBatch |
|
||||
| 3 | 软删除可用 | 归档或删除法规核查批次后,默认列表不展示但历史可追溯 |
|
||||
| 4 | 问题可筛选 | 可按 risk_level、status、owner 查询待处理问题 |
|
||||
| 5 | 证据可追溯 | Issue 可查到 file_path、page_no、field_name 和 evidence_json |
|
||||
| 6 | 产物可校验 | 每个 RegulatoryArtifact 都有 file_hash |
|
||||
| 7 | 飞书可重试 | NotificationRecord 可记录 retry_count、send_status 和失败原因 |
|
||||
| 8 | 权限可追溯 | 所有法规核查数据可通过 batch -> conversation -> user 校验访问权限 |
|
||||
|
||||
---
|
||||
|
||||
## 十一、后续实现注意事项
|
||||
|
||||
| 序号 | 问题 | 当前建议 |
|
||||
| --- | --- | --- |
|
||||
| 1 | WorkflowNodeRun/Event 通用化 | 已确定新增 workflow_type 和 workflow_batch_id,保留 batch_id 兼容文件汇总 |
|
||||
| 2 | ExportedSummaryFile 通用化 | 已确定新增 workflow_type、workflow_batch_id 和 export_category |
|
||||
| 3 | RegulatoryArtifact 下载接口 | 按 batch -> conversation -> user 校验权限 |
|
||||
| 4 | 飞书用户映射 | 暂通过 User 扩展字段或配置表映射飞书 CLI 可识别账号 |
|
||||
| 5 | 规则文件 hash 计算时机 | 规则导入或激活时计算并写入 RegulatoryRuleVersion |
|
||||
433
docs/3.数据库设计/3.产品关键信息提取与申报文件自动填表.md
Normal file
433
docs/3.数据库设计/3.产品关键信息提取与申报文件自动填表.md
Normal file
@@ -0,0 +1,433 @@
|
||||
# 产品关键信息提取与申报文件自动填表数据库设计
|
||||
|
||||
## 文档信息
|
||||
|
||||
| 项目 | 内容 |
|
||||
| --- | --- |
|
||||
| 需求分析文档 | docs/1.需求分析/3.产品关键信息提取与申报文件自动填表.md |
|
||||
| 功能设计文档 | docs/2.功能设计/3.产品关键信息提取与申报文件自动填表.md |
|
||||
| 数据库类型 | SQLite / Django ORM |
|
||||
| 表名前缀 | ra_ |
|
||||
| 设计日期 | 2026-06-07 |
|
||||
| 设计版本 | V1.0 |
|
||||
|
||||
---
|
||||
|
||||
## 一、设计原则
|
||||
|
||||
| 原则 | 说明 |
|
||||
| --- | --- |
|
||||
| 独立填表批次 | 自动填表作为独立工作流,使用独立批次表,不强绑法规核查批次 |
|
||||
| 复用文件来源 | 填表批次必须关联一个成功的 `FileSummaryBatch`,不重复保存文件清单 |
|
||||
| 可选复用法规条件 | 如当前对话已有已确认法规核查批次,可通过可空外键复用注册类型等条件 |
|
||||
| 导出记录复用 | Word、Excel、JSON、PDF 等下载文件继续进入 `ExportedSummaryFile` |
|
||||
| 过程产物独立 | 自动填表过程产物单独建表,避免和法规核查 `RegulatoryArtifact` 混用 |
|
||||
| 通知记录独立 | 自动填表飞书通知单独建表,字段风格与法规通知记录保持一致 |
|
||||
| 大文本不入库 | 字段抽取 JSON、追溯清单和模板副本保存为文件,数据库仅保存路径、hash 和摘要 |
|
||||
| 字段明细暂不入库 | 本期不新增字段级明细表;字段结果保存在 JSON/Excel 产物与批次摘要中 |
|
||||
| SQLite 兼容 | 字段类型、索引和约束优先保证当前 SQLite + Django ORM 可运行 |
|
||||
|
||||
---
|
||||
|
||||
## 二、ER 图
|
||||
|
||||
```mermaid
|
||||
erDiagram
|
||||
AUTH_USER ||--o{ CONVERSATION : owns
|
||||
CONVERSATION ||--o{ RA_FILE_SUMMARY_BATCH : has
|
||||
RA_FILE_SUMMARY_BATCH ||--o{ RA_FILE_SUMMARY_ITEM : produces
|
||||
RA_FILE_SUMMARY_BATCH ||--o{ RA_APPLICATION_FORM_FILL_BATCH : feeds
|
||||
RA_REGULATORY_REVIEW_BATCH ||--o{ RA_APPLICATION_FORM_FILL_BATCH : optionally_confirms
|
||||
AUTH_USER ||--o{ RA_APPLICATION_FORM_FILL_BATCH : runs
|
||||
CONVERSATION ||--o{ RA_APPLICATION_FORM_FILL_BATCH : has
|
||||
MESSAGE ||--o{ RA_APPLICATION_FORM_FILL_BATCH : triggers
|
||||
RA_APPLICATION_FORM_FILL_BATCH ||--o{ RA_APPLICATION_FORM_FILL_ARTIFACT : keeps
|
||||
RA_APPLICATION_FORM_FILL_BATCH ||--o{ RA_APPLICATION_FORM_FILL_NOTIFICATION_RECORD : sends
|
||||
RA_APPLICATION_FORM_FILL_BATCH ||--o{ RA_EXPORTED_SUMMARY_FILE : exports
|
||||
RA_APPLICATION_FORM_FILL_BATCH ||--o{ RA_WORKFLOW_NODE_RUN : tracks
|
||||
RA_APPLICATION_FORM_FILL_BATCH ||--o{ RA_WORKFLOW_EVENT : emits
|
||||
```
|
||||
|
||||
说明:`ra_workflow_node_run`、`ra_workflow_event`、`ra_exported_summary_file` 已在第二批中被通用化,通过 `workflow_type` 与 `workflow_batch_id` 支持多工作流。本功能使用 `workflow_type=application_form_fill`。
|
||||
|
||||
---
|
||||
|
||||
## 三、表结构设计
|
||||
|
||||
### 3.1 ra_application_form_fill_batch
|
||||
|
||||
一次自动填表工作流批次。该表记录本次触发来源、选择模板、输出类型、注册类型、产品名称、冲突摘要、工作目录和状态。
|
||||
|
||||
| 字段名 | Django 类型 | SQLite 类型 | 必填 | 说明 |
|
||||
| --- | --- | --- | --- | --- |
|
||||
| id | BigAutoField | integer | 是 | 主键 |
|
||||
| conversation_id | ForeignKey | bigint | 是 | 绑定对话 |
|
||||
| user_id | ForeignKey | bigint | 是 | 发起用户 |
|
||||
| trigger_message_id | ForeignKey | bigint | 否 | 触发填表工作流的用户消息 |
|
||||
| source_summary_batch_id | ForeignKey | bigint | 是 | 文件来源汇总批次 |
|
||||
| source_regulatory_batch_id | ForeignKey | bigint | 否 | 可选,复用已确认法规核查批次条件 |
|
||||
| batch_no | CharField(64) | varchar(64) | 是 | 填表批次编号,唯一 |
|
||||
| status | CharField(30) | varchar(30) | 是 | pending、running、waiting_user、success、partial_success、failed、cancelled |
|
||||
| requested_templates | JSONField | text/json | 是 | 用户指定模板编码列表;未指定为空数组 |
|
||||
| selected_templates | JSONField | text/json | 是 | 系统实际选择模板编码列表 |
|
||||
| output_types | JSONField | text/json | 是 | 请求输出类型,如 word、excel、json、pdf |
|
||||
| registration_type | CharField(80) | varchar(80) | 否 | 识别出的注册类型 |
|
||||
| registration_type_source | CharField(40) | varchar(40) | 否 | user_message、regulatory_batch、file_extract、unknown |
|
||||
| product_name | CharField(200) | varchar(200) | 否 | 产品名称 |
|
||||
| conflict_summary | JSONField | text/json | 是 | 冲突字段摘要 |
|
||||
| risk_notes | JSONField | text/json | 是 | 不适用模板、低置信度、PDF 待生成等提示 |
|
||||
| template_config_version | CharField(80) | varchar(80) | 否 | 模板配置版本 |
|
||||
| template_config_hash | CharField(128) | varchar(128) | 否 | 模板配置文件 hash |
|
||||
| work_dir | CharField(500) | varchar(500) | 否 | 批次工作目录 |
|
||||
| error_message | TextField | text | 否 | 批次异常说明 |
|
||||
| created_at | DateTimeField | datetime | 是 | 创建时间 |
|
||||
| started_at | DateTimeField | datetime | 否 | 开始时间 |
|
||||
| finished_at | DateTimeField | datetime | 否 | 完成时间 |
|
||||
| archived_at | DateTimeField | datetime | 否 | 归档时间 |
|
||||
| is_deleted | BooleanField | bool | 是 | 软删除标记 |
|
||||
|
||||
唯一约束:
|
||||
|
||||
| 约束名 | 字段 |
|
||||
| --- | --- |
|
||||
| uq_ra_aff_batch_no | batch_no |
|
||||
|
||||
索引:
|
||||
|
||||
| 索引名 | 字段 | 说明 |
|
||||
| --- | --- | --- |
|
||||
| idx_ra_aff_batch_conv_status | conversation_id, status | 查询对话下填表批次状态 |
|
||||
| idx_ra_aff_batch_summary | source_summary_batch_id | 根据文件汇总批次查询填表历史 |
|
||||
| idx_ra_aff_batch_regulatory | source_regulatory_batch_id | 根据法规核查批次查询关联填表历史 |
|
||||
| idx_ra_aff_batch_user_created | user_id, created_at | 查询用户发起记录 |
|
||||
| idx_ra_aff_batch_created | created_at | 按创建时间查询 |
|
||||
|
||||
---
|
||||
|
||||
### 3.2 ra_application_form_fill_artifact
|
||||
|
||||
自动填表过程产物表。仅保存文件元数据,不保存字段抽取大 JSON 的全文。
|
||||
|
||||
| 字段名 | Django 类型 | SQLite 类型 | 必填 | 说明 |
|
||||
| --- | --- | --- | --- | --- |
|
||||
| id | BigAutoField | integer | 是 | 主键 |
|
||||
| batch_id | ForeignKey | bigint | 是 | 所属自动填表批次 |
|
||||
| artifact_type | CharField(60) | varchar(60) | 是 | template_copy、field_extract_result、merged_fields、traceability、filled_template、notification_record |
|
||||
| file_format | CharField(20) | varchar(20) | 是 | json、excel、docx、pdf、markdown |
|
||||
| name | CharField(160) | varchar(160) | 是 | 产物名称 |
|
||||
| file_name | CharField(255) | varchar(255) | 是 | 文件名 |
|
||||
| storage_path | CharField(500) | varchar(500) | 是 | 存储路径 |
|
||||
| file_size | BigIntegerField | bigint | 是 | 文件大小 |
|
||||
| content_hash | CharField(128) | varchar(128) | 是 | 文件 SHA-256 hash |
|
||||
| metadata | JSONField | text/json | 是 | 模板编码、输出类型、生成状态、错误摘要等 |
|
||||
| created_by_node | CharField(60) | varchar(60) | 否 | 产生该产物的节点 |
|
||||
| created_at | DateTimeField | datetime | 是 | 创建时间 |
|
||||
| is_deleted | BooleanField | bool | 是 | 软删除标记 |
|
||||
|
||||
索引:
|
||||
|
||||
| 索引名 | 字段 | 说明 |
|
||||
| --- | --- | --- |
|
||||
| idx_ra_aff_artifact_batch_type | batch_id, artifact_type | 查询批次过程产物 |
|
||||
| idx_ra_aff_artifact_format | file_format | 按文件格式查询 |
|
||||
| idx_ra_aff_artifact_created | created_at | 按时间追溯 |
|
||||
|
||||
---
|
||||
|
||||
### 3.3 ra_application_form_fill_notification_record
|
||||
|
||||
自动填表飞书通知记录表。通知失败不阻断文件下载,但需要留痕和支持后续重试。
|
||||
|
||||
| 字段名 | Django 类型 | SQLite 类型 | 必填 | 说明 |
|
||||
| --- | --- | --- | --- | --- |
|
||||
| id | BigAutoField | integer | 是 | 主键 |
|
||||
| batch_id | ForeignKey | bigint | 是 | 所属自动填表批次 |
|
||||
| recipient_id | ForeignKey(User) | bigint | 是 | 通知对象,默认上传人/发起人 |
|
||||
| channel | CharField(30) | varchar(30) | 是 | feishu_cli、feishu_api、mock |
|
||||
| template_codes | JSONField | text/json | 是 | 本次通知涉及模板 |
|
||||
| export_ids | JSONField | text/json | 是 | 本次通知关联导出文件 ID |
|
||||
| message_summary | TextField | text | 是 | 通知摘要 |
|
||||
| send_status | CharField(20) | varchar(20) | 是 | pending、success、failed |
|
||||
| retry_count | PositiveIntegerField | integer | 是 | 已重试次数 |
|
||||
| external_message_id | CharField(120) | varchar(120) | 否 | 飞书外部消息 ID |
|
||||
| error_message | TextField | text | 否 | 失败原因 |
|
||||
| sent_at | DateTimeField | datetime | 否 | 发送成功时间 |
|
||||
| created_at | DateTimeField | datetime | 是 | 创建时间 |
|
||||
| updated_at | DateTimeField | datetime | 是 | 更新时间 |
|
||||
| is_deleted | BooleanField | bool | 是 | 软删除标记 |
|
||||
|
||||
索引:
|
||||
|
||||
| 索引名 | 字段 | 说明 |
|
||||
| --- | --- | --- |
|
||||
| idx_ra_aff_notify_batch | batch_id, created_at | 查询批次通知记录 |
|
||||
| idx_ra_aff_notify_recipient | recipient_id, send_status | 查询用户通知状态 |
|
||||
| idx_ra_aff_notify_status | send_status, retry_count | 查询待重试通知 |
|
||||
|
||||
---
|
||||
|
||||
## 四、既有表扩展
|
||||
|
||||
### 4.1 ra_exported_summary_file
|
||||
|
||||
继续复用导出文件表,需扩展导出类型。
|
||||
|
||||
| 字段/枚举 | 处理 |
|
||||
| --- | --- |
|
||||
| export_type | 增加 `word`、`pdf` |
|
||||
| workflow_type | 使用 `application_form_fill` |
|
||||
| workflow_batch_id | 记录 `ApplicationFormFillBatch.id` |
|
||||
| export_category | 使用 `filled_template`、`traceability`、`extract_result` |
|
||||
|
||||
导出类型枚举:
|
||||
|
||||
| value | 中文展示 | 说明 |
|
||||
| --- | --- | --- |
|
||||
| markdown | Markdown | 既有报告 |
|
||||
| excel | Excel | 追溯清单 |
|
||||
| json | JSON | 字段抽取结果包 |
|
||||
| word | Word | 填好的 Word 模板 |
|
||||
| pdf | PDF | Word 转换后的 PDF,P1 预留 |
|
||||
|
||||
### 4.2 ra_workflow_node_run
|
||||
|
||||
本功能使用通用工作流字段:
|
||||
|
||||
| 字段 | 值 |
|
||||
| --- | --- |
|
||||
| workflow_type | application_form_fill |
|
||||
| workflow_batch_id | ApplicationFormFillBatch.id |
|
||||
| node_group | form_fill |
|
||||
| batch_id | 可为空或兼容性填充 source_summary_batch_id |
|
||||
|
||||
### 4.3 ra_workflow_event
|
||||
|
||||
本功能事件写入:
|
||||
|
||||
| 字段 | 值 |
|
||||
| --- | --- |
|
||||
| workflow_type | application_form_fill |
|
||||
| workflow_batch_id | ApplicationFormFillBatch.id |
|
||||
| conversation_id | 当前对话 ID |
|
||||
| payload | 节点状态、模板列表、冲突数量、导出文件等 |
|
||||
|
||||
---
|
||||
|
||||
## 五、枚举设计
|
||||
|
||||
### 5.1 ApplicationFormFillBatch.status
|
||||
|
||||
| value | 中文展示 | 说明 |
|
||||
| --- | --- | --- |
|
||||
| pending | 待执行 | 批次已创建,等待执行 |
|
||||
| running | 执行中 | 工作流正在执行 |
|
||||
| waiting_user | 等待用户 | 缺少文件汇总批次或关键条件 |
|
||||
| success | 成功 | Word 和必要追溯产物生成成功 |
|
||||
| partial_success | 部分成功 | 部分模板、PDF、追溯清单或通知失败 |
|
||||
| failed | 失败 | 所有目标 Word 模板均生成失败 |
|
||||
| cancelled | 已取消 | 用户或系统取消执行 |
|
||||
|
||||
### 5.2 artifact_type
|
||||
|
||||
| value | 说明 |
|
||||
| --- | --- |
|
||||
| template_copy | 模板副本 |
|
||||
| field_extract_result | 规则/正则与 LLM 抽取原始结果 |
|
||||
| merged_fields | 合并后的最终字段和冲突 |
|
||||
| traceability | 字段来源追溯清单 |
|
||||
| filled_template | 已填写模板 |
|
||||
| notification_record | 通知记录产物 |
|
||||
|
||||
### 5.3 registration_type_source
|
||||
|
||||
| value | 说明 |
|
||||
| --- | --- |
|
||||
| user_message | 用户话语明确指定 |
|
||||
| regulatory_batch | 复用已确认法规核查条件 |
|
||||
| file_extract | 从文件内容抽取 |
|
||||
| unknown | 未识别 |
|
||||
|
||||
### 5.4 通知枚举
|
||||
|
||||
| 字段 | value |
|
||||
| --- | --- |
|
||||
| channel | feishu_cli、feishu_api、mock |
|
||||
| send_status | pending、success、failed |
|
||||
|
||||
---
|
||||
|
||||
## 六、JSON 字段结构建议
|
||||
|
||||
### 6.1 requested_templates / selected_templates
|
||||
|
||||
```json
|
||||
["registration_certificate", "essential_principles"]
|
||||
```
|
||||
|
||||
### 6.2 output_types
|
||||
|
||||
```json
|
||||
["word", "excel", "json"]
|
||||
```
|
||||
|
||||
PDF 作为 P1 预留,可在后续加入:
|
||||
|
||||
```json
|
||||
["word", "pdf", "excel", "json"]
|
||||
```
|
||||
|
||||
### 6.3 conflict_summary
|
||||
|
||||
```json
|
||||
[
|
||||
{
|
||||
"field_key": "storage_condition",
|
||||
"field_label": "产品储存条件及有效期",
|
||||
"selected_value": "2-8℃保存,有效期12个月",
|
||||
"selected_source": "说明书.docx",
|
||||
"conflict_values": [
|
||||
{
|
||||
"value": "-20℃保存",
|
||||
"source_file": "产品技术要求.docx",
|
||||
"evidence": "储存条件:-20℃保存"
|
||||
}
|
||||
],
|
||||
"handling": "说明书优先,模板内黄底红字高亮"
|
||||
}
|
||||
]
|
||||
```
|
||||
|
||||
### 6.4 risk_notes
|
||||
|
||||
```json
|
||||
[
|
||||
{
|
||||
"type": "template_registration_mismatch",
|
||||
"message": "用户指定变更注册(备案)文件,但系统识别注册类型为首次注册,需人工确认。"
|
||||
},
|
||||
{
|
||||
"type": "pdf_pending",
|
||||
"message": "PDF 转换为后续增强项,本次优先生成 Word。"
|
||||
}
|
||||
]
|
||||
```
|
||||
|
||||
### 6.5 artifact.metadata
|
||||
|
||||
```json
|
||||
{
|
||||
"template_code": "registration_certificate",
|
||||
"output_type": "word",
|
||||
"node_code": "word_fill",
|
||||
"status": "success",
|
||||
"conflict_count": 2
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 七、存储路径设计
|
||||
|
||||
自动填表工作目录按用户、对话和批次隔离:
|
||||
|
||||
```text
|
||||
media/application_form_fill/{user_id}/{conversation_id}/{batch_no}/
|
||||
```
|
||||
|
||||
目录结构:
|
||||
|
||||
```text
|
||||
media/application_form_fill/12/1001/AFF-20260607153000-a1b2c3/
|
||||
templates/
|
||||
registration_certificate.source.docx
|
||||
essential_principles.source.docx
|
||||
filled/
|
||||
AFF-20260607153000-a1b2c3-甲胎蛋白检测试剂盒-注册证格式.docx
|
||||
exports/
|
||||
AFF-20260607153000-a1b2c3-甲胎蛋白检测试剂盒-字段来源追溯清单.xlsx
|
||||
field_extract_result.json
|
||||
merged_fields.json
|
||||
notifications/
|
||||
notification_record.json
|
||||
```
|
||||
|
||||
所有产物写入 `ApplicationFormFillArtifact` 时必须记录 SHA-256 hash。
|
||||
|
||||
---
|
||||
|
||||
## 八、权限与查询规则
|
||||
|
||||
### 8.1 批次访问权限
|
||||
|
||||
```text
|
||||
ApplicationFormFillBatch -> conversation -> user
|
||||
必须等于当前 request.user
|
||||
```
|
||||
|
||||
### 8.2 导出下载权限
|
||||
|
||||
```text
|
||||
ExportedSummaryFile.workflow_type == application_form_fill
|
||||
-> workflow_batch_id
|
||||
-> ApplicationFormFillBatch.conversation.user
|
||||
```
|
||||
|
||||
若 `workflow_type=file_summary` 或 `regulatory_review`,仍按既有逻辑校验。
|
||||
|
||||
### 8.3 文件读取权限
|
||||
|
||||
自动填表只能读取 `source_summary_batch.items` 对应的文件,不允许从其他对话或其他批次随意读取文件。
|
||||
|
||||
---
|
||||
|
||||
## 九、字段级数据库表暂缓说明
|
||||
|
||||
本期不新增 `ApplicationFormFillField` 字段级明细表。原因:
|
||||
|
||||
| 原因 | 说明 |
|
||||
| --- | --- |
|
||||
| Demo 主链路更轻 | 字段结果以 JSON 和 Excel 追溯清单即可满足下载复核 |
|
||||
| 避免过早建模 | 字段结构依赖模板配置和后续人工修改交互,暂不固化表结构 |
|
||||
| 查询需求有限 | 本期主要按批次下载文件,不做字段级统计和在线编辑 |
|
||||
|
||||
后续如需要在线确认、人工修改、字段级审计或批量统计,再新增字段级表。该事项写入 `docs/6.待办计划/第二阶段暂缓事项.md`。
|
||||
|
||||
---
|
||||
|
||||
## 十、Django Model 命名建议
|
||||
|
||||
| 表名 | Model 名称 |
|
||||
| --- | --- |
|
||||
| ra_application_form_fill_batch | ApplicationFormFillBatch |
|
||||
| ra_application_form_fill_artifact | ApplicationFormFillArtifact |
|
||||
| ra_application_form_fill_notification_record | ApplicationFormFillNotificationRecord |
|
||||
|
||||
建议模型仍集中放在 `review_agent/models.py`,与前两批现有模型保持一致;业务逻辑放在 `review_agent/application_form_fill/`。
|
||||
|
||||
---
|
||||
|
||||
## 十一、验收检查点
|
||||
|
||||
| 序号 | 检查项 | 验收标准 |
|
||||
| --- | --- | --- |
|
||||
| 1 | 独立批次 | 触发填表后生成 `ApplicationFormFillBatch` |
|
||||
| 2 | 文件来源 | 每个填表批次都关联一个成功的 `FileSummaryBatch` |
|
||||
| 3 | 可选法规条件 | 如有关联法规核查批次,可记录 `source_regulatory_batch` |
|
||||
| 4 | 过程产物 | 字段抽取 JSON、合并结果、追溯清单、模板副本均可留底 |
|
||||
| 5 | 导出复用 | 填好的 Word 和追溯清单进入 `ExportedSummaryFile` |
|
||||
| 6 | 导出类型 | `ExportedSummaryFile.ExportType` 支持 `word`、`pdf` |
|
||||
| 7 | 通知记录 | 飞书通知记录能保存状态、重试次数、失败原因 |
|
||||
| 8 | 权限隔离 | A 对话的填表批次和导出文件不能被 B 对话访问 |
|
||||
| 9 | 字段表暂缓 | 字段级结果不入库,但能从 JSON/Excel 追溯产物复核 |
|
||||
|
||||
---
|
||||
|
||||
## 十二、开发顺序建议
|
||||
|
||||
1. 扩展 `ExportedSummaryFile.ExportType`,增加 `word`、`pdf`。
|
||||
2. 新增 `ApplicationFormFillBatch`、`ApplicationFormFillArtifact`、`ApplicationFormFillNotificationRecord`。
|
||||
3. 为新增状态字段定义 Django `TextChoices`。
|
||||
4. 配置表名、索引和唯一约束。
|
||||
5. 执行 `python manage.py makemigrations review_agent` 和 `python manage.py migrate`。
|
||||
6. 编写模型测试,覆盖批次创建、产物 hash、通知重试字段、导出权限查询。
|
||||
7. 将字段级数据库表和 PDF 转换能力写入待办计划。
|
||||
302
docs/3.数据库设计/4.飞书通知与问答接入.md
Normal file
302
docs/3.数据库设计/4.飞书通知与问答接入.md
Normal file
@@ -0,0 +1,302 @@
|
||||
# 飞书通知与问答接入数据库设计
|
||||
|
||||
## 文档信息
|
||||
|
||||
| 项目 | 内容 |
|
||||
| --- | --- |
|
||||
| 需求分析文档 | docs/1.需求分析/4.飞书通知与问答接入.md |
|
||||
| 功能设计文档 | docs/2.功能设计/4.飞书通知与问答接入.md |
|
||||
| 数据库类型 | SQLite / Django ORM |
|
||||
| 表名前缀 | ra_ |
|
||||
| 设计日期 | 2026-06-07 |
|
||||
| 设计版本 | V1.0 |
|
||||
|
||||
---
|
||||
|
||||
## 一、设计原则
|
||||
|
||||
| 原则 | 说明 |
|
||||
| --- | --- |
|
||||
| 统一通知抽象 | 三个工作流共用统一通知服务和通用通知记录,减少重复实现 |
|
||||
| 兼容现有表 | 现有法规通知、填表通知可保留;新增通用表作为后续统一入口 |
|
||||
| 可判重 | 通知记录必须支持同一批次、同一流程、同一状态只发送一次 |
|
||||
| 摘要入库 | 只保存发送摘要、状态、错误,不保存完整富文本 payload |
|
||||
| 映射可维护 | 系统用户与飞书用户映射独立建表,通过 Django Admin 维护 |
|
||||
| 问答可扩展 | 预留问答日志表,首期可不接事件回调 |
|
||||
| SQLite 兼容 | 使用 Django ORM 常规字段,避免数据库特有能力 |
|
||||
|
||||
---
|
||||
|
||||
## 二、ER 图
|
||||
|
||||
```mermaid
|
||||
erDiagram
|
||||
AUTH_USER ||--o{ RA_FEISHU_USER_MAPPING : maps
|
||||
AUTH_USER ||--o{ RA_WORKFLOW_NOTIFICATION_RECORD : triggers
|
||||
RA_FEISHU_USER_MAPPING ||--o{ RA_WORKFLOW_NOTIFICATION_RECORD : resolves
|
||||
AUTH_USER ||--o{ RA_FEISHU_QUESTION_LOG : asks
|
||||
|
||||
RA_WORKFLOW_NOTIFICATION_RECORD {
|
||||
bigint id
|
||||
string workflow_type
|
||||
bigint workflow_batch_id
|
||||
string workflow_status
|
||||
string dedupe_key
|
||||
string channel
|
||||
string target
|
||||
string send_status
|
||||
}
|
||||
|
||||
RA_FEISHU_USER_MAPPING {
|
||||
bigint id
|
||||
bigint system_user_id
|
||||
string feishu_open_id
|
||||
string feishu_user_id
|
||||
string feishu_mobile
|
||||
boolean is_active
|
||||
}
|
||||
|
||||
RA_FEISHU_QUESTION_LOG {
|
||||
bigint id
|
||||
bigint system_user_id
|
||||
string feishu_open_id
|
||||
string intent
|
||||
string query_object
|
||||
string status
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 三、表结构设计
|
||||
|
||||
### 3.1 ra_feishu_user_mapping
|
||||
|
||||
系统用户与飞书用户标识映射表。首期通知发送给环境变量中配置的指定个人账号,本表通过 Django Admin 手工维护,用于后续按发起人私聊通知和飞书私聊问答身份识别。
|
||||
|
||||
| 字段名 | Django 类型 | SQLite 类型 | 必填 | 说明 |
|
||||
| --- | --- | --- | --- | --- |
|
||||
| id | BigAutoField | integer | 是 | 主键 |
|
||||
| system_user_id | ForeignKey | bigint | 是 | 关联 Django 用户 |
|
||||
| feishu_display_name | CharField(120) | varchar(120) | 否 | 飞书展示名,便于后台识别 |
|
||||
| feishu_open_id | CharField(120) | varchar(120) | 否 | 飞书 open_id,优先用于 @ |
|
||||
| feishu_user_id | CharField(120) | varchar(120) | 否 | 飞书 user_id,第二优先级 |
|
||||
| feishu_mobile | CharField(40) | varchar(40) | 否 | 飞书手机号,兜底 |
|
||||
| is_active | BooleanField | bool | 是 | 是否启用 |
|
||||
| remark | CharField(255) | varchar(255) | 否 | 备注 |
|
||||
| created_at | DateTimeField | datetime | 是 | 创建时间 |
|
||||
| updated_at | DateTimeField | datetime | 是 | 更新时间 |
|
||||
|
||||
约束:
|
||||
|
||||
| 约束名 | 字段 | 说明 |
|
||||
| --- | --- | --- |
|
||||
| uq_ra_feishu_mapping_user | system_user_id | 一个系统用户首期只维护一条启用映射 |
|
||||
|
||||
索引:
|
||||
|
||||
| 索引名 | 字段 | 说明 |
|
||||
| --- | --- | --- |
|
||||
| idx_ra_feishu_mapping_active | is_active | 后台筛选启用映射 |
|
||||
| idx_ra_feishu_mapping_open | feishu_open_id | 后续私聊事件反查用户 |
|
||||
| idx_ra_feishu_mapping_userid | feishu_user_id | 后续私聊事件反查用户 |
|
||||
| idx_ra_feishu_mapping_mobile | feishu_mobile | 手机号兜底查询 |
|
||||
|
||||
校验规则:
|
||||
|
||||
| 规则 | 说明 |
|
||||
| --- | --- |
|
||||
| 至少一个飞书标识 | `feishu_open_id`、`feishu_user_id`、`feishu_mobile` 至少填写一个 |
|
||||
| @ 优先级 | `feishu_open_id -> feishu_user_id -> feishu_mobile` |
|
||||
|
||||
---
|
||||
|
||||
### 3.2 ra_workflow_notification_record
|
||||
|
||||
通用工作流通知记录表。用于记录自动汇总、法规核查、自动填表的飞书通知发送结果。现有专项通知表可继续保留,后续逐步收敛到本表。
|
||||
|
||||
| 字段名 | Django 类型 | SQLite 类型 | 必填 | 说明 |
|
||||
| --- | --- | --- | --- | --- |
|
||||
| id | BigAutoField | integer | 是 | 主键 |
|
||||
| workflow_type | CharField(40) | varchar(40) | 是 | file_summary、regulatory_review、application_form_fill |
|
||||
| workflow_batch_id | PositiveBigIntegerField | bigint | 是 | 对应工作流批次 ID |
|
||||
| workflow_batch_no | CharField(80) | varchar(80) | 是 | 批次编号冗余,便于展示 |
|
||||
| workflow_status | CharField(40) | varchar(40) | 是 | success、partial_success、failed 等 |
|
||||
| dedupe_key | CharField(160) | varchar(160) | 是 | 判重键 |
|
||||
| trigger_user_id | ForeignKey | bigint | 是 | 发起人或上传人 |
|
||||
| feishu_mapping_id | ForeignKey | bigint | 否 | 命中的飞书用户映射 |
|
||||
| channel | CharField(40) | varchar(40) | 是 | mock、feishu_api、disabled |
|
||||
| target | CharField(160) | varchar(160) | 否 | 指定个人账号名称、open_id、user_id 或目标标识 |
|
||||
| at_display_name | CharField(120) | varchar(120) | 否 | 被 @ 人展示名 |
|
||||
| at_identifier_type | CharField(30) | varchar(30) | 否 | open_id、user_id、mobile、missing |
|
||||
| at_identifier_masked | CharField(120) | varchar(120) | 否 | 脱敏后的 @ 标识 |
|
||||
| send_status | CharField(30) | varchar(30) | 是 | pending、success、failed、skipped_duplicate、disabled |
|
||||
| message_title | CharField(200) | varchar(200) | 是 | 通知标题 |
|
||||
| message_summary | TextField | text | 否 | 发送摘要,不保存完整 payload |
|
||||
| result_url | CharField(500) | varchar(500) | 否 | 系统结果入口 |
|
||||
| external_message_id | CharField(120) | varchar(120) | 否 | Webhook 一般为空,API 发送时保存 |
|
||||
| error_code | CharField(80) | varchar(80) | 否 | 飞书或客户端错误码 |
|
||||
| error_message | TextField | text | 否 | 失败原因 |
|
||||
| request_duration_ms | PositiveIntegerField | integer | 否 | HTTP 请求耗时 |
|
||||
| sent_at | DateTimeField | datetime | 否 | 成功发送时间 |
|
||||
| created_at | DateTimeField | datetime | 是 | 创建时间 |
|
||||
| updated_at | DateTimeField | datetime | 是 | 更新时间 |
|
||||
|
||||
唯一约束:
|
||||
|
||||
| 约束名 | 字段 | 说明 |
|
||||
| --- | --- | --- |
|
||||
| uq_ra_notify_dedupe_key | dedupe_key | 同一批次、流程、状态只保留一个成功发送意图 |
|
||||
|
||||
索引:
|
||||
|
||||
| 索引名 | 字段 | 说明 |
|
||||
| --- | --- | --- |
|
||||
| idx_ra_notify_workflow | workflow_type, workflow_batch_id | 批次详情页查询通知 |
|
||||
| idx_ra_notify_user_created | trigger_user_id, created_at | 用户通知历史 |
|
||||
| idx_ra_notify_status | send_status, created_at | 排查失败通知 |
|
||||
| idx_ra_notify_batch_no | workflow_batch_no | 按批次编号检索 |
|
||||
|
||||
dedupe_key 生成规则:
|
||||
|
||||
```text
|
||||
{workflow_type}:{workflow_batch_id}:{workflow_status}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 3.3 ra_feishu_question_log
|
||||
|
||||
飞书问答日志预留表。首期可创建表但不接入事件回调;后续私聊问答 MVP 使用该表记录问题、意图、查询对象、回答摘要和错误信息。
|
||||
|
||||
| 字段名 | Django 类型 | SQLite 类型 | 必填 | 说明 |
|
||||
| --- | --- | --- | --- | --- |
|
||||
| id | BigAutoField | integer | 是 | 主键 |
|
||||
| system_user_id | ForeignKey | bigint | 否 | 识别出的系统用户 |
|
||||
| feishu_mapping_id | ForeignKey | bigint | 否 | 命中的飞书映射 |
|
||||
| feishu_open_id | CharField(120) | varchar(120) | 否 | 事件中的 open_id |
|
||||
| feishu_user_id | CharField(120) | varchar(120) | 否 | 事件中的 user_id |
|
||||
| source_type | CharField(30) | varchar(30) | 是 | private_chat、group_mention |
|
||||
| message_id | CharField(120) | varchar(120) | 否 | 飞书消息 ID |
|
||||
| question_text | TextField | text | 是 | 用户原始问题 |
|
||||
| intent | CharField(60) | varchar(60) | 否 | batch_status、risk_summary、export_summary 等 |
|
||||
| query_object | JSONField | text/json | 是 | 批次号、工作流类型、最近批次等查询对象 |
|
||||
| answer_summary | TextField | text | 否 | 回答摘要,不保存完整回答正文 |
|
||||
| permission_result | CharField(40) | varchar(40) | 否 | allowed、denied、unbound |
|
||||
| status | CharField(30) | varchar(30) | 是 | success、failed、ignored |
|
||||
| error_message | TextField | text | 否 | 异常说明 |
|
||||
| processed_at | DateTimeField | datetime | 否 | 处理完成时间 |
|
||||
| created_at | DateTimeField | datetime | 是 | 创建时间 |
|
||||
|
||||
索引:
|
||||
|
||||
| 索引名 | 字段 | 说明 |
|
||||
| --- | --- | --- |
|
||||
| idx_ra_feishu_q_user_created | system_user_id, created_at | 用户问答历史 |
|
||||
| idx_ra_feishu_q_intent | intent, created_at | 按意图分析 |
|
||||
| idx_ra_feishu_q_status | status, created_at | 排查失败问答 |
|
||||
| idx_ra_feishu_q_message | message_id | 消息幂等 |
|
||||
|
||||
---
|
||||
|
||||
## 四、状态枚举
|
||||
|
||||
### 4.1 WorkflowNotificationRecord.channel
|
||||
|
||||
| 值 | 说明 |
|
||||
| --- | --- |
|
||||
| mock | 模拟通知 |
|
||||
| disabled | 真实通知未启用 |
|
||||
| feishu_api | 飞书官方智能体/企业自建应用消息 API |
|
||||
| feishu_webhook | 备选自定义机器人 Webhook,非首期主方案 |
|
||||
|
||||
### 4.2 WorkflowNotificationRecord.send_status
|
||||
|
||||
| 值 | 说明 |
|
||||
| --- | --- |
|
||||
| pending | 待发送 |
|
||||
| success | 发送成功 |
|
||||
| failed | 发送失败 |
|
||||
| skipped_duplicate | 重复通知跳过 |
|
||||
| disabled | 未启用真实发送 |
|
||||
|
||||
### 4.3 FeishuQuestionLog.intent
|
||||
|
||||
| 值 | 说明 |
|
||||
| --- | --- |
|
||||
| batch_status | 查询批次状态 |
|
||||
| risk_summary | 查询风险摘要 |
|
||||
| missing_summary | 查询缺失摘要 |
|
||||
| export_summary | 查询导出摘要 |
|
||||
| unknown | 未识别 |
|
||||
|
||||
---
|
||||
|
||||
## 五、与现有表的兼容关系
|
||||
|
||||
| 现有表 | 处理建议 |
|
||||
| --- | --- |
|
||||
| `ra_regulatory_notification_record` | 保留现有数据;法规核查真实飞书通知可新增写入通用表,后续再决定是否迁移 |
|
||||
| `ra_application_form_fill_notification_record` | 保留现有数据;自动填表通知状态展示可优先读通用表,兼容旧表 |
|
||||
| `ra_exported_summary_file` | 通知摘要中的导出文件数量来自该表 |
|
||||
| `ra_workflow_event` | 可记录通知节点事件,但不替代通知记录表 |
|
||||
| `auth_user` | 飞书映射通过外键关联系统用户 |
|
||||
|
||||
---
|
||||
|
||||
## 六、数据脱敏与安全
|
||||
|
||||
| 数据 | 入库策略 |
|
||||
| --- | --- |
|
||||
| App ID | 不入库,只在环境变量中维护 |
|
||||
| App Secret | 不入库,只在环境变量中维护 |
|
||||
| tenant_access_token | 不持久化入库,仅允许进程内短期缓存 |
|
||||
| 富文本完整 payload | 不入库 |
|
||||
| 手机号 | 映射表保存原值;通知记录只保存脱敏值 |
|
||||
| open_id/user_id | 映射表保存原值;通知记录保存脱敏值 |
|
||||
| 用户问题 | 问答日志保存原始问题,用于审计;不保存完整回答正文 |
|
||||
|
||||
---
|
||||
|
||||
## 七、迁移计划
|
||||
|
||||
| 步骤 | 说明 |
|
||||
| --- | --- |
|
||||
| 1 | 新增 `FeishuUserMapping` 模型和迁移 |
|
||||
| 2 | 新增 `WorkflowNotificationRecord` 模型和迁移 |
|
||||
| 3 | 新增 `FeishuQuestionLog` 预留模型和迁移 |
|
||||
| 4 | 注册 Django Admin 管理入口 |
|
||||
| 5 | 批次详情页查询通用通知记录展示 |
|
||||
| 6 | 保留现有专项通知表,不做破坏性迁移 |
|
||||
|
||||
---
|
||||
|
||||
## 八、验收 SQL 示例
|
||||
|
||||
查询某个批次通知状态:
|
||||
|
||||
```sql
|
||||
SELECT workflow_type, workflow_batch_no, workflow_status, channel, send_status, sent_at, error_message
|
||||
FROM ra_workflow_notification_record
|
||||
WHERE workflow_type = 'application_form_fill'
|
||||
AND workflow_batch_no = 'AFF-20260607-001'
|
||||
ORDER BY created_at DESC;
|
||||
```
|
||||
|
||||
查询未配置飞书映射的失败或降级通知:
|
||||
|
||||
```sql
|
||||
SELECT workflow_type, workflow_batch_no, trigger_user_id, send_status, message_summary
|
||||
FROM ra_workflow_notification_record
|
||||
WHERE at_identifier_type = 'missing'
|
||||
ORDER BY created_at DESC;
|
||||
```
|
||||
|
||||
查询飞书用户映射:
|
||||
|
||||
```sql
|
||||
SELECT u.username, m.feishu_display_name, m.feishu_open_id, m.feishu_user_id, m.feishu_mobile, m.is_active
|
||||
FROM ra_feishu_user_mapping m
|
||||
JOIN auth_user u ON u.id = m.system_user_id
|
||||
ORDER BY u.username;
|
||||
```
|
||||
Reference in New Issue
Block a user