# 第1章监管信息材料包生成详细设计

## 文档信息

| 项目 | 内容 |
| --- | --- |
| 需求分析文档 | docs/1.需求分析/5.第1章监管信息材料包生成.md |
| 功能设计文档 | docs/2.功能设计/5.第1章监管信息材料包生成.md |
| 数据库设计文档 | docs/3.数据库设计/5.第1章监管信息材料包生成.md |
| 参考详细设计 | docs/4.详细设计/3.产品关键信息提取与申报文件自动填表.md |
| 功能名称 | 第1章监管信息材料包生成 |
| 工作流编码 | regulatory_info_package |
| 所属模块 | 审核智能体 review_agent |
| 设计日期 | 2026-06-10 |
| 设计版本 | V1.0 |

---

## 一、详细设计目标

本详细设计用于指导 `regulatory_info_package` 独立工作流开发落地。系统根据用户上传或指定的产品说明书，抽取产品关键信息，基于 `docs/0.原始材料/第1章 监管信息` 下的样例模板生成第1章监管信息材料包，并以 `第1章 监管信息(预生成版).zip` 作为对话摘要首位下载入口。

核心约束：

| 约束 | 说明 |
| --- | --- |
| 独立工作流 | 使用 `workflow_type=regulatory_info_package`，拥有独立批次、产物、通知和卡片 |
| 独立模块 | 新增 `review_agent/regulatory_info_package/`，与 `application_form_fill` 平级 |
| 模型集中 | Django 模型仍集中放在 `review_agent/models.py` |
| 输入优先级 | 用户消息指定文件名优先；其次 active 附件；再兼容最近成功文件汇总 |
| 模板固定 | 固定处理第1章监管信息 7 个模板 |
| 规则优先可演示 | 规则抽取可独立跑通；LLM 失败最多重试 3 次，失败后继续 |
| 文档并发生成 | 工作流整体串行，`generate_docs` 节点内部每个文档可独立线程并发处理 |
| `.doc` 兜底 | 优先原生 `.doc` 写入；失败后允许生成 `.docx` 兜底文件 |
| zip 只含成功文件 | zip 只打包成功或兜底成功的文件；失败文件不进入 zip |
| 高亮规则 | 缺失和 LLM-only 黄底；冲突黄底红字 |
| 追溯输出 | 用户下载 Excel；JSON 仅保存到后台 logs 目录 |
| 前端最小接入 | 不做多说明书选择 UI；不确定时通过对话反问 |

---

## 二、代码结构设计

### 2.1 目录结构

```text
review_agent/
  models.py
  services.py
  skill_router.py
  regulatory_info_package/
    __init__.py
    constants.py
    schemas.py
    storage.py
    events.py
    workflow.py
    views.py
    services/
      __init__.py
      input_select.py
      template_config.py
      template_repository.py
      instruction_extract.py
      field_extract.py
      field_merge.py
      standard_candidates.py
      document_writer.py
      docx_document.py
      legacy_doc_document.py
      package_generate.py
      traceability_export.py
      zip_export.py
      summary.py
      notifier.py
    templates/
      regulatory_info_package_templates_v1.yaml
    prompts/
      field_extract.md
```

### 2.2 文件职责

| 文件 | 职责 |
| --- | --- |
| constants.py | 工作流编码、节点定义、触发关键词、模板编码、状态常量 |
| schemas.py | dataclass 数据结构，如 `TemplateSpec`、`InstructionExtractResult`、`MergedField`、`GeneratedFileResult` |
| storage.py | 批次目录、子目录、hash、产物创建、路径安全校验 |
| events.py | 记录与序列化 `WorkflowEvent` |
| workflow.py | `RegulatoryInfoPackageWorkflowExecutor`、批次创建、工作流启动 |
| views.py | health、start、status、select-input 接口 |
| input_select.py | 根据用户消息、active 附件、文件汇总选择说明书 |
| template_config.py | YAML 加载、校验、hash |
| template_repository.py | 定位样例模板、复制到批次目录 |
| instruction_extract.py | 说明书段落、章节、表格和组成成分表解析 |
| field_extract.py | 规则抽取与 LLM 抽取并行执行，LLM 最多 3 次重试 |
| field_merge.py | 合并字段，输出缺失、LLM-only、冲突和高亮决策 |
| standard_candidates.py | 从说明书抽标准号，调用现有知识库搜索候选 |
| document_writer.py | 文档适配器接口与通用高亮策略 |
| docx_document.py | `DocxDocumentAdapter`，处理 `.docx` |
| legacy_doc_document.py | `LegacyDocDocumentAdapter`，处理 `.doc` 原生写入与 `.docx` 兜底 |
| package_generate.py | 7 个文档生成策略，多线程生成文件 |
| traceability_export.py | 生成 `exports/traceability.xlsx` 和 `logs/traceability.json` |
| zip_export.py | 生成主下载 zip，只包含成功文件 |
| summary.py | 构造助手回显，zip 链接排首位 |
| notifier.py | 写专项通知记录，并调用统一通知服务 |

---

## 三、数据模型详细设计

模型放在 `review_agent/models.py`。

### 3.1 RegulatoryInfoPackageBatch

```python
class RegulatoryInfoPackageBatch(models.Model):
    class Status(models.TextChoices):
        PENDING = "pending", "待执行"
        RUNNING = "running", "执行中"
        WAITING_USER = "waiting_user", "等待用户"
        SUCCESS = "success", "成功"
        PARTIAL_SUCCESS = "partial_success", "部分成功"
        FAILED = "failed", "失败"
        CANCELLED = "cancelled", "已取消"
```

关键字段：

| 字段 | 说明 |
| --- | --- |
| conversation | 所属对话 |
| user | 发起用户 |
| trigger_message | 触发消息 |
| source_attachment | 直接选中的说明书附件，可空 |
| source_summary_batch | 兼容文件汇总批次，可空 |
| source_summary_item_id | 文件汇总条目 ID，可空 |
| batch_no | `RIP-YYYYMMDDHHMMSS-abcdef` |
| source_file_name | 说明书原文件名 |
| source_storage_path | 说明书存储路径 |
| product_name | 抽取产品名称 |
| output_zip_name | `第1章 监管信息(预生成版).zip` |
| generated_files | 7 个文件状态 |
| missing_fields | 缺失字段 |
| llm_only_fields | LLM-only 字段 |
| conflict_fields | 冲突字段 |
| risk_notes | 风险和降级提示 |
| adapter_summary | doc/docx 适配器实际执行摘要 |
| template_config_version/hash | 模板配置版本和 hash |
| work_dir | 批次工作目录 |
| is_deleted | 软删除 |

### 3.2 RegulatoryInfoPackageArtifact

```python
class RegulatoryInfoPackageArtifact(models.Model):
    class ArtifactType(models.TextChoices):
        TEMPLATE_COPY = "template_copy", "模板副本"
        INSTRUCTION_EXTRACT = "instruction_extract", "说明书抽取结果"
        FIELD_EXTRACT_RESULT = "field_extract_result", "字段抽取结果"
        MERGED_FIELDS = "merged_fields", "合并字段"
        GENERATED_DOCUMENT = "generated_document", "生成文件"
        TRACEABILITY = "traceability", "追溯清单"
        ZIP_PACKAGE = "zip_package", "ZIP包"
        NOTIFICATION_RECORD = "notification_record", "通知记录"
```

`file_format` 包含：`json`、`excel`、`docx`、`doc`、`zip`、`markdown`。

### 3.3 RegulatoryInfoPackageNotificationRecord

字段对齐自动填表通知记录：`batch`、`recipient`、`channel`、`export_ids`、`message_summary`、`send_status`、`retry_count`、`external_message_id`、`error_message`、`sent_at`、`is_deleted`。

### 3.4 ExportedSummaryFile 扩展

`ExportedSummaryFile.ExportType` 增加：

```python
ZIP = "zip", "ZIP"
```

下载 MIME 按扩展名兜底：

| 条件 | MIME |
| --- | --- |
| zip | application/zip |
| .doc | application/msword |
| .docx | application/vnd.openxmlformats-officedocument.wordprocessingml.document |

---

## 四、常量设计

### 4.1 工作流常量

```python
WORKFLOW_TYPE = "regulatory_info_package"
DEFAULT_ZIP_NAME = "第1章 监管信息(预生成版).zip"

REGULATORY_INFO_PACKAGE_NODE_DEFINITIONS = [
    ("prepare", "准备资料", "regulatory_info_package"),
    ("template_copy", "复制模板", "regulatory_info_package"),
    ("text_extract", "抽取说明书", "regulatory_info_package"),
    ("field_extract", "抽取字段", "regulatory_info_package"),
    ("field_merge", "合并字段", "regulatory_info_package"),
    ("generate_docs", "生成材料", "regulatory_info_package"),
    ("highlight_review_items", "标记待确认", "regulatory_info_package"),
    ("trace_export", "追溯清单", "regulatory_info_package"),
    ("zip_export", "打包下载", "regulatory_info_package"),
    ("notify", "通知", "regulatory_info_package"),
    ("completed", "完成", "completed"),
]
```

### 4.2 触发关键词

```python
REGULATORY_INFO_PACKAGE_TRIGGER_KEYWORDS = [
    "根据说明书生成第1章监管信息",
    "生成监管信息材料包",
    "从说明书生成第1章材料",
    "第1章监管信息",
    "监管信息材料包",
]
```

### 4.3 文件状态

```python
GENERATED_FILE_SUCCESS = "success"
GENERATED_FILE_FALLBACK_SUCCESS = "fallback_success"
GENERATED_FILE_FAILED = "failed"
GENERATED_FILE_SKIPPED = "skipped"
```

---

## 五、核心数据结构

### 5.1 TemplateSpec

```python
@dataclass(frozen=True)
class TemplateSpec:
    code: str
    output_name: str
    source_file: str
    file_format: str
    strategy: str
    include_in_zip: bool
    require_legacy_doc_native: bool = False
    fields: list[dict[str, Any]] = field(default_factory=list)
```

### 5.2 InstructionExtractResult

```python
@dataclass
class InstructionExtractResult:
    source_file_name: str
    paragraphs: list[str]
    sections: dict[str, str]
    tables: list[list[list[str]]]
    component_tables: list["ComponentTable"]
    front_text: str
```

### 5.3 ProductListRow

```python
@dataclass
class ProductListRow:
    package_specification: str
    item_no: str
    composition: str
    component_name: str
    main_component: str
    quantity: str
    source_table_title: str
    needs_review_fields: list[str] = field(default_factory=list)
```

其中 `item_no` 对应货号，本期固定 `/` 并黄底。

### 5.4 MergedField

```python
@dataclass
class MergedField:
    key: str
    label: str
    value: str
    source: str
    evidence: str
    confidence: float
    highlight_reason: str = "none"
    needs_review: bool = False
    rule_value: str = ""
    llm_value: str = ""
```

### 5.5 GeneratedFileResult

```python
@dataclass
class GeneratedFileResult:
    template_code: str
    file_name: str
    requested_format: str
    actual_format: str
    status: str
    path: str = ""
    artifact_id: int | None = None
    export_id: int | None = None
    highlight_count: int = 0
    missing_count: int = 0
    llm_only_count: int = 0
    error_message: str = ""
```

---

## 六、存储目录设计

```text
media/regulatory_info_package/{user_id}/{conversation_id}/{batch_no}/
  templates/
  logs/
    instruction_extract.json
    field_extract_result.json
    merged_fields.json
    doc_adapter_result.json
    traceability.json
  generated/
    CH1.2 监管信息目录.docx
    CH1.4 申请表.docx
    CH1.5 产品列表.docx
    CH1.9 产品申报前沟通的说明.docx
    CH1.11.1 符合标准的清单.docx
    CH1.11.5 真实性声明.docx
    CH1.11.6 符合性声明.docx
  exports/
    traceability.xlsx
    第1章 监管信息(预生成版).zip
```

说明：

| 目录 | 说明 |
| --- | --- |
| templates | 模板副本 |
| logs | 后台 JSON 产物，不作为用户主下载 |
| generated | 生成成功或兜底成功的单文件 |
| exports | 用户可下载的追溯 Excel 和 zip |

---

## 七、输入选择详细设计

### 7.1 选择优先级

`input_select.py` 的选择顺序：

1. 用户消息显式指定文件名时，按 active 附件名模糊匹配。
2. 当前对话 active 附件中文件名包含“说明书”的 `.docx`。
3. 当前对话 active 附件中唯一 `.docx`。
4. 最近成功 `FileSummaryBatch.items` 中包含“说明书”的 `.docx`。
5. 多候选或无候选时返回 `InputSelectionResult(status="waiting_user")`。

### 7.2 多候选处理

本期不新增在线选择弹窗。多候选时：

| 场景 | 处理 |
| --- | --- |
| 用户消息可模糊匹配唯一附件 | 直接选择 |
| 多个候选且无法确定 | 对话反问用户确认哪个说明书 |
| 无说明书 | 提示上传产品说明书 |

反问示例：

```text
我找到多个说明书候选，请回复要使用的文件名：A.docx、B.docx。
```

---

## 八、模板配置详细设计

配置路径：

```text
review_agent/regulatory_info_package/templates/regulatory_info_package_templates_v1.yaml
```

必须包含 7 个模板：

| code | source_file | strategy |
| --- | --- | --- |
| ch1_2_directory | CH1.2 监管信息目录.docx | directory |
| ch1_4_application_form | CH1.4 申请表.docx | application_form |
| ch1_5_product_list | CH1.5 产品列表.docx | product_list |
| ch1_9_pre_submission | CH1.9 产品申报前沟通的说明.doc | pre_submission |
| ch1_11_1_standard_list | CH1.11.1 符合标准的清单.docx | standard_list |
| ch1_11_5_authenticity | CH1.11.5 真实性声明.docx | authenticity_statement |
| ch1_11_6_compliance | CH1.11.6 符合性声明.docx | compliance_statement |

校验规则：

| 校验 | 说明 |
| --- | --- |
| version 必填 | 写入批次 |
| source_dir 存在 | 指向样例目录 |
| code 唯一 | 防止覆盖产物 |
| source_file 存在 | 缺失则配置错误 |
| strategy 合法 | 必须命中生成策略 |
| doc 模板标记 | `.doc` 模板需声明 `require_legacy_doc_native` |

---

## 九、字段抽取详细设计

### 9.1 规则抽取

规则抽取必须独立可用，覆盖：

| 字段 | 规则 |
| --- | --- |
| product_name | `【产品名称】` 下一段 |
| package_specification | `【包装规格】` 至下一章节 |
| intended_use | `【预期用途】` 至下一章节 |
| detection_principle | `【检测原理】` 至下一章节 |
| main_components | `【主要组成成分】` 下方表格摘要 |
| storage_condition_and_validity | `【储存条件及有效期】` 至下一章节 |
| sample_type | 样本要求章节中的“适用样本类型” |
| detection_targets | 预期用途/检测原理中的基因、病原体、靶标 |
| applicable_instruments | `【适用仪器】` 至下一章节 |
| test_method | `【检验方法】` 摘要 |
| standards | 正则抽取标准号 |

### 9.2 LLM 抽取与重试

`field_extract.py` 并行执行规则抽取和 LLM 抽取：

```text
ThreadPoolExecutor(max_workers=2)
  -> rule_extract()
  -> llm_extract_with_retry(max_attempts=3)
```

LLM 重试策略：

| 次数 | 间隔 |
| --- | --- |
| 第 1 次 | 立即 |
| 第 2 次 | 等待 1 秒 |
| 第 3 次 | 等待 2 秒 |

三次失败后：

| 产物 | 处理 |
| --- | --- |
| risk_notes | 增加 `llm_extract_failed` |
| logs/field_extract_result.json | 记录每次错误摘要 |
| 工作流 | 继续使用规则结果 |

LLM 不允许填企业信息、分类编码、管理类别、临床评价路径等说明书无法证明的内容。

### 9.3 字段合并

| 场景 | 写入值 | 高亮 | needs_review |
| --- | --- | --- | --- |
| rule 与 LLM 一致 | rule/LLM 值 | 否 | 否 |
| rule 与 LLM 冲突 | 规则优先或配置优先 | 黄底红字 | 是 |
| rule 缺失、LLM 命中 | LLM 值 | 黄底 | 是 |
| 全部缺失 | `/` | 黄底 | 是 |

---

## 十、文档适配器详细设计

### 10.1 统一接口

```python
class DocumentAdapter(Protocol):
    def replace_text(self, old: str, new: str, *, highlight: bool = False, conflict: bool = False) -> int: ...
    def fill_table_cell(self, row_label: str, value: str, *, highlight: bool = False, conflict: bool = False) -> bool: ...
    def replace_table(self, marker: str, rows: list[ProductListRow], *, highlight_columns: list[str] | None = None) -> bool: ...
    def save(self, path: Path) -> Path: ...
```

高亮规则：

| 类型 | 视觉 |
| --- | --- |
| missing | 黄色底色 |
| llm_only | 黄色底色 |
| conflict | 黄色底色 + 红色字体 |

### 10.2 DocxDocumentAdapter

实现能力：

| 方法 | 说明 |
| --- | --- |
| replace_text | 支持段落与表格中的文本替换，需处理 run 拆分 |
| fill_table_cell | 按行标签定位目标单元格 |
| replace_table | 重建 CH1.5 产品列表表格 |
| apply_highlight | 使用 `w:shd` 设置黄色底色 |
| apply_conflict_style | 黄色底色 + 红字 |

### 10.3 LegacyDocDocumentAdapter

接口：

```python
class AdapterCapability:
    adapter_name: str
    supports_native_doc_write: bool
    supports_docx_fallback: bool
    status: str
    error_message: str = ""

class LegacyDocDocumentAdapter:
    @staticmethod
    def detect_available_adapter() -> AdapterCapability: ...
```

执行顺序：

1. 优先尝试 `WordComDocAdapter` 原生打开 `.doc` 并保存 `.doc`。
2. 原生失败时，尝试将 `.doc` 另存为 `.docx`，再交给 `DocxDocumentAdapter`。
3. 兜底成功时，输出 `CH1.9 产品申报前沟通的说明.docx`。
4. 原生和兜底均失败时，该文件状态为 `failed`，不进入 zip。

兜底成功 `adapter_summary.doc`：

```json
{
  "requested_format": "doc",
  "actual_format": "docx",
  "adapter": "ConversionFallbackAdapter",
  "status": "fallback_success"
}
```

---

## 十一、材料生成详细设计

### 11.1 generate_docs 节点并发

工作流节点仍串行执行，但 `generate_docs` 内部并发生成单文件：

```python
with ThreadPoolExecutor(max_workers=min(7, len(specs))) as executor:
    futures = [executor.submit(generate_one_document, spec, context) for spec in specs]
```

并发注意事项：

| 注意事项 | 说明 |
| --- | --- |
| 每个文档使用独立模板副本 | 避免并发写同一文件 |
| 共享字段只读 | `merged_fields`、`product_list_rows` 不在子线程修改 |
| 数据库写入集中处理 | 子线程返回 `GeneratedFileResult`，主线程统一写 artifact/export |
| 异常隔离 | 单文件失败不影响其他文件 |

### 11.2 7 个生成策略

| 模板 | 输出规则 |
| --- | --- |
| CH1.2 | 替换产品名；页码沿用样例 |
| CH1.4 | 填产品名、包装规格、预期用途、组成、储存有效期、方法原理；企业/分类等缺失项 `/` 黄底 |
| CH1.5 | 按样例表头重建，货号 `/` 黄底 |
| CH1.9 | 优先 `.doc` 原生写入；失败则 `.docx` 兜底；兜底失败则不输出 |
| CH1.11.1 | 说明书标准号直接写；知识库候选只作为待确认高亮/追溯 |
| CH1.11.5 | 保留正文，替换产品名，公司名 `/` 黄底，日期当天 |
| CH1.11.6 | 保留正文，替换产品名，公司名 `/` 黄底，日期当天 |

### 11.3 产品名缺失

规则和 LLM 都抽不到产品名称时：

| 项 | 处理 |
| --- | --- |
| 文件内容 | 产品名位置写 `/` 并黄底 |
| 批次状态 | 至少 `partial_success` |
| zip | 仍生成，包含成功文件 |
| 摘要 | 明确提示产品名称待确认 |

---

## 十二、追溯与 zip 设计

### 12.1 追溯 Excel

用户可下载：

```text
exports/traceability.xlsx
```

创建导出记录：

```text
export_category = traceability
export_type = excel
```

字段：

| 字段 | 说明 |
| --- | --- |
| target_file | 目标文件 |
| target_field | 目标字段 |
| final_value | 写入值 |
| extraction_source | rule、llm、missing、knowledge_candidate |
| evidence | 来源片段 |
| highlight_reason | missing、llm_only、conflict、rag_candidate |
| needs_review | 是否需复核 |

### 12.2 后台 JSON

JSON 产物仅写入 `logs/`，按需从后台查看：

```text
logs/instruction_extract.json
logs/field_extract_result.json
logs/merged_fields.json
logs/traceability.json
logs/doc_adapter_result.json
```

这些 JSON 产物写入 `RegulatoryInfoPackageArtifact`，但不作为用户主下载。

### 12.3 zip 打包

zip 文件名：

```text
第1章 监管信息(预生成版).zip
```

规则：

| 场景 | 是否进入 zip |
| --- | --- |
| 文件状态 `success` | 是 |
| 文件状态 `fallback_success` | 是 |
| 文件状态 `failed` | 否 |
| 文件状态 `skipped` | 否 |

若 `CH1.9 .doc` 兜底 `.docx` 成功，zip 中放入：

```text
CH1.9 产品申报前沟通的说明.docx
```

---

## 十三、工作流详细设计

### 13.1 批次创建

```python
def create_regulatory_info_package_batch(
    *,
    conversation: Conversation,
    user,
    trigger_message: Message | None = None,
    source_attachment: FileAttachment | None = None,
    source_summary_batch: FileSummaryBatch | None = None,
    source_summary_item_id: int | None = None,
) -> RegulatoryInfoPackageBatch:
```

创建后初始化 `REGULATORY_INFO_PACKAGE_NODE_DEFINITIONS`。

### 13.2 执行器

```python
class RegulatoryInfoPackageWorkflowExecutor:
    def run(self) -> None: ...
    def _nodes(self): ...
    def _run_node(self, node: WorkflowNodeRun) -> None: ...
    def _execute_node(self, node: WorkflowNodeRun) -> None: ...
```

节点执行：

| 节点 | 关键动作 |
| --- | --- |
| prepare | 确认说明书，或 waiting_user |
| template_copy | 复制 7 个模板 |
| text_extract | 抽取说明书章节和表格 |
| field_extract | 规则 + LLM 并行抽取 |
| field_merge | 合并字段、高亮决策 |
| generate_docs | 多线程生成单文件 |
| highlight_review_items | 若生成策略已完成高亮，该节点记录确认结果即可 |
| trace_export | 写 Excel 和 logs JSON |
| zip_export | 打包成功/兜底成功文件 |
| notify | 写专项通知并调用统一通知 |
| completed | 写助手摘要 |

### 13.3 状态落定

| 条件 | 状态 |
| --- | --- |
| zip 成功且 7 个文件均 success/fallback_success | success |
| zip 成功但有 failed/skipped | partial_success |
| zip 失败但至少一个单文件成功 | partial_success |
| 全部文件失败或关键输入缺失 | failed |
| 多说明书候选等待确认 | waiting_user |

---

## 十四、路由与接口详细设计

### 14.1 skill_router.py

增加：

| 项 | 内容 |
| --- | --- |
| ROUTE_ACTIONS | 加入 `regulatory_info_package` |
| SkillRoute 属性 | `starts_regulatory_info_package` |
| deterministic route | 命中触发关键词直接返回 |
| LLM prompt | action 列表加入 `regulatory_info_package` |

### 14.2 services.py

`stream_message` 增加分支：

1. 调用 `select_instruction_input(conversation, content)`。
2. 若多候选，回复反问，不启动工作流。
3. 若无候选，回复请上传说明书。
4. 若唯一候选，创建批次并启动工作流。
5. SSE 发送 `workflow_started`。

### 14.3 views.py

接口：

```text
GET  /api/review-agent/regulatory-info-package/health/
POST /api/review-agent/regulatory-info-package/start/
GET  /api/review-agent/regulatory-info-package/<batch_id>/status/
POST /api/review-agent/regulatory-info-package/<batch_id>/select-input/
```

`status` 返回：

| 字段 | 说明 |
| --- | --- |
| batch | 状态、产品名、缺失/LLM-only/冲突数量 |
| nodes | 节点状态 |
| generated_files | 7 个文件成功/失败/兜底状态 |
| exports | zip、单文件、Excel 下载 |
| risk_notes | 风险提示 |
| notifications | 通知 |

zip 不需要 `is_primary` 字段，前端或摘要按返回顺序把 zip 放首位。

---

## 十五、助手摘要设计

完成消息结构：

```markdown
已生成第1章监管信息材料包。

批次号：RIP-...
产品名称：...
状态：success / partial_success

主下载：[第1章 监管信息(预生成版).zip](...)

| 文件 | 状态 | 下载/原因 |
| --- | --- | --- |
| CH1.2 监管信息目录.docx | 成功 | 下载 |
| CH1.9 产品申报前沟通的说明.docx | 兜底成功 | 下载 |
| CH1.11.1 符合标准的清单.docx | 失败 | 失败原因 |

待确认：缺失项 X 个，LLM复核项 Y 个，冲突项 Z 个。
```

要求：

| 要求 | 说明 |
| --- | --- |
| zip 首位 | zip 链接必须在单文件列表之前 |
| 失败可见 | 失败文件展示状态和原因，无下载链接 |
| 兜底提示 | `.doc -> .docx` 时显示“兜底成功” |
| 待确认摘要 | 展示 missing、llm_only、conflict 数量 |

---

## 十六、前端详细设计

### 16.1 模板

`templates/home.html` 增加工具 chip：

```html
<button
  class="tool-chip"
  type="button"
  data-prompt-template="根据说明书生成第1章监管信息"
>第1章监管信息</button>
```

`summaryPanel` 增加：

```html
data-regulatory-info-package-status-url-template="/api/review-agent/regulatory-info-package/__batch_id__/status/"
```

### 16.2 app.js

增加：

| 位置 | 处理 |
| --- | --- |
| workflow type 判断 | 支持 `regulatory_info_package` |
| 状态 URL 选择 | 使用 `data-regulatory-info-package-status-url-template` |
| 终态判断 | success、partial_success、failed、waiting_user |
| 导出展示 | 直接按 exports 返回顺序展示，zip 在后端排首位 |

### 16.3 不做选择 UI

多说明书候选时，本期不做弹窗。通过对话反问用户确认文件名。

---

## 十七、导出下载权限

`file_summary.views._export_for_user` 增加：

```python
if exported.workflow_type == "regulatory_info_package":
    allowed = RegulatoryInfoPackageBatch.objects.filter(
        pk=exported.workflow_batch_id,
        conversation__user=user,
        is_deleted=False,
    ).exists()
    return exported if allowed else None
```

下载 content type 增加 zip 和 `.doc` 后缀判断。

---

## 十八、通知详细设计

`notifier.py`：

```python
def notify_completion(batch: RegulatoryInfoPackageBatch, exports: list[ExportedSummaryFile]) -> RegulatoryInfoPackageNotificationRecord:
```

处理：

| 步骤 | 说明 |
| --- | --- |
| 创建专项通知记录 | 写 `RegulatoryInfoPackageNotificationRecord` |
| 调用统一通知 | `dispatch_workflow_notification(build_regulatory_info_package_context(batch))` |
| 捕获异常 | 通知失败写记录和 risk_notes，不影响批次下载 |

---

## 十九、测试详细设计

| 测试文件 | 覆盖 |
| --- | --- |
| test_regulatory_info_package_models.py | 三张表、zip export type、基础关联 |
| test_regulatory_info_package_trigger.py | 固定关键词与 LLM action |
| test_regulatory_info_package_input_select.py | 文件名模糊匹配、active 附件、多候选反问 |
| test_regulatory_info_package_template_config.py | YAML 加载、模板缺失、code 唯一 |
| test_regulatory_info_package_instruction_extract.py | 说明书章节和组成表抽取 |
| test_regulatory_info_package_field_extract.py | 规则抽取、LLM 三次重试、失败降级 |
| test_regulatory_info_package_field_merge.py | missing、llm_only、conflict |
| test_regulatory_info_package_docx_writer.py | 替换、表格填充、黄底、红字 |
| test_regulatory_info_package_legacy_doc.py | adapter 探测、docx 兜底、失败状态 |
| test_regulatory_info_package_package_generate.py | 7 文件生成结果、多线程异常隔离 |
| test_regulatory_info_package_traceability.py | Excel 追溯和 logs JSON |
| test_regulatory_info_package_zip.py | zip 只包含 success/fallback_success |
| test_regulatory_info_package_workflow.py | 节点流转、partial_success、waiting_user |
| test_regulatory_info_package_views.py | start/status/download 权限 |
| test_regulatory_info_package_frontend.py | chip、卡片、状态 URL |

---

## 二十、异常处理矩阵

| 异常 | 批次状态 | 处理 |
| --- | --- | --- |
| 无说明书 | waiting_user 或不创建批次 | 提示上传说明书 |
| 多候选无法匹配 | waiting_user 或不创建批次 | 反问确认文件名 |
| 模板缺失 | failed | 列出缺失模板 |
| 规则抽取失败 | partial_success/continue | 使用 LLM 结果 |
| LLM 三次失败 | continue | 使用规则结果，写 risk_notes |
| 产品名缺失 | partial_success | 写 `/` 黄底，继续生成 zip |
| 单个 docx 文件生成失败 | partial_success | 不进入 zip，摘要展示失败 |
| CH1.9 doc 原生失败但 docx 兜底成功 | success/partial_success | 状态 fallback_success，进入 zip |
| CH1.9 doc 和 docx 兜底均失败 | partial_success | 不进入 zip，摘要展示失败 |
| traceability.xlsx 失败 | partial_success | 不阻断 zip |
| zip 失败 | partial_success | 保留单文件下载 |
| 通知失败 | 不影响主状态 | 写通知失败和 risk_notes |

---

## 二十一、设计结论

| 编号 | 结论 |
| --- | --- |
| D1 | 详细设计文档路径为 `docs/4.详细设计/5.第1章监管信息材料包生成.md` |
| D2 | 模型集中在 `review_agent/models.py`，业务模块为 `review_agent/regulatory_info_package/` |
| D3 | `.doc` 采用 A+C：优先 Word COM 原生处理，同时设计适配器层和能力探测 |
| D4 | `.doc` 原生失败时允许 `.docx` 兜底；兜底文件名为 `CH1.9 产品申报前沟通的说明.docx` |
| D5 | zip 只包含成功或兜底成功文件，失败文件不进入 zip |
| D6 | LLM 最多重试 3 次，失败后使用规则结果继续 |
| D7 | 缺失和 LLM-only 黄底，冲突黄底红字 |
| D8 | 产品列表使用 `ProductListRow`，货号固定 `/` 黄底 |
| D9 | 标准清单只复用现有知识库能力，不新增独立 RAG 流程 |
| D10 | 前端最小接入，不做说明书选择弹窗 |
| D11 | 追溯 Excel 可下载，JSON 只放后台 logs |
| D12 | 本期不新增字段级数据库表 |
| D13 | 工作流串行，文档生成节点内部可多线程 |
| D14 | 本轮只产出详细设计，不写代码、不生成迁移 |