chore(docs): 清理过时任务计划文档
This commit is contained in:
@@ -1,85 +0,0 @@
|
||||
# Enum Value Transport And SysEnum Sync Implementation Plan
|
||||
|
||||
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
||||
|
||||
**Goal:** Unify long-lived structured text fields to use enum values for transport, and rebuild `sys_enum` from enum definitions through a delete-then-insert synchronization flow.
|
||||
|
||||
**Architecture:** Add a shared enum definition contract in the backend, let existing enums implement it, and refactor enum initialization to rebuild each `catalog + type` group from code. Update the RAG parse request and frontend page to submit enum values instead of enum names.
|
||||
|
||||
**Tech Stack:** Java 21, Spring Boot, MyBatis-Plus, JUnit 5, Vue 3, TypeScript, Vitest
|
||||
|
||||
---
|
||||
|
||||
### Task 1: Lock Request Protocol With Failing Tests
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/test/java/com/bruce/rag/RagDocumentParseServiceImplTests.java`
|
||||
- Modify: `frontend/src/pages/rag/__tests__/RagDocumentsPage.spec.ts`
|
||||
|
||||
- [ ] **Step 1: Write the failing test**
|
||||
- [ ] **Step 2: Run targeted backend and frontend tests to verify they fail because the old string protocol is still in place**
|
||||
- [ ] **Step 3: Update assertions to require integer enum values for `chunkStrategy`**
|
||||
- [ ] **Step 4: Re-run targeted tests and keep them red until implementation exists**
|
||||
|
||||
### Task 2: Implement Backend Enum Definition Contract
|
||||
|
||||
**Files:**
|
||||
- Create: `src/main/java/com/bruce/common/enums/PersistableSysEnumDefinition.java`
|
||||
- Modify: `src/main/java/com/bruce/common/enums/EnableStatusEnum.java`
|
||||
- Modify: `src/main/java/com/bruce/common/enums/CommonStatusEnum.java`
|
||||
- Modify: `src/main/java/com/bruce/rag/enums/RagParseStatusEnum.java`
|
||||
- Modify: `src/main/java/com/bruce/rag/enums/RagIndexStatusEnum.java`
|
||||
- Modify: `src/main/java/com/bruce/rag/enums/RagChunkStrategyEnum.java`
|
||||
- Modify: `src/test/java/com/bruce/common/enumconfig/EnumDefinitionTests.java`
|
||||
|
||||
- [ ] **Step 1: Write or extend failing tests for enum metadata access and stable value lookup**
|
||||
- [ ] **Step 2: Run backend enum tests to verify failure**
|
||||
- [ ] **Step 3: Implement the shared enum definition contract and make existing enums implement it**
|
||||
- [ ] **Step 4: Add `fromValue(Integer)` support where needed and rerun tests to green**
|
||||
|
||||
### Task 3: Rebuild SysEnum Initialization Flow
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/test/java/com/bruce/common/enumconfig/SysEnumDataInitTests.java`
|
||||
- Create or Modify: `src/test/java/com/bruce/common/enumconfig/...` supporting tests as needed
|
||||
- Modify: `src/main/java/com/bruce/common/service/ISysEnumService.java`
|
||||
- Modify: `src/main/java/com/bruce/common/service/impl/SysEnumServiceImpl.java`
|
||||
|
||||
- [ ] **Step 1: Write failing tests for duplicate `catalog + type`, duplicate value detection, and delete-then-insert rebuild behavior**
|
||||
- [ ] **Step 2: Run targeted backend tests to verify failure**
|
||||
- [ ] **Step 3: Add service support for removing and rebuilding a whole enum group**
|
||||
- [ ] **Step 4: Refactor enum init test to register enum groups, validate uniqueness, delete old rows, and batch insert the new rows**
|
||||
- [ ] **Step 5: Re-run targeted backend tests to green**
|
||||
|
||||
### Task 4: Switch RAG Parse Request To Integer Enum Values
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/main/java/com/bruce/rag/dto/request/RagDocumentParseRequest.java`
|
||||
- Modify: `src/main/java/com/bruce/rag/parse/RagChunkCommand.java`
|
||||
- Modify: `src/main/java/com/bruce/rag/service/impl/RagDocumentParseServiceImpl.java`
|
||||
- Modify: `src/test/java/com/bruce/rag/RagDocumentParseServiceImplTests.java`
|
||||
|
||||
- [ ] **Step 1: Confirm failing backend parse tests expect integer values**
|
||||
- [ ] **Step 2: Change DTOs and validation logic to use enum values**
|
||||
- [ ] **Step 3: Re-run targeted backend parse tests to green**
|
||||
|
||||
### Task 5: Update Frontend To Use Enum Values
|
||||
|
||||
**Files:**
|
||||
- Modify: `frontend/src/api/ragDocuments.ts`
|
||||
- Modify: `frontend/src/pages/rag/RagDocumentsPage.vue`
|
||||
- Modify: `frontend/src/pages/rag/__tests__/RagDocumentsPage.spec.ts`
|
||||
|
||||
- [ ] **Step 1: Confirm frontend parse request test is red for numeric enum values**
|
||||
- [ ] **Step 2: Change API typing, page defaults, options, and comparisons to numeric enum values**
|
||||
- [ ] **Step 3: Re-run targeted frontend tests to green**
|
||||
|
||||
### Task 6: Record Project Convention
|
||||
|
||||
**Files:**
|
||||
- Modify: `AGENT.md`
|
||||
- Modify: `docs/ARCHITECTURE.md`
|
||||
|
||||
- [ ] **Step 1: Add the long-term convention that stable structured text uses enum values for transport**
|
||||
- [ ] **Step 2: Add the rule that enum changes must be synchronized into `sys_enum` through the initialization test**
|
||||
- [ ] **Step 3: Do a final focused verification run**
|
||||
@@ -1,463 +0,0 @@
|
||||
# RAG Chunker Foundation Implementation Plan
|
||||
|
||||
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
||||
|
||||
**Goal:** Build the standalone chunking foundation for the RAG module with a factory, fixed-length and delimiter implementations, and focused unit tests.
|
||||
|
||||
**Architecture:** Add a dedicated chunking abstraction under `com.bruce.rag.parse` so parsing and chunk generation stay decoupled. The factory will resolve a `RagChunkStrategyEnum` to a `Chunker` implementation, and each implementation will convert a command object into in-memory `RagChunk` entities without touching persistence.
|
||||
|
||||
**Tech Stack:** Java 21, Spring Boot, MyBatis-Plus entities, JUnit 5, Mockito
|
||||
|
||||
---
|
||||
|
||||
### Task 1: Define Chunking Contracts
|
||||
|
||||
**Files:**
|
||||
- Create: `src/main/java/com/bruce/rag/parse/RagChunkCommand.java`
|
||||
- Create: `src/main/java/com/bruce/rag/parse/Chunker.java`
|
||||
- Create: `src/main/java/com/bruce/rag/parse/ChunkerFactory.java`
|
||||
- Test: `src/test/java/com/bruce/rag/parse/ChunkerFactoryTests.java`
|
||||
|
||||
- [ ] **Step 1: Write the failing test**
|
||||
|
||||
```java
|
||||
package com.bruce.rag.parse;
|
||||
|
||||
import com.bruce.rag.entity.RagDocument;
|
||||
import com.bruce.rag.entity.RagChunk;
|
||||
import com.bruce.rag.enums.RagChunkStrategyEnum;
|
||||
import org.junit.jupiter.api.Test;
|
||||
|
||||
import java.util.List;
|
||||
|
||||
import static org.junit.jupiter.api.Assertions.assertSame;
|
||||
import static org.junit.jupiter.api.Assertions.assertThrows;
|
||||
|
||||
class ChunkerFactoryTests {
|
||||
|
||||
@Test
|
||||
void resolveShouldReturnMatchingChunker() {
|
||||
Chunker supported = new StubChunker(RagChunkStrategyEnum.FIXED_LENGTH);
|
||||
Chunker unsupported = new StubChunker(RagChunkStrategyEnum.DELIMITER);
|
||||
ChunkerFactory factory = new ChunkerFactory(List.of(supported, unsupported));
|
||||
|
||||
Chunker resolved = factory.resolve(RagChunkStrategyEnum.FIXED_LENGTH);
|
||||
|
||||
assertSame(supported, resolved);
|
||||
}
|
||||
|
||||
@Test
|
||||
void resolveShouldRejectUnsupportedStrategy() {
|
||||
ChunkerFactory factory = new ChunkerFactory(List.of(new StubChunker(RagChunkStrategyEnum.FIXED_LENGTH)));
|
||||
|
||||
assertThrows(IllegalArgumentException.class, () -> factory.resolve(RagChunkStrategyEnum.SEMANTIC));
|
||||
}
|
||||
|
||||
private static class StubChunker implements Chunker {
|
||||
|
||||
private final RagChunkStrategyEnum strategy;
|
||||
|
||||
private StubChunker(RagChunkStrategyEnum strategy) {
|
||||
this.strategy = strategy;
|
||||
}
|
||||
|
||||
@Override
|
||||
public boolean supports(RagChunkStrategyEnum strategy) {
|
||||
return this.strategy == strategy;
|
||||
}
|
||||
|
||||
@Override
|
||||
public List<RagChunk> chunk(RagChunkCommand command) {
|
||||
return List.of();
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Run test to verify it fails**
|
||||
|
||||
Run: `mvn -Dtest=ChunkerFactoryTests test`
|
||||
Expected: FAIL with compilation errors because `Chunker`, `ChunkerFactory`, and `RagChunkCommand` do not exist yet
|
||||
|
||||
- [ ] **Step 3: Write minimal implementation**
|
||||
|
||||
```java
|
||||
package com.bruce.rag.parse;
|
||||
|
||||
import com.bruce.common.document.parse.DocumentParseResult;
|
||||
import com.bruce.rag.entity.RagDocument;
|
||||
import lombok.Data;
|
||||
|
||||
@Data
|
||||
public class RagChunkCommand {
|
||||
|
||||
private RagDocument document;
|
||||
|
||||
private DocumentParseResult parseResult;
|
||||
|
||||
private String chunkStrategy;
|
||||
|
||||
private Integer chunkSize;
|
||||
|
||||
private Integer chunkOverlap;
|
||||
|
||||
private String delimiter;
|
||||
}
|
||||
```
|
||||
|
||||
```java
|
||||
package com.bruce.rag.parse;
|
||||
|
||||
import com.bruce.rag.entity.RagChunk;
|
||||
import com.bruce.rag.enums.RagChunkStrategyEnum;
|
||||
|
||||
import java.util.List;
|
||||
|
||||
public interface Chunker {
|
||||
|
||||
boolean supports(RagChunkStrategyEnum strategy);
|
||||
|
||||
List<RagChunk> chunk(RagChunkCommand command);
|
||||
}
|
||||
```
|
||||
|
||||
```java
|
||||
package com.bruce.rag.parse;
|
||||
|
||||
import com.bruce.rag.enums.RagChunkStrategyEnum;
|
||||
import org.springframework.stereotype.Component;
|
||||
|
||||
import java.util.List;
|
||||
|
||||
@Component
|
||||
public class ChunkerFactory {
|
||||
|
||||
private final List<Chunker> chunkers;
|
||||
|
||||
public ChunkerFactory(List<Chunker> chunkers) {
|
||||
this.chunkers = chunkers;
|
||||
}
|
||||
|
||||
public Chunker resolve(RagChunkStrategyEnum strategy) {
|
||||
return chunkers.stream()
|
||||
.filter(chunker -> chunker.supports(strategy))
|
||||
.findFirst()
|
||||
.orElseThrow(() -> new IllegalArgumentException("不支持的切片方式: " + strategy));
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 4: Run test to verify it passes**
|
||||
|
||||
Run: `mvn -Dtest=ChunkerFactoryTests test`
|
||||
Expected: PASS
|
||||
|
||||
- [ ] **Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git add src/main/java/com/bruce/rag/parse/RagChunkCommand.java src/main/java/com/bruce/rag/parse/Chunker.java src/main/java/com/bruce/rag/parse/ChunkerFactory.java src/test/java/com/bruce/rag/parse/ChunkerFactoryTests.java
|
||||
git commit -m "feat: add rag chunker contracts"
|
||||
```
|
||||
|
||||
### Task 2: Add Fixed-Length Chunker
|
||||
|
||||
**Files:**
|
||||
- Create: `src/main/java/com/bruce/rag/parse/impl/FixedLengthChunker.java`
|
||||
- Test: `src/test/java/com/bruce/rag/parse/FixedLengthChunkerTests.java`
|
||||
|
||||
- [ ] **Step 1: Write the failing test**
|
||||
|
||||
```java
|
||||
package com.bruce.rag.parse;
|
||||
|
||||
import com.bruce.common.document.parse.DocumentParseResult;
|
||||
import com.bruce.rag.entity.RagChunk;
|
||||
import com.bruce.rag.entity.RagDocument;
|
||||
import com.bruce.rag.parse.impl.FixedLengthChunker;
|
||||
import org.junit.jupiter.api.Test;
|
||||
|
||||
import java.util.List;
|
||||
|
||||
import static org.junit.jupiter.api.Assertions.assertEquals;
|
||||
import static org.junit.jupiter.api.Assertions.assertTrue;
|
||||
|
||||
class FixedLengthChunkerTests {
|
||||
|
||||
@Test
|
||||
void chunkShouldSplitTextByChunkSizeAndOverlap() {
|
||||
FixedLengthChunker chunker = new FixedLengthChunker();
|
||||
|
||||
RagChunkCommand command = new RagChunkCommand();
|
||||
command.setDocument(buildDocument());
|
||||
command.setParseResult(buildParseResult("abcdefghij"));
|
||||
command.setChunkStrategy("FIXED_LENGTH");
|
||||
command.setChunkSize(4);
|
||||
command.setChunkOverlap(1);
|
||||
|
||||
List<RagChunk> chunks = chunker.chunk(command);
|
||||
|
||||
assertEquals(3, chunks.size());
|
||||
assertEquals("abcd", chunks.get(0).getChunkContent());
|
||||
assertEquals("defg", chunks.get(1).getChunkContent());
|
||||
assertEquals("ghij", chunks.get(2).getChunkContent());
|
||||
assertEquals(0, chunks.get(0).getChunkIndex());
|
||||
assertEquals(1, chunks.get(1).getChunkIndex());
|
||||
assertEquals(2, chunks.get(2).getChunkIndex());
|
||||
assertEquals(99L, chunks.get(0).getDocumentId());
|
||||
assertEquals(88L, chunks.get(0).getStoreId());
|
||||
assertTrue(Boolean.TRUE.equals(chunks.get(0).getEnabled()));
|
||||
}
|
||||
|
||||
@Test
|
||||
void chunkShouldReturnEmptyListForBlankText() {
|
||||
FixedLengthChunker chunker = new FixedLengthChunker();
|
||||
|
||||
RagChunkCommand command = new RagChunkCommand();
|
||||
command.setDocument(buildDocument());
|
||||
command.setParseResult(buildParseResult(" "));
|
||||
command.setChunkStrategy("FIXED_LENGTH");
|
||||
command.setChunkSize(4);
|
||||
command.setChunkOverlap(1);
|
||||
|
||||
assertTrue(chunker.chunk(command).isEmpty());
|
||||
}
|
||||
|
||||
private static RagDocument buildDocument() {
|
||||
RagDocument document = new RagDocument();
|
||||
document.setId(99L);
|
||||
document.setStoreId(88L);
|
||||
return document;
|
||||
}
|
||||
|
||||
private static DocumentParseResult buildParseResult(String text) {
|
||||
DocumentParseResult result = new DocumentParseResult();
|
||||
result.setText(text);
|
||||
result.setTextLength(text.length());
|
||||
return result;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Run test to verify it fails**
|
||||
|
||||
Run: `mvn -Dtest=FixedLengthChunkerTests test`
|
||||
Expected: FAIL with compilation errors because `FixedLengthChunker` does not exist yet
|
||||
|
||||
- [ ] **Step 3: Write minimal implementation**
|
||||
|
||||
```java
|
||||
package com.bruce.rag.parse.impl;
|
||||
|
||||
import com.bruce.common.document.parse.DocumentParseResult;
|
||||
import com.bruce.rag.entity.RagChunk;
|
||||
import com.bruce.rag.entity.RagDocument;
|
||||
import com.bruce.rag.enums.RagChunkStrategyEnum;
|
||||
import com.bruce.rag.parse.Chunker;
|
||||
import com.bruce.rag.parse.RagChunkCommand;
|
||||
import org.springframework.stereotype.Component;
|
||||
import org.springframework.util.StringUtils;
|
||||
|
||||
import java.util.ArrayList;
|
||||
import java.util.List;
|
||||
|
||||
@Component
|
||||
public class FixedLengthChunker implements Chunker {
|
||||
|
||||
@Override
|
||||
public boolean supports(RagChunkStrategyEnum strategy) {
|
||||
return RagChunkStrategyEnum.FIXED_LENGTH == strategy;
|
||||
}
|
||||
|
||||
@Override
|
||||
public List<RagChunk> chunk(RagChunkCommand command) {
|
||||
DocumentParseResult parseResult = command.getParseResult();
|
||||
String text = parseResult == null ? null : parseResult.getText();
|
||||
if (!StringUtils.hasText(text)) {
|
||||
return List.of();
|
||||
}
|
||||
|
||||
int chunkSize = command.getChunkSize() == null ? text.length() : command.getChunkSize();
|
||||
int overlap = command.getChunkOverlap() == null ? 0 : command.getChunkOverlap();
|
||||
int step = Math.max(1, chunkSize - overlap);
|
||||
List<RagChunk> chunks = new ArrayList<>();
|
||||
for (int start = 0, index = 0; start < text.length(); start += step, index++) {
|
||||
int end = Math.min(text.length(), start + chunkSize);
|
||||
chunks.add(buildChunk(command.getDocument(), index, text.substring(start, end)));
|
||||
if (end >= text.length()) {
|
||||
break;
|
||||
}
|
||||
}
|
||||
return chunks;
|
||||
}
|
||||
|
||||
private RagChunk buildChunk(RagDocument document, int index, String content) {
|
||||
RagChunk chunk = new RagChunk();
|
||||
chunk.setStoreId(document.getStoreId());
|
||||
chunk.setDocumentId(document.getId());
|
||||
chunk.setChunkIndex(index);
|
||||
chunk.setChunkContent(content);
|
||||
chunk.setEnabled(Boolean.TRUE);
|
||||
return chunk;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 4: Run test to verify it passes**
|
||||
|
||||
Run: `mvn -Dtest=FixedLengthChunkerTests test`
|
||||
Expected: PASS
|
||||
|
||||
- [ ] **Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git add src/main/java/com/bruce/rag/parse/impl/FixedLengthChunker.java src/test/java/com/bruce/rag/parse/FixedLengthChunkerTests.java
|
||||
git commit -m "feat: add fixed length rag chunker"
|
||||
```
|
||||
|
||||
### Task 3: Add Delimiter Chunker
|
||||
|
||||
**Files:**
|
||||
- Create: `src/main/java/com/bruce/rag/parse/impl/DelimiterChunker.java`
|
||||
- Test: `src/test/java/com/bruce/rag/parse/DelimiterChunkerTests.java`
|
||||
|
||||
- [ ] **Step 1: Write the failing test**
|
||||
|
||||
```java
|
||||
package com.bruce.rag.parse;
|
||||
|
||||
import com.bruce.common.document.parse.DocumentParseResult;
|
||||
import com.bruce.rag.entity.RagChunk;
|
||||
import com.bruce.rag.entity.RagDocument;
|
||||
import com.bruce.rag.parse.impl.DelimiterChunker;
|
||||
import org.junit.jupiter.api.Test;
|
||||
|
||||
import java.util.List;
|
||||
|
||||
import static org.junit.jupiter.api.Assertions.assertEquals;
|
||||
import static org.junit.jupiter.api.Assertions.assertTrue;
|
||||
|
||||
class DelimiterChunkerTests {
|
||||
|
||||
@Test
|
||||
void chunkShouldSplitByDelimiterAndIgnoreBlankSegments() {
|
||||
DelimiterChunker chunker = new DelimiterChunker();
|
||||
|
||||
RagChunkCommand command = new RagChunkCommand();
|
||||
command.setDocument(buildDocument());
|
||||
command.setParseResult(buildParseResult("第一段。第二段。。第三段"));
|
||||
command.setChunkStrategy("DELIMITER");
|
||||
command.setDelimiter("。");
|
||||
|
||||
List<RagChunk> chunks = chunker.chunk(command);
|
||||
|
||||
assertEquals(3, chunks.size());
|
||||
assertEquals("第一段", chunks.get(0).getChunkContent());
|
||||
assertEquals("第二段", chunks.get(1).getChunkContent());
|
||||
assertEquals("第三段", chunks.get(2).getChunkContent());
|
||||
assertEquals(0, chunks.get(0).getChunkIndex());
|
||||
assertEquals(1, chunks.get(1).getChunkIndex());
|
||||
assertEquals(2, chunks.get(2).getChunkIndex());
|
||||
}
|
||||
|
||||
@Test
|
||||
void chunkShouldReturnEmptyListForBlankText() {
|
||||
DelimiterChunker chunker = new DelimiterChunker();
|
||||
|
||||
RagChunkCommand command = new RagChunkCommand();
|
||||
command.setDocument(buildDocument());
|
||||
command.setParseResult(buildParseResult(" "));
|
||||
command.setChunkStrategy("DELIMITER");
|
||||
command.setDelimiter("。");
|
||||
|
||||
assertTrue(chunker.chunk(command).isEmpty());
|
||||
}
|
||||
|
||||
private static RagDocument buildDocument() {
|
||||
RagDocument document = new RagDocument();
|
||||
document.setId(66L);
|
||||
document.setStoreId(55L);
|
||||
return document;
|
||||
}
|
||||
|
||||
private static DocumentParseResult buildParseResult(String text) {
|
||||
DocumentParseResult result = new DocumentParseResult();
|
||||
result.setText(text);
|
||||
result.setTextLength(text.length());
|
||||
return result;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Run test to verify it fails**
|
||||
|
||||
Run: `mvn -Dtest=DelimiterChunkerTests test`
|
||||
Expected: FAIL with compilation errors because `DelimiterChunker` does not exist yet
|
||||
|
||||
- [ ] **Step 3: Write minimal implementation**
|
||||
|
||||
```java
|
||||
package com.bruce.rag.parse.impl;
|
||||
|
||||
import com.bruce.common.document.parse.DocumentParseResult;
|
||||
import com.bruce.rag.entity.RagChunk;
|
||||
import com.bruce.rag.entity.RagDocument;
|
||||
import com.bruce.rag.enums.RagChunkStrategyEnum;
|
||||
import com.bruce.rag.parse.Chunker;
|
||||
import com.bruce.rag.parse.RagChunkCommand;
|
||||
import org.springframework.stereotype.Component;
|
||||
import org.springframework.util.StringUtils;
|
||||
|
||||
import java.util.ArrayList;
|
||||
import java.util.List;
|
||||
import java.util.regex.Pattern;
|
||||
|
||||
@Component
|
||||
public class DelimiterChunker implements Chunker {
|
||||
|
||||
@Override
|
||||
public boolean supports(RagChunkStrategyEnum strategy) {
|
||||
return RagChunkStrategyEnum.DELIMITER == strategy;
|
||||
}
|
||||
|
||||
@Override
|
||||
public List<RagChunk> chunk(RagChunkCommand command) {
|
||||
DocumentParseResult parseResult = command.getParseResult();
|
||||
String text = parseResult == null ? null : parseResult.getText();
|
||||
if (!StringUtils.hasText(text) || !StringUtils.hasText(command.getDelimiter())) {
|
||||
return List.of();
|
||||
}
|
||||
|
||||
String[] parts = text.split(Pattern.quote(command.getDelimiter()));
|
||||
List<RagChunk> chunks = new ArrayList<>();
|
||||
for (String part : parts) {
|
||||
if (!StringUtils.hasText(part)) {
|
||||
continue;
|
||||
}
|
||||
chunks.add(buildChunk(command.getDocument(), chunks.size(), part.trim()));
|
||||
}
|
||||
return chunks;
|
||||
}
|
||||
|
||||
private RagChunk buildChunk(RagDocument document, int index, String content) {
|
||||
RagChunk chunk = new RagChunk();
|
||||
chunk.setStoreId(document.getStoreId());
|
||||
chunk.setDocumentId(document.getId());
|
||||
chunk.setChunkIndex(index);
|
||||
chunk.setChunkContent(content);
|
||||
chunk.setEnabled(Boolean.TRUE);
|
||||
return chunk;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 4: Run test to verify it passes**
|
||||
|
||||
Run: `mvn -Dtest=DelimiterChunkerTests test`
|
||||
Expected: PASS
|
||||
|
||||
- [ ] **Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git add src/main/java/com/bruce/rag/parse/impl/DelimiterChunker.java src/test/java/com/bruce/rag/parse/DelimiterChunkerTests.java
|
||||
git commit -m "feat: add delimiter rag chunker"
|
||||
```
|
||||
@@ -1,125 +0,0 @@
|
||||
# 枚举值传输与 SysEnum 同步设计
|
||||
|
||||
## 1. 背景
|
||||
|
||||
当前系统中存在两类问题:
|
||||
|
||||
1. 前后端针对长期固定的结构化字段,仍然传递字符串名称,例如 `chunkStrategy` 传 `"FIXED_LENGTH"`。
|
||||
2. `sys_enum` 初始化依赖测试类中逐条 `saveOrUpdate(...)`,新增或修改枚举时需要手工同步多处,且不会清理同一 `catalog/type` 下的历史脏数据。
|
||||
|
||||
这会导致协议冗余、前后端约束不统一,以及数据库枚举配置可能与代码定义漂移。
|
||||
|
||||
## 2. 目标
|
||||
|
||||
本次改造需要达成以下目标:
|
||||
|
||||
- 长期固定的结构化文本字段,统一采用枚举值传输,不再传名称字符串。
|
||||
- 后端 Java 枚举成为结构化枚举的单一事实来源。
|
||||
- `sys_enum` 初始化机制支持按 `catalog + type` 分组,先删后全量重建。
|
||||
- 前端展示继续使用中文文案,但请求协议只传枚举值。
|
||||
- 新增或修改枚举后,只需改枚举类并运行统一测试,即可完成数据库同步。
|
||||
|
||||
## 3. 范围
|
||||
|
||||
本次纳入统一规范的枚举包括:
|
||||
|
||||
- `EnableStatusEnum`
|
||||
- `CommonStatusEnum`
|
||||
- `RagParseStatusEnum`
|
||||
- `RagIndexStatusEnum`
|
||||
- `RagChunkStrategyEnum`
|
||||
|
||||
本次同时把 `RagDocumentParseRequest.chunkStrategy` 从字符串协议改为数值协议。
|
||||
|
||||
## 4. 设计方案
|
||||
|
||||
### 4.1 后端枚举契约
|
||||
|
||||
新增一个统一的枚举定义接口,用于描述可同步到 `sys_enum` 的枚举项。接口提供:
|
||||
|
||||
- `getCatalog()`
|
||||
- `getType()`
|
||||
- `getName()`
|
||||
- `getValue()`
|
||||
- `getStrvalue()`
|
||||
- `getSort()`
|
||||
- `getRemark()`
|
||||
|
||||
上述五个现有枚举类统一实现该接口,使代码层直接具备落库所需信息。
|
||||
|
||||
### 4.2 枚举组唯一性
|
||||
|
||||
每一组枚举通过 `catalog + type` 唯一标识,例如:
|
||||
|
||||
- `common / enable_status`
|
||||
- `common / common_status`
|
||||
- `rag / parse_status`
|
||||
- `rag / index_status`
|
||||
- `rag / chunk_strategy`
|
||||
|
||||
系统要求不同枚举组之间 `catalog + type` 不能重复,否则无法安全执行“先删后全加”。
|
||||
|
||||
### 4.3 SysEnum 初始化机制
|
||||
|
||||
重写 `SysEnumDataInitTests` 的初始化方式:
|
||||
|
||||
1. 收集所有需要同步的枚举组。
|
||||
2. 校验每个枚举组内部是否存在重复 `value`、重复 `sort`,以及不同组之间是否存在重复 `catalog + type`。
|
||||
3. 对每个枚举组先按 `catalog + type` 删除数据库中的旧枚举。
|
||||
4. 将当前代码定义的整组枚举全量写入 `sys_enum`。
|
||||
|
||||
这样可以保证数据库状态始终与当前代码一致,而不是增量叠加。
|
||||
|
||||
### 4.4 后端请求协议
|
||||
|
||||
`RagDocumentParseRequest.chunkStrategy` 改为 `Integer`,只接收枚举值。
|
||||
|
||||
同时为 `RagChunkStrategyEnum` 增加按值解析的方法,例如 `fromValue(Integer value)`,供服务层进行校验和转换。
|
||||
|
||||
`RagChunkCommand.chunkStrategy` 也同步改为 `Integer`,保持链路一致。
|
||||
|
||||
### 4.5 前端协议与展示
|
||||
|
||||
前端不再传字符串联合类型,而是改成数值枚举常量,例如:
|
||||
|
||||
- `FIXED_LENGTH = 1`
|
||||
- `DELIMITER = 5`
|
||||
|
||||
页面中的单选项 `value` 使用数值,展示文案仍使用中文 `label`。提交请求时只传枚举值。
|
||||
|
||||
### 4.6 Agent 协作约定
|
||||
|
||||
在 `AGENT.md` 中新增长期规则:
|
||||
|
||||
- 对长期固定的结构化文本字段,统一采用枚举值传输。
|
||||
- 枚举定义必须落在 Java 枚举类中。
|
||||
- 枚举变更需要同步纳入 `sys_enum` 初始化测试。
|
||||
- 每次新增或修改枚举后,需运行对应测试完成数据库同步。
|
||||
|
||||
## 5. 错误处理与边界
|
||||
|
||||
- 如果请求传入不存在的枚举值,后端直接抛出非法参数异常。
|
||||
- 如果某个枚举组定义了重复 `value` 或重复 `sort`,初始化测试直接失败。
|
||||
- 如果两个枚举组使用了相同的 `catalog + type`,初始化测试直接失败。
|
||||
- 如果前端传入旧字符串协议,后端不做兼容,统一按新协议处理。
|
||||
|
||||
## 6. 测试策略
|
||||
|
||||
后端:
|
||||
|
||||
- 扩展 `EnumDefinitionTests`,验证关键枚举值稳定。
|
||||
- 为 `RagDocumentParseServiceImplTests` 增加数值协议断言和非法值校验。
|
||||
- 为新的 `sys_enum` 全量初始化逻辑增加单元测试,验证唯一性校验和重建行为。
|
||||
|
||||
前端:
|
||||
|
||||
- 更新 `RagDocumentsPage.spec.ts`,断言解析请求提交数值枚举值。
|
||||
- 验证页面仍然展示中文切片名称。
|
||||
|
||||
## 7. 预期结果
|
||||
|
||||
改造完成后:
|
||||
|
||||
- 前后端结构化字段协议更紧凑、更稳定。
|
||||
- 枚举定义、前端传值和数据库配置三者一致。
|
||||
- 新增枚举时有固定流程,不再依赖手工增量补数据。
|
||||
Reference in New Issue
Block a user