feat: metagpt-Python-PDF-Feedback-UI-20260326
This commit is contained in:
63
resources/system_design/20260326175651.md
Normal file
63
resources/system_design/20260326175651.md
Normal file
@@ -0,0 +1,63 @@
|
||||
## Implementation approach
|
||||
|
||||
使用pypdf库实现PDF文件的读取和文本提取,该库安装简单、API直观,完全满足'尽量简单'的要求。程序通过Python标准库argparse接收命令行参数,支持指定PDF文件路径和可选的页码范围。异常处理使用Python内置异常机制,覆盖文件不存在、格式错误、权限不足等常见情况。整个程序只需一个主文件,依赖最小化。
|
||||
|
||||
## File list
|
||||
|
||||
- main.py
|
||||
|
||||
## Data structures and interfaces
|
||||
|
||||
|
||||
classDiagram
|
||||
class PDFOpener {
|
||||
-str file_path
|
||||
-int start_page
|
||||
-int end_page
|
||||
+__init__(file_path: str, start_page: int, end_page: int)
|
||||
+open() PdfReader
|
||||
+get_info(reader: PdfReader) dict
|
||||
+extract_text(reader: PdfReader) list[str]
|
||||
+display_info(info: dict) None
|
||||
+display_text(texts: list[str]) None
|
||||
+run() None
|
||||
}
|
||||
class Main {
|
||||
+parse_args() argparse.Namespace
|
||||
+main() None
|
||||
}
|
||||
Main --> PDFOpener : creates and uses
|
||||
|
||||
|
||||
## Program call flow
|
||||
|
||||
|
||||
sequenceDiagram
|
||||
participant M as Main
|
||||
participant AP as argparse
|
||||
participant PO as PDFOpener
|
||||
participant PR as PdfReader(pypdf)
|
||||
|
||||
M->>AP: parse_args()
|
||||
AP-->>M: args(file_path, start_page, end_page)
|
||||
M->>PO: __init__(file_path, start_page, end_page)
|
||||
M->>PO: run()
|
||||
PO->>PO: open()
|
||||
PO->>PR: PdfReader(file_path)
|
||||
PR-->>PO: reader
|
||||
PO->>PO: get_info(reader)
|
||||
PR-->>PO: info(num_pages, metadata)
|
||||
PO->>PO: display_info(info)
|
||||
PO-->>M: print info to console
|
||||
PO->>PO: extract_text(reader)
|
||||
loop for each page in range
|
||||
PR-->>PO: page.extract_text()
|
||||
end
|
||||
PO->>PO: display_text(texts)
|
||||
PO-->>M: print text to console
|
||||
|
||||
|
||||
## Anything UNCLEAR
|
||||
|
||||
当前需求已较为清晰,按最简单的文本提取功能实现。程序仅支持文本内容提取和基本信息展示,不支持复制、搜索、保存等高级操作。若后续需要支持加密PDF的解密或表格提取,可考虑升级为pdfplumber或pymupdf库。
|
||||
|
||||
Reference in New Issue
Block a user