第一部分:运行环境
1.1 Python 解释器与版本管理
CPython 解释器原理。Python 官方实现 CPython 把 .py 源码编译成字节码(.pyc,存于 __pycache__),由 PVM(Python Virtual Machine)这个基于栈的字节码解释器逐条执行。CPython 3.11+ 引入了 specializing adaptive interpreter(PEP 659),针对热路径自动特化指令;3.13 引入了实验性 JIT(PEP 744)。
| 维度 | CPython | JVM (Java) | V8 (Node.js) |
|---|
| 中间表示 | 字节码 (.pyc) | 字节码 (.class) | 字节码 → JIT 机器码 |
| 执行 | 解释为主,3.13+ 实验性 JIT | 重量级 JIT (HotSpot/C2) | 重量级 JIT (TurboFan) |
| 多线程并行 | GIL 限制(3.13+ 可关闭) | 真并行 | 单线程 + Worker |
| 启动速度 | 极快 | 慢(JVM 预热) | 快 |
| 内存模型 | 引用计数 + GC | 分代 GC | 分代 GC |
Python 版本(2026-04 现状)。最新稳定版 Python 3.14.4(2025-10 发布 3.14.0),生产推荐 3.13 或 3.12。3.13 起 PEP 703 free-threaded build(无 GIL)开始可选支持,3.14 通过 PEP 779 进入 Phase II "受支持但可选",二进制叫 python3.14t。3.9 已 EOL(2025-10);3.10 EOL 在 2026-10。从 3.13 起全量 bugfix 由 18 个月延长到 24 个月,总支持仍为 5 年。
版本管理推荐。2026 年的最佳实践是直接用 uv 管理 Python 版本,uv python install 3.13 一行下载预编译解释器,速度比 pyenv 快 10×+,跨三大平台统一。仍坚持 pyenv 也可:macOS brew install pyenv、Linux curl https://pyenv.run | bash、Windows 用 pyenv-win 或直接用 uv。命令对照:pyenv install 3.13 ≈ nvm install 20,pyenv global ≈ nvm alias default,pyenv local(生成 .python-version)≈ .nvmrc。
1.2 包管理与虚拟环境
虚拟环境概念。Python 没有 node_modules 这种就地隔离机制,依赖默认装到全局 site-packages,会造成项目间冲突。解决方案是 virtualenv:在项目目录下创建一个 .venv/ 文件夹,里面是一份独立的 Python 解释器副本和独立的 site-packages。激活后所有 pip install 都隔离到这个目录里。
| 概念 | Python | Java | TypeScript |
|---|
| 隔离单元 | .venv/(每项目) | 通过 Maven/Gradle 的本地仓库 + classpath 隔离 | node_modules/ |
| 描述文件 | pyproject.toml | pom.xml / build.gradle | package.json |
| 锁文件 | uv.lock / requirements.txt | 无内置(Gradle 有 lockfile) | package-lock.json / pnpm-lock.yaml |
| 全局工具 | uv tool install / pipx | ~/.m2/ | npm i -g |
uv 是 2026 年的标准答案。Astral 出品(Rust 编写,Ruff 同公司),当前版本 0.11.x,性能比 pip 快 10-100×、比 Poetry 快约 10×。它一站式替代 pyenv + venv + pip + pip-tools + pipx + twine + 大部分 Poetry 场景。
# 安装 uvcurl -LsSf https://astral.sh/uv/install.sh | sh # macOS/Linuxpowershell -c "irm https://astral.sh/uv/install.ps1 | iex" # Windows# 全流程uv init my-app # 创建项目cd my-appuv python pin 3.13 # 钉死 Python 版本uv add fastapi 'uvicorn[standard]' # 添加依赖(自动写 pyproject.toml + uv.lock)uv add --dev pytest ruff mypy # 开发依赖(PEP 735 dependency-groups)uv sync # 按 lock 同步 .venvuv run python script.py # 自动同步并运行uv run pytest # 类似 npm run testuv tool install ruff # 全局工具(≈ pipx install)uvx ruff check . # 即用即弃(≈ npx)
Poetry / PDM / Rye。Poetry 仍是第二大选择,2.0+ 已支持 PEP 621 标准元数据,但速度慢一个数量级;PDM 标准导向但社区小;Rye 已合并到 uv,2025-02 起停止单独开发。新项目无脑选 uv。
pyproject.toml 详解(PEP 621/735)。这是 Python 现代项目的"package.json + pom.xml"合体:
[project] # PEP 621:标准元数据name = "my-app"version = "0.1.0"requires-python = ">=3.12"dependencies = ["fastapi>=0.115", "pydantic>=2.9"][project.optional-dependencies] # 用户可见 extras:pip install pkg[redis]redis = ["redis>=5"][dependency-groups] # PEP 735:开发依赖,不发布dev = ["pytest>=8", "ruff>=0.9", "mypy>=1.13"][build-system]requires = ["hatchling>=1.27"]build-backend = "hatchling.build"[tool.uv]package = truedefault-groups = ["dev"][tool.ruff]line-length = 100target-version = "py312"[tool.ruff.lint]select = ["E", "F", "I", "UP", "B", "SIM", "RUF"][tool.mypy]python_version = "3.12"strict = trueplugins = ["pydantic.mypy"]
对比 package.json 与 pom.xml:
[project] dependencies ↔ package.json 的 dependencies ↔ pom.xml 的 <dependencies>
[dependency-groups] dev ↔ devDependencies ↔ <scope>test</scope>
[project.scripts] ↔ package.json 的 bin ↔ Maven exec 插件
[tool.*] 各种配置段 ↔ 散落在多个 .config.js ↔ <plugin> 配置
uv.lock(自动生成)锁定确切版本和哈希值,不要手动编辑,CI 用 uv sync --frozen 严格按 lock 安装。requirements.txt 在新项目里基本不必要,但部署 Docker 时可用 uv pip compile pyproject.toml -o requirements.txt --universal 生成。
第二部分:开发 IDE
2.1 主流 IDE 对比
VS Code + Pylance:免费、轻量、扩展生态丰富,是大多数 Python 开发者的首选。安装 ms-python.python(含调试、REPL、Notebook)+ ms-python.vscode-pylance(基于 Pyright 的语言服务器,提供 IntelliSense + 类型检查)+ charliermarsh.ruff。2025 年的 Python Environments 扩展自动识别 uv/conda 环境,类似 VS Code 写 TypeScript 时自动用 node_modules/.bin/tsc。
PyCharm 2025.3+:JetBrains 旗舰,已统一 Community + Pro(2025.2 后免费基础功能 + Pro 订阅)。2025.3 把 uv 设为默认环境管理器、原生集成 Ruff 格式化、ty/Pyright/Pyrefly 多 LSP、Junie agent(类似 Cursor)。对 Java 开发者来说几乎零迁移成本——快捷键、调试器、重构都和 IntelliJ IDEA 一致。
Cursor / Claude Code:AI-first IDE。Cursor 基于 VS Code 分叉,但因 Pylance EULA 限制需要侧载兼容版或改用 basedpyright。Claude Code 既有 CLI 也有 VS Code 扩展,原生支持 Jupyter notebook 内代码执行(带 Quick Pick 确认)和 MCP。
Jupyter / JupyterLab:数据科学和 LLM 实验场景必备。VS Code、PyCharm、Cursor 都内建 notebook 支持,远程 SSH/容器体验最佳,已基本不再需要单独的 JupyterLab。
2.2 关键插件与配置
语言服务器:Pylance(VS Code 默认)= Pyright + 增强;Pyright 单独跑用于 CI;2025 新增 ty(Astral Rust 实现,比 pyright 快 10-60×)和 Pyrefly(Meta Instagram 内部使用)。
Ruff = Black + isort + Flake8 + 数十个插件的统一替代。当前 0.15.x,性能比 Flake8 快 150-200×,对 250k LOC 项目从 2.5 min → 0.4 s。配置写在 pyproject.toml 的 [tool.ruff] 段,VS Code 中:
{ "[python]": { "editor.defaultFormatter": "charliermarsh.ruff", "editor.formatOnSave": true, "editor.codeActionsOnSave": { "source.fixAll.ruff": "explicit", "source.organizeImports.ruff": "explicit" } }}类型检查:开发时用 Pylance/Pyright 拿即时反馈,CI 用 mypy --strict 作为权威闸门。配合 pydantic.mypy 插件可让 Pydantic 模型字段也参与类型检查。
调试器:VS Code 内建 debugpy,PyCharm 内建强大调试器,都支持远程附加(attach)、条件断点、变量查看,体验和 Java 调试一致。VS Code 远程开发用 Remote-SSH / Dev Containers / WSL 扩展。
第三部分:常用语法(基础详解)
3.1 数据类型与变量
Python 动态类型 + 运行时类型注解。变量不需要声明类型,但从 Python 3.5 起支持 PEP 484 类型注解,3.10+ 起内置语法极大简化(int | None 替代 Optional[int],list[int] 替代 List[int])。注解默认不强制,需要 mypy/pyright 才能在静态分析阶段检查。
# Python 3.10+name: str = "Alice"age: int = 30salary: float | None = None # PEP 604 union 语法tags: list[str] = [] # PEP 585 内置容器泛型config: dict[str, int] = {}// JavaString name = "Alice";int age = 30;Double salary = null;List<String> tags = new ArrayList<>();Map<String, Integer> config = new HashMap<>();
// TypeScriptconst name: string = "Alice";const age: number = 30;const salary: number | null = null;const tags: string[] = [];const config: Record<string, number> = {};关键差异:Python 类型注解在运行时不阻止类型错误(除非用 Pydantic 等运行时验证库),更像 TypeScript 编译期类型;Java 是编译期 + 运行时(无类型擦除前)双重保证。
基本类型:int(任意精度,Python 没有 long/short 概念)、float(双精度)、str(Unicode)、bool(True/False,bool 是 int 的子类,True == 1 成立)、None(类似 Java 的 null / TS 的 null,但 Python 单例对象)。
可变 vs 不可变:list、dict、set 可变;tuple、str、frozenset、int、float 不可变。这与 Java 中 String 不可变、List 可变类似。
3.2 数据结构
# list(≈ Java ArrayList ≈ TS Array)nums = [1, 2, 3]nums.append(4)nums[0] = 10sliced = nums[1:3] # 切片:Python 特色,[start:stop:step]# tuple(不可变,可作为 dict 的 key)point = (3.0, 4.0)x, y = point # 解构(≈ TS const [x, y] = point)# dict(≈ Java HashMap ≈ TS Object/Map,3.7+ 保持插入顺序)user = {"name": "Alice", "age": 30}user["email"] = "a@b.com"for k, v in user.items(): ...# set(≈ Java HashSet ≈ TS Set)unique = {1, 2, 3}unique.add(4)# 列表推导式(Python 特色,比 Java Stream 简洁)squares = [x * x for x in range(10) if x % 2 == 0]# Java 等价:IntStream.range(0,10).filter(x -> x%2==0).map(x -> x*x).toArray()# TS 等价:[...Array(10).keys()].filter(x => x%2===0).map(x => x*x)word_count = {w: len(w) for w in ["a", "bb", "ccc"]} # 字典推导式unique_lens = {len(w) for w in words} # 集合推导式3.3 控制流
# if/elif/else(无括号,靠缩进)if x > 0: print("positive")elif x < 0: print("negative")else: print("zero")# for 循环:默认就是 for-of,不存在 C 风格 for(int i=0;...)for item in items: ...for i, item in enumerate(items): ... # 拿索引(≈ TS items.entries())for k, v in d.items(): ...# whilewhile not done: ...# match/case(Python 3.10+,结构化模式匹配,比 TS switch 强大得多)match command: case {"action": "move", "x": x, "y": y}: move(x, y) case ["quit"]: exit() case Point(x=0, y=0): print("origin") case _: print("unknown")# 异常:try/except/else/finally(Java try/catch/finally,多了 else)try: result = risky()except ValueError as e: print(f"value error: {e}")except (KeyError, IndexError) as e: # 多类型一行 print(f"lookup error: {e}")else: print("成功才走") # 没有异常时执行finally: cleanup()3.4 函数
Python 函数是一等公民且参数系统极其灵活:
def greet(name: str, greeting: str = "Hello", *args, **kwargs) -> str: """文档字符串(docstring,可被 help() 看到)。""" return f"{greeting}, {name}!"# 调用方式greet("Alice") # 位置参数greet(name="Alice", greeting="Hi") # 关键字参数(Java/TS 不原生支持)greet("Alice", *["Hi"], **{"x": 1}) # 解包# *args 收集多余位置参数为 tuple,**kwargs 收集多余关键字参数为 dictdef f(*args, **kwargs): print(args, kwargs)# Python 3.8+ 仅位置参数 / 仅关键字参数def f(pos_only, /, normal, *, kw_only): ...# lambda(≈ Java Function / TS 箭头函数,但只能写单个表达式)add = lambda x, y: x + y# 闭包def make_counter(): count = 0 def inc(): nonlocal count # 必须声明,否则会变成局部变量 count += 1 return count return inc默认参数陷阱(Python 著名坑):默认参数在函数定义时只求值一次,所以可变默认参数会跨调用共享:
def bad(items=[]): # ❌ 所有调用共享同一个 list items.append(1) return itemsdef good(items=None): # ✅ 标准写法 if items is None: items = [] items.append(1) return items
3.5 面向对象
from dataclasses import dataclassclass Animal: species_count = 0 # 类变量(≈ Java static field) def __init__(self, name: str, age: int): # 构造方法 self.name = name # self ≈ Java/TS this self.age = age Animal.species_count += 1 def describe(self) -> str: # 实例方法第一参数永远是 self return f"{self.name}, {self.age}" @classmethod def from_dict(cls, d: dict) -> "Animal": # 类方法(≈ static factory) return cls(d["name"], d["age"]) @staticmethod def is_adult(age: int) -> bool: # 静态方法 return age >= 18 @property def is_baby(self) -> bool: # 属性装饰器:访问像字段 return self.age < 1 def __str__(self) -> str: # 魔术方法(dunder):print() 调用 return f"Animal({self.name})" def __repr__(self) -> str: # 调试用,REPL 默认调用 return f"Animal(name={self.name!r}, age={self.age})" def __eq__(self, other) -> bool: # == 重载 return isinstance(other, Animal) and self.name == other.name# 继承(多继承支持,按 MRO C3 线性化解析)class Dog(Animal): def __init__(self, name: str, age: int, breed: str): super().__init__(name, age) self.breed = breed def describe(self) -> str: return super().describe() + f" ({self.breed})"# dataclass(Python 3.7+,≈ Java record / Kotlin data class / TS interface+class)@dataclass(frozen=True, slots=True)class Point: x: float y: float label: str = "origin"# 自动生成 __init__、__repr__、__eq__、__hash__(frozen=True 时)常用魔术方法对照:
| 方法 | 触发场景 | Java 对应 |
|---|
__init__ | 构造 | constructor |
__str__ / __repr__ | str(obj) / repr | toString() |
__eq__ / __hash__ | == / hash | equals() / hashCode() |
__lt__ 等 | < 等比较 | Comparable |
__len__ | len(obj) | .size() |
__iter__ | for x in obj | Iterable |
__enter__ / __exit__ | with 语句 | try-with-resources |
__call__ | obj() | Function.apply() |
抽象类(abc 模块)≈ Java abstract class;Protocol ≈ TS interface(鸭子类型,无需继承)—— 后面 4.5 节详述。
3.6 模块与包
# 模块 = 一个 .py 文件;包 = 含 __init__.py 的目录# 项目结构:# myproject/# __init__.py# core.py# utils/# __init__.py# helpers.py# 多种 import 形式import json # 整个模块from pathlib import Path # 单个名字from collections import abc as abc_mod # 起别名from .utils.helpers import normalize # 相对导入(包内)# __init__.py 控制包的对外接口# myproject/__init__.py:__all__ = ["Foo", "bar"]from .core import Foofrom .utils.helpers import bar# __name__ == "__main__" 习语def main(): ...if __name__ == "__main__": # 该文件被直接 python xx.py 执行才走这里 main() # 被 import 时不会执行
模块搜索路径:sys.path,依次是当前目录 → PYTHONPATH → 标准库 → site-packages。绝对导入(from myproject.utils import x)优先于相对导入(from .utils import x),后者只能在包内使用。
第四部分:高级语法
4.1 迭代器与生成器
# 迭代器协议:__iter__ + __next__class Counter: def __init__(self, n): self.n, self.i = n, 0 def __iter__(self): return self def __next__(self): if self.i >= self.n: raise StopIteration self.i += 1 return self.i# 生成器函数:用 yield,自动实现迭代器协议def fib(n): a, b = 0, 1 for _ in range(n): yield a # 暂停并返回,下次 next 从这里继续 a, b = b, a + blist(fib(10)) # [0, 1, 1, 2, 3, 5, 8, 13, 21, 34]# 生成器表达式(懒求值,省内存;类似列表推导式但用括号)squares = (x * x for x in range(10**9)) # 不会爆内存total = sum(squares)# itertools 提供大量惰性迭代工具import itertoolslist(itertools.chain([1, 2], [3, 4])) # [1,2,3,4]list(itertools.islice(itertools.count(), 5)) # [0,1,2,3,4]list(itertools.groupby([1,1,2,3,3], key=lambda x: x))
生成器在 LLM 流式响应里非常关键 —— FastAPI 的 StreamingResponse 直接接受异步生成器作为内容。
4.2 装饰器
装饰器本质是"接收函数并返回新函数"的高阶函数。语法 @decorator 等价于 func = decorator(func)。
import functools, timedef timing(func): @functools.wraps(func) # 保留原函数的 name/docstring def wrapper(*args, **kwargs): start = time.perf_counter() result = func(*args, **kwargs) print(f"{func.__name__} took {time.perf_counter()-start:.3f}s") return result return wrapper@timingdef slow_op(): time.sleep(1)# 带参数的装饰器(多一层闭包)def retry(times=3): def deco(func): @functools.wraps(func) def wrapper(*args, **kwargs): for i in range(times): try: return func(*args, **kwargs) except Exception: if i == times - 1: raise return wrapper return deco@retry(times=5)def fetch(): ...# 内置常用装饰器@functools.lru_cache(maxsize=128) # 自动记忆化(缓存)def expensive(n): ...@functools.cache # 3.9+,无界缓存def fib(n): ...对比:Java 注解(@Override)是元数据标记,需框架反射处理;TS 装饰器(stage 3)功能接近 Python,但默认禁用且仅作用于 class/method。Python 装饰器直接修改函数本身,在 LangChain @tool、FastAPI @app.get、pytest @pytest.fixture 中无处不在。
4.3 上下文管理器
with 语句确保资源(文件、锁、数据库连接)在使用后被正确释放,类似 Java 的 try-with-resources / C# 的 using。
# 内置:文件with open("data.txt") as f: # 退出时自动 close content = f.read()# 自定义:协议 __enter__ / __exit__class Timer: def __enter__(self): self.start = time.perf_counter() return self def __exit__(self, exc_type, exc_val, tb): print(f"{time.perf_counter()-self.start:.3f}s")with Timer(): expensive_operation()# contextlib:用装饰器把生成器变成上下文管理器from contextlib import contextmanager@contextmanagerdef db_transaction(conn): conn.begin() try: yield conn conn.commit() except Exception: conn.rollback() raise4.4 异步编程(LLM 必备)
核心心智模型:与 JavaScript 的 async/await 几乎完全一致 —— 协程、事件循环、Promise/Future,整套范式可以直接迁移。Python 的特别之处在于 async 是"染色"的:调用 async 函数必须 await,所以一旦项目某处用了异步,整条链路都得异步。
import asyncio, httpx# async 函数返回协程对象,必须被 await 或调度才执行async def fetch(url: str) -> str: async with httpx.AsyncClient() as c: r = await c.get(url) return r.text# asyncio.run 是程序入口(≈ TS 顶层 await,但要显式 run)async def main(): # 顺序:2 秒 a = await fetch("https://a.com") b = await fetch("https://b.com") # 并发:约 1 秒 a, b = await asyncio.gather(fetch("https://a.com"), fetch("https://b.com")) # TaskGroup(3.11+,结构化并发,推荐) async with asyncio.TaskGroup() as tg: t1 = tg.create_task(fetch("a")) t2 = tg.create_task(fetch("b")) # 出 with 时所有 task 已完成;任一失败其他自动 cancelasyncio.run(main())异步生成器(LLM 流式必备):
async def stream_tokens(): async for chunk in llm_call(): # 异步迭代器 yield chunk # 异步生成器async for tok in stream_tokens(): print(tok, end="", flush=True)
asyncio vs threading vs multiprocessing:
| 维度 | asyncio | threading | multiprocessing |
|---|
| 适合 I/O 密集 | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐ | ⭐⭐ |
| 适合 CPU 密集 | ❌ | ❌ (GIL) | ⭐⭐⭐⭐⭐ |
| 内存开销 | 极小(KB/任务) | 中(MB/线程) | 大(独立进程) |
| 上限 | 上万 | 几百 | CPU 核心数 |
LLM 应用基本全部是 I/O 密集(等 LLM、等 DB、等向量库),asyncio 是不二之选。在 async 函数中遇到阻塞 API 用 await asyncio.to_thread(blocking_fn) 包一层。
GIL 与 free-threaded:传统 CPython 的 GIL 让多线程跑 CPU 任务实质单核。Python 3.13 起 PEP 703 提供 --disable-gil 编译版本(可执行文件叫 python3.13t),CPU 密集任务实测 4 线程提速约 4.5×。但生态仍在适配中,生产推荐继续用 multiprocessing 处理 CPU 密集任务。
4.5 类型系统进阶
from typing import Optional, Union, Callable, Protocol, TypedDict, Annotatedfrom typing import TypeVar, Generic# Python 3.10+ 推荐写法def find_user(uid: int) -> User | None: ... # 替代 Optional[User]handler: Callable[[int, str], bool] # 函数类型# Python 3.12+ 新泛型语法(PEP 695)def first[T](items: list[T]) -> T: return items[0]class Box[T]: def __init__(self, value: T): self.value = valuetype Result[T] = T | None # 类型别名(替代 TypeAlias)# Protocol(结构子类型,≈ TS interface 的鸭子类型)class Closable(Protocol): def close(self) -> None: ...def shutdown(x: Closable) -> None: # 任何有 close() 方法的对象都行 x.close()# TypedDict(≈ TS interface 的字典版本,仅静态检查无运行时验证)class UserDict(TypedDict): id: int name: str email: NotRequired[str] # 3.11+# Annotated(类型 + 元数据,FastAPI/Pydantic 大量使用)PositiveInt = Annotated[int, Field(gt=0)]
Pydantic v2 是 Python 类型注解 + 运行时验证的事实标准,对应 TypeScript 的 Zod:
from pydantic import BaseModel, Field, EmailStr, field_validatorclass User(BaseModel): id: int = Field(..., gt=0) name: str = Field(..., min_length=1, max_length=50) email: EmailStr age: int | None = Field(default=None, ge=0, le=150) @field_validator("name") @classmethod def lowercase(cls, v: str) -> str: return v.strip().lower()# 用法(v2 API)user = User.model_validate({"id": 1, "name": "Alice", "email": "a@b.com"})user.model_dump() # ≈ JSON.stringifyuser.model_dump_json(indent=2)User.model_json_schema() # 生成 JSON Schema → 喂给 LLM4.6 元编程
# 反射hasattr(obj, "method")getattr(obj, "method")setattr(obj, "x", 10)# 元类(class 的 class,≈ Java 的 Class<?>)—— 极少用,但理解很重要class Singleton(type): _instances = {} def __call__(cls, *a, **kw): if cls not in cls._instances: cls._instances[cls] = super().__call__(*a, **kw) return cls._instances[cls]class Config(metaclass=Singleton): ...# 描述符:定义 __get__/__set__ 的对象,@property 底层就是描述符
第五部分:常用框架(聚焦 LLM 应用开发)
5.1 FastAPI(LLM API 服务首选)
FastAPI 是 Python LLM 服务的事实标准,当前版本 0.136.1,融合 Spring Boot 的强类型 + 自动 OpenAPI、Express 的简洁、原生 async。Pydantic v2 已是强制依赖。
from contextlib import asynccontextmanagerfrom fastapi import FastAPI, Depends, HTTPException, Requestfrom fastapi.responses import StreamingResponsefrom pydantic import BaseModelfrom typing import Annotated# Lifespan(替代废弃的 @app.on_event("startup"))@asynccontextmanagerasync def lifespan(app: FastAPI): app.state.client = httpx.AsyncClient() yield await app.state.client.aclose()app = FastAPI(title="LLM API", lifespan=lifespan)class ChatRequest(BaseModel): message: str stream: bool = True# 依赖注入(≈ Spring @Autowired,但更灵活)async def get_user(token: Annotated[str, Header()]) -> User: if not valid(token): raise HTTPException(401) return load_user(token)UserDep = Annotated[User, Depends(get_user)]@app.post("/chat")async def chat(req: ChatRequest, user: UserDep, request: Request): if not req.stream: return {"answer": await call_llm(req.message)} # SSE 流式响应(LLM 应用核心) async def event_gen(): try: async for token in stream_llm(req.message): if await request.is_disconnected(): break yield f"data: {json.dumps({'delta': token})}\n\n" yield "event: done\ndata: [DONE]\n\n" except Exception as e: yield f"event: error\ndata: {json.dumps({'error': str(e)})}\n\n" return StreamingResponse( event_gen(), media_type="text/event-stream", headers={"Cache-Control": "no-cache", "X-Accel-Buffering": "no"}, )对比 Spring Boot / Express / NestJS:
| 维度 | FastAPI | Spring Boot | Express | NestJS |
|---|
| 验证 | Pydantic 自动 | Bean Validation 注解 | 手动/Joi | class-validator |
| OpenAPI | 自动生成 | springdoc 配置 | 手写 | @nestjs/swagger 半自动 |
| DI | 函数级 Depends 简洁 | IoC 容器(@Autowired) | 无 | @Injectable 模块系统 |
| 异步 | asyncio 原生 | 同步 + WebFlux | 原生 | 原生 |
| 启动 | 秒级 | 5-30s | 秒级 | 秒级 |
| 学习曲线 | 低 | 高 | 极低 | 中 |
5.2 HTTP 客户端:httpx
2026 年的标准答案是 httpx:同步异步统一 API,HTTP/2 支持,被 FastAPI 官方推荐。
import httpx# 同步with httpx.Client(timeout=10.0) as client: r = client.get("https://api.example.com")# 异步async with httpx.AsyncClient() as client: r = await client.get("https://api.example.com") # 流式(LLM SSE 客户端) async with client.stream("POST", url, json=payload, timeout=None) as resp: async for line in resp.aiter_lines(): if line.startswith("data: "): process(line[6:])# 并发results = await asyncio.gather(*[client.get(u) for u in urls])requests 仅适合简单脚本,aiohttp 用于极高并发场景。
5.3 OpenAI SDK / Anthropic SDK
# OpenAI(v2.x,2026 主推 Responses API)from openai import OpenAI, AsyncOpenAIfrom pydantic import BaseModelclient = AsyncOpenAI()# Chat Completions(向后兼容,仍广泛使用)stream = await client.chat.completions.create( model="gpt-4o-mini", messages=[{"role": "user", "content": "Hi"}], stream=True,)async for chunk in stream: delta = chunk.choices[0].delta.content if delta: print(delta, end="", flush=True)# 结构化输出(Pydantic 直接当 schema)class Event(BaseModel): name: str; date: str; participants: list[str]resp = await client.chat.completions.parse( model="gpt-4o-2024-08-06", messages=[{"role": "user", "content": "Alice and Bob meet Friday"}], response_format=Event,)event: Event = resp.choices[0].message.parsed# Function callingtools = [{ "type": "function", "function": { "name": "get_weather", "parameters": {"type": "object", "properties": {"city": {"type": "string"}}}, }}]resp = await client.chat.completions.create(model="gpt-4o", messages=[...], tools=tools)# Anthropicfrom anthropic import AsyncAnthropicclient = AsyncAnthropic()# 流式(async context manager + text_stream)async with client.messages.stream( model="claude-sonnet-4-5", max_tokens=1024, messages=[{"role": "user", "content": "Hello"}],) as stream: async for text in stream.text_stream: print(text, end="", flush=True)# Extended thinking(Claude 4+)resp = await client.messages.create( model="claude-opus-4-7", max_tokens=16000, thinking={"type": "enabled", "budget_tokens": 10000}, messages=[...],)# Prompt cachingresp = await client.messages.create( model="claude-sonnet-4-5", system=[ {"type": "text", "text": "You are an analyst."}, {"type": "text", "text": "<huge context>", "cache_control": {"type": "ephemeral"}}, ], messages=[...],)5.4 LangChain 1.x + LangGraph 1.0
重要更新:2025-10 LangChain/LangGraph 发布 1.0 稳定版。新 API 主推 from langchain.agents import create_agent(基于 LangGraph runtime),旧 AgentExecutor、Memory 系列已迁至 langchain-legacy。LCEL(管道符 |)仍是组合 retriever、prompt、parser 的推荐方式。
# LCEL:管道符组合 Runnablefrom langchain_openai import ChatOpenAIfrom langchain_core.prompts import ChatPromptTemplatefrom langchain_core.output_parsers import StrOutputParserprompt = ChatPromptTemplate.from_template("Tell me a joke about {topic}")chain = prompt | ChatOpenAI(model="gpt-4o-mini") | StrOutputParser()# 同步、异步、批量、流式四态统一chain.invoke({"topic": "ai"})await chain.ainvoke({"topic": "ai"})chain.batch([{"topic": "cats"}, {"topic": "dogs"}])async for chunk in chain.astream({"topic": "ai"}): print(chunk, end="")LangGraph 是 2025 年 Agent 框架首选:
from typing import Annotatedfrom typing_extensions import TypedDictfrom langchain_core.tools import toolfrom langgraph.graph import StateGraph, START, END, MessagesStatefrom langgraph.prebuilt import ToolNode, tools_condition, create_react_agentfrom langgraph.checkpoint.memory import InMemorySaverfrom langchain_openai import ChatOpenAI@tooldef multiply(a: int, b: int) -> int: """Multiply two integers.""" return a * b# 一行式 ReAct Agent(推荐起点)agent = create_react_agent( model="openai:gpt-4o-mini", tools=[multiply], prompt="You are a math assistant.", checkpointer=InMemorySaver(),)# 多轮:传 thread_id 即可启用记忆config = {"configurable": {"thread_id": "user-1"}}async for chunk in agent.astream( {"messages": [{"role": "user", "content": "3*4?"}]}, config, stream_mode=["updates", "messages"],): print(chunk)手动 StateGraph(需要复杂控制时):
class AgentState(MessagesState): passllm = ChatOpenAI(model="gpt-4o-mini").bind_tools([multiply])def assistant(state): return {"messages": [llm.invoke(state["messages"])]}builder = StateGraph(AgentState)builder.add_node("assistant", assistant)builder.add_node("tools", ToolNode([multiply]))builder.add_edge(START, "assistant")builder.add_conditional_edges("assistant", tools_condition) # 有 tool_calls→tools,否则→ENDbuilder.add_edge("tools", "assistant")graph = builder.compile(checkpointer=InMemorySaver())Human-in-the-loop(HITL,Agent 关键能力):
from langgraph.types import interrupt, Commanddef approval_node(state): decision = interrupt({"task": "Deploy?", "details": state["plan"]}) return {"approved": decision == "approve"}# 第一次调用跑到 interrupt 暂停graph.invoke({"plan": "deploy v2"}, config)# 用户决策后继续graph.invoke(Command(resume="approve"), config)5.5 LlamaIndex(RAG 专用)
LlamaIndex 主打数据框架,当前版本 0.14.x。在 RAG 场景比 LangChain 更专注、更易用:
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, Settingsfrom llama_index.llms.openai import OpenAIfrom llama_index.embeddings.openai import OpenAIEmbeddingSettings.llm = OpenAI(model="gpt-4o-mini")Settings.embed_model = OpenAIEmbedding(model="text-embedding-3-small")docs = SimpleDirectoryReader("data").load_data()index = VectorStoreIndex.from_documents(docs)qe = index.as_query_engine(similarity_top_k=5)print(qe.query("What are the main themes?"))选型经验:纯 RAG / 文档分析 → LlamaIndex;Agent / 工作流编排 → LangGraph;二者可组合(用 LlamaIndex 检索作为 LangGraph 工具)。
5.6 向量数据库
| 库 | 特点 | 推荐场景 |
|---|
| Chroma | 嵌入式 SQLite,零运维 | 原型、小规模生产 |
| Qdrant | Rust,支持异步、gRPC | 通用生产首选 |
| Pinecone | 全托管 SaaS | 不想自运维 |
| Weaviate v4 | 多模态、混合检索 | 复杂查询 |
| Milvus | 大规模分布式 | 亿级向量 |
| FAISS | 纯本地索引库 | 单机大规模、无 metadata |
# Chroma(最简单的起点)import chromadbclient = chromadb.PersistentClient(path="./chroma_db")col = client.get_or_create_collection("docs")col.add(documents=["doc 1", "doc 2"], ids=["1", "2"])res = col.query(query_texts=["query"], n_results=3)# Qdrant(生产推荐)from qdrant_client import AsyncQdrantClient, modelsclient = AsyncQdrantClient(url="http://localhost:6333")await client.upsert("coll", points=[models.PointStruct(id=1, vector=[...], payload={...})])5.7 Embedding 模型
# OpenAI(最常用)from openai import AsyncOpenAIresp = await AsyncOpenAI().embeddings.create(model="text-embedding-3-small", input=["hi"])# 本地(sentence-transformers)from sentence_transformers import SentenceTransformermodel = SentenceTransformer("BAAI/bge-large-en-v1.5")emb = model.encode(["hello"], normalize_embeddings=True)5.8 Agent 框架对比
| 框架 | 哲学 | 学习曲线 | 适用场景 |
|---|
| LangGraph | 状态机 / 有向图 | 较陡 | 复杂分支、HITL、长跑工作流(默认选择) |
| CrewAI | 角色 + 任务 | 30 分钟上手 | 角色化研究/写作、快速 POC |
| AutoGen / Microsoft Agent Framework | 异步消息 actor | 中等 | 自由对话、code-execute-fix 循环 |
5.9 MCP(Model Context Protocol)
Anthropic 主导的开放协议,统一 LLM 与外部工具/数据源的接口。官方 SDK 1.27.x。
# Server(用 FastMCP)from mcp.server.fastmcp import FastMCPmcp = FastMCP("Demo")@mcp.tool()def add(a: int, b: int) -> int: """Add two numbers""" return a + bif __name__ == "__main__": mcp.run(transport="stdio") # 或 "streamable-http"# Clientfrom mcp import ClientSession, StdioServerParametersfrom mcp.client.stdio import stdio_clientasync with stdio_client(StdioServerParameters(command="python", args=["server.py"])) as (r, w): async with ClientSession(r, w) as session: await session.initialize() result = await session.call_tool("add", {"a": 3, "b": 5})通过 langchain-mcp-adapters 可把 MCP 工具一行加载为 LangChain Tool:
from langchain_mcp_adapters.client import MultiServerMCPClientclient = MultiServerMCPClient({"example": {"transport": "stdio", "command": "python", "args": ["server.py"]}})tools = await client.get_tools() # 直接喂给 create_react_agent5.10 可观测性
| 工具 | 特点 |
|---|
| LangSmith | LangChain 官方,深度集成 LangGraph |
| Langfuse | 开源可自托管,v3+ 基于 OTel |
| Phoenix (Arize) | 完全本地,OpenInference 30+ 自动 instrumentor |
# LangSmith:环境变量即开# LANGSMITH_TRACING=true, LANGSMITH_API_KEY=...# Langfuse v3from langfuse.langchain import CallbackHandleragent.invoke(inputs, config={"callbacks": [CallbackHandler()]})# Phoenix:本地零配置from phoenix.otel import registerregister(project_name="my-app", auto_instrument=True)5.11 Prompt 优化(DSPy)+ 本地推理
DSPy 把"提示工程"变成"程序化优化":
import dspydspy.configure(lm=dspy.LM("openai/gpt-4o-mini"))qa = dspy.ChainOfThought("question -> answer: str")print(qa(question="Capital of France?").answer)# 自动优化 promptfrom dspy.teleprompt import MIPROv2optimizer = MIPROv2(metric=my_metric, auto="light")optimized = optimizer.compile(qa, trainset=examples)本地推理:Ollama(最简单)、vLLM(高吞吐生产)、llama.cpp(CPU/GGUF)。
from ollama import AsyncClientasync for part in await AsyncClient().chat(model="llama3.2", messages=[...], stream=True): print(part["message"]["content"], end="")
第六部分:两个最佳实践完整示例
实践一:RAG 文档问答系统
技术栈:FastAPI + uv + Pydantic v2 + Chroma + OpenAI/Anthropic + 异步 + SSE
目录结构:
rag-app/├── pyproject.toml├── .env.example├── src/rag_app/│ ├── main.py # FastAPI 路由 + lifespan│ ├── config.py # pydantic-settings│ ├── models.py # Pydantic 请求/响应/SSE 模型│ ├── chunker.py # RecursiveCharacterTextSplitter 封装│ ├── embedder.py # AsyncOpenAI embedding│ ├── vectorstore.py # Chroma PersistentClient 封装│ ├── llm.py # OpenAI / Anthropic 异步流式│ └── rag.py # 检索 + 生成编排└── tests/
pyproject.toml 关键依赖:
[project]name = "rag-app"requires-python = ">=3.11"dependencies = [ "fastapi>=0.115", "uvicorn[standard]>=0.32", "python-multipart>=0.0.12", "pydantic>=2.9", "pydantic-settings>=2.6", "chromadb>=1.0", "openai>=1.54", "anthropic>=0.39", "httpx>=0.27", "tiktoken>=0.8", "langchain-text-splitters>=0.3",][project.optional-dependencies]dev = ["ruff>=0.7", "mypy>=1.13", "pytest>=8.3", "pytest-asyncio>=0.24"]
核心代码片段(src/rag_app/main.py):
from contextlib import asynccontextmanagerfrom fastapi import FastAPI, Depends, Request, UploadFile, File, Body, HTTPExceptionfrom fastapi.responses import StreamingResponse, JSONResponseimport jsonfrom .config import get_settingsfrom .embedder import Embedderfrom .vectorstore import VectorStorefrom .chunker import TextChunkerfrom .llm import build_llmfrom .rag import RAGServicefrom .models import IngestTextRequest, QueryRequest, HealthResponse@asynccontextmanagerasync def lifespan(app: FastAPI): s = get_settings() embedder = Embedder(api_key=s.openai_api_key, model=s.embedding_model) store = VectorStore(persist_dir=s.chroma_persist_dir, collection_name=s.chroma_collection) llm = build_llm(provider=s.llm_provider, openai_api_key=s.openai_api_key, anthropic_api_key=s.anthropic_api_key, model=s.llm_model, max_output_tokens=s.max_output_tokens) chunker = TextChunker(s.chunk_size, s.chunk_overlap) app.state.rag = RAGService(chunker, embedder, store, llm, s.top_k) app.state.store = store yield await embedder.aclose() await llm.aclose()app = FastAPI(title="RAG App", lifespan=lifespan)def _sse_pack(event: str, data) -> bytes: return f"event: {event}\ndata: {json.dumps(data, ensure_ascii=False)}\n\n".encode()@app.post("/documents")async def ingest(payload: IngestTextRequest = Body(...), request: Request = None): rag = request.app.state.rag ids = await rag.ingest_text(payload.text, payload.source, payload.metadata) return {"source": payload.source, "chunks_added": len(ids), "document_ids": ids}@app.post("/query")async def query(payload: QueryRequest, request: Request): rag = request.app.state.rag if not payload.stream: # 非流式 sources, chunks = [], [] async for ev_type, ev_data in rag.answer_stream(payload.question, top_k=payload.top_k): if ev_type == "sources": sources = ev_data elif ev_type == "delta": chunks.append(str(ev_data)) return JSONResponse({"answer": "".join(chunks), "sources": sources}) async def event_gen(): async for ev_type, ev_data in rag.answer_stream(payload.question, top_k=payload.top_k): yield _sse_pack(ev_type, ev_data if ev_type != "delta" else {"text": ev_data}) yield _sse_pack("done", {}) return StreamingResponse(event_gen(), media_type="text/event-stream", headers={"Cache-Control": "no-cache", "X-Accel-Buffering": "no"})RAG 核心(src/rag_app/rag.py):
SYSTEM_PROMPT = """你是严谨的知识助理。仅根据 <context> 中的资料回答;资料不足请说"无法回答";用 [1][2] 标注引用编号。"""USER_PROMPT_TEMPLATE = "<context>\n{context}\n</context>\n\n用户问题:{question}"class RAGService: def __init__(self, chunker, embedder, store, llm, default_top_k): self.chunker, self.embedder, self.store, self.llm = chunker, embedder, store, llm self.default_top_k = default_top_k async def ingest_text(self, text, source, extra_metadata=None): meta = {"source": source, **(extra_metadata or {})} chunks = await asyncio.to_thread(self.chunker.split, text, meta) if not chunks: return [] texts = [c.text for c in chunks] embeddings = await self.embedder.embed_texts(texts) return await self.store.add(texts, embeddings, [c.metadata for c in chunks]) async def answer_stream(self, question, top_k=None): q_vec = await self.embedder.embed_query(question) hits = await self.store.query(q_vec, top_k=top_k or self.default_top_k) yield ("sources", [{"id": h.id, "text": h.text, "score": h.score, "metadata": h.metadata} for h in hits]) context = "\n\n".join(f"[{i+1}] {h.text}" for i, h in enumerate(hits)) messages = [ {"role": "system", "content": SYSTEM_PROMPT}, {"role": "user", "content": USER_PROMPT_TEMPLATE.format(context=context, question=question)}, ] async for delta in self.llm.astream(messages): yield ("delta", delta) yield ("done", {})启动与测试:
uv synccp .env.example .env # 填 OPENAI_API_KEYuv run uvicorn rag_app.main:app --reloadcurl -X POST localhost:8000/documents -H 'content-type: application/json' \ -d '{"text":"Chroma 是开源向量数据库","source":"demo"}'curl -N -X POST localhost:8000/query -H 'content-type: application/json' \ -d '{"question":"Chroma 是什么?","stream":true}'实践二:LangGraph Agent + MCP
技术栈:FastAPI + uv + LangGraph 1.0 + LangChain 0.3 + langchain-mcp-adapters + Langfuse v3 + 异步 SSE
目录结构:
agent-app/├── pyproject.toml├── src/agent_app/│ ├── main.py # FastAPI /chat SSE + /health│ ├── config.py│ ├── models.py│ ├── tools.py # web_search, calculator, query_database│ ├── mcp_client.py # MultiServerMCPClient 封装│ ├── agent.py # create_react_agent│ ├── observability.py # Langfuse v3│ └── streaming.py # astream → SSE 适配└── mcp_servers/ └── example_server.py # FastMCP server (get_weather, read_file)
核心 Agent(src/agent_app/agent.py):
from langchain.chat_models import init_chat_modelfrom langgraph.checkpoint.memory import InMemorySaverfrom langgraph.prebuilt import create_react_agentfrom .tools import LOCAL_TOOLSfrom .mcp_client import get_mcp_toolsfrom .config import settingsSYSTEM_PROMPT = """你是严谨可靠的 AI 助手。能力:1. web_search 搜索最新信息2. calculator 精确数学计算3. query_database 查询本地 products 表(只读 SELECT)4. MCP 扩展工具(如读文件、查天气)约束:实时信息务必调用工具;数学走 calculator;引用搜索附链接;工具失败不重试超 3 次;用与用户一致的语言。"""_compiled = Noneasync def build_agent(): global _compiled if _compiled: return _compiled llm = init_chat_model(model=settings.llm_model, temperature=0.2, timeout=settings.llm_timeout, max_retries=2) mcp_tools = await get_mcp_tools() _compiled = create_react_agent( model=llm, tools=[*LOCAL_TOOLS, *mcp_tools], prompt=SYSTEM_PROMPT, checkpointer=InMemorySaver(), name="react-agent-app", ) return _compiled
安全工具(src/agent_app/tools.py 节选):
import re, asyncio, numexpr, aiosqlitefrom langchain_core.tools import tool_SAFE_EXPR_RE = re.compile(r"^[0-9eE\.\+\-\*\/\%\(\)\s\,piPI]+$")@tool("calculator", parse_docstring=True)async def calculator(expression: str) -> str: """安全计算数学表达式(不会执行任意代码)。 Args: expression: 例如 "(3+5)*12" """ expr = expression.strip() if not _SAFE_EXPR_RE.match(expr): return "[calculator] 检测到非法字符" try: result = await asyncio.to_thread(numexpr.evaluate, expr) return str(result.item() if result.ndim == 0 else result.tolist()) except Exception as e: return f"[calculator] 表达式无效:{e}"_FORBIDDEN_SQL = re.compile(r"\b(insert|update|delete|drop|alter|create)\b", re.I)@tool("query_database", parse_docstring=True)async def query_database(sql: str) -> str: """对 products(id,name,category,price,stock) 表执行只读 SELECT。""" if not sql.strip().lower().startswith("select"): return "[query_database] 仅允许 SELECT" if _FORBIDDEN_SQL.search(sql): return "[query_database] 检测到禁用关键字" async with aiosqlite.connect(settings.db_path) as db: db.row_factory = aiosqlite.Row async with asyncio.timeout(settings.tool_timeout_seconds): cur = await db.execute(f"{sql.rstrip(';')} LIMIT 50") rows = await cur.fetchall() return orjson.dumps([dict(r) for r in rows]).decode()LOCAL_TOOLS = [web_search, calculator, query_database]SSE 流式输出(src/agent_app/streaming.py 核心):
async def stream_agent_events(agent, user_message, thread_id, callbacks=None, trace_id=None): config = {"configurable": {"thread_id": thread_id}, "callbacks": callbacks or []} inputs = {"messages": [{"role": "user", "content": user_message}]} yield sse_format(StreamEvent(type="metadata", thread_id=thread_id, trace_id=trace_id)) seen = set() async for stream_mode, chunk in agent.astream( inputs, config=config, stream_mode=["updates", "messages"] ): if stream_mode == "updates": for node_name, node_state in chunk.items(): for m in node_state.get("messages", []): if isinstance(m, AIMessage) and m.tool_calls: for tc in m.tool_calls: if tc["id"] in seen: continue seen.add(tc["id"]) yield sse_format(StreamEvent(type="tool_call", name=tc["name"], content=tc["args"], thread_id=thread_id)) elif isinstance(m, ToolMessage): yield sse_format(StreamEvent(type="tool_result", name=m.name, content=str(m.content)[:4000], thread_id=thread_id)) elif stream_mode == "messages": msg, meta = chunk if isinstance(msg, AIMessageChunk) and meta.get("langgraph_node") == "agent": if msg.content: yield sse_format(StreamEvent(type="thinking", content=msg.content, thread_id=thread_id)) yield sse_format(StreamEvent(type="done", thread_id=thread_id))FastAPI 入口(src/agent_app/main.py 关键):
@app.post("/chat")async def chat(req: ChatRequest): thread_id = req.thread_id or f"thread-{uuid.uuid4().hex[:12]}" trace_id = create_trace_id(seed=thread_id) agent = await build_agent() handler = build_callback_handler() # Langfuse async def event_stream(): async for sse in stream_agent_events( agent, req.message, thread_id, callbacks=[handler] if handler else [], trace_id=trace_id, ): yield sse return StreamingResponse( event_stream(), media_type="text/event-stream", headers={ "Cache-Control": "no-cache, no-transform", "X-Accel-Buffering": "no", "X-Thread-Id": thread_id, **({"X-Trace-Id": trace_id} if trace_id else {}), }, )MCP server 示例(mcp_servers/example_server.py):
from fastmcp import FastMCPmcp = FastMCP(name="agent-app-example")@mcp.tooldef get_weather(city: str) -> dict: """获取城市天气(演示,返回模拟数据)。""" import random; rng = random.Random(hash(city)) return {"city": city, "condition": rng.choice(["Sunny","Rainy","Cloudy"]), "temperature_c": round(rng.uniform(-5, 35), 1)}if __name__ == "__main__": mcp.run() # stdio启动与流式测试:
uv synccp .env.example .env # 填 OPENAI_API_KEY、TAVILY_API_KEY、LANGFUSE_*uv run uvicorn agent_app.main:app --reloadcurl -N -X POST localhost:8000/chat -H 'content-type: application/json' -d '{ "message": "搜一下 LangGraph 1.0 新特性,再算 (123*456+789)/3", "thread_id": "demo-001", "user_id": "alice"}'输出形如:
event: metadatadata: {"type":"metadata","thread_id":"demo-001","trace_id":"7af1..."}event: tool_calldata: {"type":"tool_call","name":"web_search","content":{"query":"LangGraph 1.0"}}event: tool_resultdata: {"type":"tool_result","name":"web_search","content":"[...]"}event: tool_calldata: {"type":"tool_call","name":"calculator","content":{"expression":"(123*456+789)/3"}}event: thinkingdata: {"type":"thinking","content":"根据搜索..."}event: donedata: {"type":"done"}
结语: Python LLM 工程师的关键迁移要点
思维模型转换:动态类型 + 类型注解的组合给了你 TS 静态保障的同时不失 Python 的简洁;async/await 与 JavaScript 几乎一致,但要适应"染色"特性;"Pythonic"哲学是用列表推导式、生成器、上下文管理器等语言特性换取代码密度,少即是多。
工具链一站式记忆:uv 替代 pyenv + pip + venv + Poetry + pipx;Ruff 替代 Black + isort + Flake8;Pylance/mypy 替代 ESLint 类型部分;pytest 替代 JUnit/Jest;httpx 替代 axios;Pydantic 是 Python 的 Zod。
LLM 栈关键决策:服务层用 FastAPI + Pydantic v2 + httpx + 异步 SSE;Agent 编排用 LangGraph 1.0 的 create_react_agent;RAG 用 LlamaIndex 或 LangChain LCEL + Chroma/Qdrant;可观测性默认 Langfuse 或 LangSmith;工具协议优先 MCP;Prompt 优化场景用 DSPy。
未来三年趋势:Python 3.13+ 的 free-threaded build 会逐步重塑并发模型;LangChain 1.x 的 middleware 架构将取代旧 Memory/Agent 模式;MCP 会成为类似 LSP 之于 IDE 的"LLM 工具协议标准";Agentic IDE(Cursor、Claude Code、PyCharm Junie)已经永久改变了 Python 日常开发体验。掌握上述工具链,意味着你具备了在 2026 年构建任何 LLM 应用的完整能力。