错误处理与速率限制

当你的 AI 应用从开发阶段走向生产，错误处理和速率限制就成了必须面对的问题。API 会返回各种错误码，请求可能被限速，网络可能中断。本文教你构建健壮的错误处理体系，让你的应用在生产环境中稳定运行。

你将学到什么

Claude API 的错误码体系
速率限制的机制和应对策略
指数退避重试的实现
生产级错误处理的最佳实践

API 错误码

Claude API 使用标准 HTTP 状态码：

400 Bad Request — 请求参数有误

# 常见原因：messages 格式错误、model 名称拼写错误
# 解决：检查请求参数

401 Unauthorized — API Key 无效

403 Forbidden — 没有权限访问该资源

429 Too Many Requests — 速率限制

500 Internal Server Error — Anthropic 服务端错误

529 Overloaded — 服务过载，稍后重试

速率限制机制

Anthropic 按使用量等级（Usage Tier）设定速率限制：

限制维度：

RPM（Requests Per Minute）：每分钟请求数
TPM（Tokens Per Minute）：每分钟 token 数
TPD（Tokens Per Day）：每天总 token 数

响应头信息：

anthropic-ratelimit-requests-limit: 60
anthropic-ratelimit-requests-remaining: 55
anthropic-ratelimit-requests-reset: 2026-03-04T12:00:30Z
anthropic-ratelimit-tokens-limit: 100000
anthropic-ratelimit-tokens-remaining: 95000

指数退避重试

遇到 429 或 529 错误时，不要立刻重试——使用指数退避：

import anthropic
import time
import random

client = anthropic.Anthropic()

def call_with_retry(messages, max_retries=5):
    for attempt in range(max_retries):
        try:
            return client.messages.create(
                model="claude-sonnet-4-6",
                max_tokens=1024,
                messages=messages,
            )
        except anthropic.RateLimitError:
            if attempt == max_retries - 1:
                raise
            # 指数退避 + 随机抖动
            wait = (2 ** attempt) + random.random()
            print(f"速率限制，{wait:.1f}秒后重试...")
            time.sleep(wait)
        except anthropic.InternalServerError:
            if attempt == max_retries - 1:
                raise
            wait = (2 ** attempt) + random.random()
            print(f"服务器错误，{wait:.1f}秒后重试...")
            time.sleep(wait)

SDK 内置重试

Python 和 TypeScript SDK 都内置了自动重试：

# Python SDK 默认对 429/500/529 自动重试 2 次
client = anthropic.Anthropic(
    max_retries=3,  # 自定义重试次数
    timeout=60.0,   # 超时时间（秒）
)

// TypeScript SDK
const client = new Anthropic({
  maxRetries: 3,
  timeout: 60000,
});

生产级错误处理

import anthropic
import logging

logger = logging.getLogger(__name__)

def safe_call(messages, system=None):
    try:
        response = client.messages.create(
            model="claude-sonnet-4-6",
            max_tokens=4096,
            system=system,
            messages=messages,
        )
        return {
            "success": True,
            "text": response.content[0].text,
            "usage": {
                "input": response.usage.input_tokens,
                "output": response.usage.output_tokens,
            }
        }
    except anthropic.AuthenticationError:
        logger.error("API Key 无效")
        return {"success": False, "error": "认证失败，请检查 API Key"}
    except anthropic.RateLimitError as e:
        logger.warning(f"速率限制: {e}")
        return {"success": False, "error": "请求太频繁，请稍后重试"}
    except anthropic.BadRequestError as e:
        logger.error(f"请求参数错误: {e}")
        return {"success": False, "error": "请求参数有误"}
    except anthropic.APIError as e:
        logger.error(f"API 错误: {e}")
        return {"success": False, "error": "服务暂时不可用，请稍后重试"}
    except Exception as e:
        logger.exception(f"未知错误: {e}")
        return {"success": False, "error": "发生未知错误"}

监控和告警

生产环境中要监控：

错误率：429 和 500 错误的频率
延迟：API 响应时间的 P50/P95/P99
token 消耗：日/周/月的 token 用量趋势
成本：实际花费 vs 预算

Tip: 将每次请求的 usage 数据写入日志或数据库，定期分析成本趋势。

提升使用量等级

如果你经常遇到速率限制，可以通过增加充值金额来提升 Usage Tier，获得更高的限制额度。在 console.anthropic.com 查看你当前的等级和限额。

实战练习

Tip: 让你的应用更健壮。

给你的 API 调用添加完整的错误处理（参考上面的 safe_call）
实现指数退避重试逻辑
添加日志记录，追踪每次请求的 token 用量和延迟

关键要点

Note: 本文核心总结

关键错误码：429（限速）、500（服务错误）、529（过载）
速率限制按 RPM/TPM/TPD 三个维度
指数退避 + 随机抖动是标准的重试策略
SDK 内置了自动重试，生产中还需要自定义错误处理
监控错误率、延迟和成本是生产运维的必要工作