POST/v1/chat/completions

对话补全 API

创建对话补全响应。支持流式输出、多轮对话、Function Calling、视觉理解等功能。

请求端点

POST https://nexusflow.hk/v1/chat/completions

Authorization: Bearer <API_KEY>Content-Type: application/json

请求参数

参数	类型	必填	说明
`model`	string	*	模型 ID。例如 claude-sonnet-4-6、qwen3.5-plus 等。查看列表 →
`messages`	array	*	对话消息数组。每条消息包含 role 和 content 字段。
`messages[].role`	string	*	消息角色：system / user / assistant / tool
`messages[].content`	string \| array	*	消息内容。可以是字符串或内容数组（用于图像输入）
`stream`	boolean	-	是否启用流式输出。启用后以 SSE 格式逐字返回。默认：`false`
`temperature`	number	-	采样温度，范围 [0, 2)。值越高越随机。默认：`1.0`
`top_p`	number	-	核采样概率阈值，范围 (0, 1]。与 temperature 二选一。默认：`1.0`
`max_tokens`	integer	-	生成的最大 token 数。不同模型有不同上限。
`stop`	string \| array	-	停止词。遇到时停止输出，最多 4 个。
`presence_penalty`	number	-	存在惩罚 [-2.0, 2.0]。正值减少重复话题。默认：`0`
`frequency_penalty`	number	-	频率惩罚 [-2.0, 2.0]。正值减少重复词汇。默认：`0`
`tools`	array	-	可用工具/函数列表，用于 Function Calling。
`tool_choice`	string \| object	-	工具调用策略："none" / "auto" / 指定函数。默认：`"auto"`
`seed`	integer	-	随机种子（实验性）。
`user`	string	-	终端用户 ID，用于监控和滥用检测。
`n`	integer	-	生成回复数量，范围 1-4。默认：`1`

代码示例

from openai import OpenAI

client = OpenAI(
    api_key="sk-air-your-key",
    base_url="https://nexusflow.hk/v1",
)

response = client.chat.completions.create(
    model="claude-sonnet-4-6",
    messages=[
        {"role": "system", "content": "你是一个有帮助的助手。"},
        {"role": "user", "content": "什么是机器学习？"}
    ],
    temperature=0.7,
    max_tokens=1000
)

print(response.choices[0].message.content)

响应结构

字段	类型	说明
`id`	string	本次请求的唯一标识符
`object`	string	固定为 "chat.completion"
`created`	integer	创建时间 Unix 时间戳
`model`	string	实际使用的模型名称
`choices`	array	生成结果数组
`choices[].index`	integer	结果索引
`choices[].message`	object	生成的消息对象
`choices[].message.role`	string	固定为 "assistant"
`choices[].message.content`	string \| null	生成内容（调用工具时可能为 null）
`choices[].message.tool_calls`	array	工具调用请求
`choices[].finish_reason`	string	停止原因：stop / length / tool_calls
`usage`	object	Token 使用统计
`usage.prompt_tokens`	integer	输入 token 数
`usage.completion_tokens`	integer	输出 token 数
`usage.total_tokens`	integer	总 token 数

响应示例

{
  "id": "chatcmpl-abc123xyz789",
  "object": "chat.completion",
  "created": 1709123456,
  "model": "claude-sonnet-4-6",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "机器学习是人工智能的一个分支..."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 28,
    "completion_tokens": 256,
    "total_tokens": 284
  }
}

注意事项

不同模型 max_tokens 上限不同，参考模型列表
temperature 和 top_p 建议只使用其中一个
流式输出时最后一个 chunk 的 finish_reason 才表示完成
图像理解仅支持 Claude 和 Qwen-VL 系列模型
Function Calling 推荐使用 Claude 或 Qwen 系列

对话补全 API

请求端点

请求参数

代码示例

响应结构

注意事项

相关文档