LangChain 的工具调用 - LangChain 博客

TLDR: 我们正在 AIMessage 上引入一个新的 tool_calls 属性。越来越多的 LLM 提供商正在公开用于可靠工具调用的 API。新属性的目标是为与工具调用交互提供一个标准接口。这完全向后兼容，并支持所有具有原生工具调用功能的模型。为了访问这些最新功能，您需要升级您的 langchain_core 和合作伙伴包版本。

YouTube 演示

Python

聊天模型列表，显示工具调用能力的状态
工具调用 解释了新的工具调用接口
工具调用代理 展示了如何创建一个使用标准化工具调用接口的代理
LangGraph 笔记本 展示了如何创建一个使用标准化工具调用接口的 LangGraph 代理

聊天模型列表，显示工具调用能力的状态
工具调用 解释了新的工具调用接口
工具调用代理 展示了如何创建一个使用标准化工具调用接口的代理

简介

大型语言模型 (LLM) 可以通过工具调用功能与外部数据源交互。工具调用是一种强大的技术，允许开发者构建复杂的应用程序，这些程序可以利用 LLM 来访问、交互和操作外部资源，如数据库、文件和 API。

提供商一直在将原生工具调用功能引入他们的模型。实际上，这意味着当 LLM 为提示提供自动补全时，除了纯文本之外，它还可以返回工具调用列表。OpenAI 大约在一年前首次发布了这项功能，最初称为“函数调用”，然后在 11 月迅速演变为“工具调用”。此后，其他模型提供商也纷纷效仿：Gemini（12 月）、Mistral（2 月）、Fireworks（3 月）、Together（3 月）、Groq（4 月）、Cohere（4 月）和 Anthropic（4 月）。

所有这些提供商都公开了略有不同的接口（特别是：OpenAI、Anthropic 和 Gemini，这三个性能最高的模型互不兼容）。我们听到了社区希望有一个标准化的工具调用接口的呼声，以便更容易在这些提供商之间切换，我们很高兴今天发布它。

标准接口包括：

ChatModel.bind_tools()：一种将工具定义附加到模型调用的方法。
AIMessage.tool_calls：从模型返回的 AIMessage 上的一个属性，用于轻松访问模型决定进行的工具调用。
create_tool_calling_agent()：一个代理构造函数，可以与任何实现 bind_tools 并返回 tool_calls 的模型一起使用。

让我们来看看这些组件中的每一个。

`ChatModel.bind_tools(...)`

为了允许模型使用工具，我们需要告诉它哪些工具可用。我们通过向模型传递工具定义列表来做到这一点，包括工具参数的模式。工具定义的具体格式取决于模型提供商——OpenAI 期望一个包含“name”、“description”和“parameters”键的字典，而 Anthropic 期望“name”、“description”和“input_schema”。

ChatModel.bind_tools 提供了一个由所有工具调用模型实现的标准接口，允许您指定哪些工具可供模型使用。您不仅可以传入原始工具定义（字典），还可以传入可以从中派生工具定义的对象：即 Pydantic 类、LangChain 工具和任意函数。这使得创建通用工具定义变得容易，您可以将这些定义与任何工具调用模型一起使用。

from langchain_anthropic import ChatAnthropic
from langchain_core.pydantic_v1 import BaseModel, Field
from langchain_core.tools import tool

# ✅ Pydantic class
class multiply(BaseModel):
    """Return product of 'x' and 'y'."""
    x: float = Field(..., description="First factor")
    y: float = Field(..., description="Second factor")
    
# ✅ LangChain tool
@tool
def exponentiate(x: float, y: float) -> float:
    """Raise 'x' to the 'y'."""
    return x**y
    
# ✅ Function

def subtract(x: float, y: float) -> float:
    """Subtract 'x' from 'y'."""
    return y-x
    
# ✅ OpenAI-format dict
# Could also pass in a JSON schema with "title" and "description" 
add = {
  "name": "add",
  "description": "Add 'x' and 'y'.",
  "parameters": {
    "type": "object",
    "properties": {
      "x": {"type": "number", "description": "First number to add"},
      "y": {"type": "number", "description": "Second number to add"}
    },
    "required": ["x", "y"]
  }
}

llm = ChatAnthropic(model="claude-3-sonnet-20240229", temperature=0)

# Whenever we invoke `llm_with_tool`, all three of these tool definitions
# are passed to the model.
llm_with_tools = llm.bind_tools([multiply, exponentiate, add, subtract])

如果我们想使用不同的工具调用模型，我们的代码看起来会非常相似

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4-turbo", temperature=0)
llm_with_tools = llm.bind_tools([multiply, exponentiate, add, subtract])

那么调用 llm_with_tools 会是什么样子呢？这就是 AIMessage.tool_calls 的用武之地。

`AIMessage.tool_calls`

以前在使用工具调用模型时，模型返回的任何工具调用都可以在 AIMessage.additional_kwargs 或 AIMessage.content 中找到，具体取决于模型提供商的 API，并遵循提供商特定的格式。也就是说，您需要自定义逻辑来从不同模型的输出中提取工具调用。现在，AIMessage.tool_calls 提供了一个用于获取模型工具调用的标准化接口。因此，在使用绑定工具调用模型后，您将获得以下形式的输出

llm_with_tools.invoke([
	("system", "You're a helpful assistant"), 
	("human", "what's 5 raised to the 2.743"),
])

# 👀 Notice the tool_calls attribute 👀

# -> AIMessage(
# 	  content=..., 
# 	  additional_kwargs={...},
# 	  tool_calls=[{'name': 'exponentiate', 'args': {'y': 2.743, 'x': 5.0}, 'id': '54c166b2-f81a-481a-9289-eea68fc84e4f'}]
# 	  response_metadata={...}, 
# 	  id='...'
#   )

其中 AIMessage 具有一个 tool_calls: List[ToolCall] 属性，如果存在任何工具调用，该属性将被填充，并将遵循工具调用的标准接口

class ToolCall(TypedDict):
  name: str
  args: Dict[str, Any]
	id: Optional[str]

也就是说，无论您调用的是 Anthropic、OpenAI、Gemini 等，只要存在工具调用，它都将以 ToolCall 的形式存在于 AIMessage.tool_calls 中。

我们还添加了一些其他属性来处理流式工具调用块和无效工具调用。在此处的工具调用文档中阅读更多相关信息：此处。

`create_tool_calling_agent()`

LLM 工具调用功能最强大和最明显的用途之一是构建代理。LangChain 已经有一个 create_openai_tools_agent() 构造函数，可以轻松构建一个使用符合 OpenAI 工具调用 API 的工具调用模型的代理，但这不适用于 Anthropic 和 Gemini 等模型。由于新的 bind_tools() 和 tool_calls 接口，我们添加了一个 create_tool_calling_agent()，它可以与任何工具调用模型一起使用。

from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import ConfigurableField
from langchain_core.tools import tool
from langchain.agents import create_tool_calling_agent, AgentExecutor

@tool
def multiply(x: float, y: float) -> float:
    """Multiply 'x' times 'y'."""
    return x * y

@tool
def exponentiate(x: float, y: float) -> float:
    """Raise 'x' to the 'y'."""
    return x**y

@tool
def add(x: float, y: float) -> float:
    """Add 'x' and 'y'."""
    return x + y

prompt = ChatPromptTemplate.from_messages([
    ("system", "you're a helpful assistant"), 
    ("human", "{input}"), 
    ("placeholder", "{agent_scratchpad}"),
])

tools = [multiply, exponentiate, add]


llm = ChatAnthropic(model="claude-3-sonnet-20240229", temperature=0)


agent = create_tool_calling_agent(llm, tools, prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)

agent_executor.invoke({"input": "what's 3 plus 5 raised to the 2.743. also what's 17.24 - 918.1241", })

我们可以改用 VertexAI

from langchain_google_vertexai import ChatVertexAI

llm = ChatVertexAI(
	model="gemini-pro", 
	temperature=0, 
	convert_system_message_to_human=True
)
agent = create_tool_calling_agent(llm, tools, prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)

agent_executor.invoke({"input": "what's 3 plus 5 raised to the 2.743. also what's 17.24 - 918.1241", })

或者 OpenAI

llm = ChatOpenAI(model="gpt-3.5-turbo-0125", temperature=0)

agent = create_tool_calling_agent(llm, tools, prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)

agent_executor.invoke({"input": "what's 3 plus 5 raised to the 2.743. also what's 17.24 - 918.1241", })

等等。

有关新代理的完整文档，请参见此处。

LangGraph

如果您还没有了解 LangGraph，您绝对应该了解一下。它是 LangChain 的扩展，可以轻松构建任意代理和多代理流程。您可以想象，使用新的 tool_calls 接口也使构建 LangGraph 代理或流程变得更简单。查看此处的笔记本，详细了解如何在 LangGraph 代理中使用 tool_calls。

`with_structured_output`

我们最近发布了 ChatModel.with_structured_output() 接口，用于从模型获取结构化输出，这与工具调用非常相关。虽然具体实现因模型提供商而异，但对于大多数支持它的模型，with_structured_output 都是基于工具调用构建的。在底层，with_structured_output 使用 bind_tools 将给定的结构化输出模式传递给模型。

那么，您应该何时使用 with_structured_output，何时直接绑定工具并读取工具调用呢？

with_structured_output 始终返回您指定的模式中的结构化输出。当您想强制 LLM 输出与特定模式匹配的信息时，这很有用。这对于信息提取任务很有用。

bind_tools 更通用，可以选择特定工具 - 或不选择工具，或选择多个工具！当您希望允许 LLM 在如何响应方面具有更大的灵活性时，这很有用 - 例如，在代理应用程序中，您需要选择要调用哪些工具，还要响应用户。

结论

我们预计将原生工具调用功能引入 LLM 的趋势将在未来继续。我们希望标准化的工具调用接口可以帮助 LangChain 用户节省时间和精力，并使他们能够更轻松地在不同的 LLM 提供商之间切换。

请记住更新您的 langchain_core 和合作伙伴包版本，以利用新的接口！

我们很乐意倾听您的任何反馈！

简介

ChatModel.bind_tools(...)

AIMessage.tool_calls

create_tool_calling_agent()

LangGraph

with_structured_output

结论

加入我们的新闻邮件

`ChatModel.bind_tools(...)`

`AIMessage.tool_calls`

`create_tool_calling_agent()`

`with_structured_output`