当前位置：首页>python>如何使用 Gemma 4 和 Python 实现工具调用

如何使用 Gemma 4 和 Python 实现工具调用

2026-06-30 06:31:49

GitHub: https://github.com/mmmayo13/gemma_4_tool_calling

文章简介

本文介绍如何使用 Gemma 4 模型家族和 Ollama 构建一个本地、隐私优先的工具调用智能体。

涵盖主题包括：

Gemma 4 模型家族及其功能概述
工具调用如何使语言模型与外部函数交互
如何使用 Python 和 Ollama 实现本地工具调用系统

Gemma 4 家族介绍

开放权重模型生态系统最近随着 Gemma 4 模型家族 的发布而发生了转变。Gemma 4 变体由 Google 构建，旨在提供前沿级能力，并采用宽松的 Apache 2.0 许可证，使机器学习从业者能够完全控制其基础设施和数据隐私。

Gemma 4 发布版本包含从参数密集的 31B 和结构复杂的 26B 混合专家模型（MoE）到轻量级、边缘设备导向的变体。对于 AI 工程师来说更重要的是，该模型家族原生支持智能体工作流。它们经过微调，能够可靠地生成结构化 JSON 输出，并根据系统指令原生调用函数调用。这使它们从"碰运气"的推理引擎转变为能够执行工作流并与本地外部 API 对话的实用系统。

语言模型中的工具调用

语言模型最初是闭环对话系统。如果你向语言模型询问实时传感器读数或实时市场汇率，它最多只能道歉，最坏情况下会幻觉出一个答案。工具调用（又称函数调用） 是弥补这一差距所需的基础架构转变。

工具调用作为桥梁，帮助将静态模型转变为动态自主智能体。启用工具调用时，模型会根据提供的工具注册表（通过 JSON 模式提供）评估用户提示。模型不会仅使用内部权重尝试猜测答案，而是暂停推理，格式化专门设计用于触发外部函数的结构化请求，并等待结果。一旦结果由主机应用程序处理并交回给模型，模型就会综合注入的实时上下文，制定基于事实的最终响应。

环境设置：Ollama 和 Gemma 4:E2B

为了构建一个真正本地、隐私优先的工具调用系统，我们将使用 Ollama 作为本地推理运行器，配合 gemma4:e2b（Edge 20亿参数）模型。

gemma4:e2b 模型专为移动设备和物联网应用构建，代表了消费硬件上可能实现的功能的范式转变，在推理期间激活有效的 20 亿参数占用空间。这种优化在保持系统内存的同时实现了接近零延迟的执行。通过完全离线执行，它消除了速率限制和 API 成本，同时保持严格的数据隐私。

尽管体积极小，Google 已设计 gemma4:e2b 继承更大的 31B 模型的多模态属性和原生函数调用能力，使其成为快速、响应式桌面智能体的理想基础。

代码：设置智能体

实现采用零依赖理念，仅利用标准 Python 库如 urllib 和 json，确保最大的可移植性和透明度。

应用程序的架构流程：

定义作为工具的本地 Python 函数
定义严格的 JSON 模式，向语言模型准确解释这些工具的功能和期望的参数
将用户查询和工具注册表传递给本地 Ollama API
捕获模型的响应，识别是否请求了工具调用，执行相应的本地代码，并将答案反馈回去

构建工具：get_current_weather

第一个函数 get_current_weather 连接到开源 Open-Meteo API 以获取特定位置的实时天气数据：

defget_current_weather(city: str, unit: str = "celsius") -> str:
"""Gets the current temperature for a given city using open-meteo API."""
try:
# Geocode the city to get latitude and longitude
        geo_url = f"https://geocoding-api.open-meteo.com/v1/search?name={urllib.parse.quote(city)}&count=1"
        geo_req = urllib.request.Request(geo_url, headers={'User-Agent': 'Gemma4ToolCalling/1.0'})
with urllib.request.urlopen(geo_req) as response:
            geo_data = json.loads(response.read().decode('utf-8'))
if"results"notin geo_data ornot geo_data["results"]:
returnf"Could not find coordinates for city: {city}."
            location = geo_data["results"][0]
            lat = location["latitude"]
            lon = location["longitude"]
            country = location.get("country", "")

# Fetch the weather
        temp_unit = "fahrenheit"if unit.lower() == "fahrenheit"else"celsius"
        weather_url = f"https://api.open-meteo.com/v1/forecast?latitude={lat}&longitude={lon}&current=temperature_2m,wind_speed_10m&temperature_unit={temp_unit}"
        weather_req = urllib.request.Request(weather_url, headers={'User-Agent': 'Gemma4ToolCalling/1.0'})
with urllib.request.urlopen(weather_req) as response:
            weather_data = json.loads(response.read().decode('utf-8'))
if"current"in weather_data:
                current = weather_data["current"]
                temp = current["temperature_2m"]
                wind = current["wind_speed_10m"]
                temp_unit_str = weather_data["current_units"]["temperature_2m"]
                wind_unit_str = weather_data["current_units"]["wind_speed_10m"]
returnf"The current weather in {city.title()} ({country}) is {temp}{temp_unit_str} with wind speeds of {wind}{wind_unit_str}."
else:
returnf"Weather data for {city} is unavailable from the API."
except Exception as e:
returnf"Error fetching weather for {city}: {e}"

函数实现了两阶段 API 解析模式：先地理编码城市名称为坐标，再调用天气端点。

对应的 JSON 模式（向模型描述工具）：

{
"type":"function",
"function":{
"name":"get_current_weather",
"description":"Gets the current temperature for a given city.",
"parameters":{
"type":"object",
"properties":{
"city":{
"type":"string",
"description":"The city name, e.g. Tokyo"
},
"unit":{
"type":"string",
"enum":["celsius","fahrenheit"]
}
},
"required":["city"]
}
}
}

工具调用的内部机制

主循环编排器的核心逻辑：

# Initial payload to the model
messages = [{"role": "user", "content": user_query}]
payload = {
"model": "gemma4:e2b",
"messages": messages,
"tools": available_tools,
"stream": False
}

response_data = call_ollama(payload)
message = response_data.get("message", {})

# Check if the model decided to call tools
if"tool_calls"in message and message["tool_calls"]:
    messages.append(message)

for tool_call in message["tool_calls"]:
        function_name = tool_call["function"]["name"]
        arguments = tool_call["function"]["arguments"]

if function_name in TOOL_FUNCTIONS:
            func = TOOL_FUNCTIONS[function_name]
            result = func(**arguments)
            messages.append({
"role": "tool",
"content": str(result),
"name": function_name
            })

# Send the tool results back to the model to get the final answer
    payload["messages"] = messages
    final_response_data = call_ollama(payload)
print(final_response_data.get("message", {}).get("content", ""))

关键点：工具执行结果作为 tool 角色附加到对话历史后，需要第二次 API 调用，让模型根据真实数据生成最终回复。

扩展工具集

除天气工具外，还整合了三个额外的实时工具：

get_current_news
：利用 NewsAPI 端点，根据查询关键词解析全球头条新闻
get_current_time
：通过 TimeAPI.io 处理时区逻辑并返回可读的日期时间字符串
convert_currency
：依赖 ExchangeRate-API，支持法定货币之间的实时转换计算

每个工具都通过 JSON 模式注册表配置，无需外部编排框架。

测试结果

作者测试了多个场景：

单工具查询：get_current_weather → 成功
单工具查询：convert_currency → 成功
堆叠工具调用
：四个问题同时涉及时间、汇率、天气、新闻 → 全部成功，由四个不同工具并行回答

作者在整个周末对这个设置运行了数百个查询，模型的推理从未失败，无论提问措辞如何变化。

结论

开放权重模型内部工具调用行为的出现是近期本地 AI 中最实用的发展之一。随着 Gemma 4 的发布，我们可以安全地离线操作，构建不受云和 API 限制的复杂系统。即使是低功耗的消费设备也可以自主运行，实现以前仅限于云级硬件的能力。

往期推荐

从4周到45分钟：为4700多份PDF设计文档提取系统

掌握智能体 AI 系统中记忆的 7 个步骤

超越提示工程：5 种检测与缓解 LLM 幻觉的实用技术

从提示词到预测：深入理解 LLM 中的 Prefill、Decode 与 KV Cache

2026年值得关注的7大机器学习趋势

用Python构建一个简单的MCP服务器

本文由 Matthew Mayo 撰写，原载于 Machine Learning Mastery。翻译整理仅供学习参考。

本文来自网友投稿或网络内容，如有侵犯您的权益请联系我们删除，联系邮箱：wyl860211@qq.com 。

如何使用 Gemma 4 和 Python 实现工具调用

文章简介

Gemma 4 家族介绍

语言模型中的工具调用

环境设置：Ollama 和 Gemma 4:E2B

代码：设置智能体

构建工具：get_current_weather

工具调用的内部机制

扩展工具集

测试结果

结论

最新文章

热门文章

随机文章

如何使用 Gemma 4 和 Python 实现工具调用

文章简介

Gemma 4 家族介绍

语言模型中的工具调用

环境设置：Ollama 和 Gemma 4:E2B

代码：设置智能体

构建工具：get_current_weather

工具调用的内部机制

扩展工具集

测试结果

结论

Python.NET:打开Python与.NET世界互通的大门

Python技巧与AI结合创作小程序或者小游戏0基础都可以使用

最新文章

热门文章

随机文章