Sunday, April 5

By the end of this tutorial, you’ll have a working Python implementation that defines custom tools, sends them to Claude, executes the returned tool calls, and feeds results back — all without a framework sitting between you and the API. Claude tool use with Python is one of the highest-leverage patterns in agent development, and it’s simpler to wire up directly than most tutorials suggest.

  1. Install dependencies — set up the Anthropic SDK and any API clients you’ll call
  2. Define tool schemas — write JSON Schema definitions Claude will understand
  3. Send tools with the first API call — attach tools to the messages request
  4. Parse and execute tool calls — handle tool_use blocks in the response
  5. Return tool results and get the final response — close the loop with tool_result
  6. Chain multiple tool calls — handle agentic loops where Claude calls tools repeatedly

Step 1: Install Dependencies

You need the Anthropic SDK and whatever library backs your tool. For this tutorial we’re building a weather tool (calls a real HTTP endpoint) and a calculator tool (pure Python). Install the minimum:

pip install anthropic httpx python-dotenv

Pin versions in production. The SDK’s tool call interface stabilised in anthropic>=0.25.0 but parameter shapes changed between minor versions before that. If you’re running older code that broke, that’s likely why.

import os
import json
import httpx
import anthropic
from dotenv import load_dotenv

load_dotenv()
client = anthropic.Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])

Step 2: Define Tool Schemas

Claude expects tools as a list of dicts, each with a name, description, and input_schema following JSON Schema draft-07. The description is not decorative — Claude reads it to decide when to call the tool. Vague descriptions produce wrong tool selection.

tools = [
    {
        "name": "get_weather",
        "description": (
            "Fetches current weather for a given city. "
            "Use this when the user asks about weather, temperature, or conditions "
            "in a specific location. Returns temperature in Celsius and a condition string."
        ),
        "input_schema": {
            "type": "object",
            "properties": {
                "city": {
                    "type": "string",
                    "description": "City name, e.g. 'London' or 'Tokyo'"
                },
                "units": {
                    "type": "string",
                    "enum": ["metric", "imperial"],
                    "description": "Unit system. Defaults to metric."
                }
            },
            "required": ["city"]
        }
    },
    {
        "name": "calculate",
        "description": (
            "Evaluates a mathematical expression and returns the numeric result. "
            "Use for arithmetic, percentages, and unit conversions. "
            "Do NOT use for symbolic algebra."
        ),
        "input_schema": {
            "type": "object",
            "properties": {
                "expression": {
                    "type": "string",
                    "description": "A valid Python math expression, e.g. '(42 * 1.2) / 3.5'"
                }
            },
            "required": ["expression"]
        }
    }
]

One thing the documentation underplays: Claude will hallucinate tool inputs if your schema is ambiguous. Adding an enum where possible, and being explicit about units and formats in property descriptions, cuts bad calls significantly. This pairs well with the grounding strategies covered in reducing LLM hallucinations in production.

Step 3: Send Tools with the First API Call

Pass your tools list into client.messages.create(). The model is claude-opus-4-5 here; swap to claude-haiku-4-5 if you’re cost-sensitive — tool selection quality holds up well on Haiku for straightforward schemas, and it’s roughly 20x cheaper per token.

def first_turn(user_message: str) -> anthropic.types.Message:
    return client.messages.create(
        model="claude-opus-4-5",
        max_tokens=1024,
        tools=tools,
        messages=[{"role": "user", "content": user_message}]
    )

response = first_turn("What's the weather like in Tokyo right now, and what's 15% of 8500?")
print(response.stop_reason)  # Expect: "tool_use"
print(response.content)      # List of TextBlock and/or ToolUseBlock objects

When Claude wants to call a tool, stop_reason will be "tool_use" and response.content will contain one or more ToolUseBlock objects alongside any text it generated before the call. Don’t assume the content list has a fixed structure — iterate it.

Step 4: Parse and Execute Tool Calls

This is where most tutorials hand-wave. You need to extract every ToolUseBlock, dispatch to the right Python function, and collect results with their corresponding tool_use_id values. That ID is what threads the result back to the right call.

def execute_tool(name: str, inputs: dict) -> str:
    """Route tool calls to their implementations. Returns a string result."""
    if name == "get_weather":
        return fetch_weather(inputs["city"], inputs.get("units", "metric"))
    elif name == "calculate":
        return safe_eval(inputs["expression"])
    else:
        raise ValueError(f"Unknown tool: {name}")


def fetch_weather(city: str, units: str) -> str:
    """Real HTTP call to wttr.in — no API key required."""
    url = f"https://wttr.in/{city}?format=j1"
    try:
        r = httpx.get(url, timeout=5.0)
        r.raise_for_status()
        data = r.json()
        current = data["current_condition"][0]
        temp_key = "temp_C" if units == "metric" else "temp_F"
        desc = current["weatherDesc"][0]["value"]
        temp = current[temp_key]
        return json.dumps({"temperature": temp, "condition": desc, "units": units})
    except Exception as e:
        return json.dumps({"error": str(e)})


def safe_eval(expression: str) -> str:
    """Evaluate math expressions safely using ast.literal_eval's safer cousin."""
    import ast, operator

    # Whitelist of allowed operators
    ops = {
        ast.Add: operator.add, ast.Sub: operator.sub,
        ast.Mult: operator.mul, ast.Div: operator.truediv,
        ast.Pow: operator.pow, ast.Mod: operator.mod,
        ast.USub: operator.neg,
    }

    def _eval(node):
        if isinstance(node, ast.Constant):
            return node.value
        elif isinstance(node, ast.BinOp):
            return ops[type(node.op)](_eval(node.left), _eval(node.right))
        elif isinstance(node, ast.UnaryOp):
            return ops[type(node.op)](_eval(node.operand))
        else:
            raise ValueError("Unsupported expression")

    try:
        tree = ast.parse(expression, mode="eval")
        result = _eval(tree.body)
        return json.dumps({"result": result})
    except Exception as e:
        return json.dumps({"error": str(e)})


def collect_tool_results(response: anthropic.types.Message) -> list[dict]:
    """Extract tool calls, execute them, return tool_result blocks."""
    results = []
    for block in response.content:
        if block.type == "tool_use":
            output = execute_tool(block.name, block.input)
            results.append({
                "type": "tool_result",
                "tool_use_id": block.id,   # must match exactly
                "content": output
            })
    return results

Never use Python’s built-in eval() for the calculator — you’ll eventually get a user (or a hallucinating model) that passes something destructive. The AST approach above handles real arithmetic without the risk.

Step 5: Return Tool Results and Get the Final Response

The second API call must include the full conversation history: the original user message, Claude’s assistant turn (including its ToolUseBlocks), and a new user turn containing your tool_result blocks. If you drop any of these, you’ll get a 400 error or garbage output.

def second_turn(
    user_message: str,
    first_response: anthropic.types.Message,
    tool_results: list[dict]
) -> anthropic.types.Message:
    messages = [
        {"role": "user", "content": user_message},
        {"role": "assistant", "content": first_response.content},  # ToolUseBlocks preserved
        {"role": "user", "content": tool_results}                   # tool_result blocks
    ]
    return client.messages.create(
        model="claude-opus-4-5",
        max_tokens=1024,
        tools=tools,
        messages=messages
    )

tool_results = collect_tool_results(response)
final = second_turn("What's the weather like in Tokyo right now, and what's 15% of 8500?", response, tool_results)
print(final.content[0].text)

Step 6: Chain Multiple Tool Calls with an Agent Loop

For real agents, Claude will call tools repeatedly until it’s satisfied. Wrap the whole thing in a loop that checks stop_reason. This is the core of any web-browsing Claude agent or multi-step workflow.

def run_agent(user_message: str, max_iterations: int = 10) -> str:
    messages = [{"role": "user", "content": user_message}]

    for _ in range(max_iterations):
        response = client.messages.create(
            model="claude-opus-4-5",
            max_tokens=2048,
            tools=tools,
            messages=messages
        )

        # Append Claude's full response to history
        messages.append({"role": "assistant", "content": response.content})

        if response.stop_reason == "end_turn":
            # Claude is done — extract final text
            for block in response.content:
                if hasattr(block, "text"):
                    return block.text
            return ""

        if response.stop_reason == "tool_use":
            tool_results = collect_tool_results(response)
            messages.append({"role": "user", "content": tool_results})
            continue  # Loop back, Claude will process results

        # Unexpected stop reason
        break

    return "Agent reached max iterations without completing."


result = run_agent("What's the weather in Berlin and Paris? Which city is warmer?")
print(result)

Set max_iterations based on your task complexity. For most tool-augmented Q&A, 3–5 iterations is sufficient. Unlimited loops will burn tokens fast on pathological inputs — this is one of the failure modes worth logging. Pair this with the error handling and fallback logic patterns covered separately to make this production-safe.

On cost: with claude-haiku-4-5, a two-turn tool call exchange (first request + tool result + final response) runs roughly $0.0008–$0.002 depending on payload size. On claude-opus-4-5 expect 15–20x that. For high-volume workloads, Haiku handles tool selection surprisingly well on clean schemas — worth benchmarking before committing to Opus.

Common Errors and How to Fix Them

Error: “messages: roles must alternate between ‘user’ and ‘assistant'”

You’re sending two consecutive user messages or two consecutive assistant messages. The most common cause is forgetting to append Claude’s assistant response before appending tool results. Check that your message list always follows the pattern: user → assistant → user → assistant. The assistant turn must include the raw response.content (which contains ToolUseBlock objects), not just the text.

Error: Tool result not matching any tool_use_id

You’re either generating a fake tool_use_id or extracting it incorrectly. Always use block.id directly from the ToolUseBlock — don’t construct IDs yourself. IDs look like toolu_01XjA... and are case-sensitive.

Claude ignores tools and answers from memory instead

This usually means your tool description overlaps too much with Claude’s training data. If you write “Get the current time” as your description but Claude already knows what time zones are, it might just answer directly. Add specificity: “Call this tool to get the real-time system clock — do not estimate or guess the time.” You can also set tool_choice={"type": "any"} to force at least one tool call, or tool_choice={"type": "tool", "name": "your_tool"} to force a specific one. Writing strong system prompts that establish agent role and behaviour — as described in this guide to Claude agent system prompts — also helps Claude understand when tools are expected.

What to Build Next

The natural extension is a multi-agent tool router: define a “dispatch” tool that Claude can call to spin up specialised sub-agents (one for web search, one for database queries, one for code execution), then aggregate their results. This pattern scales well for complex workflows without requiring a framework like LangChain. If you want to compare the tradeoffs before going further, the LangChain vs LlamaIndex vs plain Python architecture comparison is worth reading first — in most cases, the plain Python approach you just built is already better for controlled production deployments.

Bottom line: If you’re a solo founder or small team, start with the loop pattern above and Haiku for cost control. If you’re building something that needs reliable multi-step tool chaining across dozens of concurrent users, add observability tooling early — you’ll want to see exactly which tool calls Claude makes, with what inputs, and how often they fail. The Claude tool use Python pattern above gives you full control to instrument every step without a framework’s abstractions hiding what’s actually happening.

Frequently Asked Questions

How do I force Claude to always use a specific tool?

Pass tool_choice={"type": "tool", "name": "your_tool_name"} in your messages.create() call. This forces Claude to call that exact tool on every turn. Use {"type": "any"} if you want Claude to call at least one tool but don’t care which. Note that forcing tool use can produce low-quality inputs if the user’s message doesn’t naturally warrant a tool call.

Can Claude call multiple tools in a single response?

Yes. When Claude wants to call multiple tools in parallel, it returns multiple ToolUseBlock objects in the same response. Your tool_result message should include one result block per tool call, each with its matching tool_use_id. The loop in Step 6 above handles this automatically since collect_tool_results iterates all blocks.

What’s the difference between tool_use and function calling in OpenAI?

They’re conceptually identical — both let the model request execution of a named function with structured JSON arguments. The main structural difference is that Claude returns tool calls as content blocks within the main response, while OpenAI uses a separate tool_calls field. Claude also requires you to pass the full assistant message (including ToolUseBlocks) back into history, which trips up developers coming from the OpenAI SDK.

How much does a tool call round-trip cost with Claude Haiku?

At current pricing, a typical two-turn exchange (user prompt → tool call → tool result → final answer) with claude-haiku-4-5 and modest payloads costs roughly $0.0008–$0.002. That’s around 500–1,250 tool-augmented interactions per dollar. Opus is 15–20x more expensive for the same exchange, so benchmark Haiku first on your specific tool schemas before assuming you need the bigger model.

Can I use Claude tools without the Anthropic Python SDK?

Yes — the tools feature is part of the REST API. You can POST to https://api.anthropic.com/v1/messages directly with the tools array in the request body. The Python SDK is just a typed wrapper. If you’re in a non-Python environment or want minimal dependencies, hitting the HTTP API directly works fine.

Put this into practice

Try the Api Security Audit agent — ready to use, no setup required.

Browse Agents →

Editorial note: API pricing, model capabilities, and tool features change frequently — always verify current details on the vendor’s website before building in production. Code examples are tested at time of writing; pin your dependency versions to avoid breaking changes. Some links in this article may be affiliate links — we may earn a commission if you sign up, at no extra cost to you.


Share.
Leave A Reply