Claude MCP servers: complete setup guide for production tool integrations

By the end of this tutorial, you’ll have a working Claude MCP server that exposes custom tools to Claude, understands how the protocol routes calls, and knows the three patterns that actually hold up when traffic hits. Claude MCP servers setup is more approachable than the spec makes it look — but there are enough sharp edges to burn you if you skip the architecture thinking.

MCP (Model Context Protocol) is Anthropic’s open standard for giving Claude structured access to external tools and data sources. Instead of bolting ad-hoc function-calling JSON onto every API call, MCP defines a client-server handshake so Claude knows what tools are available, what arguments they expect, and how to interpret results. It’s similar in spirit to the LSP (Language Server Protocol) — one protocol, many implementations.

This matters practically: if you’ve already read our breakdown of Claude tool use vs function calling, you know that raw function calling gets messy at scale. MCP gives you a cleaner abstraction with proper capability negotiation.

Install dependencies — set up Python environment with the MCP SDK and Anthropic client
Define your tools — register tool schemas with input validation
Implement tool handlers — write the actual execution logic
Wire up the MCP server — create the server, attach transport, run it
Connect Claude to your server — configure the client and send your first tool-using request
Add production hardening — error handling, timeouts, and structured responses

Step 1: Install Dependencies

You need Python 3.10+ and two packages. The official MCP Python SDK handles all the protocol machinery; anthropic is for the client side.

pip install mcp anthropic httpx

Create a virtual environment first — MCP is evolving fast and you don’t want transitive dependency conflicts biting you. Pin versions in production:

pip install "mcp==1.0.0" "anthropic==0.34.0" "httpx==0.27.0"

The MCP SDK installs a CLI utility (mcp) you can use to test servers directly without writing client code. Useful for quick iteration.

Step 2: Define Your Tools

Tools in MCP are declared as JSON Schema objects. The SDK gives you a decorator-based API that generates the schema automatically from Python type hints — much cleaner than writing raw JSON.

Here’s a realistic example: a server that wraps a weather API and a simple database lookup.

from mcp.server import Server
from mcp.server.models import InitializationOptions
from mcp.types import Tool, TextContent
import mcp.types as types

# Create the server instance
app = Server("weather-tools")

@app.list_tools()
async def list_tools() -> list[Tool]:
    """Tell Claude what tools this server exposes."""
    return [
        Tool(
            name="get_weather",
            description="Fetch current weather for a city. Returns temperature, conditions, and humidity.",
            inputSchema={
                "type": "object",
                "properties": {
                    "city": {
                        "type": "string",
                        "description": "City name, e.g. 'London' or 'New York'"
                    },
                    "units": {
                        "type": "string",
                        "enum": ["celsius", "fahrenheit"],
                        "default": "celsius"
                    }
                },
                "required": ["city"]
            }
        ),
        Tool(
            name="lookup_customer",
            description="Look up a customer record by email address.",
            inputSchema={
                "type": "object",
                "properties": {
                    "email": {
                        "type": "string",
                        "format": "email"
                    }
                },
                "required": ["email"]
            }
        )
    ]

Write descriptions as if you’re explaining tools to a new junior engineer. Vague descriptions like “gets data” produce bad tool selection. Be specific about what the tool returns, not just what it does.

Step 3: Implement Tool Handlers

The handler receives tool name and arguments, dispatches to your logic, and returns structured content. Keep handlers thin — delegate real work to service classes you can test independently.

import httpx
import json

@app.call_tool()
async def call_tool(name: str, arguments: dict) -> list[types.TextContent]:
    """Route tool calls to their implementations."""
    
    if name == "get_weather":
        return await handle_weather(arguments)
    elif name == "lookup_customer":
        return await handle_customer_lookup(arguments)
    else:
        raise ValueError(f"Unknown tool: {name}")

async def handle_weather(args: dict) -> list[types.TextContent]:
    city = args["city"]
    units = args.get("units", "celsius")
    
    # Replace with your actual weather API
    async with httpx.AsyncClient(timeout=10.0) as client:
        response = await client.get(
            "https://api.openweathermap.org/data/2.5/weather",
            params={
                "q": city,
                "appid": "YOUR_API_KEY",
                "units": "metric" if units == "celsius" else "imperial"
            }
        )
        response.raise_for_status()
        data = response.json()
    
    result = {
        "city": city,
        "temperature": data["main"]["temp"],
        "conditions": data["weather"][0]["description"],
        "humidity": data["main"]["humidity"],
        "units": units
    }
    
    # Return as structured text — Claude parses this well
    return [TextContent(type="text", text=json.dumps(result, indent=2))]

async def handle_customer_lookup(args: dict) -> list[types.TextContent]:
    email = args["email"]
    
    # Simulated DB lookup — replace with your actual data layer
    mock_db = {
        "alice@example.com": {"name": "Alice Chen", "plan": "pro", "mrr": 299},
        "bob@example.com": {"name": "Bob Smith", "plan": "starter", "mrr": 49}
    }
    
    customer = mock_db.get(email)
    if not customer:
        return [TextContent(type="text", text=json.dumps({"error": "Customer not found"}))]
    
    return [TextContent(type="text", text=json.dumps(customer, indent=2))]

Return JSON strings consistently. Claude handles structured data well, and if you want to enforce a schema on the output side, the patterns in our guide on consistent JSON from LLMs apply equally here.

Step 4: Wire Up the MCP Server

MCP supports two transport layers: stdio (for local/subprocess communication) and SSE (for HTTP-based remote servers). For production, SSE is what you want. For development and desktop client integrations like Claude Desktop, stdio is simpler.

import asyncio
import mcp.server.stdio as stdio_server

# stdio transport — works with Claude Desktop and local clients
async def main():
    async with stdio_server.stdio_server() as (read_stream, write_stream):
        await app.run(
            read_stream,
            write_stream,
            InitializationOptions(
                server_name="weather-tools",
                server_version="1.0.0",
                capabilities=app.get_capabilities(
                    notification_options=None,
                    experimental_capabilities={}
                )
            )
        )

if __name__ == "__main__":
    asyncio.run(main())

Save this as server.py. Run it with python server.py — it’ll wait silently on stdin, which is correct for stdio transport.

SSE Transport for HTTP Deployment

from mcp.server.sse import SseServerTransport
from starlette.applications import Starlette
from starlette.routing import Route, Mount
import uvicorn

# SSE transport — use this for remote/production deployments
sse = SseServerTransport("/messages")

async def handle_sse(request):
    async with sse.connect_sse(
        request.scope, request.receive, request._send
    ) as streams:
        await app.run(
            streams[0], streams[1],
            InitializationOptions(
                server_name="weather-tools",
                server_version="1.0.0",
                capabilities=app.get_capabilities(
                    notification_options=None,
                    experimental_capabilities={}
                )
            )
        )

starlette_app = Starlette(routes=[
    Route("/sse", endpoint=handle_sse),
    Mount("/messages", app=sse.handle_post_message)
])

if __name__ == "__main__":
    uvicorn.run(starlette_app, host="0.0.0.0", port=8000)

This pairs naturally with our Starlette + Claude skills backend guide if you’re building a full API layer around it.

Step 5: Connect Claude to Your Server

Now the client side. You’re using the Anthropic Python SDK with the MCP client to connect, discover tools, and make requests.

import asyncio
import anthropic
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client

async def run_with_mcp():
    # Point this at your server.py
    server_params = StdioServerParameters(
        command="python",
        args=["server.py"],
        env=None  # Inherit current environment
    )
    
    async with stdio_client(server_params) as (read, write):
        async with ClientSession(read, write) as session:
            # Initialize — this triggers the capability handshake
            await session.initialize()
            
            # List available tools and convert to Anthropic format
            tools_result = await session.list_tools()
            tools = [
                {
                    "name": t.name,
                    "description": t.description,
                    "input_schema": t.inputSchema
                }
                for t in tools_result.tools
            ]
            
            client = anthropic.Anthropic()
            messages = [
                {"role": "user", "content": "What's the weather in Tokyo right now?"}
            ]
            
            # Agentic loop — keep going until Claude stops calling tools
            while True:
                response = client.messages.create(
                    model="claude-opus-4-5",
                    max_tokens=1024,
                    tools=tools,
                    messages=messages
                )
                
                if response.stop_reason == "end_turn":
                    # Claude is done
                    print(response.content[0].text)
                    break
                
                if response.stop_reason == "tool_use":
                    # Execute each tool call Claude requested
                    tool_results = []
                    for block in response.content:
                        if block.type == "tool_use":
                            result = await session.call_tool(
                                block.name,
                                arguments=block.input
                            )
                            tool_results.append({
                                "type": "tool_result",
                                "tool_use_id": block.id,
                                "content": result.content[0].text
                            })
                    
                    # Feed results back into the conversation
                    messages.append({"role": "assistant", "content": response.content})
                    messages.append({"role": "user", "content": tool_results})

asyncio.run(run_with_mcp())

The agentic loop is the key piece most tutorials skip. Claude can call multiple tools in sequence — you need to keep feeding results back until it signals end_turn. For complex multi-step reasoning, also look at the Claude subagent orchestration patterns we’ve covered separately.

Step 6: Add Production Hardening

Raw MCP servers break in three predictable ways: network timeouts, downstream API failures, and malformed arguments. Handle all three explicitly.

import asyncio
from mcp.types import TextContent
import json

@app.call_tool()
async def call_tool_hardened(name: str, arguments: dict) -> list[TextContent]:
    try:
        # Enforce a global timeout on all tool calls
        result = await asyncio.wait_for(
            _dispatch_tool(name, arguments),
            timeout=15.0  # seconds — adjust per tool
        )
        return result
    
    except asyncio.TimeoutError:
        error = {"error": "tool_timeout", "tool": name, "message": "Tool exceeded 15s limit"}
        return [TextContent(type="text", text=json.dumps(error))]
    
    except ValueError as e:
        # Bad arguments — Claude needs to know to retry differently
        error = {"error": "invalid_arguments", "tool": name, "message": str(e)}
        return [TextContent(type="text", text=json.dumps(error))]
    
    except Exception as e:
        # Don't let tool failures crash the server
        error = {"error": "tool_error", "tool": name, "message": f"Unexpected error: {type(e).__name__}"}
        return [TextContent(type="text", text=json.dumps(error))]

async def _dispatch_tool(name: str, arguments: dict):
    if name == "get_weather":
        return await handle_weather(arguments)
    elif name == "lookup_customer":
        return await handle_customer_lookup(arguments)
    else:
        raise ValueError(f"Unknown tool: {name}")

Always return errors as valid tool results, never as exceptions that bubble up to the transport layer. If a tool throws an unhandled exception, the MCP session can terminate. Return structured error JSON and let Claude decide whether to retry or report failure to the user.

This mirrors what we’ve written about error handling in AI workflows — the same principle applies: fail gracefully, give the orchestrator enough information to recover.

Common Errors in Claude MCP Servers Setup

Error 1: “Capability negotiation failed” on initialization

This usually means a version mismatch between your MCP SDK and what the client expects. Check that both client and server are using the same protocol version. The SDK exposes this via mcp.__version__. Fix: pin both sides to the same MCP SDK version in your requirements files.

Error 2: Tool calls hang indefinitely

Your tool handler is blocking the event loop. Any synchronous I/O (DB calls, HTTP requests using requests instead of httpx, file reads) inside an async handler will stall the whole server. Fix: use asyncio.to_thread() for unavoidable synchronous operations, or switch to async libraries. Every HTTP call in handlers should use httpx.AsyncClient.

Error 3: Claude ignores available tools and responds from memory

The tool descriptions aren’t specific enough, or Claude’s context has gotten too long and it’s optimizing away tool calls. Fix: sharpen descriptions to include when the tool should be used, not just what it does. Example — instead of “Gets weather data”, write “Use this when the user asks about current weather conditions, forecasts, or temperature in any specific location.” Also check you’re passing the full tools array on every turn of the agentic loop — it’s easy to accidentally pass it only on the first call.

What to Build Next

The natural extension is a multi-server setup: separate MCP servers for different domains (one for CRM data, one for internal APIs, one for web search), with a routing layer that presents all tools to Claude through a single session. This scales well because each server is independently deployable and testable.

Start by extracting your tool handlers into a proper service layer with dependency injection, then wrap each domain in its own MCP server process. Use SSE transport and put them behind a lightweight reverse proxy. At that point, you’re running something that looks a lot like a microservices architecture — with Claude as the orchestrator deciding which service to call based on natural language intent.

At current Claude Opus pricing (~$15/MTok input), a tool-heavy session with 5-6 round trips costs roughly $0.008–0.015 per conversation. If you’re running high volume, swap the agentic loop to use Claude Haiku for tool routing and Sonnet only for final synthesis — that can cut costs by 10-15x on the tool selection turns.

Frequently Asked Questions

What is the difference between MCP and regular Claude function calling?

Regular function calling embeds tool schemas directly in each API request — you manage schema distribution yourself. MCP adds a protocol layer where Claude connects to a server, negotiates capabilities at session start, and then calls tools over a persistent connection. MCP is better for production because tools are versioned and discoverable independently of your application code, and the same MCP server can serve multiple clients.

Can I run a Claude MCP server in a serverless environment like AWS Lambda?

Not easily with stdio transport, which requires a long-lived process. SSE transport works better for serverless but you’ll hit cold start latency on the capability handshake. The practical approach is to run MCP servers as always-on containers (ECS, Cloud Run, Fly.io) and keep serverless for stateless processing tasks. Check our AI infrastructure for solo founders guide for cost-effective container hosting options.

How do I authenticate requests to a remote MCP server?

MCP doesn’t define authentication at the protocol level — it’s transport-level. For SSE servers, add Bearer token validation as Starlette middleware before the SSE handler processes requests. Generate per-client tokens, validate them on every connection, and rotate on a schedule. Never expose an MCP server endpoint publicly without auth — it’s effectively an open API execution endpoint.

How many tools can I register on a single MCP server?

Technically unlimited, but practically you’ll hit Claude’s context window limits when all tool schemas are serialized into the system prompt. Beyond ~30 tools, Claude’s tool selection accuracy degrades noticeably. If you need more, split into domain-specific servers or use a tool registry pattern that dynamically presents only the relevant tools based on conversation context.

Does MCP work with Claude models other than Opus?

Yes — MCP is transport-level and model-agnostic on the client side. You can point the same agentic loop at Claude Haiku, Sonnet, or Opus by changing the model parameter in your messages.create call. Haiku handles tool selection reasonably well for straightforward tasks and costs roughly 25x less than Opus — worth testing for high-volume automations.

Put this into practice

Try the Mcp Deployment Orchestrator agent — ready to use, no setup required.

Browse Agents →

Editorial note: API pricing, model capabilities, and tool features change frequently — always verify current details on the vendor’s website before building in production. Code examples are tested at time of writing; pin your dependency versions to avoid breaking changes. Some links in this article may be affiliate links — we may earn a commission if you sign up, at no extra cost to you.

Claude MCP servers: complete setup guide for production tool integrations

Prompt token optimization: reducing LLM API costs without sacrificing quality

Building Claude agents with persistent memory: architecture for multi-session state management

Stacking multiple Claude models in a single workflow: when to use Haiku vs Sonnet vs Opus

Building Claude agents with Starlette 1.0: modern Python web framework integration

Holotron-12B for computer use agents: building high-throughput vision-based automation

Profiling users from behavior: privacy implications and safety considerations for Claude agents

Claude MCP servers: complete setup guide for production tool integrations

Step 1: Install Dependencies

Step 2: Define Your Tools

Step 3: Implement Tool Handlers

Step 4: Wire Up the MCP Server

SSE Transport for HTTP Deployment

Step 5: Connect Claude to Your Server

Step 6: Add Production Hardening

Common Errors in Claude MCP Servers Setup

Error 1: “Capability negotiation failed” on initialization

Error 2: Tool calls hang indefinitely

Error 3: Claude ignores available tools and responds from memory

What to Build Next

Frequently Asked Questions

What is the difference between MCP and regular Claude function calling?

Can I run a Claude MCP server in a serverless environment like AWS Lambda?

How do I authenticate requests to a remote MCP server?

How many tools can I register on a single MCP server?

Does MCP work with Claude models other than Opus?

Put this into practice

Related Claude Code Agents

Related Posts

Prompt token optimization: reducing LLM API costs without sacrificing quality

Building Claude agents with persistent memory: architecture for multi-session state management

Stacking multiple Claude models in a single workflow: when to use Haiku vs Sonnet vs Opus

Building Claude agents with Starlette 1.0: modern Python web framework integration

Holotron-12B for computer use agents: building high-throughput vision-based automation

Profiling users from behavior: privacy implications and safety considerations for Claude agents