Building Claude agents that browse the web: tool use, follow-ups, and reliability patterns

By the end of this tutorial, you’ll have a working Claude agent that can search Google, fetch and parse web pages, handle pagination, deal with JavaScript-heavy sites, and recover gracefully when browsing fails. Claude agents web browsing is one of those capabilities that looks simple in demos and falls apart immediately in production — this guide covers the parts that actually break.

Install dependencies — set up httpx, BeautifulSoup, and the Anthropic SDK
Define browsing tools — register search and fetch tools with Claude’s tool-use API
Build the agentic loop — handle multi-turn tool calls until Claude has enough context
Handle pagination and multi-page results — detect and traverse paginated content
Handle JavaScript-heavy sites — integrate Playwright as a fallback renderer
Add retry and fallback logic — make the agent survive real-world network conditions

Why Web Browsing Agents Are Harder Than They Look

The standard demo goes: Claude calls a search tool, reads a result, done. In production, sites block scrapers, return 403s, serve empty shells to headless clients, paginate across 12 pages, or just time out. Add Claude’s context window limits and you’ve got a system that needs real engineering — not just tool definitions.

I’ve built several of these in production environments, including a competitor monitoring workflow and an SEO audit pipeline. The patterns here come from watching things fail and fixing them. If you want to understand the broader Claude tool use foundation before diving in, Claude Tool Use with Python: Building Custom Skills and API Integrations covers the primitives well.

Step 1: Install Dependencies

# Core dependencies
pip install anthropic httpx beautifulsoup4 lxml

# For JS-heavy sites (Step 5)
pip install playwright
playwright install chromium

Use httpx over requests here — it handles async natively and has better timeout control, which you’ll need when scraping unreliable URLs.

Step 2: Define Browsing Tools

Claude’s tool-use API needs a JSON schema for each tool. We’re defining two: a Google Custom Search wrapper and a URL fetcher. If you don’t have a Google Custom Search API key, SerpAPI works as a drop-in at roughly $0.001 per search — acceptable for most workloads.

import anthropic
import httpx
from bs4 import BeautifulSoup
from typing import Optional
import os

client = anthropic.Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])

TOOLS = [
    {
        "name": "google_search",
        "description": "Search Google and return the top organic results with titles, URLs, and snippets.",
        "input_schema": {
            "type": "object",
            "properties": {
                "query": {"type": "string", "description": "The search query"},
                "num_results": {
                    "type": "integer",
                    "description": "Number of results to return (max 10)",
                    "default": 5
                }
            },
            "required": ["query"]
        }
    },
    {
        "name": "fetch_url",
        "description": "Fetch and parse the text content of a URL. Returns cleaned body text.",
        "input_schema": {
            "type": "object",
            "properties": {
                "url": {"type": "string", "description": "The URL to fetch"},
                "extract_links": {
                    "type": "boolean",
                    "description": "Whether to also return links found on the page",
                    "default": False
                }
            },
            "required": ["url"]
        }
    }
]

Step 3: Build the Agentic Loop

The loop is the core of the agent. Claude returns a tool_use stop reason, you execute the tool, feed results back, and repeat until Claude produces a final text response. Most production issues happen here — infinite loops, context overflow, and unhandled tool errors.

def execute_tool(tool_name: str, tool_input: dict) -> str:
    """Dispatch tool calls and return string results."""
    if tool_name == "google_search":
        return google_search(tool_input["query"], tool_input.get("num_results", 5))
    elif tool_name == "fetch_url":
        return fetch_url(tool_input["url"], tool_input.get("extract_links", False))
    return f"Unknown tool: {tool_name}"

def run_browsing_agent(user_query: str, max_iterations: int = 10) -> str:
    """
    Run the web-browsing agent loop.
    max_iterations prevents infinite tool-call loops.
    """
    messages = [{"role": "user", "content": user_query}]
    
    for iteration in range(max_iterations):
        response = client.messages.create(
            model="claude-opus-4-5",
            max_tokens=4096,
            tools=TOOLS,
            messages=messages
        )
        
        # Claude is done — return the final text
        if response.stop_reason == "end_turn":
            for block in response.content:
                if hasattr(block, "text"):
                    return block.text
        
        # Claude wants to call tools
        if response.stop_reason == "tool_use":
            # Add Claude's response (including tool_use blocks) to history
            messages.append({"role": "assistant", "content": response.content})
            
            # Execute all tool calls and build tool_result blocks
            tool_results = []
            for block in response.content:
                if block.type == "tool_use":
                    result = execute_tool(block.name, block.input)
                    tool_results.append({
                        "type": "tool_result",
                        "tool_use_id": block.id,
                        "content": result[:8000]  # cap to avoid context overflow
                    })
            
            messages.append({"role": "user", "content": tool_results})
        else:
            # Unexpected stop reason
            break
    
    return "Agent reached iteration limit without completing."

The max_iterations guard is non-negotiable. I’ve seen agents loop 40+ times on ambiguous tasks without it. Ten is conservative but safe for most research tasks.

Step 4: Handle Pagination and Multi-Page Results

Most content you actually want to scrape is paginated — documentation, search results, product listings. The trick is teaching Claude to detect pagination signals and continue fetching, rather than stopping at page one.

def fetch_url(url: str, extract_links: bool = False, max_chars: int = 8000) -> str:
    """
    Fetch a URL and return cleaned text. Detects pagination.
    """
    headers = {
        "User-Agent": "Mozilla/5.0 (compatible; research-bot/1.0)"
    }
    
    try:
        resp = httpx.get(url, headers=headers, timeout=15, follow_redirects=True)
        resp.raise_for_status()
    except httpx.HTTPStatusError as e:
        return f"HTTP {e.response.status_code} fetching {url}"
    except httpx.TimeoutException:
        return f"Timeout fetching {url}"
    
    soup = BeautifulSoup(resp.text, "lxml")
    
    # Remove noise elements
    for tag in soup(["script", "style", "nav", "footer", "aside"]):
        tag.decompose()
    
    text = soup.get_text(separator="\n", strip=True)
    
    # Detect pagination — look for "next page" patterns
    pagination_info = ""
    next_links = soup.find_all("a", string=lambda s: s and any(
        word in s.lower() for word in ["next", "next page", "→", "»", "page 2"]
    ))
    if next_links:
        next_href = next_links[0].get("href", "")
        if next_href and not next_href.startswith("#"):
            pagination_info = f"\n\n[PAGINATION DETECTED: next page at {next_href}]"
    
    result = text[:max_chars] + pagination_info
    
    if extract_links:
        links = [a.get("href") for a in soup.find_all("a", href=True)]
        result += f"\n\nLINKS ON PAGE:\n" + "\n".join(links[:50])
    
    return result

The pagination hint at the end of the returned content is key. Claude will see [PAGINATION DETECTED: next page at /docs/page-2] and, with a good system prompt, will follow it automatically without you hard-coding that logic.

Step 5: Handle JavaScript-Heavy Sites

About 30-40% of modern sites render nothing useful without JavaScript. Simple httpx fetches return an empty shell. Playwright handles this — it’s heavier (adds ~2s per fetch) but catches what static scrapers miss.

from playwright.sync_api import sync_playwright

def fetch_url_with_js(url: str, max_chars: int = 8000) -> str:
    """
    Fetch a JS-rendered URL using Playwright.
    Falls back to this when static fetch returns < 500 chars.
    """
    try:
        with sync_playwright() as p:
            browser = p.chromium.launch(headless=True)
            page = browser.new_page(
                user_agent="Mozilla/5.0 (compatible; research-bot/1.0)"
            )
            # Block images/fonts to speed up rendering
            page.route("**/*.{png,jpg,gif,woff2,woff}", lambda r: r.abort())
            page.goto(url, wait_until="networkidle", timeout=20000)
            
            # Wait for main content to appear
            page.wait_for_timeout(1500)
            
            content = page.inner_text("body")
            browser.close()
            return content[:max_chars]
    except Exception as e:
        return f"Playwright fetch failed: {str(e)}"

def smart_fetch(url: str) -> str:
    """Try static fetch first; fall back to Playwright if content is thin."""
    static_result = fetch_url(url)
    
    # If we got less than 500 chars of real content, try JS rendering
    if len(static_result.strip()) < 500 and "HTTP " not in static_result:
        return fetch_url_with_js(url)
    
    return static_result

Step 6: Add Retry and Fallback Logic

Networks fail. Rate limits hit. Google blocks IP ranges. A production web-browsing agent needs structured retry logic — not just a bare try/except. I cover this in depth in Building LLM Fallback and Retry Logic: Graceful Degradation Patterns for Production, but the browsing-specific version looks like this:

import time
from functools import wraps

def retry_with_backoff(max_retries: int = 3, base_delay: float = 1.0):
    """Decorator: exponential backoff with jitter."""
    def decorator(func):
        @wraps(func)
        def wrapper(*args, **kwargs):
            for attempt in range(max_retries):
                try:
                    result = func(*args, **kwargs)
                    # If result is an error string, retry
                    if result.startswith("HTTP 5") or result.startswith("Timeout"):
                        raise ValueError(result)
                    return result
                except Exception as e:
                    if attempt == max_retries - 1:
                        return f"Failed after {max_retries} attempts: {str(e)}"
                    delay = base_delay * (2 ** attempt) + (time.time() % 0.5)
                    time.sleep(delay)
            return "Max retries exceeded"
        return wrapper
    return decorator

@retry_with_backoff(max_retries=3)
def google_search(query: str, num_results: int = 5) -> str:
    """
    Google Custom Search API wrapper.
    Replace with SerpAPI if you don't have a CSE key.
    ~$0.005 per 1000 queries on Google CSE free tier.
    """
    api_key = os.environ["GOOGLE_CSE_API_KEY"]
    cx = os.environ["GOOGLE_CSE_ID"]
    
    resp = httpx.get(
        "https://www.googleapis.com/customsearch/v1",
        params={"key": api_key, "cx": cx, "q": query, "num": num_results},
        timeout=10
    )
    resp.raise_for_status()
    data = resp.json()
    
    results = []
    for item in data.get("items", []):
        results.append(f"Title: {item['title']}\nURL: {item['link']}\nSnippet: {item['snippet']}\n")
    
    return "\n---\n".join(results) if results else "No results found."

Common Errors

1. Context window overflow mid-loop

Symptom: anthropic.BadRequestError: prompt is too long after 3-4 fetch operations.

Fix: Cap what you return to Claude. The content[:8000] truncation in the loop and max_chars in fetch functions are essential. For deeper content, summarize what you’ve fetched before adding more:

# Before appending a large fetch result, summarize it
if len(result) > 4000:
    summary_response = client.messages.create(
        model="claude-haiku-4-5",  # Use Haiku for cheap summarization
        max_tokens=512,
        messages=[{
            "role": "user",
            "content": f"Summarize this in 300 words, preserving key facts:\n\n{result}"
        }]
    )
    result = summary_response.content[0].text

Using Claude Haiku for intermediate summarization costs roughly $0.0004 per call — negligible versus burning your context window.

2. Claude enters an infinite search loop

Symptom: Agent keeps calling google_search with minor query variations and never produces a final answer.

Fix: Two things. First, enforce max_iterations as shown in Step 3. Second, tighten your system prompt to tell Claude when to stop. Add: “If you’ve fetched 3 or more sources and have sufficient information to answer, stop browsing and respond now.” This also connects to the hallucination patterns discussed in Reducing LLM Hallucinations in Production: Structured Outputs and Verification Patterns — an agent that keeps searching is often trying to resolve uncertainty it should just surface.

3. 403 / bot detection blocks

Symptom: Tool returns HTTP 403 or returns a CAPTCHA page instead of content.

Fix: Rotate user-agent strings, add realistic request headers (Accept, Accept-Language, Referer), and add delays between requests. For persistent blocks, proxy services like Bright Data or Oxylabs run $15-30/month for light usage and solve most anti-scraping measures. As a last resort, cache aggressively — if you’ve fetched a URL successfully in the last 24 hours, return the cached version rather than re-fetching.

Putting It Together: Full System Prompt

The system prompt shapes how well the agent uses the tools. For web browsing specifically, you need explicit instructions about when to search versus fetch, when to paginate, and when to stop.

SYSTEM_PROMPT = """You are a research agent with access to web search and URL fetching tools.

Guidelines:
1. Start with a google_search to find relevant URLs before fetching
2. Fetch the 2-3 most promising URLs — don't fetch everything
3. If a page shows [PAGINATION DETECTED], fetch the next page only if it contains information you still need
4. If a fetch returns less than 200 characters, try fetch_url with a different URL from search results
5. Once you have enough information to answer the question confidently, stop browsing and respond
6. Cite your sources (URL + title) in your final response
7. If you cannot find reliable information after 3 search+fetch cycles, say so explicitly"""

What to Build Next

The natural extension of this agent is persistent memory — storing what it has previously fetched so it doesn’t re-scrape the same URLs across sessions. Combine this browsing layer with the architecture from Building Claude agents with persistent memory across sessions: production architecture guide and you’ve got an agent that builds a genuine knowledge base over time. Concretely: write each fetched URL + extracted text to a vector store, check the store before calling fetch_url, and fall back to live browsing only on cache misses. That pattern also feeds directly into AI-powered competitor monitoring — a real production use case where the browsing agent pays for itself quickly.

Bottom Line

For solo founders building research or monitoring tools: Start with Claude Sonnet + Google CSE + static httpx fetching. Add Playwright only after you hit JS-rendering failures in the wild — which you will, but maybe not immediately. Total cost for a 100-query/day research agent runs around $1-3/day depending on page sizes.

For teams running this in production: Add the summarization step before context overflow becomes a problem, implement proper caching with Redis, and log every tool call with its result length and status code. You’ll thank yourself when debugging the inevitable edge cases.

The fundamentals of Claude agents web browsing are straightforward once you understand the failure modes. Most production problems reduce to three things: context overflow, bot detection, and Claude not knowing when to stop. The code above handles all three.

Frequently Asked Questions

How do I give Claude the ability to browse the web?

You define custom tools in Claude’s tool-use API — specifically a search tool (Google Custom Search or SerpAPI) and a URL fetch tool (httpx + BeautifulSoup). Claude decides when to call them and you execute the actual requests in your application code. There’s no native browsing built into the Claude API; you supply the web access layer.

What is the cheapest way to add Google search to a Claude agent?

Google Custom Search Engine (CSE) gives you 100 free queries per day. Beyond that, it’s $5 per 1000 queries. SerpAPI starts at $50/month for 5000 searches. For most small agents, Google CSE is sufficient — set up a programmable search engine scoped to “the entire web” in the Google Cloud Console.

How do I handle JavaScript-rendered pages in a Claude web browsing agent?

Use Playwright as a fallback when static HTML fetching returns thin content (under 500 characters is a good heuristic). Launch a headless Chromium instance, navigate to the URL with wait_until="networkidle", then extract the body text. Block images and fonts in the Playwright route to cut load time from ~5s to ~2s.

Can Claude get stuck in an infinite loop when browsing?

Yes — especially when searches return ambiguous results. Always set a max_iterations guard (10 is reasonable) and include explicit stopping criteria in your system prompt telling Claude to answer once it has fetched 2-3 quality sources. Without both guards, agents can run 20+ tool calls on simple queries.

Which Claude model should I use for a web browsing agent?

Claude Sonnet for the main reasoning loop — it handles multi-step tool use reliably at a reasonable price (~$0.003 per 1K output tokens). Use Claude Haiku for cheap intermediate summarization steps when pages are large. Claude Opus is overkill for most browsing tasks and will cost 5x more without meaningful quality improvement on structured retrieval work.

Put this into practice

Try the Web Vitals Optimizer agent — ready to use, no setup required.

Browse Agents →

Editorial note: API pricing, model capabilities, and tool features change frequently — always verify current details on the vendor’s website before building in production. Code examples are tested at time of writing; pin your dependency versions to avoid breaking changes. Some links in this article may be affiliate links — we may earn a commission if you sign up, at no extra cost to you.

Building Claude agents that browse the web: tool use, follow-ups, and reliability patterns

Claude MCP servers: complete setup guide for production tool integrations

Prompt token optimization: reducing LLM API costs without sacrificing quality

Building Claude agents with persistent memory: architecture for multi-session state management

Stacking multiple Claude models in a single workflow: when to use Haiku vs Sonnet vs Opus

Building Claude agents with Starlette 1.0: modern Python web framework integration

Holotron-12B for computer use agents: building high-throughput vision-based automation

Building Claude agents that browse the web: tool use, follow-ups, and reliability patterns

Why Web Browsing Agents Are Harder Than They Look

Step 1: Install Dependencies

Step 2: Define Browsing Tools

Step 3: Build the Agentic Loop

Step 4: Handle Pagination and Multi-Page Results

Step 5: Handle JavaScript-Heavy Sites

Step 6: Add Retry and Fallback Logic

Common Errors

1. Context window overflow mid-loop

2. Claude enters an infinite search loop

3. 403 / bot detection blocks

Putting It Together: Full System Prompt

What to Build Next

Bottom Line

Frequently Asked Questions

How do I give Claude the ability to browse the web?

What is the cheapest way to add Google search to a Claude agent?

How do I handle JavaScript-rendered pages in a Claude web browsing agent?

Can Claude get stuck in an infinite loop when browsing?

Which Claude model should I use for a web browsing agent?

Put this into practice

Related Claude Code Agents

Related Posts

Claude MCP servers: complete setup guide for production tool integrations

Prompt token optimization: reducing LLM API costs without sacrificing quality

Building Claude agents with persistent memory: architecture for multi-session state management

Stacking multiple Claude models in a single workflow: when to use Haiku vs Sonnet vs Opus

Building Claude agents with Starlette 1.0: modern Python web framework integration

Holotron-12B for computer use agents: building high-throughput vision-based automation