Backend Architect: The Claude Code Agent That Designs Systems Before You Write a Single Line of Code

Most architecture mistakes don’t happen during development — they happen before it. A poorly defined service boundary discovered three months into a project means refactoring data contracts, rewriting database queries, and having uncomfortable conversations with stakeholders about timelines. The Backend Architect agent exists to surface those decisions early, when changing them costs nothing.

This agent acts as a senior backend engineer embedded directly in your Claude Code workflow. Ask it to design a RESTful API and you get versioned endpoints with request/response examples. Ask it about microservice decomposition and you get a concrete boundary analysis, not a whiteboard sketch. It outputs Mermaid diagrams, database schemas with indexes, technology recommendations with actual rationale, and an honest assessment of where your system will break under load. That’s not documentation you have to write — it’s architecture you can hand directly to your team.

The time savings aren’t marginal. Developers typically spend hours translating vague requirements into coherent system designs, cross-referencing API conventions, and arguing about normalization in pull request comments. Backend Architect compresses that work into minutes and moves those conversations to the beginning of a project, where they belong.

When to Use Backend Architect

The agent description says to use it proactively — that’s the right instinct. Don’t wait until you’re stuck. Reach for it the moment architecture decisions are on the table.

Greenfield API Design

You’re starting a new service and need to establish endpoint conventions, versioning strategy, and error response formats before the first controller gets written. Backend Architect produces contract-first API definitions you can share with frontend teams immediately, eliminating the “what does a 422 return exactly?” back-and-forth mid-sprint.

Microservice Boundary Analysis

Your monolith is growing and you need to extract services without creating a distributed monolith instead. The agent evaluates domain boundaries, identifies data ownership conflicts, and recommends inter-service communication patterns (synchronous vs. event-driven) based on your consistency requirements.

Database Schema Design

Before you run your first migration, you need a schema that won’t require painful alterations when your user table hits ten million rows. Backend Architect designs normalized schemas, recommends index strategies, and flags sharding candidates based on access patterns you describe.

Performance and Scaling Planning

You’re onboarding a large client and traffic is projected to increase tenfold. The agent analyzes your described architecture, identifies bottlenecks, and recommends caching layers, connection pooling configurations, and horizontal scaling strategies — with concrete technology choices, not generic advice.

Legacy System Modernization

You need to wrap a legacy system in a clean API while progressively migrating functionality. Backend Architect can design the strangler fig pattern implementation, defining which endpoints proxy to legacy and which serve new implementations.

Real-World Examples

Scenario: Designing a Multi-Tenant SaaS API from Scratch

A team is building a project management SaaS product. They need to establish their core API architecture before development begins. They’ve decided on Node.js and PostgreSQL but haven’t committed to anything beyond that.

Developer: We're building a multi-tenant project management SaaS. 
Users belong to organizations, organizations have projects, projects 
have tasks. We need a REST API. Where do we start?

Backend Architect: Start with your tenant isolation strategy — 
everything downstream depends on it. For your data model, 
row-level security in PostgreSQL is the pragmatic choice over 
separate schemas per tenant until you have specific compliance 
requirements forcing separation.

Here's the core schema:

organizations (id, slug, plan, created_at)
users (id, org_id, email, role, created_at)
projects (id, org_id, name, status, owner_id, created_at)
tasks (id, project_id, assignee_id, title, status, priority, due_at)

Critical indexes: users(org_id, email), projects(org_id, status), 
tasks(project_id, assignee_id), tasks(project_id, status, due_at).

For your API structure, version from day one:

GET    /v1/projects
POST   /v1/projects
GET    /v1/projects/:id
PATCH  /v1/projects/:id
DELETE /v1/projects/:id

GET    /v1/projects/:id/tasks
POST   /v1/projects/:id/tasks

Org context should come from the authenticated JWT, not the URL. 
Your middleware resolves org_id from the token and attaches it to 
every request — never trust an org_id from the request body.

Bottleneck to plan for: task listing queries with filters will 
get expensive. Add a partial index on tasks(project_id) WHERE 
status != 'completed' and paginate with cursor-based pagination 
from day one. Offset pagination on a tasks table will hurt you 
after 100k rows per project.

Technology additions I'd recommend: Redis for session caching 
and rate limiting (per org, not per IP), BullMQ for async 
operations like notification dispatch.

Scenario: Decomposing a Monolith into Services

An e-commerce platform has a Rails monolith that’s becoming hard to deploy safely. The team wants to extract services but isn’t sure where to draw boundaries.

Developer: We have a Rails monolith handling orders, inventory, 
users, and email notifications. Deployments are risky because 
everything is coupled. Where do we cut first?

Backend Architect: Notifications first — it's the lowest-risk 
extraction. Email and SMS dispatch has clear inputs, no 
synchronous response requirement, and failure there doesn't 
break order processing. Extract it as an async consumer 
listening to a message queue.

Your extraction sequence:
1. Notification Service (async, event-driven — easiest boundary)
2. Inventory Service (clear aggregate, but watch for 
   order-inventory consistency — use optimistic locking)
3. Order Service (highest business value, hardest to extract last)
4. User/Auth Service (extract early if you're adding OAuth or SSO)

The pattern for each extraction:
- Define the service's event schema first
- Run the new service in parallel, consuming events while 
  the monolith still handles the same operations
- Shift writes to the new service, monolith reads from it 
  via internal API
- Cut over reads, deprecate monolith's module

For inter-service communication: notifications are fire-and-forget 
(use RabbitMQ or SQS), inventory checks during checkout need 
synchronous responses (gRPC or REST with circuit breaker), 
order events to trigger downstream workflows use an event bus.

Don't extract the user service last. Auth is referenced 
everywhere — if it's not stable and independently deployable 
before you start extracting business domains, every service 
extraction will be messier than it needs to be.

What Makes This Agent Powerful

Contract-First Output

The agent is instructed to produce API endpoint definitions with concrete request and response examples. This isn’t pseudocode — it’s output you can paste into an OpenAPI spec or share with a frontend team as a working contract. The difference between “we’ll design the API as we go” and starting with a defined contract is measured in weeks of integration friction.

Visual Architecture Artifacts

Backend Architect generates Mermaid diagrams and ASCII architecture diagrams inline. These aren’t throwaway outputs — they become the architectural documentation for your system. Mermaid diagrams render directly in GitHub, Notion, and most wiki platforms without any additional tooling.

Bottleneck Identification

The agent’s output always includes an explicit section on potential bottlenecks and scaling considerations. This isn’t generic “add a cache” advice — it’s tied to the specific schema and access patterns you’ve described. Knowing where a system will fail before it fails is the most valuable thing a senior architect provides, and this agent bakes that into every response.

Pragmatic Technology Recommendations

Technology recommendations come with rationale. The agent doesn’t just say “use Redis” — it explains which specific use case justifies it and what problem you’re solving. This makes recommendations defensible in engineering discussions and helps teams avoid cargo-culting technology choices.

Principled Simplicity

The agent’s core principles include avoiding premature optimization and keeping designs simple. This counterbalances the tendency to over-engineer. When you ask about microservices, it will tell you if a modular monolith is the right answer for your scale. That kind of honest pushback is what you’d want from a senior engineer, and it prevents the most common architecture mistake: building distributed systems complexity before you’ve earned it.

How to Install Backend Architect

Installation takes about sixty seconds. Claude Code automatically loads agent files from the .claude/agents/ directory in your project.

Create the agent file in your project root:

mkdir -p .claude/agents
touch .claude/agents/backend-architect.md

Open .claude/agents/backend-architect.md and paste the following system prompt:

You are a backend system architect specializing in scalable API design 
and microservices.

## Focus Areas
- RESTful API design with proper versioning and error handling
- Service boundary definition and inter-service communication
- Database schema design (normalization, indexes, sharding)
- Caching strategies and performance optimization
- Basic security patterns (auth, rate limiting)

## Approach
1. Start with clear service boundaries
2. Design APIs contract-first
3. Consider data consistency requirements
4. Plan for horizontal scaling from day one
5. Keep it simple - avoid premature optimization

## Output
- API endpoint definitions with example requests/responses
- Service architecture diagram (mermaid or ASCII)
- Database schema with key relationships
- List of technology recommendations with brief rationale
- Potential bottlenecks and scaling considerations

Always provide concrete examples and focus on practical 
implementation over theory.

Save the file. The next time you open Claude Code in that project, the Backend Architect agent is available. Reference it explicitly in your prompts with @backend-architect or simply describe an architecture problem — Claude Code will surface it when the context is appropriate.

The .claude/agents/ directory is project-scoped, so you can commit it to your repository and share the agent with your entire team. Every developer on the project gets consistent architectural guidance from the same system prompt.

Practical Next Steps

Install the agent, then use it immediately on something concrete. Don’t wait for a new project — take your most complex existing service and ask Backend Architect to review its API design. The gap between what you have and what it recommends is often instructive.

If you’re starting something new, run your requirements through the agent before writing any code. Get the schema, the endpoint definitions, and the scaling analysis. Then build against that contract. You’ll spend less time in pull request debates about conventions and more time shipping features.

The agent pairs well with specialized agents for adjacent concerns — a Security Reviewer for validating auth implementations, a Database Specialist for query optimization. Backend Architect handles the macro design; those agents handle the implementation details. Building a small library of focused agents that cover your team’s recurring decision points is how you systematically eliminate the slow, expensive conversations that block every project.

Agent template sourced from the claude-code-templates open source project (MIT License).

Backend Architect — Claude Code Agent

Claude MCP servers: complete setup guide for production tool integrations

Prompt token optimization: reducing LLM API costs without sacrificing quality

Building Claude agents with persistent memory: architecture for multi-session state management

Stacking multiple Claude models in a single workflow: when to use Haiku vs Sonnet vs Opus

Building Claude agents with Starlette 1.0: modern Python web framework integration

Holotron-12B for computer use agents: building high-throughput vision-based automation