Performance Optimization¶

Maximize speed and minimize costs when using Claude Code.

What You'll Learn¶

Context management strategies
Model selection for tasks
Prompt efficiency
Parallel execution
Cost optimization

Understanding Context¶

What Uses Context¶

Every conversation consumes context (tokens):

Element	Token Usage
Your prompts	Medium
Claude's responses	Medium-High
File contents read	High
Tool results	Variable
Conversation history	Cumulative

The Context Window¶

Claude has a limited context window. As it fills: - Response quality may decrease - Older context may be summarized - Costs increase

Context Management Strategies¶

Use /compact Regularly¶

The /compact command summarizes the conversation:

> [Long exploration session...]
> [Many files read...]
> [Extensive debugging...]

> /compact

Conversation compacted. Key context preserved.

When to compact: - After completing a major task - When exploring many files - Before starting a new topic - Every 15-20 exchanges

Targeted File Reading¶

# Inefficient: Reading entire files
> Show me the auth module

# Efficient: Reading specific parts
> Show me the login function in auth.js

Clear Context Signals¶

Help Claude avoid re-reading:

# Less efficient
> What does getUserById do?
> [Claude reads user.js]
> What about createUser?
> [Claude might re-read user.js]

# More efficient
> Explain getUserById and createUser in user.js
> [Claude reads once, explains both]

Model Selection¶

Task-Appropriate Models¶

Different models for different tasks:

Task	Recommended Model
Simple edits	Haiku (fastest, cheapest)
Code review	Sonnet (balanced)
Architecture design	Opus (most capable)
Quick questions	Haiku
Complex debugging	Sonnet or Opus

Switching Models¶

> /config

# Set default model
# Or use for specific tasks

Cost Comparison¶

Relative costs (approximate): - Haiku: 1x (baseline) - Sonnet: 5x - Opus: 15x

Use Haiku for routine tasks to save costs.

Prompt Efficiency¶

Concise Prompts¶

# Verbose (more tokens)
> I was wondering if you could possibly take a look at the
  authentication module and give me your thoughts on whether
  there might be any security issues that we should be aware of.

# Concise (fewer tokens)
> Review auth module for security issues.

Batch Requests¶

# Inefficient: Multiple round trips
> Add error handling to login()
> [wait]
> Add error handling to logout()
> [wait]
> Add error handling to refresh()

# Efficient: Single request
> Add error handling to login(), logout(), and refresh() in auth.js

Structured Requests¶

> Do these in order:
  1. Read the User model
  2. Add email validation
  3. Add a migration for the email_verified field
  4. Update the tests

Parallel Execution¶

Independent Tasks¶

Claude can run independent searches in parallel:

> Find all deprecated functions AND check test coverage

Two parallel operations, single response.

Subagent Parallelism¶

For complex exploration:

> Analyze the authentication system, database layer,
  and API structure of this project.

Claude may spawn multiple agents working simultaneously.

When Parallel Helps¶

Searching multiple directories
Analyzing independent modules
Fetching from multiple sources

When Sequential is Necessary¶

Changes that depend on previous changes
Operations requiring confirmation
State-dependent commands

Session Optimization¶

Plan Before Executing¶

# Get oriented first
> Give me an overview of this codebase

# Then make targeted changes
> Based on what I see, update the auth middleware

Reduces back-and-forth.

Use Continue Wisely¶

claude -c  # Continue previous session

Good: Resuming work on the same task
Bad: Starting unrelated work (start fresh instead)

Fresh Starts¶

claude  # New session

Start fresh for: - New unrelated tasks - When context is polluted - After major topic changes

Reducing Token Waste¶

Avoid Redundant Reads¶

# Don't repeat yourself
> Read package.json
> [Claude reads it]
> Read package.json again and show me the scripts
> [Claude reads it AGAIN]

# Better: Ask for what you need upfront
> Show me the scripts section of package.json

Reference Previous Context¶

> Using the auth pattern we just discussed, implement
  the same for the payment module.

Claude uses existing context instead of re-deriving.

Limit Output When Possible¶

# May produce lengthy output
> Show me all the API routes

# More focused
> List API routes in routes/user.js with one-line descriptions

Non-Interactive Mode¶

For scripts and automation:

# Quick question, minimal overhead
claude -p "What testing framework does this project use"

# Pipe to file, avoid session costs
claude -p "Generate API documentation for routes/" > api-docs.md

Caching Considerations¶

MCP Server Caching¶

If using MCP servers that fetch external data:

// In your MCP server
const cache = new Map();
const TTL = 300000; // 5 minutes

server.tool({
  name: 'get_weather',
  handler: async ({ city }) => {
    const cached = cache.get(city);
    if (cached && Date.now() - cached.time < TTL) {
      return cached.data;
    }
    const data = await fetchWeather(city);
    cache.set(city, { data, time: Date.now() });
    return data;
  }
});

Avoiding Repeated Exploration¶

# If you'll reference files multiple times
> Read and summarize the key configuration files.
  I'll need to reference these settings throughout our session.

Claude keeps the summary in context, avoiding re-reads.

Measuring Performance¶

Track Token Usage¶

Monitor your usage in the Anthropic console or via API responses.

Time Operations¶

time claude -p "what files are in src/"

Compare Approaches¶

Test different prompting strategies: - Specific vs general prompts - Single vs batch requests - Different models

Quick Reference¶

┌─────────────────────────────────────────────────────────────┐
│ PERFORMANCE CHECKLIST                                        │
├─────────────────────────────────────────────────────────────┤
│ □ Use /compact after major tasks                            │
│ □ Read specific parts, not whole files when possible        │
│ □ Batch related requests                                    │
│ □ Use Haiku for simple tasks                                │
│ □ Use non-interactive mode for scripts                      │
│ □ Start fresh sessions for new topics                       │
│ □ Reference previous context instead of re-reading          │
│ □ Request focused output                                    │
└─────────────────────────────────────────────────────────────┘

Try It Yourself¶

Exercise: Optimize a Session¶

Start a session and work on a task
Note how many exchanges it takes
Start fresh and try to complete the same task in fewer exchanges
Compare the efficiency

Exercise: Model Comparison¶

Take a simple task (e.g., "explain this function")
Try with Haiku
Try with Sonnet
Note speed, quality, and when each is appropriate

What's Next?¶

Learn how to integrate Claude Code with n8n for powerful workflow automation in 04-n8n-integration.

Summary: - Context is precious: Use /compact, be specific, batch requests - Right model for the job: Haiku for simple, Sonnet for complex - Efficient prompts: Concise, batched, structured - Parallel when possible: Independent tasks can run simultaneously - Fresh starts for new topics: Don't pollute context

Learning Resources¶

Featured Video¶

Edmund Yong: 800+ Hours Claude Code - Optimization (Popular tech channel)

Performance optimization - context management, model selection, and token efficiency.

Additional Resources¶

Type	Resource	Description
🎬 Video	Claude Code Tips	All About AI - Performance tips
📚 Official Docs	Memory Management	Context and memory documentation
📖 Tutorial	Builder.io Guide	Optimization techniques
🎓 Free Course	Anthropic Academy	Performance courses
💼 Commercial	Prompt Engineering	Token optimization course