Performance Optimization¶
Maximize speed and minimize costs when using Claude Code.
What You'll Learn¶
- Context management strategies
- Model selection for tasks
- Prompt efficiency
- Parallel execution
- Cost optimization
Understanding Context¶
What Uses Context¶
Every conversation consumes context (tokens):
| Element | Token Usage |
|---|---|
| Your prompts | Medium |
| Claude's responses | Medium-High |
| File contents read | High |
| Tool results | Variable |
| Conversation history | Cumulative |
The Context Window¶
Claude has a limited context window. As it fills: - Response quality may decrease - Older context may be summarized - Costs increase
Context Management Strategies¶
Use /compact Regularly¶
The /compact command summarizes the conversation:
> [Long exploration session...]
> [Many files read...]
> [Extensive debugging...]
> /compact
Conversation compacted. Key context preserved.
When to compact: - After completing a major task - When exploring many files - Before starting a new topic - Every 15-20 exchanges
Targeted File Reading¶
# Inefficient: Reading entire files
> Show me the auth module
# Efficient: Reading specific parts
> Show me the login function in auth.js
Clear Context Signals¶
Help Claude avoid re-reading:
# Less efficient
> What does getUserById do?
> [Claude reads user.js]
> What about createUser?
> [Claude might re-read user.js]
# More efficient
> Explain getUserById and createUser in user.js
> [Claude reads once, explains both]
Model Selection¶
Task-Appropriate Models¶
Different models for different tasks:
| Task | Recommended Model |
|---|---|
| Simple edits | Haiku (fastest, cheapest) |
| Code review | Sonnet (balanced) |
| Architecture design | Opus (most capable) |
| Quick questions | Haiku |
| Complex debugging | Sonnet or Opus |
Switching Models¶
Cost Comparison¶
Relative costs (approximate): - Haiku: 1x (baseline) - Sonnet: 5x - Opus: 15x
Use Haiku for routine tasks to save costs.
Prompt Efficiency¶
Concise Prompts¶
# Verbose (more tokens)
> I was wondering if you could possibly take a look at the
authentication module and give me your thoughts on whether
there might be any security issues that we should be aware of.
# Concise (fewer tokens)
> Review auth module for security issues.
Batch Requests¶
# Inefficient: Multiple round trips
> Add error handling to login()
> [wait]
> Add error handling to logout()
> [wait]
> Add error handling to refresh()
# Efficient: Single request
> Add error handling to login(), logout(), and refresh() in auth.js
Structured Requests¶
> Do these in order:
1. Read the User model
2. Add email validation
3. Add a migration for the email_verified field
4. Update the tests
Parallel Execution¶
Independent Tasks¶
Claude can run independent searches in parallel:
Two parallel operations, single response.
Subagent Parallelism¶
For complex exploration:
Claude may spawn multiple agents working simultaneously.
When Parallel Helps¶
- Searching multiple directories
- Analyzing independent modules
- Fetching from multiple sources
When Sequential is Necessary¶
- Changes that depend on previous changes
- Operations requiring confirmation
- State-dependent commands
Session Optimization¶
Plan Before Executing¶
# Get oriented first
> Give me an overview of this codebase
# Then make targeted changes
> Based on what I see, update the auth middleware
Reduces back-and-forth.
Use Continue Wisely¶
- Good: Resuming work on the same task
- Bad: Starting unrelated work (start fresh instead)
Fresh Starts¶
Start fresh for: - New unrelated tasks - When context is polluted - After major topic changes
Reducing Token Waste¶
Avoid Redundant Reads¶
# Don't repeat yourself
> Read package.json
> [Claude reads it]
> Read package.json again and show me the scripts
> [Claude reads it AGAIN]
# Better: Ask for what you need upfront
> Show me the scripts section of package.json
Reference Previous Context¶
Claude uses existing context instead of re-deriving.
Limit Output When Possible¶
# May produce lengthy output
> Show me all the API routes
# More focused
> List API routes in routes/user.js with one-line descriptions
Non-Interactive Mode¶
For scripts and automation:
# Quick question, minimal overhead
claude -p "What testing framework does this project use"
# Pipe to file, avoid session costs
claude -p "Generate API documentation for routes/" > api-docs.md
Caching Considerations¶
MCP Server Caching¶
If using MCP servers that fetch external data:
// In your MCP server
const cache = new Map();
const TTL = 300000; // 5 minutes
server.tool({
name: 'get_weather',
handler: async ({ city }) => {
const cached = cache.get(city);
if (cached && Date.now() - cached.time < TTL) {
return cached.data;
}
const data = await fetchWeather(city);
cache.set(city, { data, time: Date.now() });
return data;
}
});
Avoiding Repeated Exploration¶
# If you'll reference files multiple times
> Read and summarize the key configuration files.
I'll need to reference these settings throughout our session.
Claude keeps the summary in context, avoiding re-reads.
Measuring Performance¶
Track Token Usage¶
Monitor your usage in the Anthropic console or via API responses.
Time Operations¶
Compare Approaches¶
Test different prompting strategies: - Specific vs general prompts - Single vs batch requests - Different models
Quick Reference¶
┌─────────────────────────────────────────────────────────────┐
│ PERFORMANCE CHECKLIST │
├─────────────────────────────────────────────────────────────┤
│ □ Use /compact after major tasks │
│ □ Read specific parts, not whole files when possible │
│ □ Batch related requests │
│ □ Use Haiku for simple tasks │
│ □ Use non-interactive mode for scripts │
│ □ Start fresh sessions for new topics │
│ □ Reference previous context instead of re-reading │
│ □ Request focused output │
└─────────────────────────────────────────────────────────────┘
Try It Yourself¶
Exercise: Optimize a Session¶
- Start a session and work on a task
- Note how many exchanges it takes
- Start fresh and try to complete the same task in fewer exchanges
- Compare the efficiency
Exercise: Model Comparison¶
- Take a simple task (e.g., "explain this function")
- Try with Haiku
- Try with Sonnet
- Note speed, quality, and when each is appropriate
What's Next?¶
Learn how to integrate Claude Code with n8n for powerful workflow automation in 04-n8n-integration.
Summary: - Context is precious: Use /compact, be specific, batch requests - Right model for the job: Haiku for simple, Sonnet for complex - Efficient prompts: Concise, batched, structured - Parallel when possible: Independent tasks can run simultaneously - Fresh starts for new topics: Don't pollute context
Learning Resources¶
Featured Video¶
Edmund Yong: 800+ Hours Claude Code - Optimization (Popular tech channel)
Performance optimization - context management, model selection, and token efficiency.
Additional Resources¶
| Type | Resource | Description |
|---|---|---|
| 🎬 Video | Claude Code Tips | All About AI - Performance tips |
| 📚 Official Docs | Memory Management | Context and memory documentation |
| 📖 Tutorial | Builder.io Guide | Optimization techniques |
| 🎓 Free Course | Anthropic Academy | Performance courses |
| 💼 Commercial | Prompt Engineering | Token optimization course |