System Design Interview Guide: From Requirements to Architecture
Table of Contents
System design interviews are where senior engineering candidates often struggle the most. Unlike coding problems with clear right answers, design questions are open-ended conversations where you need to demonstrate breadth of knowledge, structured thinking, and the ability to make and justify trade-offs.
The good news: there’s a repeatable framework you can follow, and the set of building blocks you need to know is finite.
What Interviewers Evaluate
System design interviews assess:
- Requirements gathering — Can you ask the right questions to scope a problem?
- High-level design — Can you identify major components and how they interact?
- Trade-off analysis — Can you articulate why you chose one approach over another?
- Scalability thinking — Do you understand what breaks at scale?
- Technical depth — Can you dive deep into specific components when asked?
You’re not expected to design a production-ready system in 45 minutes. You’re expected to demonstrate how you think about complex systems.
The Framework
Use this structure for every system design question:
Step 1: Clarify Requirements (5 minutes)
Never start designing without understanding what you’re building. Ask about:
Functional requirements:
- What are the core features?
- Who are the users?
- What are the key user flows?
Non-functional requirements:
- What’s the expected scale? (users, requests/sec, data volume)
- What are the latency requirements?
- Is availability or consistency more important?
- Are there any regulatory or compliance constraints?
Constraints:
- What’s the expected growth rate?
- Are there budget or team size constraints?
- Do we need to integrate with existing systems?
Step 2: Estimate Scale (3 minutes)
Back-of-envelope calculations ground your design in reality:
- Users: 10M daily active users
- Requests: If each user makes 10 requests/day → ~1,150 requests/sec average, ~3,500 peak
- Storage: If each record is 1KB and we store 100M records → 100GB
- Bandwidth: 3,500 req/sec × 1KB = 3.5 MB/sec
These numbers inform your choices about databases, caching, and infrastructure.
Step 3: High-Level Design (10 minutes)
Draw the major components and their interactions:
Client → Load Balancer → API Servers → Database
→ Cache
→ Message Queue → Workers
Identify:
- Where data flows
- What protocols are used (HTTP, WebSocket, gRPC)
- What type of storage each component needs
- Where you need asynchronous processing
Step 4: Deep Dive (15–20 minutes)
Pick 2–3 components to design in detail. The interviewer will often guide you toward what they want to explore. Common deep dives:
Database design:
- Schema design
- SQL vs NoSQL choice (and why)
- Indexing strategy
- Sharding approach
API design:
- Key endpoints
- Request/response formats
- Authentication and rate limiting
Caching strategy:
- What to cache and where
- Cache invalidation approach
- Cache-aside vs write-through
Scaling approach:
- Horizontal vs vertical scaling
- Database replication
- CDN for static content
- Partitioning strategy
Step 5: Address Bottlenecks (5 minutes)
Identify what would break first at 10x scale and how you’d address it:
- Single points of failure
- Hot spots in your data
- Network bottlenecks
- Consistency challenges
Essential Building Blocks
You need working knowledge of these components:
Load Balancers
- Distribute traffic across servers
- Algorithms: round-robin, least connections, consistent hashing
- Layer 4 (TCP) vs Layer 7 (HTTP)
Databases
- Relational (PostgreSQL, MySQL) — ACID transactions, complex queries, joins
- Document (MongoDB) — Flexible schema, horizontal scaling
- Key-value (Redis, DynamoDB) — Fast lookups, caching, sessions
- Wide-column (Cassandra) — High write throughput, time-series data
- Graph (Neo4j) — Relationship-heavy data
Caching
- CDN — Static assets, geographically distributed
- Application cache (Redis/Memcached) — Frequently accessed data
- Database query cache — Expensive query results
- Cache invalidation strategies: TTL, write-through, write-behind
Message Queues
- Kafka — High-throughput event streaming, ordered processing
- RabbitMQ/SQS — Task queues, decoupling services
- Use cases: async processing, event-driven architecture, buffering spikes
Storage
- Object storage (S3) — Files, images, backups
- Block storage (EBS) — Database volumes
- File storage (EFS) — Shared file systems
Common Design Questions
Here are frequently asked questions with key considerations:
URL Shortener
- Hash function for generating short codes
- Database choice (key-value store works well)
- Handling collisions
- Analytics and click tracking
- Cache popular URLs
Social Media Feed
- Fan-out on write vs fan-out on read
- Ranking algorithm
- Caching strategy for hot users
- Handling celebrities (millions of followers)
Chat Application
- WebSocket for real-time communication
- Message storage and retrieval
- Online presence tracking
- Group chat scaling
- Message delivery guarantees
Rate Limiter
- Token bucket vs sliding window algorithms
- Distributed rate limiting across multiple servers
- Where to place it (API gateway vs application layer)
Notification System
- Multiple channels (push, email, SMS)
- Priority queues
- User preferences and opt-outs
- Delivery guarantees and retries
Trade-Offs to Discuss
Great candidates don’t just make choices — they explain what they’re giving up:
| Decision | Trade-off |
|---|---|
| SQL vs NoSQL | Consistency & joins vs horizontal scalability |
| Caching | Speed vs data freshness |
| Sync vs async | Simplicity vs throughput |
| Monolith vs microservices | Development speed vs independent scaling |
| Replication | Availability vs consistency (CAP theorem) |
| Denormalization | Read speed vs write complexity & storage |
Mistakes to Avoid
- Diving into details too early — Start broad, then go deep
- Not asking clarifying questions — The interviewer expects you to scope the problem
- Ignoring scale — “Just use a database” doesn’t work at 1M requests/sec
- Over-engineering — Don’t add Kafka and microservices to a system serving 100 users
- Single technology answers — “Just use Redis for everything” shows limited thinking
- Not discussing trade-offs — Every choice has downsides; acknowledge them
How to Practice
- Pick a system you use daily (Twitter, Uber, Spotify) and design it from scratch
- Set a 45-minute timer and practice the full framework
- Study real architectures — Engineering blogs from Netflix, Uber, and Stripe are gold
- Practice with a partner — Have them ask follow-up questions and push back on your choices
- Read “Designing Data-Intensive Applications” by Martin Kleppmann — it’s the single best resource for system design fundamentals
At Different Levels
Mid-level (3–5 years): Focus on a working design with reasonable component choices. Show you understand basic scaling patterns.
Senior (5–8 years): Drive the conversation. Proactively discuss trade-offs, failure modes, and evolution over time. Show depth in at least one area.
Staff+ (8+ years): Demonstrate cross-cutting concerns: observability, deployment strategy, team boundaries, migration paths from existing systems. Show you think about organizational constraints, not just technical ones.
Final Advice
System design interviews are conversations, not presentations. The interviewer is your collaborator — they’ll drop hints, ask leading questions, and steer you toward interesting areas. Listen carefully, think out loud, and don’t be afraid to say “I’m not sure about the best approach here, but my instinct is…” and then reason through it.
The best preparation is building real systems. If you’ve designed and operated production services, you already have the intuition — you just need to practice articulating it in an interview setting.