跳轉到

System Design Interviews: A step by step guide

The purpose of a interview is to test your ability to organize your thoughts and focus on what is important.

Ask as many questions as necessary to fully understand the problem!

  • Who is going to use it?
  • How are they going to use it?
  • How many users are there?
  • What does the system do?
  • What are the inputs and outputs of the system?
  • How much data do we expect to handle?
  • How many requests per second do we expect?
  • What is the expected read to write ratio?

Keep in mind a system that is designed to handle 100,000 requests per second, each 1 KB in size, looks very different from a system that is designed for 3 requests per minute, each 2 GB in size, even though the two systems have the same data throughput.

Step 1: Requirements clarifications

  • Functional: API, data types, ..., etc.
  • Non-functional: availability, reliability, latency, daily active user, ..., etc.

Step 2: System interface definition

Define what APIs are expected from the system.

This will not only establish the exact contract expected from the system, but will also ensure if we haven’t gotten any requirements wrong

Step 3: Back-of-the-envelope estimation

It is always a good idea to estimate the scale of the system we’re going to design.

This will also help later when we will be focusing on scaling, partitioning, load balancing and caching

Step 4: Defining data model

Defining the data model early will clarify how data will flow among different components of the system

Step 5: High-level design

Draw a block diagram with 5-6 boxes representing the core components of our system.

We should identify enough components that are needed to solve the actual problem from end-to-end

Step 6: Detailed design

Dig deeper into two or three components; interviewer’s feedback should always guide us what parts of the system need further discussion.

We should be able to present different approaches, their pros and cons, and explain why we will prefer one approach on the other

Step 7: Identifying and resolving bottlenecks

Try to discuss as many bottlenecks as possible and different approaches to mitigate them - Is there any single point of failure in our system? What are we doing to mitigate it? - Do we have enough replicas of the data so that if we lose a few servers we can still serve our users? - Similarly, do we have enough copies of different services running such that a few failures will not cause total system shutdown? - How are we monitoring the performance of our service? Do we get alerts whenever critical components fail or their performance degrades?