System Roles and Abstract Shapes

This chapter is the deeper unification layer behind the scratch-building problems.

The previous chapter unified the vocabulary. This chapter unifies the abstract roles that those systems are actually made of.

The goal is to stop seeing ten separate interview problems and start seeing a small number of recurring system roles.

The Core Roles

1. Identity

This is the question:

Who or what is this state about?

Examples:

user ID
API key
IP address
tenant
connection ID
cache key

Why it matters:

Many backend systems are not global. They are per-identity systems.

If there is no identity, there is often no way to partition, track, or limit behavior cleanly.

2. Stored State

This is the question:

What do I need to remember between events?

Examples:

request count
cached value
retry count
last-seen timestamp
token count
active subscriptions

Why it matters:

Most backend systems are not just transformations of one input to one output. They depend on remembered state across time.

Rust type references:

when the state is keyed by identity, the default Rust type is often HashMap<K, V>
when the state is a single shared aggregate, it may just be a struct with fields
when ordering matters, BTreeMap<K, V> may be a better fit than HashMap<K, V>

3. Pending Work

This is the question:

What has been accepted but not yet processed?

Examples:

queued jobs
pending retries
buffered log lines
events waiting to be handled

Why it matters:

Any time work does not happen instantly, a system needs a way to hold pending work.

Rust type references:

VecDeque<T> is the direct in-memory queue type
std::sync::mpsc channels model queued work across threads
tokio::sync::mpsc models queued work across async tasks

4. Execution Unit

This is the question:

What actually performs the work?

Examples:

thread
async task
worker
background consumer

Why it matters:

A lot of design questions reduce to how work gets attached to execution.

5. Time Boundary

This is the question:

What changes as time passes?

Examples:

rate-limit window expiration
cache TTL
retry delay
rolling metrics window
token refill
flush interval

Why it matters:

Many backend systems are state plus time. Time is not just metadata; it changes the validity of the state.

Rust type references:

std::time::Instant is the right type for monotonic elapsed-time measurement
std::time::Duration represents intervals and limits

6. Capacity Boundary

This is the question:

What is allowed to grow, and what must stay bounded?

Examples:

queue length
pool size
cache size
number of retries
batch size

Why it matters:

A backend system without capacity boundaries often becomes a memory or latency problem.

Rust type references:

capacity often appears as a usize
bounded channels, bounded pools, and bounded caches all turn capacity into explicit program state

7. Admission Boundary

This is the question:

What gets allowed in, and what gets rejected or delayed?

Examples:

rate limiter allow/reject
queue full / backpressure
connection acquire timeout
cache admission rules

Why it matters:

The system needs rules for deciding what enters and what does not.

8. Delivery Boundary

This is the question:

How does something move from one part of the system to another?

Examples:

channel send/receive
event publish/subscribe
queue dispatch
batch flush

Why it matters:

This is where ordering, fan-out, buffering, and backpressure show up.

Rust type references:

Sender<T> / Receiver<T> pairs are the standard typed delivery boundary in channel-based designs
VecDeque<T> is the local delivery boundary when work stays in one process component

9. Resource Reuse

This is the question:

What expensive thing should be reused instead of recreated?

Examples:

database connections
workers
pooled clients
cached values

Why it matters:

Reuse is often the difference between a toy design and a real backend design.

Rust type references:

reused resources are often stored in Vec<T>, VecDeque<T>, or maps keyed by identity depending on how they are acquired and returned

10. Lifecycle State

This is the question:

What phases can this thing be in?

Examples:

pending / running / failed / succeeded
active / expired
available / acquired
subscribed / disconnected

Why it matters:

When the system has phases, modeling them explicitly reduces ambiguity.

Mapping the 10 Problems onto the Roles

1. Rate Limiter

identity: user, key, IP, token
stored state: count, current window, token count
time boundary: window expiration, refill timing
admission boundary: allow vs reject
capacity boundary: per-key memory growth

2. Worker Pool / Job Queue

pending work: queued jobs
execution unit: worker threads or async tasks
delivery boundary: queue to worker handoff
capacity boundary: queue size
lifecycle state: submitted, running, finished, shutdown

3. In-Memory Cache

identity: key
stored state: cached value
capacity boundary: max items
time boundary: TTL expiration
resource reuse: reused data instead of recomputation

4. Event Bus / Pub-Sub

delivery boundary: producer to subscribers
fan-out: one event to many consumers
stored state: subscriber list
lifecycle state: active or dead subscribers

5. Retry Queue / Task Scheduler

pending work: failed tasks waiting to retry
time boundary: retry delay
lifecycle state: pending, retrying, dead-lettered
admission boundary: max retries or drop policy

6. Rolling Metrics Aggregator

stored state: counters or buckets
time boundary: rolling windows
capacity boundary: retained history
aggregation: combine many events into summaries

7. Connection Pool

resource reuse: reusable connections
capacity boundary: pool size
admission boundary: acquire or timeout
lifecycle state: available, acquired, broken

8. LRU Cache

identity: key
stored state: cached values
capacity boundary: maximum cache size
lifecycle state: recently used vs old
eviction: remove least recently used

9. Token Bucket

identity: key
stored state: token count
time boundary: refill over time
admission boundary: consume token or reject
burst control: temporary spikes allowed within a bounded model

10. Log Batcher / Buffered Writer

pending work: buffered log entries
delivery boundary: flush to sink
capacity boundary: batch size
time boundary: flush interval
throughput/latency trade-off: larger batches vs faster flush

The Deep Compression

Most backend interview problems reduce to combinations of these:

identity
remembered state
pending work
execution
time
capacity
admission
delivery
reuse
lifecycle

That is a stronger unification than just data structures or code snippets.

How To Use This in an Interview

When you hear a new problem, ask:

What is the identity here?
What state must be remembered?
Is there pending work?
What is the execution model?
What changes over time?
What must stay bounded?
What gets admitted, delayed, or rejected?
How is work delivered?
What should be reused?
What lifecycle states exist?

If you answer those cleanly, the data structures usually follow naturally.

Very often the first Rust types that fall out of those answers are:

HashMap<K, V> for per-identity state
VecDeque<T> for pending ordered work
Sender<T> / Receiver<T> for delivery across workers or tasks
Instant and Duration for time-aware behavior
structs and enums for lifecycle and policy state

Short Interview Framing

I usually try to reduce backend problems to a few abstract roles: identity, stored state, pending work, execution, time boundaries, capacity boundaries, admission rules, delivery rules, reuse, and lifecycle state. Once those are clear, the implementation choices become much easier to reason about.

Keyboard shortcuts

Bobby Interview Notes