Rust Types: Arc

This chapter is for Arc<T> as a Rust type in its own right.

Arc<T> means atomic reference-counted shared ownership. It lets multiple owners hold the same value safely across threads.

The Core Idea

A clean way to say it is:

Arc<T> gives shared ownership with thread-safe reference counting.

Example:

#![allow(unused)]
fn main() {
use std::sync::Arc;

let config: Arc<String> = Arc::new("shared".to_string());
let config2 = Arc::clone(&config);
}

Here:

both bindings point to the same underlying allocation
Arc::clone(&config) does not clone the inner String
it only increments the atomic reference count

Why `Arc<T>` Exists

Technically, Arc<T> can do everything Rc<T> can do. If you replaced Rc<T> with Arc<T>, the program could still be correct.

The reason Rust still has both is:

Rc<T> is cheaper for single-threaded shared ownership
Arc<T> pays an atomic coordination cost so it is safe across threads

Interview-safe summary:

Rc<T> and Arc<T> both model shared ownership. Rc<T> is the cheaper single-threaded form. Arc<T> is the thread-safe form and pays the atomic cost for that guarantee.

`Rc<T>` vs `Arc<T>`

`Rc<T>`

Use Rc<T> when:

the data never leaves one thread
you want the lightest shared ownership form
the type should clearly be thread-local

`Arc<T>`

Use Arc<T> when:

multiple threads need shared ownership
the data may cross thread boundaries
the type must participate in thread-safe sharing

A compact comparison:

Rc<T>: faster, single-thread only
Arc<T>: slower, cross-thread safe

`Arc<T>` Alone vs `Arc<Mutex<T>>`

Arc<T> by itself is for shared ownership. It does not automatically mean shared mutation.

Example:

#![allow(unused)]
fn main() {
use std::sync::Arc;

let table: Arc<Vec<i32>> = Arc::new(vec![1, 2, 3]);
}

This is good when multiple threads only need to read shared data.

If the shared data must be mutated across threads, you usually compose Arc<T> with a synchronization primitive.

Example:

#![allow(unused)]
fn main() {
use std::sync::{Arc, Mutex};

let counter: Arc<Mutex<u32>> = Arc::new(Mutex::new(0));
}

A good distinction is:

Arc<T> solves many owners
Mutex<T> solves coordinated mutation

So:

Arc<T> means shared ownership
Arc<Mutex<T>> means shared ownership plus synchronized mutation

Why `Arc::clone(&x)` Is Preferred

You can write:

#![allow(unused)]
fn main() {
let b = a.clone();
}

But idiomatic Rust often prefers:

#![allow(unused)]
fn main() {
let b = Arc::clone(&a);
}

Why?

it makes the intent clearer
it signals cheap reference-count cloning
it does not look like a deep clone of the inner data

This matters because .clone() often looks expensive to the reader. Arc::clone(&a) says clearly: we are cloning the pointer-like owner, not the data itself.

Important `Arc` APIs

High-yield associated functions:

Arc::new(value) creates a new Arc<T>
Arc::clone(&x) increments the strong reference count
Arc::strong_count(&x) shows how many strong owners exist
Arc::ptr_eq(&a, &b) checks whether two Arcs point to the same allocation
Arc::get_mut(&mut x) gives &mut T only if the strong count is exactly 1
Arc::make_mut(&mut x) gives mutable access, cloning only if needed

Example:

#![allow(unused)]
fn main() {
use std::sync::Arc;

let mut data = Arc::new(vec![1, 2, 3]);
let other = Arc::clone(&data);

let count = Arc::strong_count(&data);
let same_allocation = Arc::ptr_eq(&data, &other);
}

`Arc::make_mut` and Copy-on-Write

Arc::make_mut is important because it shows that Arc<T> is not only for permanent sharing. It also supports copy-on-write.

Example:

#![allow(unused)]
fn main() {
use std::sync::Arc;

let mut data = Arc::new(vec![1, 2, 3]);
let other = Arc::clone(&data);

let mine = Arc::make_mut(&mut data);
mine.push(4);
}

If the strong count is greater than 1:

Rust allocates a new copy for data
your binding now points to its own private version
the other Arc still points to the old version

If the strong count is 1:

no clone happens
you get mutable access directly

Interview-safe summary:

Arc::make_mut is copy-on-write. If the value is uniquely owned, mutate in place. If it is shared, clone first so mutation does not affect the other owners.

Why Copy-on-Write Matters

Copy-on-write is useful when:

many readers can share the same large value cheaply
writes are relatively rare
you want to delay cloning until mutation is actually needed

That makes it a good fit for:

large shared read-mostly data
versioned or branching state
situations where deep cloning up front would waste memory

A compact comparison:

Arc<Mutex<T>>: shared mutable state with locking
Arc<T> plus make_mut: shared read-mostly state with cloning only on write

Use the first when frequent mutation is the real model. Use the second when shared reads dominate and writes are occasional forks.

When To Reach For `Arc<T>`

Use Arc<T> when:

multiple threads need shared ownership
the shared data is mostly read-only
you want to avoid deep cloning large values up front

Use Rc<T> instead when:

the data stays on one thread
you want the lighter-weight shared ownership form

Use Arc<Mutex<T>> or Arc<RwLock<T>> when:

multiple threads must both share and mutate the same value

Interview-Safe Summary

Arc<T> is the thread-safe shared ownership type. It is heavier than Rc<T> because it uses atomic reference counting, but that cost buys safe sharing across threads. By itself, Arc<T> is often for shared read-only ownership; when mutation is required, it is usually paired with a synchronization primitive like Mutex<T>, or used with Arc::make_mut when copy-on-write fits the model.

Applied Problem: Live-Update Traffic Router

This is a good systems problem for Arc<T> because it combines:

thread-safe shared state
read-heavy access
low lock contention for readers
versioned updates without invalidating in-flight work

Problem shape

Imagine a network router with:

worker threads constantly routing packets from a RoutingTable
an admin thread that occasionally updates that routing table
a requirement that workers must finish their current packet with the version they started with
a constraint that reads should not sit behind a long-held Mutex

This is a classic snapshotting pattern, often described as read-copy-update intuition.

Why `Arc<T>` fits

The key idea is:

readers take a cheap Arc snapshot of the current table
once they hold that Arc, they can keep using that version without further locking
the writer publishes a new Arc when an update happens
the old version stays alive until the last reader drops its snapshot

That solves three problems at once:

readers do very little work under the lock
each worker sees a stable table for the duration of its task
memory cleanup happens automatically when old snapshots are no longer used

Example

use std::sync::{Arc, Mutex};
use std::thread;
use std::time::Duration;

#[derive(Debug, Clone)]
struct RoutingTable {
    version: u32,
    blocked_ips: Vec<String>,
}

struct Router {
    active_config: Mutex<Arc<RoutingTable>>,
}

impl Router {
    pub fn new(version: u32, blocked_ips: Vec<String>) -> Self {
        let table = RoutingTable { version, blocked_ips };
        Self {
            active_config: Mutex::new(Arc::new(table)),
        }
    }

    pub fn get_config(&self) -> Arc<RoutingTable> {
        let guard = self.active_config.lock().unwrap();
        Arc::clone(&*guard)
    }

    pub fn add_blocked_ip(&self, ip: String) {
        let mut guard = self.active_config.lock().unwrap();
        let table = Arc::make_mut(&mut guard);

        table.blocked_ips.push(ip.clone());
        table.version += 1;

        println!("--- Admin: Blocked {} (Version {}) ---", ip, table.version);
    }
}

fn main() {
    let router = Arc::new(Router::new(1, vec!["192.168.1.1".to_string()]));

    for worker_id in 0..3 {
        let router_ptr = Arc::clone(&router);
        thread::spawn(move || loop {
            let current_snapshot = router_ptr.get_config();
            println!(
                "Worker {} routing with Version {}: {:?}",
                worker_id, current_snapshot.version, current_snapshot.blocked_ips
            );

            thread::sleep(Duration::from_millis(800));
        });
    }

    thread::sleep(Duration::from_secs(1));
    router.add_blocked_ip("10.0.0.1".to_string());

    thread::sleep(Duration::from_secs(1));
    router.add_blocked_ip("172.16.0.5".to_string());

    thread::sleep(Duration::from_secs(2));
    println!("Final State: {:?}", router.get_config());
}

The type shape that matters

The most important type here is:

use arc_swap::ArcSwap;
use std::sync::{Arc, Mutex, MutexGuard};
use std::sync::atomic::{AtomicBool, Ordering};
use std::thread;
use std::time::Duration;

#[derive(Debug, Clone)]
struct RoutingTable {
    version: u32,
    blocked_ips: Vec<String>,
}

struct Router {
    active_config: ArcSwap<RoutingTable>,
    write_gate: Mutex<()>,
}

impl Router {
    pub fn new(version: u32, blocked_ips: Vec<String>) -> Self {
        let table = RoutingTable { version, blocked_ips };
        Self {
            active_config: ArcSwap::from_pointee(table),
            write_gate: Mutex::new(()),
        }
    }

    pub fn get_config(&self) -> Arc<RoutingTable> {
        self.active_config.load_full()
    }

    pub fn add_blocked_ip(&self, ip: String) {
        let _guard = self.lock_write_gate();

        let mut next = (*self.active_config.load_full()).clone();
        next.blocked_ips.push(ip.clone());
        next.version += 1;

        println!("--- Admin: Blocked {} (Version {}) ---", ip, next.version);
        self.active_config.store(Arc::new(next));
    }

    fn lock_write_gate(&self) -> MutexGuard<'_, ()> {
        match self.write_gate.lock() {
            Ok(guard) => guard,
            Err(poisoned) => {
                eprintln!("write gate was poisoned; recovering inner state");
                poisoned.into_inner()
            }
        }
    }
}

fn main() {
    let router = Arc::new(Router::new(1, vec!["192.168.1.1".to_string()]));
    let running = Arc::new(AtomicBool::new(true));
    let mut handles = Vec::new();

    for worker_id in 0..3 {
        let router_ptr = Arc::clone(&router);
        let running_flag = Arc::clone(&running);
        handles.push(thread::spawn(move || {
            while running_flag.load(Ordering::Acquire) {
                let current_snapshot = router_ptr.get_config();
                println!(
                    "Worker {} routing with Version {}: {:?}",
                    worker_id, current_snapshot.version, current_snapshot.blocked_ips
                );

                thread::sleep(Duration::from_millis(800));
            }
        }));
    }

    thread::sleep(Duration::from_secs(1));
    router.add_blocked_ip("10.0.0.1".to_string());

    thread::sleep(Duration::from_secs(1));
    router.add_blocked_ip("172.16.0.5".to_string());

    thread::sleep(Duration::from_secs(2));
    running.store(false, Ordering::Release);

    for handle in handles {
        handle.join().unwrap();
    }

    println!("Final State: {:?}", router.get_config());
}

This can look redundant at first, but each layer has a different job:

outer Arc shares the router state across threads
Mutex protects swapping the active pointer
inner Arc is the current published snapshot of the routing table

The important performance idea is that the lock protects pointer publication, not long-lived reading of the table itself.

Reader-side intuition

A worker does this:

lock briefly
clone the inner Arc<RoutingTable>
unlock immediately
use the snapshot without further synchronization

That means reads are mostly lock-free in practice after snapshot acquisition.

A useful clarification about the example: the worker thread::sleep(Duration::from_millis(500)) is there mainly to simulate real packet-processing work and make the version changes easier to observe in the output. It is not required for correctness. Without that sleep, the workers still take snapshots correctly, but they loop much faster, produce much noisier logs, and hold each snapshot for a much shorter time. That means the example still shows version publication, but it less clearly shows the stronger idea that an in-flight unit of work can keep using an older snapshot while newer work has already moved to a newer one.

Writer-side intuition

The admin does this:

lock the master pointer
clone the current Arc<RoutingTable>
call Arc::make_mut on the cloned handle
mutate the new version
swap the published pointer to the new Arc

If readers are still holding the old version, make_mut clones the table first. If no one else is holding it, mutation happens in place.

What this pattern guarantees

a worker never sees the table change halfway through its own task
new workers observe the newly published version
old versions stay alive exactly as long as some worker still holds them
cleanup happens automatically when the last Arc to an old version is dropped

Interview-safe summary

This pattern uses Arc as a snapshotting mechanism. Readers clone a cheap shared pointer to the current version and then work without holding the lock. Writers publish a new Arc when they update the table, and Arc::make_mut gives copy-on-write behavior so in-flight readers can finish on the old version safely.

All `let`-bound types in the router example

These are the main let bindings in the example and the types they hold:

initial_table: std::sync::Arc<RoutingTable>
router: std::sync::Arc<Router>
handles: std::vec::Vec<std::thread::JoinHandle<_>>
router_ptr: std::sync::Arc<Router>
current_snapshot: std::sync::Arc<RoutingTable>
guard: std::sync::MutexGuard<'_, std::sync::Arc<RoutingTable>>
admin_router_ptr: std::sync::Arc<Router>
admin_handle: std::thread::JoinHandle<()>
master_guard: std::sync::MutexGuard<'_, std::sync::Arc<RoutingTable>>
new_config: std::sync::Arc<RoutingTable>
unique_config: &mut RoutingTable

Loop bindings also introduce values even though they are not written with a separate let line:

worker_id: inferred integer from 0..3
i: inferred integer from 2..5

The main standard-library types behind this example are:

std::sync::Arc<T>
std::sync::Mutex<T>
std::sync::MutexGuard<'a, T>
std::thread::JoinHandle<T>
std::vec::Vec<T>

More Advanced Applied Problem: Sharded 2PC With `Arc`

The live-update router example is about publishing new versions safely. A more advanced version of the same idea is a sharded database transaction.

Here the problem is not only snapshotting. It is atomic cross-shard updates.

Imagine you want to transfer money from one key in Shard A to another key in Shard B. If one side updates and the other fails, the system is inconsistent. So the design needs a prepare phase and a commit phase.

Core strategy

A compact way to model it with Arc is:

lock the involved shards in a fixed order
clone the current Arc snapshots for those shards
use Arc::make_mut to stage the changes in private copies
if validation fails, drop the staged copies and return an error
if validation succeeds, swap the master pointers to publish the new versions

That gives a useful property:

rollback is mostly free, because failed staged copies are simply dropped before publication.

Synchronous version

#![allow(unused)]
fn main() {
use std::collections::HashMap;
use std::sync::{Arc, Mutex};

type ShardData = HashMap<String, i32>;
type Shard = Arc<Mutex<Arc<ShardData>>>;

struct ShardedDb {
    shards: Vec<Shard>,
}

impl ShardedDb {
    fn atomic_transfer(&self, key_a: &str, key_b: &str, amount: i32) -> Result<(), &'static str> {
        let idx_a = key_a.len() % self.shards.len();
        let idx_b = key_b.len() % self.shards.len();

        let (first_idx, second_idx) = if idx_a < idx_b {
            (idx_a, idx_b)
        } else {
            (idx_b, idx_a)
        };

        let mut guard_1 = self.shards[first_idx].lock().unwrap();
        let mut maybe_guard_2 = if first_idx != second_idx {
            Some(self.shards[second_idx].lock().unwrap())
        } else {
            None
        };

        if first_idx == second_idx {
            let mut staged = Arc::clone(&guard_1);

            {
                let map = Arc::make_mut(&mut staged);
                let balance_a = map.get_mut(key_a).ok_or("Source not found")?;
                if *balance_a < amount {
                    return Err("Insufficient funds");
                }
                *balance_a -= amount;
            }

            {
                let map = Arc::make_mut(&mut staged);
                let balance_b = map.entry(key_b.to_string()).or_insert(0);
                *balance_b += amount;
            }

            *guard_1 = staged;
            return Ok(());
        }

        let mut staged_a_arc = Arc::clone(&guard_1);
        let mut staged_b_arc = if let Some(ref g2) = maybe_guard_2 {
            Arc::clone(g2)
        } else {
            Arc::clone(&staged_a_arc)
        };

        {
            let map_a = Arc::make_mut(&mut staged_a_arc);
            let balance_a = map_a.get_mut(key_a).ok_or("Source not found")?;
            if *balance_a < amount {
                return Err("Insufficient funds");
            }
            *balance_a -= amount;
        }

        {
            let map_b = Arc::make_mut(&mut staged_b_arc);
            let balance_b = map_b.entry(key_b.to_string()).or_insert(0);
            *balance_b += amount;
        }

        *guard_1 = staged_a_arc;
        if let Some(mut guard_2) = maybe_guard_2 {
            *guard_2 = staged_b_arc;
        }

        Ok(())
    }
}
}

Why this is more advanced

This version forces you to reason about:

lock ordering to prevent deadlock
same-shard transfers, where both updates must land in one staged copy
prepare versus commit phases
rollback by dropping unpublished staged copies
cloning only the dirty shards instead of the whole database
readers continuing on old snapshots while writers stage a new version

Async Tokio version

A final step up is when the commit path must cross an .await, for example to write a log record before publishing the new state. In that case, std::sync::Mutex is usually the wrong tool if the lock must be held across an .await, because it blocks the executor thread.

That is why the async version uses tokio::sync::Mutex.

#![allow(unused)]
fn main() {
use std::collections::HashMap;
use std::sync::Arc;
use tokio::sync::Mutex;
use tokio::time::{sleep, Duration};

type ShardData = HashMap<String, i32>;
type Shard = Arc<Mutex<Arc<ShardData>>>;

struct AsyncShardedDb {
    shards: Vec<Shard>,
}

impl AsyncShardedDb {
    fn new(num_shards: usize) -> Self {
        let mut shards = Vec::new();
        for _ in 0..num_shards {
            shards.push(Arc::new(Mutex::new(Arc::new(HashMap::new()))));
        }
        Self { shards }
    }

    async fn atomic_transfer(&self, from: &str, to: &str, amount: i32) -> Result<(), &'static str> {
        let idx_a = from.len() % self.shards.len();
        let idx_b = to.len() % self.shards.len();
        let (first, second) = if idx_a < idx_b { (idx_a, idx_b) } else { (idx_b, idx_a) };

        let mut guard_1 = self.shards[first].lock().await;
        let mut guard_2 = if first != second {
            Some(self.shards[second].lock().await)
        } else {
            None
        };

        if first == second {
            let mut staged = Arc::clone(&guard_1);

            {
                let map = Arc::make_mut(&mut staged);
                let bal_a = map.get_mut(from).ok_or("Source missing")?;
                if *bal_a < amount {
                    return Err("Insufficient funds");
                }
                *bal_a -= amount;
            }

            {
                let map = Arc::make_mut(&mut staged);
                let bal_b = map.entry(to.to_string()).or_insert(0);
                *bal_b += amount;
            }

            simulate_disk_io().await;
            *guard_1 = staged;
            return Ok(());
        }

        let mut staged_a = Arc::clone(&guard_1);
        let mut staged_b = if let Some(ref g2) = guard_2 {
            Arc::clone(g2)
        } else {
            Arc::clone(&staged_a)
        };

        {
            let map_a = Arc::make_mut(&mut staged_a);
            let bal_a = map_a.get_mut(from).ok_or("Source missing")?;
            if *bal_a < amount {
                return Err("Insufficient funds");
            }
            *bal_a -= amount;
        }

        {
            let map_b = Arc::make_mut(&mut staged_b);
            let bal_b = map_b.entry(to.to_string()).or_insert(0);
            *bal_b += amount;
        }

        simulate_disk_io().await;

        *guard_1 = staged_a;
        if let Some(mut g2) = guard_2 {
            *g2 = staged_b;
        }

        Ok(())
    }
}

async fn simulate_disk_io() {
    sleep(Duration::from_millis(10)).await;
}
}

What the async version adds

Now the design has to satisfy all the earlier Arc concerns plus:

async lock acquisition with .await
holding the commit gate across an async logging step
avoiding a blocking std::sync::Mutex guard across .await
ensuring the state remains Send + Sync enough for Tokio task movement

A clean way to summarize the type stack is:

outer Arc: share the database or shard handle across tasks
async Mutex: serialize commit access even across .await
inner Arc: let readers and staged writers work from stable snapshots

Interview-safe summary

The router example shows snapshot publication. The sharded database version adds transactional staging and atomic commit. Arc::make_mut gives a private sandbox for updates, pointer swapping publishes the new version, and failed staged changes are dropped for rollback. In async Rust, the same pattern extends naturally, but the commit gate usually becomes tokio::sync::Mutex so waiting for locks or I/O does not block the executor thread.

Keyboard shortcuts

Bobby Interview Notes