Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Rust Types: Arc

This chapter is for Arc<T> as a Rust type in its own right.

Arc<T> means atomic reference-counted shared ownership. It lets multiple owners hold the same value safely across threads.

The Core Idea

A clean way to say it is:

Arc<T> gives shared ownership with thread-safe reference counting.

Example:

#![allow(unused)]
fn main() {
use std::sync::Arc;

let config: Arc<String> = Arc::new("shared".to_string());
let config2 = Arc::clone(&config);
}

Here:

  • both bindings point to the same underlying allocation
  • Arc::clone(&config) does not clone the inner String
  • it only increments the atomic reference count

Why Arc<T> Exists

Technically, Arc<T> can do everything Rc<T> can do. If you replaced Rc<T> with Arc<T>, the program could still be correct.

The reason Rust still has both is:

  • Rc<T> is cheaper for single-threaded shared ownership
  • Arc<T> pays an atomic coordination cost so it is safe across threads

Interview-safe summary:

Rc<T> and Arc<T> both model shared ownership. Rc<T> is the cheaper single-threaded form. Arc<T> is the thread-safe form and pays the atomic cost for that guarantee.

Rc<T> vs Arc<T>

Rc<T>

Use Rc<T> when:

  • the data never leaves one thread
  • you want the lightest shared ownership form
  • the type should clearly be thread-local

Arc<T>

Use Arc<T> when:

  • multiple threads need shared ownership
  • the data may cross thread boundaries
  • the type must participate in thread-safe sharing

A compact comparison:

  • Rc<T>: faster, single-thread only
  • Arc<T>: slower, cross-thread safe

Arc<T> Alone vs Arc<Mutex<T>>

Arc<T> by itself is for shared ownership. It does not automatically mean shared mutation.

Example:

#![allow(unused)]
fn main() {
use std::sync::Arc;

let table: Arc<Vec<i32>> = Arc::new(vec![1, 2, 3]);
}

This is good when multiple threads only need to read shared data.

If the shared data must be mutated across threads, you usually compose Arc<T> with a synchronization primitive.

Example:

#![allow(unused)]
fn main() {
use std::sync::{Arc, Mutex};

let counter: Arc<Mutex<u32>> = Arc::new(Mutex::new(0));
}

A good distinction is:

  • Arc<T> solves many owners
  • Mutex<T> solves coordinated mutation

So:

  • Arc<T> means shared ownership
  • Arc<Mutex<T>> means shared ownership plus synchronized mutation

Why Arc::clone(&x) Is Preferred

You can write:

#![allow(unused)]
fn main() {
let b = a.clone();
}

But idiomatic Rust often prefers:

#![allow(unused)]
fn main() {
let b = Arc::clone(&a);
}

Why?

  • it makes the intent clearer
  • it signals cheap reference-count cloning
  • it does not look like a deep clone of the inner data

This matters because .clone() often looks expensive to the reader. Arc::clone(&a) says clearly: we are cloning the pointer-like owner, not the data itself.

Important Arc APIs

High-yield associated functions:

  • Arc::new(value) creates a new Arc<T>
  • Arc::clone(&x) increments the strong reference count
  • Arc::strong_count(&x) shows how many strong owners exist
  • Arc::ptr_eq(&a, &b) checks whether two Arcs point to the same allocation
  • Arc::get_mut(&mut x) gives &mut T only if the strong count is exactly 1
  • Arc::make_mut(&mut x) gives mutable access, cloning only if needed

Example:

#![allow(unused)]
fn main() {
use std::sync::Arc;

let mut data = Arc::new(vec![1, 2, 3]);
let other = Arc::clone(&data);

let count = Arc::strong_count(&data);
let same_allocation = Arc::ptr_eq(&data, &other);
}

Arc::make_mut and Copy-on-Write

Arc::make_mut is important because it shows that Arc<T> is not only for permanent sharing. It also supports copy-on-write.

Example:

#![allow(unused)]
fn main() {
use std::sync::Arc;

let mut data = Arc::new(vec![1, 2, 3]);
let other = Arc::clone(&data);

let mine = Arc::make_mut(&mut data);
mine.push(4);
}

If the strong count is greater than 1:

  • Rust allocates a new copy for data
  • your binding now points to its own private version
  • the other Arc still points to the old version

If the strong count is 1:

  • no clone happens
  • you get mutable access directly

Interview-safe summary:

Arc::make_mut is copy-on-write. If the value is uniquely owned, mutate in place. If it is shared, clone first so mutation does not affect the other owners.

Why Copy-on-Write Matters

Copy-on-write is useful when:

  • many readers can share the same large value cheaply
  • writes are relatively rare
  • you want to delay cloning until mutation is actually needed

That makes it a good fit for:

  • large shared read-mostly data
  • versioned or branching state
  • situations where deep cloning up front would waste memory

A compact comparison:

  • Arc<Mutex<T>>: shared mutable state with locking
  • Arc<T> plus make_mut: shared read-mostly state with cloning only on write

Use the first when frequent mutation is the real model. Use the second when shared reads dominate and writes are occasional forks.

When To Reach For Arc<T>

Use Arc<T> when:

  • multiple threads need shared ownership
  • the shared data is mostly read-only
  • you want to avoid deep cloning large values up front

Use Rc<T> instead when:

  • the data stays on one thread
  • you want the lighter-weight shared ownership form

Use Arc<Mutex<T>> or Arc<RwLock<T>> when:

  • multiple threads must both share and mutate the same value

Interview-Safe Summary

Arc<T> is the thread-safe shared ownership type. It is heavier than Rc<T> because it uses atomic reference counting, but that cost buys safe sharing across threads. By itself, Arc<T> is often for shared read-only ownership; when mutation is required, it is usually paired with a synchronization primitive like Mutex<T>, or used with Arc::make_mut when copy-on-write fits the model.

Applied Problem: Live-Update Traffic Router

This is a good systems problem for Arc<T> because it combines:

  • thread-safe shared state
  • read-heavy access
  • low lock contention for readers
  • versioned updates without invalidating in-flight work

Problem shape

Imagine a network router with:

  • worker threads constantly routing packets from a RoutingTable
  • an admin thread that occasionally updates that routing table
  • a requirement that workers must finish their current packet with the version they started with
  • a constraint that reads should not sit behind a long-held Mutex

This is a classic snapshotting pattern, often described as read-copy-update intuition.

Why Arc<T> fits

The key idea is:

  • readers take a cheap Arc snapshot of the current table
  • once they hold that Arc, they can keep using that version without further locking
  • the writer publishes a new Arc when an update happens
  • the old version stays alive until the last reader drops its snapshot

That solves three problems at once:

  • readers do very little work under the lock
  • each worker sees a stable table for the duration of its task
  • memory cleanup happens automatically when old snapshots are no longer used

Example

use std::sync::{Arc, Mutex};
use std::thread;
use std::time::Duration;

#[derive(Debug, Clone)]
struct RoutingTable {
    version: u32,
    blocked_ips: Vec<String>,
}

struct Router {
    active_config: Mutex<Arc<RoutingTable>>,
}

impl Router {
    pub fn new(version: u32, blocked_ips: Vec<String>) -> Self {
        let table = RoutingTable { version, blocked_ips };
        Self {
            active_config: Mutex::new(Arc::new(table)),
        }
    }

    pub fn get_config(&self) -> Arc<RoutingTable> {
        let guard = self.active_config.lock().unwrap();
        Arc::clone(&*guard)
    }

    pub fn add_blocked_ip(&self, ip: String) {
        let mut guard = self.active_config.lock().unwrap();
        let table = Arc::make_mut(&mut guard);

        table.blocked_ips.push(ip.clone());
        table.version += 1;

        println!("--- Admin: Blocked {} (Version {}) ---", ip, table.version);
    }
}

fn main() {
    let router = Arc::new(Router::new(1, vec!["192.168.1.1".to_string()]));

    for worker_id in 0..3 {
        let router_ptr = Arc::clone(&router);
        thread::spawn(move || loop {
            let current_snapshot = router_ptr.get_config();
            println!(
                "Worker {} routing with Version {}: {:?}",
                worker_id, current_snapshot.version, current_snapshot.blocked_ips
            );

            thread::sleep(Duration::from_millis(800));
        });
    }

    thread::sleep(Duration::from_secs(1));
    router.add_blocked_ip("10.0.0.1".to_string());

    thread::sleep(Duration::from_secs(1));
    router.add_blocked_ip("172.16.0.5".to_string());

    thread::sleep(Duration::from_secs(2));
    println!("Final State: {:?}", router.get_config());
}

The type shape that matters

The most important type here is:

use arc_swap::ArcSwap;
use std::sync::{Arc, Mutex, MutexGuard};
use std::sync::atomic::{AtomicBool, Ordering};
use std::thread;
use std::time::Duration;

#[derive(Debug, Clone)]
struct RoutingTable {
    version: u32,
    blocked_ips: Vec<String>,
}

struct Router {
    active_config: ArcSwap<RoutingTable>,
    write_gate: Mutex<()>,
}

impl Router {
    pub fn new(version: u32, blocked_ips: Vec<String>) -> Self {
        let table = RoutingTable { version, blocked_ips };
        Self {
            active_config: ArcSwap::from_pointee(table),
            write_gate: Mutex::new(()),
        }
    }

    pub fn get_config(&self) -> Arc<RoutingTable> {
        self.active_config.load_full()
    }

    pub fn add_blocked_ip(&self, ip: String) {
        let _guard = self.lock_write_gate();

        let mut next = (*self.active_config.load_full()).clone();
        next.blocked_ips.push(ip.clone());
        next.version += 1;

        println!("--- Admin: Blocked {} (Version {}) ---", ip, next.version);
        self.active_config.store(Arc::new(next));
    }

    fn lock_write_gate(&self) -> MutexGuard<'_, ()> {
        match self.write_gate.lock() {
            Ok(guard) => guard,
            Err(poisoned) => {
                eprintln!("write gate was poisoned; recovering inner state");
                poisoned.into_inner()
            }
        }
    }
}

fn main() {
    let router = Arc::new(Router::new(1, vec!["192.168.1.1".to_string()]));
    let running = Arc::new(AtomicBool::new(true));
    let mut handles = Vec::new();

    for worker_id in 0..3 {
        let router_ptr = Arc::clone(&router);
        let running_flag = Arc::clone(&running);
        handles.push(thread::spawn(move || {
            while running_flag.load(Ordering::Acquire) {
                let current_snapshot = router_ptr.get_config();
                println!(
                    "Worker {} routing with Version {}: {:?}",
                    worker_id, current_snapshot.version, current_snapshot.blocked_ips
                );

                thread::sleep(Duration::from_millis(800));
            }
        }));
    }

    thread::sleep(Duration::from_secs(1));
    router.add_blocked_ip("10.0.0.1".to_string());

    thread::sleep(Duration::from_secs(1));
    router.add_blocked_ip("172.16.0.5".to_string());

    thread::sleep(Duration::from_secs(2));
    running.store(false, Ordering::Release);

    for handle in handles {
        handle.join().unwrap();
    }

    println!("Final State: {:?}", router.get_config());
}

This can look redundant at first, but each layer has a different job:

  • outer Arc shares the router state across threads
  • Mutex protects swapping the active pointer
  • inner Arc is the current published snapshot of the routing table

The important performance idea is that the lock protects pointer publication, not long-lived reading of the table itself.

Reader-side intuition

A worker does this:

  1. lock briefly
  2. clone the inner Arc<RoutingTable>
  3. unlock immediately
  4. use the snapshot without further synchronization

That means reads are mostly lock-free in practice after snapshot acquisition.

A useful clarification about the example: the worker thread::sleep(Duration::from_millis(500)) is there mainly to simulate real packet-processing work and make the version changes easier to observe in the output. It is not required for correctness. Without that sleep, the workers still take snapshots correctly, but they loop much faster, produce much noisier logs, and hold each snapshot for a much shorter time. That means the example still shows version publication, but it less clearly shows the stronger idea that an in-flight unit of work can keep using an older snapshot while newer work has already moved to a newer one.

Writer-side intuition

The admin does this:

  1. lock the master pointer
  2. clone the current Arc<RoutingTable>
  3. call Arc::make_mut on the cloned handle
  4. mutate the new version
  5. swap the published pointer to the new Arc

If readers are still holding the old version, make_mut clones the table first. If no one else is holding it, mutation happens in place.

What this pattern guarantees

  • a worker never sees the table change halfway through its own task
  • new workers observe the newly published version
  • old versions stay alive exactly as long as some worker still holds them
  • cleanup happens automatically when the last Arc to an old version is dropped

Interview-safe summary

This pattern uses Arc as a snapshotting mechanism. Readers clone a cheap shared pointer to the current version and then work without holding the lock. Writers publish a new Arc when they update the table, and Arc::make_mut gives copy-on-write behavior so in-flight readers can finish on the old version safely.

All let-bound types in the router example

These are the main let bindings in the example and the types they hold:

  • initial_table: std::sync::Arc<RoutingTable>
  • router: std::sync::Arc<Router>
  • handles: std::vec::Vec<std::thread::JoinHandle<_>>
  • router_ptr: std::sync::Arc<Router>
  • current_snapshot: std::sync::Arc<RoutingTable>
  • guard: std::sync::MutexGuard<'_, std::sync::Arc<RoutingTable>>
  • admin_router_ptr: std::sync::Arc<Router>
  • admin_handle: std::thread::JoinHandle<()>
  • master_guard: std::sync::MutexGuard<'_, std::sync::Arc<RoutingTable>>
  • new_config: std::sync::Arc<RoutingTable>
  • unique_config: &mut RoutingTable

Loop bindings also introduce values even though they are not written with a separate let line:

  • worker_id: inferred integer from 0..3
  • i: inferred integer from 2..5

The main standard-library types behind this example are:

  • std::sync::Arc<T>
  • std::sync::Mutex<T>
  • std::sync::MutexGuard<'a, T>
  • std::thread::JoinHandle<T>
  • std::vec::Vec<T>

More Advanced Applied Problem: Sharded 2PC With Arc

The live-update router example is about publishing new versions safely. A more advanced version of the same idea is a sharded database transaction.

Here the problem is not only snapshotting. It is atomic cross-shard updates.

Imagine you want to transfer money from one key in Shard A to another key in Shard B. If one side updates and the other fails, the system is inconsistent. So the design needs a prepare phase and a commit phase.

Core strategy

A compact way to model it with Arc is:

  1. lock the involved shards in a fixed order
  2. clone the current Arc snapshots for those shards
  3. use Arc::make_mut to stage the changes in private copies
  4. if validation fails, drop the staged copies and return an error
  5. if validation succeeds, swap the master pointers to publish the new versions

That gives a useful property:

rollback is mostly free, because failed staged copies are simply dropped before publication.

Synchronous version

#![allow(unused)]
fn main() {
use std::collections::HashMap;
use std::sync::{Arc, Mutex};

type ShardData = HashMap<String, i32>;
type Shard = Arc<Mutex<Arc<ShardData>>>;

struct ShardedDb {
    shards: Vec<Shard>,
}

impl ShardedDb {
    fn atomic_transfer(&self, key_a: &str, key_b: &str, amount: i32) -> Result<(), &'static str> {
        let idx_a = key_a.len() % self.shards.len();
        let idx_b = key_b.len() % self.shards.len();

        let (first_idx, second_idx) = if idx_a < idx_b {
            (idx_a, idx_b)
        } else {
            (idx_b, idx_a)
        };

        let mut guard_1 = self.shards[first_idx].lock().unwrap();
        let mut maybe_guard_2 = if first_idx != second_idx {
            Some(self.shards[second_idx].lock().unwrap())
        } else {
            None
        };

        if first_idx == second_idx {
            let mut staged = Arc::clone(&guard_1);

            {
                let map = Arc::make_mut(&mut staged);
                let balance_a = map.get_mut(key_a).ok_or("Source not found")?;
                if *balance_a < amount {
                    return Err("Insufficient funds");
                }
                *balance_a -= amount;
            }

            {
                let map = Arc::make_mut(&mut staged);
                let balance_b = map.entry(key_b.to_string()).or_insert(0);
                *balance_b += amount;
            }

            *guard_1 = staged;
            return Ok(());
        }

        let mut staged_a_arc = Arc::clone(&guard_1);
        let mut staged_b_arc = if let Some(ref g2) = maybe_guard_2 {
            Arc::clone(g2)
        } else {
            Arc::clone(&staged_a_arc)
        };

        {
            let map_a = Arc::make_mut(&mut staged_a_arc);
            let balance_a = map_a.get_mut(key_a).ok_or("Source not found")?;
            if *balance_a < amount {
                return Err("Insufficient funds");
            }
            *balance_a -= amount;
        }

        {
            let map_b = Arc::make_mut(&mut staged_b_arc);
            let balance_b = map_b.entry(key_b.to_string()).or_insert(0);
            *balance_b += amount;
        }

        *guard_1 = staged_a_arc;
        if let Some(mut guard_2) = maybe_guard_2 {
            *guard_2 = staged_b_arc;
        }

        Ok(())
    }
}
}

Why this is more advanced

This version forces you to reason about:

  • lock ordering to prevent deadlock
  • same-shard transfers, where both updates must land in one staged copy
  • prepare versus commit phases
  • rollback by dropping unpublished staged copies
  • cloning only the dirty shards instead of the whole database
  • readers continuing on old snapshots while writers stage a new version

Async Tokio version

A final step up is when the commit path must cross an .await, for example to write a log record before publishing the new state. In that case, std::sync::Mutex is usually the wrong tool if the lock must be held across an .await, because it blocks the executor thread.

That is why the async version uses tokio::sync::Mutex.

#![allow(unused)]
fn main() {
use std::collections::HashMap;
use std::sync::Arc;
use tokio::sync::Mutex;
use tokio::time::{sleep, Duration};

type ShardData = HashMap<String, i32>;
type Shard = Arc<Mutex<Arc<ShardData>>>;

struct AsyncShardedDb {
    shards: Vec<Shard>,
}

impl AsyncShardedDb {
    fn new(num_shards: usize) -> Self {
        let mut shards = Vec::new();
        for _ in 0..num_shards {
            shards.push(Arc::new(Mutex::new(Arc::new(HashMap::new()))));
        }
        Self { shards }
    }

    async fn atomic_transfer(&self, from: &str, to: &str, amount: i32) -> Result<(), &'static str> {
        let idx_a = from.len() % self.shards.len();
        let idx_b = to.len() % self.shards.len();
        let (first, second) = if idx_a < idx_b { (idx_a, idx_b) } else { (idx_b, idx_a) };

        let mut guard_1 = self.shards[first].lock().await;
        let mut guard_2 = if first != second {
            Some(self.shards[second].lock().await)
        } else {
            None
        };

        if first == second {
            let mut staged = Arc::clone(&guard_1);

            {
                let map = Arc::make_mut(&mut staged);
                let bal_a = map.get_mut(from).ok_or("Source missing")?;
                if *bal_a < amount {
                    return Err("Insufficient funds");
                }
                *bal_a -= amount;
            }

            {
                let map = Arc::make_mut(&mut staged);
                let bal_b = map.entry(to.to_string()).or_insert(0);
                *bal_b += amount;
            }

            simulate_disk_io().await;
            *guard_1 = staged;
            return Ok(());
        }

        let mut staged_a = Arc::clone(&guard_1);
        let mut staged_b = if let Some(ref g2) = guard_2 {
            Arc::clone(g2)
        } else {
            Arc::clone(&staged_a)
        };

        {
            let map_a = Arc::make_mut(&mut staged_a);
            let bal_a = map_a.get_mut(from).ok_or("Source missing")?;
            if *bal_a < amount {
                return Err("Insufficient funds");
            }
            *bal_a -= amount;
        }

        {
            let map_b = Arc::make_mut(&mut staged_b);
            let bal_b = map_b.entry(to.to_string()).or_insert(0);
            *bal_b += amount;
        }

        simulate_disk_io().await;

        *guard_1 = staged_a;
        if let Some(mut g2) = guard_2 {
            *g2 = staged_b;
        }

        Ok(())
    }
}

async fn simulate_disk_io() {
    sleep(Duration::from_millis(10)).await;
}
}

What the async version adds

Now the design has to satisfy all the earlier Arc concerns plus:

  • async lock acquisition with .await
  • holding the commit gate across an async logging step
  • avoiding a blocking std::sync::Mutex guard across .await
  • ensuring the state remains Send + Sync enough for Tokio task movement

A clean way to summarize the type stack is:

  • outer Arc: share the database or shard handle across tasks
  • async Mutex: serialize commit access even across .await
  • inner Arc: let readers and staged writers work from stable snapshots

Interview-safe summary

The router example shows snapshot publication. The sharded database version adds transactional staging and atomic commit. Arc::make_mut gives a private sandbox for updates, pointer swapping publishes the new version, and failed staged changes are dropped for rollback. In async Rust, the same pattern extends naturally, but the commit gate usually becomes tokio::sync::Mutex so waiting for locks or I/O does not block the executor thread.