Caches In Rust

In this post Ill describe how to implement caches in Rust. It is inspired by two recent refactors I landed at nearcore (nearcore#6549, nearcore#6811). Based on that experience, it seems that implementing caches wrong is rather easy, and making a mistake there risks spilling over, and spoiling the overall architecture of the application a bit.

Lets start with an imaginary setup with an application with some configuration and a database:

struct App {
  config: Config,
  db: Db,
}

The database is an untyped key-value store:

impl Db {
  pub fn load(&self, key: &[u8]) -> io::Result<Option<Vec<u8>>> {
    ...
  }
}

And the App encapsulates database and provides typed access to domain-specific Widget:

#[derive(serde::Serialize, serde::Deserialize)]
struct Widget {
  title: String,
}

impl App {
  pub fn get_widget(
    &self,
    id: u32,
  ) -> io::Result<Option<Widget>> {
    let key = id.to_be_bytes();
    let value = match self.db.load(&key)? {
      None => return Ok(None),
      Some(it) => it,
    };

    let widget: Widget =
      bincode::deserialize(&value).map_err(|it| {
        io::Error::new(io::ErrorKind::InvalidData, it)
      })?;

    Ok(Some(widget))
  }
}

Now, for the sake of argument lets assume that database access and subsequent deserialization are costly, and that we want to add a cache of Widgets in front of the database. Data-oriented thinking would compel us to get rid of deserialization step instead, but we will not pursue that idea this time.

Well use a simple HashMap for the cache:

struct App {
  config: Config,
  db: Db,
  cache: HashMap<u32, Widget>,
}

And we need to modify get_widget method to return the value from the cache, if there is one:

impl App {
  pub fn get_widget(
    &mut self,
    id: u32,
  ) -> io::Result<Option<&Widget>> {

    if self.cache.contains_key(&id) {
      let widget = self.cache.get(&id).unwrap();
      return Ok(Some(widget));
    }

    let key = id.to_be_bytes();
    let value = match self.db.load(&key)? {
      None => return Ok(None),
      Some(it) => it,
    };
    let widget: Widget =
      bincode::deserialize(&value).map_err(|it| {
        io::Error::new(io::ErrorKind::InvalidData, it)
      })?;

    self.cache.insert(id, widget);
    let widget = self.cache.get(&id).unwrap();

    Ok(Some(widget))
  }
}

The biggest change is the &mut self. Even when reading the widget, we need to modify the cache, and the easiest way to get that ability is to require an exclusive reference.

I want to argue that this path of least resistance doesnt lead to a good place. There are many problems with methods of the following-shape:

fn get(&mut self) -> &Widget

First, such methods conflict with each other. For example, the following code wont work, because well try to borrow the app exclusively twice.

let app: &mut App = ...;
let w1 = app.get_widget(1)?;
let w2 = app.get_widget(2)?;

Second, the &mut methods conflict even with & methods. Naively, it would seem that, as get_widget returns a shared reference, we should be able to call & methods. So, one can expect something like this to work:

let w: &Widget = app.get_widget(1)?.unwrap();
let c: &Color = &app.config.main_color;

Alas, it doesnt. Rust borrow checker doesnt distinguish between mut and non-mut lifetimes (for a good reason: doing that would be unsound). So, although w is just &Widget, the lifetime there is the same as on the &mut self, so the app remains mutably borrowed while the widget exists.

Third, perhaps the most important point, the &mut self becomes viral most of functions in the program begin requiring &mut, and you lose type-system distinction between read-only and read-write operations. Theres no distinction between this function can only modify the cache and this function can modify literally everything.

Finally, even implementing get_widget is not pleasant. Seasoned rustaceans among you might twitch at the needlessly-repeated hashmap lookups. But trying to get rid of those with the help of the entry-API runs into current borrow checker limitations.

Lets look at how we can better tackle this!

The general idea for this class of problems is to think what the ownership and borrowing situation should be and try to achieve that, as opposed to merely following suggestions by the compiler. That is, most of the time just using &mut and & as compiler guides you is a path to success, as, it turns out, majority of the code naturally follows simple aliasing rules. But there are exceptions, its important to recognize them as such and make use of interior mutability to implement the aliasing structure which makes sense.

Lets start with a simplified case. Suppose that theres only one Widget to deal with. In this case, wed want something like this:

struct App {
  ...
  cache: Option<Widget>,
}

impl App {
  fn get_widget(&self) -> &Widget {
    if let Some(widget) = &self.cache {
      return widget;
    }
    self.cache = Some(create_widget());
    self.cache.as_ref().unwrap()
  }
}

This doesnt work as is modifying the cache needs &mut which wed very much prefer to avoid. However, thinking about this pattern, it feels like it should be valid we enforce at runtime that the contents of the cache is never overwritten. That is, we actually do have exclusive access to cache on the highlighted line at runtime, we just cant explain that to the type system. But we can reach out for unsafe for that. Whats more, Rusts type system is powerful enough to encapsulate that usage of unsafe into a safe and generally re-usable API. So lets pull once_cell crate for this:

struct App {
  ...
  cache: once_cell::sync::OnceCell<Widget>,
}

impl App {
  fn get_widget(&self) -> &Widget {
    self.cache.get_or_init(create_widget)
  }
}

Coming back to the original hash-map example, we can apply the same logic here: as long as we never overwrite, delete or move values, we can safely return references to them. This is handled by the elsa crate:

struct App {
  config: Config,
  db: Db,
  cache: elsa::map::FrozenMap<u32, Box<Widget>>,
}

impl App {
  pub fn get_widget(
    &self,
    id: u32,
  ) -> io::Result<Option<&Widget>> {
    if let Some(widget) = self.cache.get(&id) {
      return Ok(Some(widget));
    }

    let key = id.to_be_bytes();
    let value = match self.db.load(&key)? {
      None => return Ok(None),
      Some(it) => it,
    };
    let widget: Widget =
      bincode::deserialize(&value).map_err(|it| {
        io::Error::new(io::ErrorKind::InvalidData, it)
      })?;

    let widget = self.cache.insert(id, Box::new(widget));

    Ok(Some(widget))
  }
}

The third case is that of a bounded cache. If you need to evict values, than the above reasoning does not apply. If the user of a cache gets a &T, and than the corresponding entry is evicted, the reference would dangle. In this situations, we want the clients of the cache to co-own the value. This is easily handled by an Rc:

struct App {
  config: Config,
  db: Db,
  cache: RefCell<lru::LruCache<u32, Rc<Widget>>>,
}

impl App {
  pub fn get_widget(
    &self,
    id: u32,
  ) -> io::Result<Option<Rc<Widget>>> {
    {
      let mut cache = self.cache.borrow_mut();
      if let Some(widget) = cache.get(&id) {
        return Ok(Some(Rc::clone(widget)));
      }
    }

    let key = id.to_be_bytes();
    let value = match self.db.load(&key)? {
      None => return Ok(None),
      Some(it) => it,
    };
    let widget: Widget =
      bincode::deserialize(&value).map_err(|it| {
        io::Error::new(io::ErrorKind::InvalidData, it)
      })?;

    let widget = Rc::new(widget);
    {
      let mut cache = self.cache.borrow_mut();
      cache.put(id, Rc::clone(&widget));
    }

    Ok(Some(widget))
  }
}

To sum up: when implementing a cache, the path of the least resistance is to come up with a signature like this:

fn get(&mut self) -> &T

This often leads to problems down the line. Its usually better to employ some interior mutability and get either of these instead:

fn get(&self) -> &T
fn get(&self) -> T

This is an instance of the more general effect: despite the mutability terminology, Rust references track not mutability, but aliasing. Mutability and exclusive access are correlated, but not perfectly. Its important to identify instances where you need to employ interior mutability, often they are architecturally interesting.

To learn more about relationships between aliasing and mutability, I recommend the following two posts:

Rust: A unique perspective

https://limpet.net/mbrubeck/2019/02/07/rust-a-unique-perspective.html

Accurate mental model for Rust’s reference types

https://docs.rs/dtolnay/latest/dtolnay/macro._02__reference_types.html

Finally, the borrow checker limitation is explained (with much skill and humor, I should add), in this document:

Polonius the Crab

https://docs.rs/polonius-the-crab/0.2.1/polonius_the_crab/

Thats all! Discussion on /r/rust.