Today on the bus I've been playing with some of the exercises from chapter 12 of Programming Erlang. I didn't get far, but it was interesting to run the process benchmark; trying to start the max number of processes failed - it turns out there's about thirty processes running when you start the shell. Good to know.

I've been offhandedly working on the file cache; one thing I need to think about is how to backup the cache metadata. I suppose a snapshot should contain all this data, and it should go to a backup drive. I think the backend needs to take a blocklist, figure out which blocks it doesn't have, remove the ones it no longer needs and write the ones it should have.

I also need to write unit tests for the ordering code; I keep running into ordering problems.

It might make sense to build the messages around protobufs, but I don't know how that will interact with Erlang.

Maybe the flow looks like:

block_server() ->
  case block_metadata_exists(BlockMetaFile) of
    true ->
      Blocks = blocks:load_blocks(BlockMetaFile),
    false ->
      Blocks = init_blocks(BlockMetaFile),

Then I need to think about the block_server(Blocks) case: should the server wait to remove blocks until it's removed the outdated ones? If not, it could take 2x the space temporarily while a new blocklist is updated. I like the guarantees of this, but it's asking to overcommit commit a lot of storage space. Right now this is just for me, but it's not a stretch to think of running this for friends, too. In that case, that could be a lot of storage space. If we're thinking of allocating 16G of storage for the cache, double the storage requirement is 32G (plus about a gig or two of overhead). That halves the number of users that could use a system, which isn't trivial. It is, however, a stronger guarantee of data security. One option is to only remove blocks once the file they used to belong to is updated. This is something I need to think more deeply about.