Storing nested objects in Redis

Since Redis supports the usage of a Hash Tables, one dimensional JavaScript-like objects can be modeled it quite easily. The problem though is that hash tables can’t be nested - at least in Redis.

Assume we have the following JavaScript object:

var house = {
  roof: {
    color: 'black'
  },
  street: 'Market',
  buildYear: '1996'
};

This object could be stored in Redis using two different approaches:

  1. Storing the object as a JSON string using SET house:1 "{....}"

    This approach was fairly common before Redis started supporting Hash Tables. There is nothing wrong with this solution. Parsing and stringifying objects is usually not a performance issue and having the option to store nested objects is really convenient. There is only one major drawback: You can not retrieve parts of the object. The only way to access the stored data is by doing a GET house:1. You can not specify the selection of certain keys. You necessarily need to retrieve everything, which is likely to become a performance issue on really large objects.

  2. Storing the object in two hash tables.

    Since hash tables in Redis can not be nested (a.k.a. need to be “flat”), you have to use two hash tables in order to store the house-object. Suppose we have a nested object “roof”, we would store the roof-object in a separate hash table.

    Example:

    • HMSET house:1 roof "house:1:roof" street "Market" buildYear "1996" Here we create our top-level house hash table. Instead of storing the nested roof-object inside the house-object, we store a reference to a second hash table “house:1:roof”. This way, we can achieve constant time lookup on key-value-pairs. If there is a 1-to-1 relationship between house and roof, we wouldn’t event need to store the reference to the roof, since inside our application we could assume that a house always has a roof (which makes sense in many real-world scenarios).
    • HMSET house:1:roof color "black" parent "house:1" Here we actually store our nested roof-object. Again, we don’t necessarily need to keep track of the parent-object by explicitly storing a reference to a different key-value-pair as long as we have semantically reasonable names (e.g. house:1 instead of a randomized UUID for every key).

    The drawback of this approach is that this is not suited for scenarios in which you always need to retrieve all nested objects, since you would need to do n operations if the object is nested on n levels. Nevertheless, Redis operations are every cheap.

The second approach can be extended further by storing a reference to a list, (sorted) set or HyperLogLog. This is fine as long as you don’t mix datatypes, since would be likely to result in really nasty errors.

 
577
Kudos
 
577
Kudos

Now read this

How I hacked the Bower Registry and what I learned from it

Bower is a front-end package manager created at Twitter. It has about 20k active users and solves a real problem: Front-end package management. Two weeks ago, I stumbled upon a huge security risk allowing me to manipulate the underlying... Continue →