Redis in Modern System Design

While many engineers use Redis as a cache for quick access to pre-processed data, Redis also shines in various other practical scenarios:

  1. Nearest Neighbor Search: Redis efficiently helps find nearby drivers in ride-sharing apps.

  2. Leaderboards: It's a go-to for maintaining ranked lists of top scores in online games.

  3. Rate Limiting and Throttling: Easily implement sliding-window rate limiters to control API request rates.

  4. Large Data Membership Checks: Quickly check if an element is part of a vast dataset.

  5. Frequency Estimation: Use Redis to estimate the frequency of elements in a data stream.

  6. Find the top K items in a stream of Infinite data.

A user should be able to book a nearby Ride using Uber.

The drivers and Active users both emit their current location every few seconds.

The current location of the user/driver is stored in Redis with a TTL.

Every time, a new location is detected, the GeoHash is calculated based on the location of the user/Driver and stored like this with a TTL.

GeoHashDrivers ( sorted Set )
ab1cd4[ <driverId, other_meta_info >, .. ]
ef3gf4[ <driverId, other_meta_info >, .. ]

When an Active User tries to book a ride, their GeoHash is calculated, 8 surrounding GeoHash is calculated and a query is fired to get all the drivers from 9 geoHash.

Leaderboards

You can create a Leader Boarding using the Sorted Set Data Structure

Syntax ZADD key score member

example: ZADD game 123 Manish ZADD game 129 Dipankar

Query to get Top 10 players

ZREVRANGE game 0 9 ( Display Name of top 10 players )

ZREVRANGE game 0 9 WITHSCORES ( Display the Name and score of the top 10 players )

Rate Limiters

Building a rate limiter with Redis is easy because of two commands INCR and EXPIRE.

This example assumes a time window of 1 Mins with a Fixed Window Rate limiting Algorithm.

GET [user-api-key]:[current minute number]

GET abc:123
// If the value is less than 20 ( rate limit ), 
then go ahead and increment in the next step.

// If the value >=20 ( rate limit ), Show error message
MULTI
INCR abc:123 
EXPIRE abc:123 59 
EXEC

There are two crucial insights to glean from this procedure:

  1. The first use of INCR on a non-existent key always results in 1.

  2. EXPIRE is executed within a MULTI transaction, ensuring atomicity with INCR.

Bloom filter

is a Probabilistic data structure that helps check for membership of an element in a large set.

BF.RESERVE {key} {error_rate} {capacity} [EXPANSION expansion] [NONSCALING]

  • error_rate decides the size of the bloom filter

  • capacity is the estimated number of elements in the bloom filter. Underestimating leads to Increased false positives and hence adding new filters, resulting in increased time to lookup.

  • The size of the new sub-filter is the size of the last sub-filter multiplied by EXPANSION. The default is 2.

BF.ADD unique_visitors 10.94.214.120
BF.EXISTS unique_visitors 10.94.214.120 ( This can give false positive )
  • Once an Element is added to the Bloom Filter, it can't be removed

  • The Rate of False Positives increases with the increase in the volume of data.

Cuckoo filter

An alternative to Bloom filters with additional support for the deletion of elements from a set.

Count Min Sketch

Estimates the frequency of an Element in a Data Stream with a sub-linear memory.

This works well in the case of an Infinite stream where HashTable would keep on growing in size, resulting in more and more collisions and hence increased lookup time. This is a probabilistic data structure where this can end up over-reporting the frequency sometimes. Also, the tail frequencies are better ignored.

A simple explanation of how it works can be found here:

Top K

find the most frequent items in a data stream.

  • Find the top 10 searched keywords on Google in the last 1 hour.

  • Find the top 10 retweeted tweets in the last 10 Mins

  • Find the top 10 songs played on Spotify in the last 30 Mins.

When compared to SortedSet, this takes less memory, and faster top K lookup but is less accurate.

> TOPK.ADD my-topk foo bar baz // Insert into the data structure
> TOPK.LIST my-topk // List top K elements

References