In this post I want to share some details about Redis as one of the Centrifugo built-in engines.

This is what “redis” means inRussia:)
Maybe you know from project docs or from my first article here Four years in Centrifuge that Centrifugo has two built-in engines: in memory (suitable for single node deploy) and Redis Engine (for multi-node Centrifugo deploy). Besides multi-node support due to its PUB/SUB feature Redis gives us an opportunity to keep channel data ― such as message history cache, client presence information ― and allows all nodes to have access to this data. But lets start from the beginning and describe everything in more detail.
What’s engine?Engine in Centrifugo must be able to do several things:
It’s responsible for subscribing/unsubscribing node to/from channels It’s responsible for publishing messages into channels It’s responsible for maintaining message history cache for channels (and maintain its expiration) It’s responsible for maintaining channel presence information (and maintain its expiration)Too many responsibilities, yeah?:)
There are not so many server software in the world that could provide all of the features above out of the box. I can only remember RethinkDB (though it has no builtin expiration) and Tarantool (it has no proper PUB/SUB yet). Redis is perfect candidate here as it has everything Centrifugo needs to work. Let’s see in details how Centrifugo utilizes Redis features.
Publish/SubscribeCentrifugo is a PUB/SUB server ― this means that clients subscribe on channels (topics) and wait for messages published into those channels (by another client or application backend). There are many similar real-time solutions in the wild. Redis allows Centrifugo to be more scalable as it provides a way to run several Centrifugo nodes for load-balancing clients. Those nodes connected together using Redis as PUB/SUB broker.
Every Centrifugo node subscribes on some channels in Redis:
centrifugo.control channel ― channel through which all internal communnication between Centrifugo nodes happens ― ping messages, node statistics, propagating some API commands such as disconnect and unsubscribe centrifugo.admin channel ― channel that allows admin websocket connections to live on any node as messages will be delivered to any of them using this PUB/SUB channel and the most important ― Centrifugo subscribes on centrifugo.message.CHANNEL channels in Redis as soon as some websocket or SockJS client subscribes on channel CHANNEL .Now let’s look at what happens. Let’s look again at simple scheme I already showed you in previous post here.
You can see 4 Centrifugo nodes, all connected via Redis:

Now if client connects to any Centrifugo node and another client connects to any other Centrifugo node and both clients subscribe on the same CHANNEL ― it’s possible for application to just PUBLISH message into one Centrifugo node. Node will publish that message into Redis then so every node that has interested client (subscribed on CHANNEL ) will receive message and will forward it to client connection.
Using PUB/SUB via internal communication centrifugo.control channel allows each node to have information about other running nodes ― there is no need to create full mesh graph off connected nodes ― with Redis as PUB/SUB proxy everything is very simple. To add new Centrifugo node into cluster all you need is to provide Redis server address in configuration.
Publish new messages intochannelsAs I mentioned above, to deliver message to clients message must be PUBLISHed into Redis. Centrifugo has HTTP API to receive new messages from application ― after preparing new message it then published into Redis.
Interesting thing here is how to publish efficiently. As Redis is single threaded and supports pipelining ― the most efficient way is publish new messages to Redis over single connection and use batching mechanism to pipeline as many messages into batch as collected over RTT interval since previous pipeline request sent. It could be not very opaque from my words ― so look at Smart Batching article describing this simple technique.
Btw for SUBSCRIBE/UNSUBSCRIBE commands described above Centrifugo utilizes pipelining too.
Combining pipelining and batching allowed to increase publish throughput in more than 20 times. Many thanks to Paul Banks who contributed this improvement into Centrifugo.
Message historycacheLet’s go further and talk about message history. Centrifugo provides message history of limited size and lifetime for channels. History for channels is kept in Redis LIST data structure. Every time message added into history list LTRIM command called to keep list of fixed maximum size. Also an EXPIRE command called every time too. So there is no infinite history grows and no memory leak from old unused channels.
The big win is that Redis provides both PUB/SUB and store capatibilities is that message publish and saving it into channel history list can be done atomically. So actually when publishing message Centrifugo calls Lua script which combines both operations into one atomic step. Here is this script:
local n = redis.call("publish", ARGV[1], ARGV[2]) local m = redis.call("lpush", KEYS[1], ARGV[2]) if m > 0 then redis.call("ltrim", KEYS[1], 0, ARGV[3]) redis.call("expire", KEYS[1], ARGV[4]) end return nKEYS contain Redis key names we do operation under and ARGV array has different script arguments such as channel name to publish, data to publish and some Centrifugo specific channel options which determine how to deal with message.
Again ― combination of PUB/SUB, data store and atomic flexible lua procedures makes Redis unique and very suitable for Centrifugo needs.
Presence informationAnother important Centrifugo feature is presence information ― sometimes it’s necessary for application to get info about clients currently connected and subscribed on certain channels. Presence information must be expiring.
Presence information in Centrifugo implemented using combination of 2 built-in Redis data structures ― SET and HASH. Below is a lua script that updates presence information for client in channel:
redis.call("zadd", KEYS[1], ARGV[2], ARGV[3]) redis.call("hset", KEYS[2], ARGV[3], ARGV[4]) redis.call("expire", KEYS[1], ARGV[1]) redis.call("expire", KEYS[2], ARGV[1])KEYS just contain key for SET structure and key for HASH structure built using channel name.
ARGV[1] ― expiration seconds ARGV[2] ― expire at time as Unix seconds for SET member ARGV[3] ― unique connection ID in Centrifugo ARGV[4] ― encoded c