The first release candidate of Redis 4.0 is out

antirez 2793 days ago. 297205 views.

It’s not yet stable but it’s soon to become, and comes with a long list of things that will make Redis more useful for we users: finally Redis 4.0 Release Candidate 1 is here, and is bold enough to call itself 4.0 instead of 3.4. For me semantic versioning is not a thing, what I like instead is try to communicate, using version numbers and jumps, what’s up with the new version, and in this specific case 4.0 means “this is the shit”.

It’s just that Redis 4.0 has a lot of things that Redis should have had since ages, in a different world where one developer can, like Ken The Warrior, duplicate itself in ten copies and start to code. But it does not matter how hard I try to learn about new vim shortcuts, still the duplicate-me thing is not in my chords.

But well, finally with 4.0 we got a lot of these stuff done… and here is the list of the big ones, with a few details and pointers to learn more.

1. Modules

As you probably already know Redis 4.0 got a modules system, and one that allows to do pretty fancy stuff like implementing new data types that are RDB/AOF persisted, create non blocking commands and so forth. The deal here is that all this is done with an higher level abstract API completely separated from the core, so the modules you write are going to work with new releases of Redis. Using modules I wrote Neural Redis, a neural network data type that can be trained inside Redis itself, and many people are doing very interesting stuff: there are new rate limiting commands (implemented in Rust!), Graph DBs on top of Redis, secondary indexes, time series modules, full text indexing, and a number of other stuff, and my feeling is that’s just the start.

This does not just allow Redis to grow and cover new things, while taking the core, if not minimal, just with things that are useful to most users at least, pretty generic things that many people need. But also has the potential to avoid for many tasks the problem of rewriting a networked server, even if the goal is to create something not related to Redis, databases, caching, or whatever Redis is. That is, you can just write a module to use the Redis “infrastructure”: the protocol, the clients people already wrote and so forth. So I’ve a good feeling about it, no pressure for the core, freedom for the users that want to do more crazy stuff.

2. Replication version 2

So, that’s going to be very useful in production from the POV of operations. At some point in the past we introduced what is known as “PSYNC”. It was a new master-slave protocol that allowed the master and the salve to continue from where they were, if the connection between the two broke. Before that, every single connection break in the replication link between master and slave would result into a full synchronization: generate an RDB file in the master, transfer it, load it in the slave, ok you know how this works. So PSYNC was like a real improvement. But not enough…

PSYNC was not good enough when there was a failover. If a slave is promoted to master, the slaves that replicated with the old master were not able to connect to the new promoted slave and PSYNC with it: a full resynchronization was needed. This is not great, and is not good for Redis Cluster as well. However fixing this required to do changes to the replication protocol, because I really wanted to make sure that partial resyncs where going to work after any possible topology change, as long as there was a common replication history among the instances.

So the first change needed was about how “chained replication” works, that is, slaves of slaves of slaves … How do they work? Like, A is the master, and we have something like that:

A —> B —> C —> D

So A is the master of B but B is the master of C and so forth. Before Redis 4.0 what was happening is that B received the replication protocol from A. The replication protocol is, normally, a stream of write commands. B acted as a master for C just doing, internally, what A was doing: at every write it was generating again a suitable replication protocol to pass to C, and so forth.

Now instead B just proxies, verbatim, what it receives from A to C, and C will do the same for D: given that all the sub slaves now receive an identical stream of bytes, they can “tag” a given history, and use the tag and the offset to always try to continue if they have something in common.

The master itself, when it is turned into a slave, is now able to PSYNC with the new master. And slaves can usually PSYNC with the master even after a “clean” restart, using the info inside the RDB file, that now stores the replication tags and offsets.

Details are more complex than that, in order to make it working well, but well the deal here is, don’t be annoyed by full resynchronizations if possible. And PSYNC v2 does this well apparently. Please try it and let me know if you are interested in this feature.

3. Cache eviction improvements

Ok about that I wrote a full article months ago: http://antirez.com/news/109. So going to give you just the TLDR. We now have LFU (Last Frequently Used), and all the other policies switched to a more robust, fast and precise implementation. So big news for caching use cases. Read the full article if you care about this stuff, there are tons of infos.

4. Non blocking DEL and FLUSHALL/FLUSHDB.

Code name “lazy freeing of objects”, but it’s a lame name for a neat feature. There is a new command called UNLINK that just deletes a key reference in the database, and does the actual clean up of the allocations in a separated thread, so if you use UNLINK instead of DEL against a huge key the server will not block. And even better with the ASYNC options of FLUSHALL and FLUSHDB you can do that for whole DBs or for all the data inside the instance, if you want. Combined with the new SWAPDB command, that swaps two Redis databases content, FLUSHDB ASYNC can be quite interesting. Once you, for instance, populated DB 1 with the new version of the data, you can SWAPDB 0 1 and FLUSHDB ASYNC the database with the old data, and create yet a newer version and reiterate. This is only possible now because flushing a whole DB is no longer blocking.

There are reasons why UNLINK is not the default for DEL. I know things… I can’t talk (**).

5. Mixed RDB-AOF persistence format.

Optionally, if you enable it, now AOF rewrites are performed by prefixing the AOF file with an RDB file, which is both faster to generate and to load. This is going to be very useful in certain environments, but makes the AOF file a less transparent file, so it’s an optional thing for now. This feature was talked for ages and finally is “in”.

6. The new MEMORY command.

I love it, as much as I loved LATENCY DOCTOR that once introduced cut the percentage of “My Redis is slow” complains in the mailing list to a minor fraction. Now we have it for the memory issues as well.

127.0.0.1:6379> MEMORY DOCTOR
Hi Sam, this instance is empty or is using very little memory, my issues detector can't be used in these conditions. Please, leave for your mission on Earth and fill it with some data. The new Sam and I will be back to our programming as soon as I finished rebooting.

Movie rights owners will probably sue me for taking inspirations of sci-fi dialogues but that’s fine. Bring oranges for me when I’ll be in jail.

MEMORY does a lot more than that.

127.0.0.1:6379> MEMORY HELP
1) "MEMORY USAGE <key> [SAMPLES <count>] - Estimate memory usage of key"
2) "MEMORY STATS - Show memory usage details"
3) "MEMORY PURGE - Ask the allocator to release memory"
4) "MEMORY MALLOC-STATS - Show allocator internal stats"

The memory usage reporting of the USAGE subcommand is going to be very useful, but also the in depth informations provided by “STATS”.

For now all this is totally not documented, so have fun figuring out what the heck it does.

7. Redis Cluster is now NAT / Docker compatible.

But this is actually a bad news too, because the binary protocol changed in the “Cluster bus” thing nodes use to communicate, so to upgrade to 4.0 you need a mass reboot of Redis Cluster. I’m sorry I was tricked into this NAT / Docker fixes, pardon me. And I tried to make it backward compatible but there was no way to obtain this easily, or even non easily, without doing totally awkward stuff.

The feature is explained in the example redis.conf file. You can’t see how I’m a bit less excited about this feature compared to the other ones, don’t you?

Well… that’s it I guess, those are the major things. If you want also read the release notes here: https://raw.githubusercontent.com/antirez/redis/4.0/00-RELEASENOTES

ETA to get it stable is as usually unknown: I plan to release a new RC every 2-4 weeks more or less. As bugs slow down significantly both from the point of view of severity and frequency of reporting, it will be Redis 4.0-final time. However there is a lot of doc to update as well, so I’ll have plenty of things to do.

Many thanks to all the people that contributed to this release: a lot did in significant ways. In the release note above there is the list of all the commits that you can scan in order to see the names of the committers.

Thanks to the Redis community and Redis Labs for making all this possible, but especially thanks to all the developers that use Redis and apply it well in every day problems to get shit done, because this is the whole point, other than having fun while coding.

P.s. faster way to grab the new code is to fetch the '4.0' branch from Github. Repository is as usually antirez/redis.

** Check https://news.ycombinator.com/item?id=13091370 for more info about UNLINK not being the default behavior for DEL.