And since it's only used for speedy lookup we can even use a fast, cheap and non-secure hashing algorithm, so it's really a low-cost operation!
Thanks! This was really one of those aha-moments where I feel kinda stupid to not have thought of it myself!
yes, that's the crux of the problem. when you have a sharded database, typically you want to be able to add (and/or remove) shards easily and non-disruptively.
for example - your database is currently sharded across N nodes, and it's overloaded due to increased traffic, so you want to increase it to N+1 nodes (or N+M nodes, which can add complexity in some cases)
if adding a shard causes a significant increase in load on the database, that's usually a non-starter for a production workload, because at the time you want to do it, the database is already overloaded
you can read about this in the original Dynamo paper [0] from almost 20 years ago - consistent hashing is used to select 3 of the N nodes to host a given key. when node N+1 is added, it joins the cluster in such a way that it will "take over" hosting 1/Nth of the data, from each of the N nodes - meaning that a) the joining process places a relatively small load on each of those N nodes and b) once the node is fully joined, it reduces overall load evenly across all N nodes.
0: https://www.allthingsdistributed.com/2007/10/amazons_dynamo....