Global Tables vs. Duplicate Indexes
With CockroachDB version 21.1, a new concept of “global tables” was introduced. This functionality is a good fit for situations where read latency on a table needs to be very low and write latency can be much higher than normal [read: it needs to be fast when doing lookups, but we can tolerate lengthy writes] . It is also a fit for when reads must be up to date for business reasons or because the table is referenced by foreign keys or where latency critical reads cannot be tied to specific regions. So using the global tables functionality is ideal for any kind of reference table, where we want to get the data out very quickly, but we can handle long inserts and updates. With global tables, the lengthy writes go about doing all the pre-work to prevent any contention and serializable errors that we would traditionally see. When we read from the table, all the work has previously been done to avoid any contention and we can access the closest replica and read from it, effectively as if we were doing a follower read but in real time.
Previously, this same type of scenario was accomplished using duplicate copies of covering indexes, where each copy of the index was pinned to a specific region. Then, follower reads could be used to make lookups outside of the indexes even faster if the bounded staleness inherent to follower reads could be tolerated.
The downside, at least with larger enterprise customers, is the new syntax. With the release of 21.1, the new multi-region syntax is declarative. You specify the primary region for your DB or table, you specify the survivability goal, and you set a table to have a locality of global. This is all much simpler than the older syntax for specifying zone configurations. But if you need to be precise and specific as to where you place replicas and leaseholders, such as in applications with specific business needs or because the multi-region layout of the cluster is non-trivial, the specificity of the older syntax is preferred.
So why don’t we use both the old and new syntax together? CRDB internally is going to assume that things have been setup using either the old syntax or the newer syntax. And once we touch a zone config with the older syntax, we can’t be certain the new syntax will be able to take all those situations into account. But… we can use the new syntax to create a table then inspect it with our older command set and utilize the new variables in zone configurations.
Let’s fire up a test cluster and create a table.
1root@localhost:26257/defaultdb> create table postal_codes ( id int primary key, code string) locality global;
2ERROR: cannot set LOCALITY on a table in a database that is not multi-region enabled
3SQLSTATE: 42P16
Well, that didn’t go as planned. We need to set a primary region for our DB first.
1root@localhost:26257/defaultdb> alter database defaultdb primary region "us-east1";
2ALTER DATABASE PRIMARY REGION
3
4Time: 94ms total (execution 93ms / network 0ms)
5
6root@localhost:26257/defaultdb> create table postal_codes ( id int primary key, code string) locality global;
7CREATE TABLE
8
9Time: 23ms total (execution 23ms / network 0ms)
Now if we look at the create table output, we see what you would expect after reading the CRDB documentation on the subject.
1root@localhost:26257/defaultdb> show create table postal_codes;
2 table_name | create_statement
3---------------+-------------------------------------------------
4 postal_codes | CREATE TABLE public.postal_codes (
5 | id INT8 NOT NULL,
6 | code STRING NULL,
7 | CONSTRAINT "primary" PRIMARY KEY (id ASC),
8 | FAMILY "primary" (id, code)
9 | ) LOCALITY GLOBAL
10(1 row)
11
12Time: 46ms total (execution 46ms / network 0ms)
The LOCALITY GLOBAL
is the key element in there.
But what if we look at the zone configuration?
1root@localhost:26257/defaultdb> show zone configuration for table postal_codes;
2 target | raw_config_sql
3---------------------+-------------------------------------------------
4 TABLE postal_codes | ALTER TABLE postal_codes CONFIGURE ZONE USING
5 | range_min_bytes = 134217728,
6 | range_max_bytes = 536870912,
7 | gc.ttlseconds = 90000,
8 | global_reads = true,
9 | num_replicas = 3,
10 | num_voters = 3,
11 | constraints = '{+region=us-east1: 1}',
12 | voter_constraints = '[+region=us-east1]',
13 | lease_preferences = '[[+region=us-east1]]'
14(1 row)
15
16Time: 17ms total (execution 17ms / network 0ms)
Compared to 20.2 and previous versions, there are some new variables here. Specifically, the new items are global_reads
, num_voters
, and the voter_constraints
. All of the other elements look familiar.
So if this new global table concept is a set of zone configuration variables… why don’t we use them in our traditional zone configurations? We can then have the new functionality and use the older syntax for the reasons noted previously. The caveat for using global_reads = true
is that we must also specify a lease preference, and constraints (both regular and for voters). And once we use the older zone configuration syntax we will not want to use the new declarative syntax for survivability, regions, and global tables.
In the example above, where we have three regions in our nine node cluster, we are creating three replicas in total and pinning them to us-east1 along with the lease holder. If we take num_replicas
and subtract num_voters
we are left with the number of non-voting replicas in the cluster (0 in this case) that global table reads will take advantage of. We can increase the number of replicas to nine, so that we will have six non-voting replicas, and one replica will be on each node in the cluster with each of the voting replicas (the ones that participate in quorum based writes) in us-east1. Now, with nine replicas, there will be a copy of that data close to anywhere we might issue a query.
If instead we were to set num_replicas = 5
we would then have the 3 voting replicas in us-east1 and a non-voting replica in each of the other two regions. Again, we now get very fast for lookups in us-east1 (1ms for point lookups with proper indexing) and relatively fast lookups in the other two regions (~14ms for point lookups with proper indexing if we aren’t connected to the node with the replica for that region in it and 1ms if we are connected to the node with the replica on it for that region).
And with this, we no longer need to employ our older technique of creating duplicate covering indexes and specifying which region each of those indexes should reside in. Instead, we can set global_reads = true
in our zone configuration for the table and utilize the new functionality.
Comments
Post a Comment