The cost-based optimizer seeks the lowest cost for a query, usually related to time.
In versions prior to 2.1, a heuristic planner was used to generate query execution plans. The heuristic planner is only used in the following cases:
- If your query uses functionality that is not yet supported by the cost-based optimizer. For more information about the types of queries that are supported, see Types of statements supported by the cost-based optimizer.
- If you explicitly turn off the optimizer. For more information, see How to turn the optimizer off.
How is cost calculated?
A given SQL query can have thousands of equivalent query plans with vastly different execution times. The cost-based optimizer enumerates these plans and chooses the lowest cost plan.
Cost is roughly calculated by:
- Estimating how much time each node in the query plan will use to process all results
- Modeling how data flows through the query plan
The most important factor in determining the quality of a plan is cardinality (i.e., the number of rows); the fewer rows each SQL operator needs to process, the faster the query will run.
View query plan
To see whether a query will be run with the cost-based optimizer, run the query with EXPLAIN (OPT). The OPT option displays a query plan tree, along with some information that was used to plan the query.
If the query is unsupported it will return an error message that starts with e.g., pq: unsupported statement. In such cases, the query will be run with the legacy heuristic planner. This should be rare since the optimizer supports most SQL statements.
Types of statements supported by the cost-based optimizer
The cost-based optimizer supports most SQL statements. Specifically, the following types of statements are supported:
- CREATE TABLE
- UPDATE
- INSERT, including:- INSERT .. ON CONFLICT DO NOTHING
- INSERT .. ON CONFLICT .. DO UPDATE
 
- UPSERT
- DELETE
- FILTERclauses on aggregate functions
- Sequences
- Views
- All SELECTstatements that do not include window functions
- All UNIONstatements that do not include window functions
- All VALUESstatements that do not include window functions
- Most correlated subqueries — for exceptions, see Correlated subqueries.
This is not meant to be an exhaustive list. To check whether a particular query will be run with the cost-based optimizer, follow the instructions in the View query plan section.
Table statistics
The cost-based optimizer can often find more performant query plans if it has access to statistical data on the contents of your tables. This data needs to be generated from scratch for new tables, and regenerated periodically for existing tables.
New in v19.1: By default, CockroachDB generates table statistics automatically as tables are updated. It does this using a background job that automatically determines which columns to get statistics on — specifically, it chooses:
- Columns that are part of the primary key or an index (in other words, all indexed columns).
- Up to 100 non-indexed columns.
Schema changes trigger automatic statistics collection for the affected table(s).
Controlling automatic statistics
For best query performance, most users should leave automatic statistics enabled with the default settings. The information provided in this section is useful for troubleshooting or performance tuning by advanced users.
Controlling statistics refresh rate
Statistics are refreshed in the following cases:
- When there are no statistics.
- When it's been a long time since the last refresh, where "long time" is defined according to a moving average of the time across the last several refreshes.
- After each mutation operation (INSERT,UPDATE, orDELETE), the probability of a refresh is calculated using a formula that takes the cluster settings shown below as inputs. These settings define the target number of rows in a table that should be stale before statistics on that table are refreshed. Increasing either setting will reduce the frequency of refreshes. In particular,min_stale_rowsimpacts the frequency of refreshes for small tables, whilefraction_stale_rowshas more of an impact on larger tables.
| Setting | Default Value | Details | 
|---|---|---|
| sql.stats.automatic_collection.fraction_stale_rows | 0.2 | Target fraction of stale rows per table that will trigger a statistics refresh | 
| sql.stats.automatic_collection.min_stale_rows | 500 | Target minimum number of stale rows per table that will trigger a statistics refresh | 
Because the formula for statistics refreshes is probabilistic, you should not expect to see statistics update immediately after changing these settings, or immediately after exactly 500 rows have been updated.
Turning off statistics
If you need to turn off automatic statistics collection, follow the steps below:
- Run the following statement to disable the automatic statistics cluster setting: - > SET CLUSTER SETTING sql.stats.automatic_collection.enabled = false;
- Use the - SHOW STATISTICSstatement to view automatically generated statistics.
- Delete the automatically generated statistics using the following statement: - > DELETE FROM system.table_statistics WHERE true;
- Restart the nodes in your cluster to clear the statistics caches. 
For instructions showing how to manually generate statistics, see the examples in the CREATE STATISTICS documentation.
Query plan cache
New in v19.1: CockroachDB uses a cache for the query plans generated by the optimizer. This can lead to faster query execution since the database can reuse a query plan that was previously calculated, rather than computing a new plan each time a query is executed.
The query plan cache is enabled by default. To disable it, execute the following statement:
> SET CLUSTER SETTING sql.query_cache.enabled = false;
Finally, note that only the following statements use the plan cache:
Join reordering
New in v19.1: The cost-based optimizer will explore additional join orderings in an attempt to find the lowest-cost execution plan for a query involving multiple joins, which can lead to significantly better performance in some cases.
Because this process leads to an exponential increase in the number of possible execution plans for such queries, it's only used to reorder subtrees containing 4 or fewer joins by default.
To change this setting, which is controlled by the reorder_joins_limit session variable, run the statement shown below. To disable this feature, set the variable to 0.
> SET reorder_joins_limit = 6;
We strongly recommend not setting this value higher than 8 to avoid performance degradation. If set too high, the cost of generating and costing execution plans can end up dominating the total execution time of the query.
For more information about the difficulty of selecting an optimal join ordering, see our blog post An Introduction to Join Ordering.
Join hints
New in v19.1: The optimizer supports hint syntax to force the use of a specific join algorithm. The algorithm is specified between the join type (INNER, LEFT, etc.) and the JOIN keyword, for example:
- INNER HASH JOIN
- OUTER MERGE JOIN
- LEFT LOOKUP JOIN
- CROSS HASH JOIN
Note that the hint cannot be specified with a bare hint keyword (e.g., MERGE) - in that case, the INNER keyword must be added. For example, a INNER MERGE JOIN b will work, but a MERGE JOIN b will not work.
Join hints cannot be specified with a bare hint keyword (e.g., MERGE) due to SQL's implicit AS syntax. If you're not careful, you can make MERGE be an alias for a table; for example, a MERGE JOIN b will be interpreted as having an implicit AS and be executed as a AS MERGE JOIN b, which is just a long way of saying a JOIN b. Because the resulting query might execute without returning any hint-related error (because it is valid SQL), it will seem like the join hint "worked", but actually it didn't affect which join algorithm was used. In this case, the correct syntax is a INNER MERGE JOIN b.
Supported join algorithms
- HASH: Forces a hash join; in other words, it disables merge and lookup joins. A hash join is always possible, even if there are no equality columns - CockroachDB considers the nested loop join with no index a degenerate case of the hash join (i.e., a hash table with one bucket).
- MERGE: Forces a merge join, even if it requires re-sorting both sides of the join.
- LOOKUP: Forces a lookup join into the right side; the right side must be a table with a suitable index. Note that- LOOKUPcan only be used with- INNERand- LEFTjoins.
If it is not possible to use the algorithm specified in the hint, an error is signaled.
Additional considerations
- This syntax is consistent with the SQL Server syntax for join hints, except that: - SQL Server uses LOOPinstead ofLOOKUP.
- CockroachDB does not support LOOPand instead supportsLOOKUPfor the specific case of nested loop joins with an index.
 
- SQL Server uses 
- When a join hint is specified, the two tables will not be reordered by the optimizer. The reordering behavior has the following characteristics, which can be affected by hints: - Given a JOIN b, CockroachDB will not try to commute tob JOIN a. This means that you will need to pay attention to this ordering, which is especially important for lookup joins. Without a hint,a JOIN bmight be executed asb INNER LOOKUP JOIN ausing an index intoa, whereasa INNER LOOKUP JOIN brequires an index intob.
- (a JOIN b) JOIN cmight be changed to- a JOIN (b JOIN c), but this does not happen if- a JOIN buses a hint; the hint forces that particular join to happen as written in the query.
 
- Given 
- Hint usage should be reconsidered with each new release of CockroachDB. Due to improvements in the optimizer, hints specified to work with an older version may cause decreased performance in a newer version. 
Preferring the nearest index
New in v19.1: Given multiple identical indexes that have different locality constraints using replication zones, the optimizer will prefer the index that is closest to the gateway node that is planning the query. In a properly configured geo-distributed cluster, this can lead to performance improvements due to improved data locality and reduced network traffic.
This feature enables scenarios such as:
- Reference data such as a table of postal codes that can be replicated to different regions, and queries will use the copy in the same region. See Example - zone constraints for more details.
- Optimizing for local reads (potentially at the expense of writes) by adding leaseholder preferences to your zone configuration. See Example - leaseholder preferences for more details.
This feature is only available to users with an enterprise license.
To take advantage of this feature, you need to:
- Have an enterprise license.
- Determine which data consists of reference tables that are rarely updated (such as postal codes) and can therefore be easily replicated to different regions.
- Create multiple secondary indexes on the reference tables. Note that these indexes must include (in key or using STORED) every column that you wish to query. For example, if you runSELECT * from db.tableand not every column ofdb.tableis in the set of secondary indexes you created, the optimizer will have no choice but to fall back to the primary index.
- Create replication zones for each index.
With the above pieces in place, the optimizer will automatically choose the index nearest the gateway node that is planning the query.
The optimizer does not actually understand geographic locations, i.e., the relative closeness of the gateway node to other nodes that are located to its "east" or "west". It is matching against the node locality constraints you provided when you configured your replication zones.
Examples
Zone constraints
We can demonstrate the necessary configuration steps using a local cluster. The instructions below assume that you are already familiar with:
- How to start a local cluster.
- The syntax for assigning node locality when configuring replication zones.
- Using the built-in SQL client.
First, start 3 local nodes as shown below. Use the --locality flag to put them each in a different region as denoted by region=usa, region=eu, etc.
$ cockroach start --locality=region=usa  --insecure --store=/tmp/node0 --listen-addr=localhost:26257 \
  --http-port=8888  --join=localhost:26257,localhost:26258,localhost:26259 --background
$ cockroach start --locality=region=eu   --insecure --store=/tmp/node1 --listen-addr=localhost:26258 \
  --http-port=8889  --join=localhost:26257,localhost:26258,localhost:26259 --background
$ cockroach start --locality=region=apac --insecure --store=/tmp/node2 --listen-addr=localhost:26259 \
  --http-port=8890  --join=localhost:26257,localhost:26258,localhost:26259 --background
$ cockroach init --insecure --host=localhost --port=26257
Next, from the SQL client, add your organization name and enterprise license:
$ cockroach sql --insecure --host=localhost --port=26257
> SET CLUSTER SETTING cluster.organization = 'FooCorp - Local Testing';
> SET CLUSTER SETTING enterprise.license = 'xxxxx';
Create a test database and table. The table will have 3 indexes into the same data. Later, we'll configure the cluster to associate each of these indexes with a different datacenter using replication zones.
> CREATE DATABASE IF NOT EXISTS test;
> USE test;
CREATE TABLE postal_codes (
    id INT PRIMARY KEY,
    code STRING,
    INDEX idx_eu (id) STORING (code),
    INDEX idx_apac (id) STORING (code)
);
Next, we modify the replication zone configuration via SQL so that:
- Nodes in the USA will use the primary key index.
- Nodes in the EU will use the postal_codes@idx_euindex (which is identical to the primary key index).
- Nodes in APAC will use the postal_codes@idx_apacindex (which is also identical to the primary key index).
ALTER TABLE postal_codes CONFIGURE ZONE USING constraints='["+region=usa"]';
ALTER INDEX postal_codes@idx_eu CONFIGURE ZONE USING constraints='["+region=eu"]';
ALTER INDEX postal_codes@idx_apac CONFIGURE ZONE USING constraints='["+region=apac"]';
To verify this feature is working as expected, we'll query the database from each of our local nodes as shown below. Each node has been configured to be in a different region, and it should now be using the index pinned to that region.
In a geo-distributed scenario with a cluster that spans multiple datacenters, it may take time for the optimizer to fetch schemas from other nodes the first time a query is planned; thereafter, the schema should be cached locally.
For example, if you have 11 nodes, you may see 11 queries with high latency due to schema cache misses. Once all nodes have cached the schema locally, the latencies will drop.
This behavior may also cause the Statements page of the Web UI to show misleadingly high latencies until schemas are cached locally.
As expected, the node in the USA region uses the primary key index.
$ cockroach sql --insecure --host=localhost --port=26257 --database=test -e 'EXPLAIN SELECT * FROM postal_codes WHERE id=1;'
  tree | field |      description
+------+-------+----------------------+
  scan |       |
       | table | postal_codes@primary
       | spans | /1-/1/#
(3 rows)
As expected, the node in the EU uses the idx_eu index.
$ cockroach sql --insecure --host=localhost --port=26258 --database=test -e 'EXPLAIN SELECT * FROM postal_codes WHERE id=1;'
  tree | field |     description
+------+-------+---------------------+
  scan |       |
       | table | postal_codes@idx_eu
       | spans | /1-/2
(3 rows)
As expected, the node in APAC uses the idx_apac index.
$ cockroach sql --insecure --host=localhost --port=26259 --database=test -e 'EXPLAIN SELECT * FROM postal_codes WHERE id=1;'
  tree | field |      description
+------+-------+-----------------------+
  scan |       |
       | table | postal_codes@idx_apac
       | spans | /1-/2
(3 rows)
You'll need to make changes to the above configuration to reflect your production environment, but the concepts will be the same.
Leaseholder preferences
If you provide leaseholder preferences in addition to replication zone constraints, the optimizer will attempt to take your leaseholder preferences into account as well when selecting an index for your query. There are several factors to keep in mind:
- Zone constraints are always respected (hard constraint), whereas lease preferences are taken into account as "additional information" -- as long as they do not contradict the zone constraints. 
- The optimizer does not consider the real-time location of leaseholders when selecting an index; it is pattern matching on the text values passed in the configuration (e.g., the - ALTER INDEXstatements shown below). For the same reason, the optimizer only matches against the first locality in your- lease_preferencesarray.
- The optimizer may use an index that satisfies your leaseholder preferences even though that index has moved to a different node/region due to leaseholder rebalancing. This can cause slower performance than you expected. Therefore, you should only use this feature if you’re confident you know where the leaseholders will end up based on your cluster's usage patterns. We recommend thoroughly testing your configuration to ensure the optimizer is selecting the index(es) you expect. 
In this example, we'll set up an authentication service using the access token / refresh token pattern from OAuth 2. To support fast local reads in our geo-distributed use case, we will have 3 indexes into the same authentication data: one for each region of our cluster. We configure each index using zone configurations and lease preferences so that the optimizer will use the local index for better performance.
The instructions below assume that you are already familiar with:
- How to start a local cluster.
- The syntax for assigning node locality when configuring replication zones.
- Using the built-in SQL client.
First, start 3 local nodes as shown below. Use the --locality flag to put them each in a different region.
$ cockroach start --locality=region=us-east  --insecure --store=/tmp/node0 --listen-addr=localhost:26257 \
  --http-port=8888  --join=localhost:26257,localhost:26258,localhost:26259 --background
$ cockroach start --locality=region=us-central   --insecure --store=/tmp/node1 --listen-addr=localhost:26258 \
  --http-port=8889  --join=localhost:26257,localhost:26258,localhost:26259 --background
$ cockroach start --locality=region=us-west --insecure --store=/tmp/node2 --listen-addr=localhost:26259 \
  --http-port=8890  --join=localhost:26257,localhost:26258,localhost:26259 --background
$ cockroach init --insecure --host=localhost --port=26257
From the SQL client, add your organization name and enterprise license:
$ cockroach sql --insecure --host=localhost --port=26257
> SET CLUSTER SETTING cluster.organization = 'FooCorp - Local Testing';
> SET CLUSTER SETTING enterprise.license = 'xxxxx';
Create an authentication database and table:
> CREATE DATABASE if NOT EXISTS auth;
> USE auth;
> CREATE TABLE token (
    token_id VARCHAR(100) NULL,
    access_token VARCHAR(4000) NULL,
    refresh_token VARCHAR(4000) NULL
  );
Create the indexes for each region:
> CREATE INDEX token_id_west_idx ON token (token_id) STORING (access_token, refresh_token);
> CREATE INDEX token_id_central_idx ON token (token_id) STORING (access_token, refresh_token);
> CREATE INDEX token_id_east_idx ON token (token_id) STORING (access_token, refresh_token);
Enter zone configurations to distribute replicas across the cluster as follows:
- For the "East" index, store 2 replicas in the East, 2 in Central, and 1 in the West. Further, prefer that the leaseholders for that index live in the East or, failing that, in the Central region.
- Follow the same replica and leaseholder patterns for each of the Central and West regions.
The idea is that, for example, token_id_east_idx will have sufficient replicas (2/5) so that even if one replica goes down, the leaseholder will stay in the East region. That way, if a query comes in that accesses the columns covered by that index from the East gateway node, the optimizer will select token_id_east_idx for fast reads.
The ALTER TABLE statement below is not required since it's later made redundant by the token_id_west_idx index. In production, you might go with the ALTER TABLE to put your table's lease preferences in the West, and then create only 2 indexes (for East and Central); however, the use of 3 indexes makes the example easier to understand.
> ALTER TABLE token CONFIGURE ZONE USING
        num_replicas = 5, constraints = '{+region=us-east: 1, +region=us-central: 2, +region=us-west: 2}', lease_preferences = '[[+region=us-west], [+region=us-central]]';
> ALTER INDEX token_id_east_idx CONFIGURE ZONE USING num_replicas = 5,
        constraints = '{+region=us-east: 2, +region=us-central: 2, +region=us-west: 1}', lease_preferences = '[[+region=us-east], [+region=us-central]]';
> ALTER INDEX token_id_central_idx CONFIGURE ZONE USING num_replicas = 5,
        constraints = '{+region=us-east: 2, +region=us-central: 2, +region=us-west: 1}', lease_preferences = '[[+region=us-central], [+region=us-east]]';
> ALTER INDEX token_id_west_idx CONFIGURE ZONE USING num_replicas = 5,
        constraints = '{+region=us-west: 2, +region=us-central: 2, +region=us-east: 1}', lease_preferences = '[[+region=us-west], [+region=us-central]]';
Next let's check our zone configurations to make sure they match our expectation:
> SHOW ZONE CONFIGURATIONS;
The output should include the following:
auth.token                      | ALTER TABLE auth.public.token CONFIGURE ZONE USING
                                |     num_replicas = 5,
                                |     constraints = '{+region=us-central: 2, +region=us-east: 1, +region=us-west: 2}',
                                |     lease_preferences = '[[+region=us-west], [+region=us-central]]'
auth.token@token_id_east_idx    | ALTER INDEX auth.public.token@token_id_east_idx CONFIGURE ZONE USING
                                |     num_replicas = 5,
                                |     constraints = '{+region=us-central: 2, +region=us-east: 2, +region=us-west: 1}',
                                |     lease_preferences = '[[+region=us-east], [+region=us-central]]'
auth.token@token_id_central_idx | ALTER INDEX auth.public.token@token_id_central_idx CONFIGURE ZONE USING
                                |     num_replicas = 5,
                                |     constraints = '{+region=us-central: 2, +region=us-east: 2, +region=us-west: 1}',
                                |     lease_preferences = '[[+region=us-central], [+region=us-east]]'
auth.token@token_id_west_idx    | ALTER INDEX auth.public.token@token_id_west_idx CONFIGURE ZONE USING
                                |     num_replicas = 5,
                                |     constraints = '{+region=us-central: 2, +region=us-east: 1, +region=us-west: 2}',
                                |     lease_preferences = '[[+region=us-west], [+region=us-central]]'
Now that we've set up our indexes the way we want them, we need to insert some data. The first statement below inserts 10,000 rows of placeholder data; the second inserts a row with a specific UUID string that we'll later query against to check which index is used.
On a freshly created cluster like this one, you may need to wait a moment after adding the data to give automatic statistics time to update. Then, the optimizer can generate a query plan that uses the expected index.
> INSERT
  INTO
      token (token_id, access_token, refresh_token)
  SELECT
      gen_random_uuid()::STRING,
      gen_random_uuid()::STRING,
      gen_random_uuid()::STRING
  FROM
      generate_series(1, 10000);
> INSERT
  INTO
      token (token_id, access_token, refresh_token)
  VALUES
      (
          '2E1B5BFE-6152-11E9-B9FD-A7E0F13211D9',
          '49E36152-6152-11E9-8CDC-3682F23211D9',
          '4E0E91B6-6152-11E9-BAC1-3782F23211D9'
      );
Finally, we EXPLAIN a selection query from each node to verify which index is being queried against. For example, when running the query shown below against the us-west node, we expect it to use the token_id_west_idx index.
$ cockroach sql --insecure --host=localhost --port=26259 --database=auth # "West" node
> EXPLAIN
      SELECT
          access_token, refresh_token
      FROM
          token
      WHERE
          token_id = '2E1B5BFE-6152-11E9-B9FD-A7E0F13211D9';
    tree    | field |                                        description
+-----------+-------+-------------------------------------------------------------------------------------------+
  render    |       |
   └── scan |       |
            | table | token@token_id_west_idx
            | spans | /"2E1B5BFE-6152-11E9-B9FD-A7E0F13211D9"-/"2E1B5BFE-6152-11E9-B9FD-A7E0F13211D9"/PrefixEnd
(4 rows)
Time: 787µs
Similarly, queries from the us-east node should use the token_id_east_idx index (and the same should be true for us-central).
$ cockroach sql --insecure --host=localhost --port=26257 --database=auth # "East" node
> EXPLAIN
      SELECT
          access_token, refresh_token
      FROM
          token
      WHERE
          token_id = '2E1B5BFE-6152-11E9-B9FD-A7E0F13211D9';
    tree    | field |                                        description
+-----------+-------+-------------------------------------------------------------------------------------------+
  render    |       |
   └── scan |       |
            | table | token@token_id_east_idx
            | spans | /"2E1B5BFE-6152-11E9-B9FD-A7E0F13211D9"-/"2E1B5BFE-6152-11E9-B9FD-A7E0F13211D9"/PrefixEnd
(4 rows)
Time: 619µs
You'll need to make changes to the above configuration to reflect your production environment, but the concepts will be the same.
How to turn the optimizer off
With the optimizer turned on, the performance of some workloads may change. If your workload performs worse than expected (e.g., lower throughput or higher latency), you can turn off the cost-based optimizer and use the heuristic planner.
To turn the cost-based optimizer off for the current session:
> SET optimizer = 'off';
To turn the cost-based optimizer off for all sessions:
> SET CLUSTER SETTING sql.defaults.optimizer = 'off';
Changing the cluster setting does not immediately turn the optimizer off; instead, it changes the default session setting to off. To see the change, restart your session.
Known limitations
- Some features are not supported by the cost-based optimizer; however, the optimizer will fall back to the heuristic planner for this functionality. If performance is worse than in previous versions of CockroachDB, you can turn the optimizer off to manually force it to fallback to the heuristic planner.