The cockroach debug zip command connects to your cluster and gathers information from each active node into a single .zip file (inactive nodes are not included). For details on the .zip contents, see Files.
You can use the cockroach debug merge-logs command in conjunction with cockroach debug zip to merge the collected logs into one file, making them easier to parse.
The files produced by cockroach debug zip can contain highly sensitive, personally-identifiable information (PII), such as usernames, hashed passwords, and possibly table data. Use the --redact flag to configure CockroachDB to redact sensitive data when generating the .zip file (excluding range keys) if intending to share it with Cockroach Labs.
Details
Use cases
cockroach debug zip is an expensive operation and impacts cluster performance.
Only use this command as an emergency measure under the guidance of Cockroach Labs.
Particularly fetching stack traces for all goroutines is a "stop-the-world" operation, which can momentarily but significantly increase SQL service latency. Exclude these goroutine stacks by using the --include-goroutine-stacks=false flag.
There are two scenarios in which debug zip is useful:
- If you experience severe or difficult-to-reproduce issues with your cluster, Cockroach Labs might ask you to send us your cluster's debugging information using - cockroach debug zip. We recommend reducing the- *.zipfile size by only retrieving debugging information for the relevant time range of the issue by using the- --files-from, and/or- --files-untilflags.
- To collect all of your nodes' logs, which you can then parse to locate issues. You can optionally use the flags to retrieve only the log files. For more information about logs, see Logging. Also note: - Nodes that are currently down cannot deliver their logs over the network. For these nodes, you must log on to the machine where the cockroachprocess would otherwise be running, and gather the files manually.
- Nodes that are currently up but disconnected from other nodes (e.g., because of a network partition) may not be able to respond to debug ziprequests forwarded by other nodes, but can still respond to requests for data when asked directly. In such situations, we recommend using the--hostflag to pointdebug zipat each of the disconnected nodes until data has been gathered for the entire cluster.
 
- Nodes that are currently down cannot deliver their logs over the network. For these nodes, you must log on to the machine where the 
Files
cockroach debug zip collects log files, heap profiles, CPU profiles, and goroutine dumps from the last 48 hours, by default.
These files can greatly increase the size of the cockroach debug zip output. To limit the .zip file size for a large cluster, we recommend first experimenting with cockroach debug list-files and then using flags to filter the files.
The following files collected by cockroach debug zip, which are found in the individual node directories, can be filtered using the --exclude-files, --include-files, --files-from, and/or --files-until flags:
| Information | Filename | 
|---|---|
| Log files | cockroach-{log-file-group}.{host}.{user}.{start timestamp in UTC}.{process ID}.log | 
| Goroutine dumps | goroutine_dump.{date-and-time}.{metadata}.double_since_last_dump.{metadata}.txt.gz | 
| Heap profiles | memprof.{date-and-time}.{heapsize}.pprof | 
| Memory statistics | memstats.{date-and-time}.{heapsize}.txt | 
| CPU profiles | cpuprof.{date-and-time} | 
| Active query dumps | activequeryprof.{date-and-time}.csv | 
The following information is also contained in the .zip file, and cannot be filtered:
- System tables. The following system tables are not included:
- system.users
- system.web_sessions
- system.join_tokens
- system.comments
- system.ui
- system.zones
- system.statement_bundle_chunks
- system.statement_statistics
- system.transaction_statistics
 
- Cluster events
- Database details
- Schema change events
- Database, table, node, and range lists
- Node details
- Node liveness
- Gossip data
- Stack traces
- Range details
- Jobs
- Cluster Settings
- Metrics
- CPU profiles
- A script (hot-ranges.sh) that summarizes the hottest ranges (ranges receiving a high number of reads or writes)
Subcommands
While the cockroach debug command has a few subcommands, users are expected to use only the zip, encryption-active-key, merge-logs, list-files, tsdump, and ballast subcommands.
We recommend using the encryption-decrypt and job-trace subcommands only when directed by the Cockroach Labs support team.
The other debug subcommands are useful only to Cockroach Labs. Output of debug commands may contain sensitive or secret information.
Synopsis
$ cockroach debug zip {ZIP file destination} {flags}
The following flags must apply to an active CockroachDB node. If no nodes are live, you must start at least one node.
Flags
The debug zip subcommand supports the following general-use, client connection, and logging flags.
General
| Flag | Description | 
|---|---|
| --cpu-profile-duration | Fetch CPU profiles from the cluster with the specified sample duration in seconds. The debug zipcommand will block for the duration specified. A value of0disables this feature.Default: 5s | 
| --concurrency | The maximum number of nodes to concurrently poll for data. This can be any value between 1and15. | 
| --exclude-files | Files to exclude from the generated .zip. This can be used to limit the size of the generated.zip, and affects logs, heap profiles, goroutine dumps, and/or CPU profiles. The files are specified as a comma-separated list of double-quoted glob patterns. For example:
 --include_files. Usecockroach debug list-fileswith this flag to see a list of files that will be contained in the.zip. | 
| --exclude-nodes | Specify nodes to exclude from inspection as a comma-separated list or range of node IDs. For example: --exclude-nodes=1,10,13-15 | 
| --files-from | Start timestamp for log file, goroutine dump, and heap profile collection. This can be used to limit the size of the generated .zip, which is increased by these files. The timestamp uses the formatYYYY-MM-DD, followed optionally byHH:MM:SSorHH:MM. For example:--files-from='2021-07-01 15:00'When specifying a narrow time window, we recommend adding extra seconds/minutes to account for uncertainties such as clock drift. Default: 48 hours before now | 
| --files-until | End timestamp for log file, goroutine dump, and heap profile collection. This can be used to limit the size of the generated .zip, which is increased by these files. The timestamp uses the formatYYYY-MM-DD, followed optionally byHH:MM:SSorHH:MM. For example:--files-until='2021-07-01 16:00'When specifying a narrow time window, we recommend adding extra seconds/minutes to account for uncertainties such as clock drift. Default: 24 hours beyond now (to include files created during .zipcreation) | 
| --include-files | Files to include in the generated .zip. This can be used to limit the size of the generated.zip, and affects logs, heap profiles, goroutine dumps, and/or CPU profiles. The files are specified as a comma-separated list of double-quoted glob patterns. For example:
 --exclude-files. Usecockroach debug list-fileswith this flag to see a list of files that will be contained in the.zip. | 
| --include-goroutine-stacks | Fetch stack traces for all goroutines running on each targeted node in nodes/*/stacks.txtandnodes/*/stacks_with_labels.txtfiles. Note that fetching stack traces for all goroutines is a "stop-the-world" operation, which can momentarily have negative impacts on SQL service latency. Exclude these goroutine stacks by using the--include-goroutine-stacks=falseflag. Note that any periodic goroutine dumps previously taken on the node will still be included innodes/*/goroutines/*.txt.gz, as these would have already been generated and don't require any additional stop-the-world operations to be collected.Default: true | 
| --include-range-info | Include one file per node with information about the KV ranges stored on that node, in nodes/{node ID}/ranges.json.This information can be vital when debugging issues that involve the KV layer (which includes everything below the SQL layer), such as data placement, load balancing, performance or other behaviors. In certain situations, on large clusters with large numbers of ranges, these files can be omitted if and only if the issue being investigated is already known to be in another layer of the system (for example, an error message about an unsupported feature or incompatible value in a SQL schema change or statement). However, many higher-level issues are ultimately related to the underlying KV layer described by these files. Only set this to falseif directed to do so by Cockroach Labs support.In addition, include problem ranges information in reports/problemranges.json.Default: true | 
| --include-running-job-traces | Include information about each traceable job that is running or reverting (such as backup, restore, import, physical cluster replication) in jobs/*/*/trace.zipfiles. This involves collecting cluster-wide traces for each running job in the cluster.Default: true | 
| --nodes | Specify nodes to inspect as a comma-separated list or range of node IDs. For example: --nodes=1,10,13-15 | 
| --redact | Redact sensitive data in the generated .zipfile. This flag replaces the deprecated--redact-logsflag.This flag redacts the following data: 
 
 | 
| --redact-logs | Deprecated Redact sensitive data from collected log files only. Use the --redactflag instead, which redacts sensitive data across the entire generated.zipas well as the collected log files. Passing the--redact-logsflag will be interpreted as the--redactflag. | 
| --timeout | In the process of generating a debug zip, many internal requests are made. Each request is allowed the maximum duration specified by the timeout. If an internal request does not complete within the timeout duration, an error is displayed for that request and its artifact is not included in the zip file. The timeout is suffixed with s(seconds),m(minutes), orh(hours).Default: 60s | 
| --validate-zip-file | Validate debug zip file after generation. This is a quick check to validate whether the generated zip file is valid and not corrupted. Default: true | 
Client connection
| Flag | Description | 
|---|---|
| --cert-principal-map | A comma-separated list of <cert-principal>:<db-principal>mappings. This allows mapping the principal in a cert to a DB principal such asnodeorrootor any SQL user. This is intended for use in situations where the certificate management system places restrictions on theSubject.CommonNameorSubjectAlternateNamefields in the certificate (e.g., disallowing aCommonNamelikenodeorroot). If multiple mappings are provided for the same<cert-principal>, the last one specified in the list takes precedence. A principal not specified in the map is passed through as-is via the identity function. A cert is allowed to authenticate a DB principal if the DB principal name is contained in the mappedCommonNameor DNS-typeSubjectAlternateNamefields. | 
| --certs-dir | The path to the certificate directory containing the CA and client certificates and client key. Env Variable: COCKROACH_CERTS_DIRDefault: ${HOME}/.cockroach-certs/ | 
| --cluster-name | The cluster name to use to verify the cluster's identity. If the cluster has a cluster name, you must include this flag. For more information, see cockroach start. | 
| --disable-cluster-name-verification | Disables the cluster name check for this command. This flag must be paired with --cluster-name. For more information, seecockroach start. | 
| --host | The server host and port number to connect to. This can be the address of any node in the cluster. Env Variable: COCKROACH_HOSTDefault: localhost:26257 | 
| --insecure | Use an insecure connection. Env Variable: COCKROACH_INSECUREDefault: false | 
| --url | A connection URL to use instead of the other arguments. To convert a connection URL to the syntax that works with your client driver, run cockroach convert-url.Env Variable: COCKROACH_URLDefault: no URL | 
Logging
By default, this command logs messages to stderr. This includes events with WARNING severity and higher.
If you need to troubleshoot this command's behavior, you can customize its logging behavior.
Examples
Generate a debug zip file
Generate the debug zip file for an insecure cluster:
$ cockroach debug zip ./cockroach-data/logs/debug.zip --insecure --host=200.100.50.25
Generate the debug zip file for a secure cluster:
$ cockroach debug zip ./cockroach-data/logs/debug.zip --host=200.100.50.25
Secure examples assume you have the appropriate certificates in the default certificate directory, ${HOME}/.cockroach-certs/.
Generate a debug zip file for a time range
Generate a debug zip file containing only debugging information for a specified time range:
$ cockroach debug zip ./cockroach-data/logs/debug.zip --files-from='2023-10-03 13:30' --files-until='2023-10-03 14:30'
Generate a debug zip file with logs only
Generate a debug zip file containing only log files:
$ cockroach debug zip ./cockroach-data/logs/debug.zip --include-files=*.log
Redact sensitive information
Log redaction
Example of a log string without --redact enabled:
server/server.go:1423 ⋮ password of user ‹admin› was set to ‹"s3cr34?!@x_"›
Enable log redaction:
$ cockroach debug zip ./cockroach-data/logs/debug.zip --redact --insecure --host=200.100.50.25
server/server.go:1423 ⋮ password of user ‹×› was set to ‹×›
Cluster settings redaction
Example cluster settings in crdb_internal.cluster_settings.txt without --redact enabled:
variable                          value                  type public sensitive reportable description                                   default_value origin
...
cluster.organization              Cockroach Labs Testing s    t      f         f          organization name                                           override
...
server.identity_map.configuration <redacted>             s    t      t         f          system-identity to database-username mappings               default
server.identity_map.configuration is always redacted, since sensitive equals true.
Enable log redaction:
$ cockroach debug zip ./cockroach-data/logs/debug.zip --redact --insecure --host=200.100.50.25
Cluster settings in crdb_internal.cluster_settings.txt with --redact enabled:
variable                          value      type public sensitive reportable description                                    default_value origin
...
cluster.organization              <redacted> s    t      f         f          organization name                                            override
...
server.identity_map.configuration <redacted> s    t      t         f          system-identity to database-username mappings                default
server.identity_map.configuration is still redacted. cluster.organization is now redacted since reportable equals false and the Cockroach Labs Testing value is not the default value (in this case, the empty string).
Hostname and IP address redaction
Example of status.json without hostname and IP address redaction enabled:
{
  "node_id": 1,
  "address": {
    "network_field": "tcp",
    "address_field": "200.100.50.25:26257"
  },
  "sql_address": {
    "network_field": "tcp",
    "address_field": "200.100.50.25:26257"
  }
}
Enable hostname and IP address redaction with the debug.zip.redact_addresses.enabled cluster setting:
SET CLUSTER SETTING debug.zip.redact_addresses.enabled = true;
Some hostnames and IP addresses in the nodes.json and gossip.json files are never redacted, even when debug.zip.redact_addresses.enabled is enabled.
Generate .zip with --redact enabled:
cockroach debug zip ./cockroach-data/logs/debug.zip --redact --insecure --host=200.100.50.25
status.json with hostname and IP address redaction:
{
  "node_id": 1,
  "address": {
    "network_field": "tcp",
    "address_field": "‹×›"
  },
  "sql_address": {
    "network_field": "tcp",
    "address_field": "‹×›"
  }
}