The IMPORT statement imports tabular data (e.g., CSVs) into a single table.
SET CLUSTER SETTING experimental.importcsv.enabled = trueGlossary
| Term | Definition | 
|---|---|
| Import file | The tabular data file you want to import. | 
| Processing node | The single node processing the IMPORTstatement/ | 
| Temp directory | A location where the processing node can store data from the import file it converts to CockroachDB-compatible key-value data. This directory must be available to all nodes using the same address (i.e., cannot use the processing node's local file storage). | 
Functional Overview
Because importing data is a complex task, it can be useful to have a high-level understanding of the process.
- A single node receives the IMPORTrequest, which becomes the processing node.
- The processing node streams the contents of the import file, converting its contents into CockroachDB-compatible key-value data.
- As the key-value data is generated, the node stores it in the temp directory.
- Once the entire import file has been converted to key-value data, relevant nodes import key-value data from the temp directory.
After the import has completed, you should also delete the files from your temp directory.
Preparation
Before using IMPORT, you should have:
- The schema of the table you want to import.
- The tabular data you want to import (e.g., CSV), preferably hosted on cloud storage.
- A location to store data before it is fully imported into all your nodes (referred to in this document as a "temp" directory). This location must be accessible to all nodes using the same address (i.e., cannot use a node's local file storage). - For ease of use, we recommend using cloud storage. However, if that isn't readily available to you, we also have a guide on easily creating your own file server. 
Details
Import Targets
Imported tables must not exist and must be created in the IMPORT statement. If the table you want to import already exists, you must drop it with DROP TABLE.
You can only import a single table at a time.
Create Table
Your IMPORT statement must include a CREATE TABLE statement (representing the schema of the data you want to import) using one of the following methods:
- A reference to a file that contains a CREATE TABLEstatement
- An inline CREATE TABLEstatement
We also recommend all secondary indexes you want to use in the CREATE TABLE statement. It is possible to add secondary indexes later, but it is significantly faster to specify them during import.
Object Dependencies
When importing tables, you must be mindful of the following rules because IMPORT only creates single tables which must not already exist:
- Objects that the imported table depends on must already exist
- Objects that depend on the imported table can only be created after the import completes
Operational Requirements & Concerns
Because IMPORT has a number of moving parts, there are a number of operational concerns in executing the statement, the most important of which is ensuring that the processing node can execute IMPORT successfully.
Choose Node to Process Request
Because of IMPORT's current implementation, the entire task is executed on a single node. If your deployment is not entirely symmetric, sending the request to a random node might have undesirable effects. Instead, we recommend bypassing any load balancers, connecting to a machine directly, and running the IMPORT statement on it.
It's important to note, though, that after the single machine creates the CockroachDB-compatible key-value data, the process of importing the data is distributed among nodes in the cluster.
IMPORT will let you distribute the entire process among many nodes.Available Storage Requirements
The node's first-listed/default store directory must have enough available storage equal to or greater than the size of the file you're importing.
On cockroach start, if you set --max-disk-temp-storage, it must also be greater than the size of the file you're importing.
For example, if you're importing approximately 10GiB of data, the node that ends up running the IMPORT command must have at least 10GiB of available storage in its store directory.
Import File Location
You can store the tabular data you want to import using either a node's local storage or remote cloud storage (Amazon S3, Google Cloud Platform, etc.).
For simplicity's sake, we highly recommend using cloud/remote storage for the data you want to import.
However, if you do want to store the file locally to import it, there are a number of things to understand.
Importing Data From Local Storage
IMPORT process, we recommend using it instead of local file storage.If you do not have access to cloud storage, you can easily create a file server using this guide.
Because CockroachDB is designed as a distributed system, the ergonomics of local file storage require some understanding to use successfully. Though we do not recommend this process, if you do want to use a locally stored file, this procedure is likely to cause you the fewest headaches:
- Ensure the node you want to use has available storage space at least 2x the size of the data you want to import; 1x for the file itself, and 1x for the converted key-value data. - For example, if you want to import 10GiB of data, your node needs 20GiB of available storage. 
- Upload the tabular data file to a single node, and then connect to that node. 
- Execute the - IMPORTstatement, importing to the locally stored file with the- nodelocalprefix, e.g.,- nodelocal://backup.csv.- However, the "temp" directory you choose must use a location available to all nodes in the cluster (i.e., you cannot use local file storage). You will need to use either cloud storage, a custom HTTP server, or NFS connected to all nodes in the cluster. 
Temp Directory
To distribute the data you want to import to all nodes in your cluster, the IMPORT process requires the CockroachDB-compatible key-value data be stored in a location that is accessible to all nodes in the cluster using the same address. To achieve this you can use:
- Cloud storage, such as Amazon S3 or Google Cloud Platform
- Network file storage mounted to every node
- HTTP file server
The temp directory must have at least as much storage space as the size of the data you want to import.
Temp Directory Cleanup
After completing the IMPORT process, you must manually remove the key-value data stored in the temp directory.
Table Users and Privileges
Imported tables are treated as new tables, so you must GRANT privileges to them.
Performance
Currently, IMPORT uses a single node to convert your tabular data into key-value data, which means the node's CPU and RAM will be partially consumed by the IMPORT task in addition to serving normal traffic.
Later steps of the import process distribute work among many nodes and have less impact on the nodes' resources.
Synopsis
Required Privileges
Only the root user can run IMPORT.
Parameters
| Parameter | Description | 
|---|---|
| table_name | The name of the table you want to import/create. | 
| create_table_file | The URL of a plain text file containing the CREATE TABLEstatement you want to use (see this example for syntax). | 
| table_elem_list | The table definition you want to use (see this example for syntax). | 
| file_to_import | The URL of the file you want to import. | 
| WITHkv_option | Control your import's behavior with these options. The temp option (which represents the temp directory's URL) is required. | 
Import File & Temp Directory URLs
URLs for the file you want to import and your temp directory must use the following format:
[scheme]://[host]/[path]?[parameters]
| Location | scheme | host | parameters | 
|---|---|---|---|
| Amazon S3 | s3 | Bucket name | AWS_ACCESS_KEY_ID,AWS_SECRET_ACCESS_KEY | 
| Azure | azure | Container name | AZURE_ACCOUNT_KEY,AZURE_ACCOUNT_NAME | 
| Google Cloud 1 | gs | Bucket name | N/A | 
| HTTP | http | Remote host | N/A | 
| NFS/Local 2 | nodelocal | File system location | N/A | 
Considerations
- 1 GCS connections use Google's default authentication strategy. 
- 2 Because CockroachDB is a distributed system, you cannot meaningfully store backups "locally" on nodes. The entire backup file must be stored in a single location, so attempts to store backups locally must point to an NFS drive to be useful. 
- The location parameters often contain special characters that need to be URI-encoded. Use Javascript's encodeURIComponent function or Go language's url.QueryEscape function to URI-encode the parameters. Other languages provide similar functions to URI-encode special characters. 
Notes
1 Only supports instance auth.
2 You can easily create your own HTTP server with Caddy or nginx.
3 If using NFS for your temp directory, each node in the cluster must have access to the NFS using the same URL.
Import Options
You can control the IMPORT process's behavior using any of the following key-value pairs as a kv_option.
temp
A directory accessible by all nodes, which is used to store the CockroachDB-compatible key-value data before all nodes import the data.
| Required? | Yes | 
| Key | temp | 
| Value | The URL of the temp directory | 
| Example | WITH temp = 'azure://acme-co/import-temp?AZURE_ACCOUNT_KEY=hash&AZURE_ACCOUNT_NAME=acme-co' | 
delimiter
If not using comma as your column delimiter, you can specify another Unicode character as the delimiter.
| Required? | No | 
| Key | delimiter | 
| Value | The unicode character that delimits columns in your rows | 
| Example | To use tab-delimited values: WITH temp = '...', delimiter = e'\t' | 
comment
Do not import rows that begin with this character.
| Required? | No | 
| Key | comment | 
| Value | The unicode character that identifies rows to skip | 
| Example | WITH temp = '...', comment = '#' | 
nullif
Convert values to SQL NULL if they match the specified string.
| Required? | No | 
| Key | nullif | 
| Value | The string that should be converted to NULL | 
| Example | To use empty columns as NULL: WITH temp = '...', nullif = '' | 
Examples
Use Create Table Statement from a File
> IMPORT TABLE customers
CREATE USING 'azure://acme-co/customer-create-table.sql?AZURE_ACCOUNT_KEY=hash&AZURE_ACCOUNT_NAME=acme-co'
CSV DATA ('azure://acme-co/customer-import-data.csv?AZURE_ACCOUNT_KEY=hash&AZURE_ACCOUNT_NAME=acme-co')
WITH
    temp = 'azure://acme-co/temp/?AZURE_ACCOUNT_KEY=hash&AZURE_ACCOUNT_NAME=acme-co'
;
Use Create Table Statement from a Statement
> IMPORT TABLE customers (
        id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
        name TEXT,
        INDEX name_idx (name)
)
CSV DATA ('azure://acme-co/customer-import-data.csv?AZURE_ACCOUNT_KEY=hash&AZURE_ACCOUNT_NAME=acme-co')
WITH
    temp = 'azure://acme-co/temp/?AZURE_ACCOUNT_KEY=hash&AZURE_ACCOUNT_NAME=acme-co'
;
Import a Tab-Separated File
> IMPORT TABLE customers (
        id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
        name TEXT,
        INDEX name_idx (name)
)
CSV DATA ('azure://acme-co/customer-import-data.tsc?AZURE_ACCOUNT_KEY=hash&AZURE_ACCOUNT_NAME=acme-co')
WITH
    temp = 'azure://acme-co/temp/?AZURE_ACCOUNT_KEY=hash&AZURE_ACCOUNT_NAME=acme-co',
    delimiter = e'\t'
;
Skip Commented Lines
> IMPORT TABLE customers (
        id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
        name TEXT,
        INDEX name_idx (name)
)
CSV DATA ('azure://acme-co/customer-import-data.tsc?AZURE_ACCOUNT_KEY=hash&AZURE_ACCOUNT_NAME=acme-co')
WITH
    temp = 'azure://acme-co/temp/?AZURE_ACCOUNT_KEY=hash&AZURE_ACCOUNT_NAME=acme-co',
    comment = '#'
;
Use Blank Characters as NULL
> IMPORT TABLE customers (
        id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
        name TEXT,
        INDEX name_idx (name)
)
CSV DATA ('azure://acme-co/customer-import-data.tsc?AZURE_ACCOUNT_KEY=hash&AZURE_ACCOUNT_NAME=acme-co')
WITH
    temp = 'azure://acme-co/temp/?AZURE_ACCOUNT_KEY=hash&AZURE_ACCOUNT_NAME=acme-co',
    nullif = ''
;