The instructions on this page require updates. We currently recommend using AWS Database Migration Service (DMS) to migrate data from Oracle to CockroachDB. You can also migrate from CSV.
This page has instructions for migrating data from Oracle into CockroachDB by importing CSV files. Note that IMPORT only works for creating new tables. For information on how to add CSV data to existing tables, see IMPORT INTO.
To illustrate this process, we use the following sample data and tools:
- Swingbench OrderEntry data set, which is based on the oeschema that ships with Oracle Database 11g and Oracle Database 12c.
- Oracle Data Pump, which enables the movement of data and metadata from one database to another, and comes with all Oracle installations.
- SQL*Plus, the interactive and batch query tool that comes with every Oracle Database installation.
For best practices for optimizing import performance in CockroachDB, see Import Performance Best Practices.
Step 1. Export the Oracle schema
Using Oracle's Data Pump Export utility, export the schema:
$ expdp user/password directory=datapump dumpfile=oracle_example.dmp content=metadata_only logfile=example.log
The schema is stored in an Oracle-specific format (e.g., oracle_example.dmp).
Step 2. Convert the Oracle schema to SQL
Using Oracle's Data Pump Import utility, load the exported DMP file to convert it to a SQL file:
$ impdp user/password directory=datapump dumpfile=oracle_example.dmp sqlfile=example_sql.sql TRANSFORM=SEGMENT_ATTRIBUTES:N:table PARTITION_OPTIONS=MERGE
This SQL output will be used later, in Step 7.
Step 3. Export table data
You need to extract each table's data into a data list file (.lst). We wrote a simple SQL script (spool.sql) to do this:
$ cat spool.sql
SET ECHO OFF
SET TERMOUT OFF
SET FEEDBACK OFF
SET PAGESIZE 0
SET TRIMSPOOL ON
SET WRAP OFF
set linesize 30000
SET RECSEP OFF
SET VERIFY OFF
SET ARRAYSIZE 10000
SET COLSEP '|'
SPOOL '&1'
ALTER SESSION SET nls_date_format = 'YYYY-MM-DD HH24:MI:SS';
  # Used to set a properly formatted date for CockroachDB
SELECT * from &1;
SPOOL OFF
SET PAGESIZE 24
SET FEEDBACK ON
SET TERMOUT ON
In the example SQL script, | is used as a delimiter. Choose a delimiter that will not also occur in the rows themselves. For more information, see IMPORT.
To extract the data, we ran the script for each table in SQL*Plus:
$ sqlplus user/password
> @spool CUSTOMERS
  @spool ADDRESSES
  @spool CARD_DETAILS
  @spool WAREHOUSES
  @spool ORDER_ITEMS
  @spool ORDERS
  @spool INVENTORIES
  @spool PRODUCT_INFORMATION
  @spool LOGON
  @spool PRODUCT_DESCRIPTIONS
  @spool ORDERENTRY_METADATA
A data list file (.lst) with leading and trailing spaces is created for each table.
Exit SQL*Plus:
> EXIT
Step 4. Configure and convert the table data to CSV
Each table's data list file needs to be converted to CSV and formatted for CockroachDB. We wrote a simple Python script (fix-example.py) to do this:
$ cat fix-example.py
import csv
import string
import sys
for lstfile in sys.argv[1:]:
  filename = lstfile.split(".")[0]
  with open(lstfile) as f:
    reader = csv.reader(f, delimiter="|")
    with open(filename+".csv", "w") as fo:
      writer = csv.writer(fo)
      for rec in reader:
        writer.writerow(map(string.strip, rec))
$ python3 fix-example.py CUSTOMERS.lst ADDRESSES.lst CARD_DETAILS.lst WAREHOUSES.lst ORDER_ITEMS.lst ORDERS.lst INVENTORIES.lst PRODUCT_INFORMATION.lst LOGON.lst PRODUCT_DESCRIPTIONS.lst ORDERENTRY_METADATA.lst
Format the generated CSV files to meet the CockroachDB's CSV requirements.
CSV requirements
You will need to export one CSV file per table, with the following requirements:
- Files must be in valid CSV format.
- Files must be UTF-8 encoded.
- If one of the following characters appears in a field, the field must be enclosed by double quotes:
- delimiter (,by default)
- double quote (")
- newline (\n)
- carriage return (\r)
 
- delimiter (
- If double quotes are used to enclose fields, then a double quote appearing inside a field must be escaped by preceding it with another double quote.  For example: "aaa","b""bb","ccc"
- If a column is of type BYTES, it can either be a valid UTF-8 string or a hex-encoded byte literal beginning with\x. For example, a field whose value should be the bytes1,2would be written as\x0102.
CSV configuration options
The following options are available to IMPORT ... CSV:
For usage examples, see Migrate from CSV - Configuration Options.
Step 5. Compress the CSV files
Compress the CSV files for a faster import:
$ gzip CUSTOMERS.csv ADDRESSES.csv CARD_DETAILS.csv WAREHOUSES.csv ORDER_ITEMS.csv ORDERS.csv INVENTORIES.csv PRODUCT_INFORMATION.csv LOGON.csv PRODUCT_DESCRIPTIONS.csv ORDERENTRY_METADATA.csv
These compressed CSV files will be used to import your data into CockroachDB.
Step 6. Host the files where the cluster can access them
Each node in the CockroachDB cluster needs to have access to the files being imported. There are several ways for the cluster to access the data; for more information on the types of storage IMPORT can pull from, see the following:
We strongly recommend using cloud storage such as Amazon S3 or Google Cloud to host the data files you want to import.
Step 7. Map Oracle to CockroachDB data types
Using the SQL file created in Step 2, write IMPORT TABLE statements that match the schemas of the table data you're importing.
Remove all Oracle-specific attributes, remap all Oracle data types, refactor all CREATE TABLE statements to include primary keys.
Data type mapping
Use the table below for data type mappings:
| Oracle Data Type | CockroachDB Data Type | 
|---|---|
| BLOB | BYTES1 | 
| CHAR(n),CHARACTER(n)n < 256 | CHAR(n),CHARACTER(n) | 
| CLOB | STRING1 | 
| DATE | DATE | 
| FLOAT(n) | DECIMAL(n) | 
| INTERVAL YEAR(p) TO MONTH | VARCHAR,INTERVAL | 
| INTERVAL DAY(p) TO SECOND(s) | VARCHAR,INTERVAL | 
| JSON | JSON2 | 
| LONG | STRING | 
| LONG RAW | BYTES | 
| NCHAR(n)n < 256 | CHAR(n) | 
| NCHAR(n)n > 255 | VARCHAR,STRING | 
| NCLOB | STRING | 
| NUMBER(p,0),NUMBER(p)1 <= p < 5 | INT23 | 
| NUMBER(p,0),NUMBER(p)5 <= p < 9 | INT43 | 
| NUMBER(p,0),NUMBER(p)9 <= p < 19 | INT83 | 
| NUMBER(p,0),NUMBER(p)19 <= p <= 38 | DECIMAL(p) | 
| NUMBER(p,s)s > 0 | DECIMAL(p,s) | 
| NUMBER,NUMBER(\*) | DECIMAL | 
| NVARCHAR2(n) | VARCHAR(n) | 
| RAW(n) | BYTES | 
| TIMESTAMP(p) | TIMESTAMP | 
| TIMESTAMP(p) WITH TIME ZONE | TIMESTAMP WITH TIMEZONE | 
| VARCHAR(n),VARCHAR2(n) | VARCHAR(n) | 
| XML | JSON2 | 
- 1 BLOBSandCLOBSshould be converted toBYTES, orSTRINGwhere the size is variable, but it's recommended to keep values under 1 MB to ensure performance. Anything above 1 MB would require refactoring into an object store with a pointer embedded in the table in place of the object.
- 2 JSONandXMLtypes can be converted toJSONBusing any XML to JSON conversion.XMLmust be converted toJSONBbefore importing into CockroachDB.
- 3 When converting NUMBER(p,0), considerNUMBERtypes with Base-10 limits map to the Base-10 Limits for CockroachDBINTtypes. Optionally,NUMBERScan be converted toDECIMAL.
When moving from Oracle to CockroachDB data types, consider the following:
- Schema changes within transactions
- Schema changes between executions of prepared statements
- If JSONcolumns are used only for payload, consider switching toBYTES.
- Max size of a single column family (512 MiB by default).
For more information, see Known Limitations, Online Schema Changes, and Transactions.
NULLs
For information on how CockroachDB handles NULLs, see NULL Handling and NOT NULL Constraint.
Primary key, constraints, and secondary indexes
Cockroach distributes a table by the primary key or by a default ROWID when a primary key is not provided. This also requires the primary key creation to be part of the table creation. Using the above data type mapping, refactor each table DDL to include the primary key, constraints, and secondary indexes.
For more information and examples, refer to the following:
Privileges for users and roles
The Oracle privileges for users and roles must be rewritten for CockroachDB. Once the CockroachDB cluster is secured, CockroachDB follows the same role-based access control methodology as Oracle.
Step 8. Import the CSV
For example, to import the data from CUSTOMERS.csv.gz into a new CUSTOMERS table, issue the following statement in the CockroachDB SQL shell:
> IMPORT TABLE customers (
        customer_id       DECIMAL
                          NOT NULL
                          PRIMARY KEY,
        cust_first_name   VARCHAR(40) NOT NULL,
        cust_last_name    VARCHAR(40) NOT NULL,
        nls_language      VARCHAR(3),
        nls_territory     VARCHAR(30),
        credit_limit      DECIMAL(9,2),
        cust_email        VARCHAR(100),
        account_mgr_id    DECIMAL,
        customer_since    DATE,
        customer_class    VARCHAR(40),
        suggestions       VARCHAR(40),
        dob               DATE,
        mailshot          VARCHAR(1),
        partner_mailshot  VARCHAR(1),
        preferred_address DECIMAL,
        preferred_card    DECIMAL,
        INDEX cust_email_ix (cust_email),
        INDEX cust_dob_ix (dob),
        INDEX cust_account_manager_ix (
            account_mgr_id
        )
       )
   CSV DATA (
        'https://your-bucket-name.s3.us-east-2.amazonaws.com/CUSTOMERS.csv.gz'
       )
  WITH delimiter = e'\t',
       "nullif" = '',
       decompress = 'gzip';
       job_id       |  status   | fraction_completed |  rows  | index_entries | system_records |  bytes   
--------------------+-----------+--------------------+--------+---------------+----------------+----------
 381866942129111041 | succeeded |                  1 | 300024 |             0 |              0 | 13258389
(1 row)
As of v21.2 IMPORT TABLE will be deprecated. We recommend using CREATE TABLE followed by IMPORT INTO to import data into a new table. For an example, read Import into a new table from a CSV file.
To import data into an existing table, use IMPORT INTO.
Then add the computed columns, constraints, and function-based indexes. For example:
> UPDATE CUSTOMERS SET credit_limit = 50000 WHERE credit_limit > 50000;
  ALTER TABLE CUSTOMERS ADD CONSTRAINT CUSTOMER_CREDIT_LIMIT_MAX CHECK (credit_limit <= 50000);
  ALTER TABLE CUSTOMERS ADD COLUMN LOW_CUST_LAST_NAME STRING AS (lower(CUST_LAST_NAME)) STORED;
  ALTER TABLE CUSTOMERS ADD COLUMN LOW_CUST_FIRST_NAME STRING AS (lower(CUST_FIRST_NAME)) STORED;
  CREATE INDEX CUST_FUNC_LOWER_NAME_IX on CUSTOMERS (LOW_CUST_LAST_NAME,CUST_FIRST_NAME);
Repeat the above for each CSV file you want to import.
Step 9. Refactor application SQL
The last phase of the migration process is to change the transactional behavior and SQL dialect of your application.
Transactions, locking, and concurrency control
Both Oracle and CockroachDB support multi-statement transactions, which are atomic and guarantee ACID semantics. However, CockroachDB operates in a serializable isolation mode while Oracle defaults to read committed, which can create both non-repeatable reads and phantom reads when a transaction reads data twice. It is typical that Oracle developers will use SELECT FOR UPDATE to work around read committed issues. The SELECT FOR UPDATE statement is also supported in CockroachDB.
Regarding locks, Cockroach utilizes a lightweight latch to serialize access to common keys across concurrent transactions. Oracle and CockroachDB transaction control flows only have a few minor differences; for more details, refer to Transactions - SQL statements.
As CockroachDB does not allow serializable anomalies, transactions may experience deadlocks or read/write contention. This is expected during concurrency on the same keys. These can be addressed with either automatic retries or client-side intervention techniques.
SQL dialect
Cockroach is ANSI SQL compliant with a PostgreSQL dialect, which allows you to use native drivers to connect applications and ORMs to CockroachDB. CockroachDB’s SQL layer supports full relational schema and SQL (similar to Oracle).
You will have to refactor Oracle SQL and functions that do not comply with ANSI SQL-92 in order to work with CockroachDB. For more information about the Cockroach SQL Grammar and a SQL comparison, see below:
- SQL best practices
- Common table expressions (CTE)
- DUALtable- Oracle requires use of the - DUALtable, as Oracle requires a- SELECT ... FROM. In CockroachDB, all reference to the- DUALtable should be eliminated.
- 
CockroachDB supports HASH,MERGE, andLOOKUPjoins. Oracle uses the+operator forLEFTandRIGHTjoins, but CockroachDB uses the ANSI join syntax.
- 
Sequences in CockroachDB do not require a trigger to self-increment; place the sequence in the table DDL: > CREATE TABLE customer_list ( id INT PRIMARY KEY DEFAULT nextval('customer_seq'), customer string, address string );
- SYSDATE- CockroachDB does not support - SYSDATE; however, it does support date and time with the following:- > SELECT transaction_timestamp(), clock_timestamp();- > SELECT current_timestamp- > SELECT now();