Exporting and Importing Data
Export Your Data
If you are migrating data, for example, from a development server to a QA server, then follow this section to export your data from your current deployment. Autonomous Identity provides a python script to export your data to .csv files and stores them to a folder in your home directory.
On the target machine, change to the
dbutils
directory.$
cd /opt/autoid/dbutils
Export the database.
$
python dbutils.py export ~/backup
Import the Data into the Autonomous Identity Keyspace
If you are moving your data from another server, import your data to the target environment using the following steps.
First, create a
zoran_user.cql
file. This file is used to drop and re-create the Autonomous Identityuser
anduser_history
tables. The file should go to the same directory as the other .csv files. Make sure to create this file from the source node, for example, the development server, from where we exported the data.Start cqlsh in the source environment, and use the output of these commands to create the
zoran_user.cql
file:$
describe zoran.user;
$
describe zoran.user_history;
Make sure the DROP TABLE cql commands precedes the CREATE TABLE commands as shown in the
zoran_user.cql
example file below:USE zoran ; DROP TABLE IF EXISTS zoran.user_history ; DROP TABLE IF EXISTS zoran.user ; CREATE TABLE zoran.user ( user text PRIMARY KEY, chiefyesno text, city text, costcenter text, isactive text, jobcodename text, lineofbusiness text, lineofbusinesssubgroup text, managername text, usrdepartmentname text, userdisplayname text, usremptype text, usrmanagerkey text ) WITH bloom_filter_fp_chance = 0.01 AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'} AND comment = '' AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'} AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND crc_check_chance = 1.0 AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = '99PERCENTILE'; CREATE TABLE zoran.user_history ( user text, batch_id int, chiefyesno text, city text, costcenter text, isactive text, jobcodename text, lineofbusiness text, lineofbusinesssubgroup text, managername text, usrdepartmentname text, userdisplayname text, usremptype text, usrmanagerkey text, PRIMARY KEY (user, batch_id) ) WITH CLUSTERING ORDER BY (batch_id ASC) AND bloom_filter_fp_chance = 0.01 AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'} AND comment = '' AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'} AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND crc_check_chance = 1.0 AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = '99PERCENTILE';
Copy the
ui-config.json
from the source environment where you ran an analytics pipeline, usually under/data/config
, to the same folder where you have your .csv files.On the target machine, change to the
dbutils
directory.$
cd /opt/autoid/dbutils
Use the dbutils.py import command to populate the Autonomous Identity keyspace with the .csv files, generated from the export command from the source environment using the previous steps. Note that before importing the data, the script truncates the existing tables to remove duplicates. Again, make sure the
zoran_user.cql
and theui-config.json
are in the/import-dir
.$
python dbutils.py import /import-dir
For example:
$
python dbutils.py import ~/import/AutoID-data
Verify that the data is imported in the directory on your server.