Backup and restore
|
How backup works
DS directory servers store data in backends. The amount of data in a backend varies depending on your deployment. It can range from very small to very large. A JE backend can hold billions of LDAP entries, for example.
Backup process
A JE backend stores data on disk using append-only log files with names like number.jdb
.
The JE backend writes updates to the highest-numbered log file.
The log files grow until they reach a specified size (default: 1 GB).
When the current log file reaches the specified size, the JE backend creates a new log file.
To avoid an endless increase in database size on disk, JE backends clean their log files in the background. A cleaner thread copies active records to new log files. Log files that no longer contain active records are deleted.
The DS backup process takes advantage of this log file structure. Together, a set of log files represents a backend at a point in time. The backup process essentially copies the log files to the backup directory. DS also protects the data and adds metadata to keep track of the log files it needs to restore a JE backend to the state it had when the backup task completed.
Cumulative backups
DS backups are cumulative in nature. Backups reuse the JE files that did not change since the last backup operation. They only copy the JE files the backend created or changed. Files that did not change are shared between backups.
A set of backup files is fully standalone.
Purge old backups
Backup tasks keep JE files until you purge them.
The backup purge operation prevents an endless increase in the size of the backup folder on disk. The purge operation does not happen automatically; you choose to run it. When you run a purge operation, it removes the files for old or selected backups. The purge does not impact the integrity of the backups DS keeps. It only removes log files that do not belong to any remaining backups.
Back up
When you set up a directory server, the process creates a /path/to/opendj/bak/
directory.
You can use this for backups if you have enough local disk space, and when developing or testing backup processes.
In deployment, store backups remotely to avoid losing your data and backups in the same crash.
Back up data (server task)
When you schedule a backup as a server task, the DS server manages task completion. The server must be running when you schedule the task, and when the task runs:
-
Schedule the task on a running server, binding as a user with the
backend-backup
administrative privilege.The following example schedules an immediate backup task for the
dsEvaluation
backend:$ dsbackup \ create \ --hostname localhost \ --port 4444 \ --bindDN uid=admin \ --bindPassword password \ --usePkcs12TrustStore /path/to/opendj/config/keystore \ --trustStorePassword:file /path/to/opendj/config/keystore.pin \ --backupLocation bak \ --backendName dsEvaluation
To back up all backends, omit the
--backendName
option.To back up more than one backend, specify the
--backendName
option multiple times.For details, refer to dsbackup.
Back up data (scheduled task)
When you schedule a backup as a server task, the DS server manages task completion. The server must be running when you schedule the task, and when the task runs:
-
Schedule backups using the
crontab
format with the--recurringTask
option.The following example schedules nightly online backup of all user data at 2 AM, notifying
diradmin@example.com
when finished, or on error:$ dsbackup \ create \ --hostname localhost \ --port 4444 \ --bindDN uid=admin \ --bindPassword password \ --usePkcs12TrustStore /path/to/opendj/config/keystore \ --trustStorePassword:file /path/to/opendj/config/keystore.pin \ --backupLocation bak \ --recurringTask "00 02 * * *" \ --description "Nightly backup at 2 AM" \ --taskId NightlyBackup \ --completionNotify diradmin@example.com \ --errorNotify diradmin@example.com
For details, refer to dsbackup.
Back up data (external command)
When you back up data without contacting the server,
the dsbackup create
command runs as an external command, independent of the server process.
It backs up the data whether the server is running or not.
When you back up LDIF-based backends with this method, the command does not lock the files.
To avoid corrupting the backup files, do not run the This applies to LDIF backends, schema files, and the task backend, for example. |
Use this method to schedule backup with a third-party tool, such as the cron
command:
-
Back up data without contacting the server process, and use the
--offline
option.The following example backs up the
dsEvaluation
backend immediately:$ dsbackup \ create \ --offline \ --backupLocation bak \ --backendName dsEvaluation
To back up all backends, omit the
--backendName
option.To back up more than one backend, specify the
--backendName
option multiple times.For details, refer to dsbackup.
Back up configuration files
When you back up directory data using the dsbackup
command, you do not back up server configuration files.
The server stores configuration files under the /path/to/opendj/config/
directory.
The server records snapshots of its configuration under the /path/to/opendj/var/
directory.
You can use snapshots to recover from misconfiguration performed with the dsconfig
command.
Snapshots only reflect the main configuration file, config.ldif
.
-
Stop the server:
$ stop-ds
-
Back up the configuration files:
$ tar -zcvf backup-config-$(date +%s).tar.gz config
By default, this backup includes the server keystore, so store it securely.
-
Start the server:
$ start-ds
Back up using snapshots
ForgeRock recommends using the dsbackup
command when possible for backup and restore operations.
You can use snapshot technology as an alternative to the dsbackup
command,
but you must be careful how you use it.
While DS directory servers are running, database backend cleanup operations write data even when there are no pending client or replication operations. An ongoing file system backup operation may record database log files that are not in sync with each other.
Successful recovery after restore is only guaranteed under certain conditions.
The snapshots must:
-
Be atomic, capturing the state of all files at exactly the same time.
If you are not sure that the snapshot technology is atomic, do not use it. Use the
dsbackup
command instead. -
Capture the state of all data (
db/
) and (changelogDb/
) changelog files together.When using a file system-level snapshot feature, for example, keep at least all data and changelog files on the same file system. This is the case in a default server setup.
-
Be paired with a specific server configuration.
A snapshot of all files includes configuration files that may be specific to one DS server, and cannot be restored safely on another DS server with a different configuration. If you restore all system files, this principle applies to system configuration as well.
For details on making DS configuration files as generic as possible, refer to Property value substitution.
If snapshots in your deployment do not meet these criteria, you must stop the DS server before taking the snapshot. You must also take care not to restore incompatible configuration files.
Restore
After you restore a replicated backend, replication brings it up to date with changes newer than the backup.
Replication uses internal change log records to determine which changes to apply.
This process happens even if you only have a single server that you configured for replication at setup time
(by setting the replication port with the Replication purges internal change log records, however, to prevent the change log from growing indefinitely. Replication can only bring the backend up to date if the change log still includes the last change backed up. For this reason, when you restore a replicated backend from backup, the backup must be newer than the last purge of the replication change log (default: 3 days). If no backups are newer than the replication purge delay, do not restore from a backup. Initialize the replica instead, without using a backup. For details, refer to Manual initialization. |
Restore data (server task)
-
Verify the backup you intend to restore.
The following example verifies the most recent backup of the
dsEvaluation
backend:$ dsbackup \ list \ --backupLocation bak \ --backendName dsEvaluation \ --last \ --verify
-
Schedule the restore operation as a task, binding as a user with the
backend-restore
administrative privilege.The following example schedules an immediate restore task for the
dsEvaluation
backend:$ dsbackup \ restore \ --hostname localhost \ --port 4444 \ --bindDN uid=admin \ --bindPassword password \ --usePkcs12TrustStore /path/to/opendj/config/keystore \ --trustStorePassword:file /path/to/opendj/config/keystore.pin \ --backupLocation bak \ --backendName dsEvaluation
To restore the latest backups of more than one backend, specify the
--backendName
option multiple times.To restore a specific backup, specify the
--backupId
option. To restore multiple specific backups of different backends, specify the--backupId
option multiple times.To list backup information without performing verification, use the
dsbackup list
command without the--verify
option. The output includes backup IDs for use with the--backupId
option.For details, refer to dsbackup.
Restore data (external command)
-
Stop the server if it is running:
$ stop-ds --quiet
-
Verify the backup you intend to restore.
The following example verifies the most recent backup of the
dsEvaluation
backend:$ dsbackup \ list \ --backupLocation bak \ --backendName dsEvaluation \ --last \ --verify
-
Restore using the
--offline
option.The following example restores the
dsEvaluation
backend:$ dsbackup \ restore \ --offline \ --backupLocation bak \ --backendName dsEvaluation
To restore the latest backups of more than one backend, specify the
--backendName
option multiple times.To restore a specific backup, specify the
--backupId
option. To restore multiple specific backups of different backends, specify the--backupId
option multiple times.To list backup information without performing verification, use the
dsbackup list
command without the--verify
option. The output includes backup IDs for use with the--backupId
option.For details, refer to dsbackup.
-
Start the server:
$ start-ds --quiet
Restore configuration files
-
Stop the server:
$ stop-ds --quiet
-
Restore the configuration files from the backup, overwriting existing files:
$ tar -zxvf backup-config-<date>.tar.gz
-
Start the server:
$ start-ds --quiet
Restore from a snapshot
ForgeRock recommends using the dsbackup
command when possible for backup and restore operations.
You can use snapshot technology as an alternative to the dsbackup
command,
but you must be careful how you use it.
For details, refer to Back up using snapshots.
Take the following points into account before restoring a snapshot:
-
When you restore files for a replicated backend, the snapshot must be newer than the last purge of the replication change log (default: 3 days).
-
Stop the DS server before you restore the files.
-
The DS configuration files in the snapshot must match the configuration where you restore the snapshot.
If the configuration uses expressions, define their values for the current server before starting DS.
-
When using snapshot files to initialize replication, only restore the data (
db/
) files for the target backend.Depending on the snapshot technology, you might need to restore the files separately, and then move only the target backend files from the restored snapshot.
-
When using snapshot files to restore replicated data to a known state, stop all affected servers before you restore.
Purge old files
Periodically purge old backup files with the dsbackup purge
command.
The following example removes all backup files older than the default replication purge delay:
$ dsbackup \
purge \
--offline \
--backupLocation bak \
--olderThan 3d
This example runs the external command without contacting the server process. You can also purge backups by ID, or by backend name, and you can specify the number of backups to keep. For details, refer to dsbackup.
To purge files as a server task, use the task options, such as --recurringTask
.
The user must have the backend-backup
administrative privilege to schedule a purge task.
Cloud storage
You can stream backup files to cloud storage, and restore them directly from cloud storage.
The implementation supports these providers:
-
Amazon AWS S3
-
Azure Cloud Storage
-
Google Cloud Storage
Follow these steps to store backup files in the cloud:
-
If you upgraded in place from DS 6.5 or earlier, activate cloud storage for backup.
-
Get a storage account and space from the cloud provider where the server can store backup files.
This storage space is referred to below as cloud-bak.
-
Get credentials from the cloud provider.
The DS server backing up files must have read, write, and delete access. For information about granting access, refer to the access control documentation for your provider.
If you are not yet familiar with cloud storage, refer to the documentation from your provider for help. The following table provides links to the documentation for supported providers:
Provider Hints Amazon AWS S3
For details on setting up S3 and working with S3 buckets, refer to the Amazon Web Services documentation on Getting started with Amazon Simple Storage Service.
Azure Cloud Storage
DS authenticates to Azure with an Azure storage account. For details, refer to the Microsoft documentation on how to Create an Azure Storage account, or to Create a BlockBlobStorage account.
Google Cloud Storage
DS authenticates to Google Cloud with a service account. For details, refer to the Google documentation on Getting Started with Authentication.
For details about creating and managing storage buckets, refer to the Google How-To documentation on Creating buckets, and Working with buckets.
-
Set environment variables for the credentials:
Provider Environment Variable(s) Amazon AWS S3
export AWS_ACCESS_KEY_ID=aws-access-key
export AWS_SECRET_ACCESS_KEY=aws-secret-key
When using temporary credentials, also export the session token:
export AWS_SESSION_TOKEN=aws-session-token
Azure Cloud Storage
export AZURE_ACCOUNT_NAME=azure-account-name
export AZURE_ACCOUNT_KEY=azure-account-key
Google Cloud Storage
export GOOGLE_CREDENTIALS=/path/to/gcp-credentials.json
(optional) -
Restart the DS server so that it reads the environment variables you set:
$ stop-ds --restart
-
Run
dsbackup
commands with all required provider-specific options.The options in the following table use the providers' default storage endpoints:
Provider Required Options Amazon AWS S3
--storageProperty s3.keyId.env.var:AWS_ACCESS_KEY_ID \ --storageProperty s3.secret.env.var:AWS_SECRET_ACCESS_KEY \ --backupLocation s3://cloud-bak # When using temporary credentials, also use the session token: --storageProperty s3.keyId.env.var:AWS_ACCESS_KEY_ID \ --storageProperty s3.secret.env.var:AWS_SECRET_ACCESS_KEY \ --storageProperty s3.sessionToken.env.var:AWS_SESSION_TOKEN \ --backupLocation s3://cloud-bak
Azure Cloud Storage
--storageProperty az.accountName.env.var:AZURE_ACCOUNT_NAME \ --storageProperty az.accountKey.env.var:AZURE_ACCOUNT_KEY \ --backupLocation az://cloud-bak
Google Cloud Storage
--storageProperty gs.credentials.path:/path/to/gcp-credentials.json \ --backupLocation gs://cloud-bak
or
--storageProperty gs.credentials.env.var:GOOGLE_CREDENTIALS \ --backupLocation gs://cloud-bak
In production environments, also set the cloud storage endpoint.
Cloud storage requires working space in the local system temporary directory. Some cloud storage providers require sending the content length with each file.
To send the correct content length, the
dsbackup
command writes each prepared backup file to the system temporary directory before upload. It deletes each file after successful upload.
Cloud storage endpoint
Backup to cloud storage can use a default endpoint, which can simplify evaluation and testing.
Control where your backup files go. Add one of the following options:
-
--storage-property endpoint:endpoint-url
-
--storage-property endpoint.env.var:environment-variable-for-endpoint-url
The endpoint-url depends on your provider.
Refer to their documentation for details.
For Azure cloud storage, the endpoint-url starts with the account name.
Examples include https://azure-account-name.blob.core.windows.net
,
https://${AZURE_ACCOUNT_NAME}.blob.core.windows.net
,
and https://${AZURE_ACCOUNT_NAME}.some.private.azure.endpoint
.
Cloud storage samples demonstrate how to use the setting.
Cloud storage samples
Click the samples for your storage provider to expand the section and display the commands:
AWS samples
#
# API keys created through the AWS API gateway console:
#
export AWS_ACCESS_KEY_ID=aws-access-key-id
export AWS_SECRET_ACCESS_KEY=aws-secret-key
# When using temporary credentials:
# export AWS_SESSION_TOKEN=aws-session-token
# These samples use the following S3 bucket, and a non-default endpoint:
# S3 bucket: s3://ds-test-backup
# S3 endpoint: https://s3.us-east-1.amazonaws.com
#
# When using temporary credentials, also add
# the AWS session token storage property option to each of the commands:
# --storageProperty s3.sessionToken.env.var:AWS_SESSION_TOKEN
# Back up the dsEvaluation backend offline:
dsbackup create --backendName dsEvaluation --offline \
--backupLocation s3://ds-test-backup \
--storageProperty s3.keyId.env.var:AWS_ACCESS_KEY_ID \
--storageProperty s3.secret.env.var:AWS_SECRET_ACCESS_KEY \
--storageProperty endpoint:https://s3.us-east-1.amazonaws.com
# List and verify the latest backup files for each backend at this location:
dsbackup list --verify --last \
--backupLocation s3://ds-test-backup \
--storageProperty s3.keyId.env.var:AWS_ACCESS_KEY_ID \
--storageProperty s3.secret.env.var:AWS_SECRET_ACCESS_KEY \
--storageProperty endpoint:https://s3.us-east-1.amazonaws.com
# Restore dsEvaluation from backup offline:
dsbackup restore --backendName dsEvaluation --offline \
--backupLocation s3://ds-test-backup \
--storageProperty s3.keyId.env.var:AWS_ACCESS_KEY_ID \
--storageProperty s3.secret.env.var:AWS_SECRET_ACCESS_KEY \
--storageProperty endpoint:https://s3.us-east-1.amazonaws.com
# Purge all dsEvaluation backup files:
dsbackup purge --backendName dsEvaluation --keepCount 0 --offline \
--backupLocation s3://ds-test-backup \
--storageProperty s3.keyId.env.var:AWS_ACCESS_KEY_ID \
--storageProperty s3.secret.env.var:AWS_SECRET_ACCESS_KEY \
--storageProperty endpoint:https://s3.us-east-1.amazonaws.com
Azure samples
#
# Credentials for Azure storage, where the Azure account is found in key1 in the Azure console:
#
export AZURE_ACCOUNT_NAME=azure-account-name
export AZURE_ACCOUNT_KEY=azure-account-key
# These samples use the following Azure storage, and a non-default endpoint:
# Azure storage: az://ds-test-backup/test1
# Azure endpoint: https://${AZURE_ACCOUNT_NAME}.blob.core.windows.net
# Back up the dsEvaluation backend offline:
dsbackup create --backendName dsEvaluation --offline \
--backupLocation az://ds-test-backup/test1 \
--storageProperty az.accountName.env.var:AZURE_ACCOUNT_NAME \
--storageProperty az.accountKey.env.var:AZURE_ACCOUNT_KEY \
--storageProperty "endpoint:https://${AZURE_ACCOUNT_NAME}.blob.core.windows.net"
# List and verify the latest backup files for each backend at this location:
dsbackup list --verify --last \
--backupLocation az://ds-test-backup/test1 \
--storageProperty az.accountName.env.var:AZURE_ACCOUNT_NAME \
--storageProperty az.accountKey.env.var:AZURE_ACCOUNT_KEY \
--storageProperty "endpoint:https://${AZURE_ACCOUNT_NAME}.blob.core.windows.net"
# Restore dsEvaluation from backup offline:
dsbackup restore --backendName dsEvaluation --offline \
--backupLocation az://ds-test-backup/test1 \
--storageProperty az.accountName.env.var:AZURE_ACCOUNT_NAME \
--storageProperty az.accountKey.env.var:AZURE_ACCOUNT_KEY \
--storageProperty "endpoint:https://${AZURE_ACCOUNT_NAME}.blob.core.windows.net"
# Purge all dsEvaluation backup files:
dsbackup purge --backendName dsEvaluation --keepCount 0 --offline \
--backupLocation az://ds-test-backup/test1 \
--storageProperty az.accountName.env.var:AZURE_ACCOUNT_NAME \
--storageProperty az.accountKey.env.var:AZURE_ACCOUNT_KEY \
--storageProperty "endpoint:https://${AZURE_ACCOUNT_NAME}.blob.core.windows.net"
Google cloud samples
#
# Credentials generated with and download from the Google cloud console:
#
export GOOGLE_CREDENTIALS=/path/to/gcp-credentials.json
# These samples use the following cloud storage, and endpoint:
# Google storage: gs://ds-test-backup/test1
# Google endpoint: https://www.googleapis.com
# Back up the dsEvaluation backend offline:
dsbackup create --backendName dsEvaluation --offline \
--backupLocation gs://ds-test-backup/test1 \
--storageProperty gs.credentials.env.var:GOOGLE_CREDENTIALS \
--storageProperty endpoint:https://www.googleapis.com
# List and verify the latest backup files for each backend at this location:
dsbackup list --verify --last \
--backupLocation gs://ds-test-backup/test1 \
--storageProperty gs.credentials.env.var:GOOGLE_CREDENTIALS \
--storageProperty endpoint:https://www.googleapis.com
# Restore dsEvaluation from backup offline:
dsbackup restore --backendName dsEvaluation --offline \
--backupLocation gs://ds-test-backup/test1 \
--storageProperty gs.credentials.env.var:GOOGLE_CREDENTIALS \
--storageProperty endpoint:https://www.googleapis.com
# Purge all dsEvaluation backup files:
dsbackup purge --backendName dsEvaluation --keepCount 0 --offline \
--backupLocation gs://ds-test-backup/test1 \
--storageProperty gs.credentials.env.var:GOOGLE_CREDENTIALS \
--storageProperty endpoint:https://www.googleapis.com
Efficiently store backup files
DS backups are collections of files in a backup directory. To restore from backup, DS requires a coherent collection of backup files.
You can use the dsbackup command to purge stale backup files from a backup directory. When you purge stale backup files, the command leaves a coherent collection of files you can use to restore data.
You should also store copies of backup files remotely to guard against the loss of data in a disaster.
Remote storage
Perform the following steps to store copies of backup files remotely in an efficient way. These steps address backup of directory data, which is potentially very large, not backup of configuration data, which is almost always small:
-
Choose a local directory or local network directory to hold backup files.
Alternatively, you can back up to cloud storage.
-
Schedule a regular backup task to back up files to the directory you chose.
Make sure that the backup task runs more often than the replication purge delay. For example, schedule the backup task to run every three hours for a default purge delay of three days. Each time the task runs, it backs up only new directory backend files.
For details, refer to the steps for backing up directory data.
-
Store copies of the local backup files at a remote location for safekeeping:
-
Purge old files in the local backup directory.
As described in How backup works, DS backups are cumulative in nature; DS reuses common data that has not changed from previous backup operations when backing up data again. The set of backup files is fully standalone.
The purge removes stale files without impacting the integrity of newer backups, reducing the volume of backup files to store when you copy files remotely.
-
Regularly copy the backup directory and all the files it holds to a remote location.
For example, copy all local backup files every day to a remote directory called
bak-date
:$ ssh user@remote-storage mkdir /path/to/bak-date $ scp -R /path/to/bak/* user@remote-storage:/path/to/bak-date/
-
-
Remove old
bak-date
directories from remote storage in accordance with the backup policy for the deployment.
Restore from remote backup
For each DS directory server to restore:
-
Install DS using the same cryptographic keys and deployment ID.
Backup files are protected using keys derived from the DS deployment ID and password. You must use the same ones when recovering from a disaster.
-
Restore directory data from the latest remote backup folder.
After restoring all directory servers, validate that the restore procedure was a success.