PingDS 7.5.1

Backup and restore

  • Backup archives are not guaranteed to be compatible across major and minor server releases. Restore backups only on directory servers of the same major or minor version.

    To share data between servers of different versions, either use replication, or use LDIF.

  • DS servers use cryptographic keys to sign and verify the integrity of backup files, and to encrypt data. Servers protect these keys by encrypting them with the shared master key for a deployment. For portability, servers store the encrypted keys in the backup files.

    Any server can therefore restore a backup taken with the same server version, as long as it holds a copy of the shared master key used to encrypt the keys.

How backup works

DS directory servers store data in backends. The amount of data in a backend varies depending on your deployment. It can range from very small to very large. A JE backend can hold billions of LDAP entries, for example.

Backup process

A JE backend stores data on disk using append-only log files with names like number.jdb. The JE backend writes updates to the highest-numbered log file. The log files grow until they reach a specified size (default: 1 GB). When the current log file reaches the specified size, the JE backend creates a new log file.

To avoid an endless increase in database size on disk, JE backends clean their log files in the background. A cleaner thread copies active records to new log files. Log files that no longer contain active records are deleted.

The DS backup process takes advantage of this log file structure. Together, a set of log files represents a backend at a point in time. The backup process essentially copies the log files to the backup directory. DS also protects the data and adds metadata to keep track of the log files it needs to restore a JE backend to the state it had when the backup task completed.

Cumulative backups

DS backups are cumulative in nature. Backups reuse the JE files that did not change since the last backup operation. They only copy the JE files the backend created or changed. Files that did not change are shared between backups.

A set of backup files is fully standalone.

Purge old backups

Backup tasks keep JE files until you purge them.

The backup purge operation prevents an endless increase in the size of the backup folder on disk. The purge operation does not happen automatically; you choose to run it. When you run a purge operation, it removes the files for old or selected backups. The purge does not impact the integrity of the backups DS keeps. It only removes log files that do not belong to any remaining backups.

Back up

When you set up a directory server, the process creates a /path/to/opendj/bak/ directory. You can use this for backups if you have enough local disk space, and when developing or testing backup processes. In deployment, store backups remotely to avoid losing your data and backups in the same crash.

Back up data (server task)

When you schedule a backup as a server task, the DS server manages task completion. The server must be running when you schedule the task, and when the task runs:

  1. Schedule the task on a running server, binding as a user with the backend-backup administrative privilege.

    The following example schedules an immediate backup task for the dsEvaluation backend:

    $ dsbackup \
     create \
     --hostname localhost \
     --port 4444 \
     --bindDN uid=admin \
     --bindPassword password \
     --usePkcs12TrustStore /path/to/opendj/config/keystore \
     --trustStorePassword:file /path/to/opendj/config/keystore.pin \
     --backupLocation bak \
     --backendName dsEvaluation

    To back up all backends, omit the --backendName option.

    To back up more than one backend, specify the --backendName option multiple times.

    For details, refer to dsbackup.

Back up data (scheduled task)

When you schedule a backup as a server task, the DS server manages task completion. The server must be running when you schedule the task, and when the task runs:

  1. Schedule backups using the crontab format with the --recurringTask option.

    The following example schedules nightly online backup of all user data at 2 AM, notifying diradmin@example.com when finished, or on error:

    $ dsbackup \
     create \
     --hostname localhost \
     --port 4444 \
     --bindDN uid=admin \
     --bindPassword password \
     --usePkcs12TrustStore /path/to/opendj/config/keystore \
     --trustStorePassword:file /path/to/opendj/config/keystore.pin \
     --backupLocation bak \
     --recurringTask "00 02 * * *" \
     --description "Nightly backup at 2 AM" \
     --taskId NightlyBackup \
     --completionNotify diradmin@example.com \
     --errorNotify diradmin@example.com

    For details, refer to dsbackup.

    Use the manage-tasks command to manage scheduled tasks. For background, read Server tasks. For an example command, refer to Status and tasks.

Back up data (external command)

When you back up data without contacting the server, the dsbackup create command runs as an external command, independent of the server process. It backs up the data whether the server is running or not.

When you back up LDIF-based backends with this method, the command does not lock the files. To avoid corrupting the backup files, do not run the dsbackup create --offline command on an LDIF backend simultaneously with any changes to the backend.

This applies to LDIF backends, schema files, and the task backend, for example.

Use this method to schedule backup with a third-party tool, such as the cron command:

  1. Back up data without contacting the server process, and use the --offline option.

    The following example backs up the dsEvaluation backend immediately:

    $ dsbackup \
     create \
     --offline \
     --backupLocation bak \
     --backendName dsEvaluation

    To back up all backends, omit the --backendName option.

    To back up more than one backend, specify the --backendName option multiple times.

    For details, refer to dsbackup.

Back up configuration files

When you back up directory data using the dsbackup command, you do not back up server configuration files. The server stores configuration files under the /path/to/opendj/config/ directory.

The server records snapshots of its configuration under the /path/to/opendj/var/ directory. You can use snapshots to recover from misconfiguration performed with the dsconfig command. Snapshots only reflect the main configuration file, config.ldif.

  1. Stop the server:

    $ stop-ds
  2. Back up the configuration files:

    $ tar -zcvf backup-config-$(date +%s).tar.gz config

    By default, this backup includes the server keystore, so store it securely.

  3. Start the server:

    $ start-ds

Back up using snapshots

Use the dsbackup command when possible for backup and restore operations. You can use snapshot technology as an alternative to the dsbackup command, but you must be careful how you use it.

While DS directory servers are running, database backend cleanup operations write data even when there are no pending client or replication operations. An ongoing file system backup operation may record database log files that are not in sync with each other.

Successful recovery after restore is only guaranteed under certain conditions.

The snapshots must:

  • Be atomic, capturing the state of all files at exactly the same time.

    If you are not sure that the snapshot technology is atomic, do not use it. Use the dsbackup command instead.

    For example, Kubernetes deployments can use volume snapshots when the underlying storage supports atomic snapshots. For details, refer to Backup and restore using volume snapshots.

    In contrast, do not use VMWare snapshots to back up a running DS server.

  • Capture the state of all data (db/) and (changelogDb/) changelog files together.

    When using a file system-level snapshot feature, for example, keep at least all data and changelog files on the same file system. This is the case in a default server setup.

  • Be paired with a specific server configuration.

    A snapshot of all files includes configuration files that may be specific to one DS server, and cannot be restored safely on another DS server with a different configuration. If you restore all system files, this principle applies to system configuration as well.

    For details on making DS configuration files as generic as possible, refer to Property value substitution.

If snapshots in your deployment do not meet these criteria, you must stop the DS server before taking the snapshot. You must also take care not to restore incompatible configuration files.

Backup and restore options
dsbackup commands Snapshots

What is backed up

DS backend data only

Potentially everything; at minimum DS backends, changelogs

Incremental backups

Yes

Depends on the snapshot tools

Portability

Yes; restore backend data on any DS of the same major/minor version

Depends; potentially limited to the same environment as with Kubernetes volume snapshots

Disaster recovery

Optimal; restore data and delete old changelog

Potentially restores changelog only to clear it during recovery

Recover single server

Potentially slower while rebuilding the local changelog; impacts the change number index (if enabled)

Optimal; restores everything to the previous state

Choice of what to restore

Good; you choose which backends to restore

Bad; you restore the file system, potentially rolling back multiple backends at once

Ease of use

Medium; you must understand dsbackup commands and choose what to restore

Medium; you must understand platform tools and impact of restoring everything at once

Restore

After you restore a replicated backend, replication brings it up to date with changes newer than the backup. Replication uses internal change log records to determine which changes to apply. This process happens even if you only have a single server that you configured for replication at setup time (by setting the replication port with the --replicationPort port option). To prevent replication from replaying changes newer than the backup you restore, refer to Disaster recovery.

Replication purges internal change log records, however, to prevent the change log from growing indefinitely. Replication can only bring the backend up to date if the change log still includes the last change backed up.

For this reason, when you restore a replicated backend from backup, the backup must be newer than the last purge of the replication change log (default: 3 days).

If no backups are newer than the replication purge delay, do not restore from a backup. Initialize the replica instead, without using a backup. For details, refer to Manual initialization.

Restore data (server task)

  1. Verify the backup you intend to restore.

    The following example verifies the most recent backup of the dsEvaluation backend:

    $ dsbackup \
     list \
     --backupLocation bak \
     --backendName dsEvaluation \
     --last \
     --verify
  2. Schedule the restore operation as a task, binding as a user with the backend-restore administrative privilege.

    The following example schedules an immediate restore task for the dsEvaluation backend:

    $ dsbackup \
     restore \
     --hostname localhost \
     --port 4444 \
     --bindDN uid=admin \
     --bindPassword password \
     --usePkcs12TrustStore /path/to/opendj/config/keystore \
     --trustStorePassword:file /path/to/opendj/config/keystore.pin \
     --backupLocation bak \
     --backendName dsEvaluation

    To restore the latest backups of more than one backend, specify the --backendName option multiple times.

    To restore a specific backup, specify the --backupId option. To restore multiple specific backups of different backends, specify the --backupId option multiple times.

    To list backup information without performing verification, use the dsbackup list command without the --verify option. The output includes backup IDs for use with the --backupId option.

    For details, refer to dsbackup.

Restore data (external command)

  1. Stop the server if it is running:

    $ stop-ds --quiet
  2. Verify the backup you intend to restore.

    The following example verifies the most recent backup of the dsEvaluation backend:

    $ dsbackup \
     list \
     --backupLocation bak \
     --backendName dsEvaluation \
     --last \
     --verify
  3. Restore using the --offline option.

    The following example restores the dsEvaluation backend:

    $ dsbackup \
     restore \
     --offline \
     --backupLocation bak \
     --backendName dsEvaluation

    To restore the latest backups of more than one backend, specify the --backendName option multiple times.

    To restore a specific backup, specify the --backupId option. To restore multiple specific backups of different backends, specify the --backupId option multiple times.

    To list backup information without performing verification, use the dsbackup list command without the --verify option. The output includes backup IDs for use with the --backupId option.

    For details, refer to dsbackup.

  4. Start the server:

    $ start-ds --quiet

Restore configuration files

  1. Stop the server:

    $ stop-ds --quiet
  2. Restore the configuration files from the backup, overwriting existing files:

    $ tar -zxvf backup-config-<date>.tar.gz
  3. Start the server:

    $ start-ds --quiet

Restore from a snapshot

Use the dsbackup command when possible for backup and restore operations.

You can use snapshot technology as an alternative to the dsbackup command, but you must be careful how you use it. For details, refer to Back up using snapshots.

Take the following points into account before restoring a snapshot:

  • When you restore files for a replicated backend, the snapshot must be newer than the last purge of the replication change log (default: 3 days).

  • Stop the DS server before you restore the files.

  • The DS configuration files in the snapshot must match the configuration where you restore the snapshot.

    If the configuration uses expressions, define their values for the current server before starting DS.

  • When using snapshot files to initialize replication, only restore the data (db/) files for the target backend.

    Depending on the snapshot technology, you might need to restore the files separately, and then move only the target backend files from the restored snapshot.

  • When using snapshot files to restore replicated data to a known state, stop all affected servers before you restore.

Purge old files

Periodically purge old backup files with the dsbackup purge command. The following example removes all backup files older than the default replication purge delay:

$ dsbackup \
 purge \
 --offline \
 --backupLocation bak \
 --olderThan 3d

This example runs the external command without contacting the server process. You can also purge backups by ID, or by backend name, and you can specify the number of backups to keep. For details, refer to dsbackup.

To purge files as a server task, use the task options, such as --recurringTask. The user must have the backend-backup administrative privilege to schedule a purge task.

Cloud storage

You can stream backup files to cloud storage, and restore them directly from cloud storage.

The implementation supports these providers:

  • Amazon AWS S3

  • Azure Cloud Storage

  • Google Cloud Storage

Follow these steps to store backup files in the cloud:

  1. If you upgraded in place from DS 6.5 or earlier, activate cloud storage for backup.

  2. Get a storage account and space from the cloud provider where the server can store backup files.

    This storage space is referred to below as cloud-bak.

  3. Get credentials from the cloud provider.

    The DS server backing up files must have read, write, and delete access. For information about granting access, refer to the access control documentation for your provider.

    If you are not yet familiar with cloud storage, refer to the documentation from your provider for help. The following table provides links to the documentation for supported providers:

    Provider Hints

    Amazon AWS S3

    For details on setting up S3 and working with S3 buckets, refer to the Amazon Web Services documentation on Getting started with Amazon Simple Storage Service.

    Azure Cloud Storage

    DS authenticates to Azure with an Azure storage account. For details, refer to the Microsoft documentation on how to Create an Azure Storage account, or to Create a BlockBlobStorage account.

    Google Cloud Storage

    DS authenticates to Google Cloud with a service account. For details, refer to the Google documentation on Getting Started with Authentication.

    For details about creating and managing storage buckets, refer to the Google How-To documentation on Creating buckets, and Working with buckets.

  4. Set environment variables for the credentials:

    Provider Environment Variable(s)

    Amazon AWS S3

    export AWS_ACCESS_KEY_ID=aws-access-key

    export AWS_SECRET_ACCESS_KEY=aws-secret-key

    When using temporary credentials, also export the session token:
    export AWS_SESSION_TOKEN=aws-session-token

    Azure Cloud Storage

    export AZURE_ACCOUNT_NAME=azure-account-name

    export AZURE_ACCOUNT_KEY=azure-account-key

    Google Cloud Storage

    export GOOGLE_CREDENTIALS=/path/to/gcp-credentials.json (optional)

  5. Restart the DS server so that it reads the environment variables you set:

    $ stop-ds --restart
  6. Run dsbackup commands with all required provider-specific options.

    The options in the following table use the providers' default storage endpoints:

    Provider Required Options

    Amazon AWS S3

    --storageProperty s3.keyId.env.var:AWS_ACCESS_KEY_ID \
    --storageProperty s3.secret.env.var:AWS_SECRET_ACCESS_KEY \
    --backupLocation s3://cloud-bak
    
    # When using temporary credentials, also use the session token:
    --storageProperty s3.keyId.env.var:AWS_ACCESS_KEY_ID \
    --storageProperty s3.secret.env.var:AWS_SECRET_ACCESS_KEY \
    --storageProperty s3.sessionToken.env.var:AWS_SESSION_TOKEN \
    --backupLocation s3://cloud-bak

    Azure Cloud Storage

    --storageProperty az.accountName.env.var:AZURE_ACCOUNT_NAME \
    --storageProperty az.accountKey.env.var:AZURE_ACCOUNT_KEY \
    --backupLocation az://cloud-bak

    Google Cloud Storage

    --storageProperty gs.credentials.path:/path/to/gcp-credentials.json \
    --backupLocation gs://cloud-bak

    or

    --storageProperty gs.credentials.env.var:GOOGLE_CREDENTIALS \
    --backupLocation gs://cloud-bak

    In production environments, also set the cloud storage endpoint.

    Cloud storage requires working space in the local system temporary directory. Some cloud storage providers require sending the content length with each file.

    To send the correct content length, the dsbackup command writes each prepared backup file to the system temporary directory before upload. It deletes each file after successful upload.

Cloud storage endpoint

Backup to cloud storage can use a default endpoint, which can simplify evaluation and testing.

Control where your backup files go. Add one of the following options:

  • --storage-property endpoint:endpoint-url

  • --storage-property endpoint.env.var:environment-variable-for-endpoint-url

The endpoint-url depends on your provider. Refer to their documentation for details. For Azure cloud storage, the endpoint-url starts with the account name. Examples include https://azure-account-name.blob.core.windows.net, https://${AZURE_ACCOUNT_NAME}.blob.core.windows.net, and https://${AZURE_ACCOUNT_NAME}.some.private.azure.endpoint.

Cloud storage samples demonstrate how to use the setting.

Cloud storage samples

Click the samples for your storage provider to expand the section and display the commands:

AWS samples
#
# API keys created through the AWS API gateway console:
#
export AWS_ACCESS_KEY_ID=aws-access-key-id
export AWS_SECRET_ACCESS_KEY=aws-secret-key
# When using temporary credentials:
# export AWS_SESSION_TOKEN=aws-session-token

# These samples use the following S3 bucket, and a non-default endpoint:
# S3 bucket: s3://ds-test-backup
# S3 endpoint: https://s3.us-east-1.amazonaws.com
#
# When using temporary credentials, also add
# the AWS session token storage property option to each of the commands:
# --storageProperty s3.sessionToken.env.var:AWS_SESSION_TOKEN

# Back up the dsEvaluation backend offline:
dsbackup create --backendName dsEvaluation --offline \
 --backupLocation s3://ds-test-backup \
 --storageProperty s3.keyId.env.var:AWS_ACCESS_KEY_ID \
 --storageProperty s3.secret.env.var:AWS_SECRET_ACCESS_KEY \
 --storageProperty endpoint:https://s3.us-east-1.amazonaws.com

# List and verify the latest backup files for each backend at this location:
dsbackup list --verify --last \
 --backupLocation s3://ds-test-backup \
 --storageProperty s3.keyId.env.var:AWS_ACCESS_KEY_ID \
 --storageProperty s3.secret.env.var:AWS_SECRET_ACCESS_KEY \
 --storageProperty endpoint:https://s3.us-east-1.amazonaws.com

# Restore dsEvaluation from backup offline:
dsbackup restore --backendName dsEvaluation --offline \
 --backupLocation s3://ds-test-backup \
 --storageProperty s3.keyId.env.var:AWS_ACCESS_KEY_ID \
 --storageProperty s3.secret.env.var:AWS_SECRET_ACCESS_KEY \
 --storageProperty endpoint:https://s3.us-east-1.amazonaws.com

# Purge all dsEvaluation backup files:
dsbackup purge --backendName dsEvaluation --keepCount 0 --offline \
 --backupLocation s3://ds-test-backup \
 --storageProperty s3.keyId.env.var:AWS_ACCESS_KEY_ID \
 --storageProperty s3.secret.env.var:AWS_SECRET_ACCESS_KEY \
 --storageProperty endpoint:https://s3.us-east-1.amazonaws.com
Azure samples
#
# Credentials for Azure storage, where the Azure account is found in key1 in the Azure console:
#
export AZURE_ACCOUNT_NAME=azure-account-name
export AZURE_ACCOUNT_KEY=azure-account-key

# These samples use the following Azure storage, and a non-default endpoint:
# Azure storage: az://ds-test-backup/test1
# Azure endpoint: https://${AZURE_ACCOUNT_NAME}.blob.core.windows.net

# Back up the dsEvaluation backend offline:
dsbackup create --backendName dsEvaluation --offline \
 --backupLocation az://ds-test-backup/test1 \
 --storageProperty az.accountName.env.var:AZURE_ACCOUNT_NAME \
 --storageProperty az.accountKey.env.var:AZURE_ACCOUNT_KEY \
 --storageProperty "endpoint:https://${AZURE_ACCOUNT_NAME}.blob.core.windows.net"

# List and verify the latest backup files for each backend at this location:
dsbackup list --verify --last \
 --backupLocation az://ds-test-backup/test1 \
 --storageProperty az.accountName.env.var:AZURE_ACCOUNT_NAME \
 --storageProperty az.accountKey.env.var:AZURE_ACCOUNT_KEY \
 --storageProperty "endpoint:https://${AZURE_ACCOUNT_NAME}.blob.core.windows.net"

# Restore dsEvaluation from backup offline:
dsbackup restore --backendName dsEvaluation --offline \
 --backupLocation az://ds-test-backup/test1 \
 --storageProperty az.accountName.env.var:AZURE_ACCOUNT_NAME \
 --storageProperty az.accountKey.env.var:AZURE_ACCOUNT_KEY \
 --storageProperty "endpoint:https://${AZURE_ACCOUNT_NAME}.blob.core.windows.net"

# Purge all dsEvaluation backup files:
dsbackup purge --backendName dsEvaluation --keepCount 0 --offline \
 --backupLocation az://ds-test-backup/test1 \
 --storageProperty az.accountName.env.var:AZURE_ACCOUNT_NAME \
 --storageProperty az.accountKey.env.var:AZURE_ACCOUNT_KEY \
 --storageProperty "endpoint:https://${AZURE_ACCOUNT_NAME}.blob.core.windows.net"
Google cloud samples
#
# Credentials generated with and download from the Google cloud console:
#
export GOOGLE_CREDENTIALS=/path/to/gcp-credentials.json

# These samples use the following cloud storage, and endpoint:
# Google storage: gs://ds-test-backup/test1
# Google endpoint: https://www.googleapis.com

# Back up the dsEvaluation backend offline:
dsbackup create --backendName dsEvaluation --offline \
 --backupLocation gs://ds-test-backup/test1 \
 --storageProperty gs.credentials.env.var:GOOGLE_CREDENTIALS \
 --storageProperty endpoint:https://www.googleapis.com

# List and verify the latest backup files for each backend at this location:
dsbackup list --verify --last \
 --backupLocation gs://ds-test-backup/test1 \
 --storageProperty gs.credentials.env.var:GOOGLE_CREDENTIALS \
 --storageProperty endpoint:https://www.googleapis.com

# Restore dsEvaluation from backup offline:
dsbackup restore --backendName dsEvaluation --offline \
 --backupLocation gs://ds-test-backup/test1 \
 --storageProperty gs.credentials.env.var:GOOGLE_CREDENTIALS \
 --storageProperty endpoint:https://www.googleapis.com

# Purge all dsEvaluation backup files:
dsbackup purge --backendName dsEvaluation --keepCount 0 --offline \
 --backupLocation gs://ds-test-backup/test1 \
 --storageProperty gs.credentials.env.var:GOOGLE_CREDENTIALS \
 --storageProperty endpoint:https://www.googleapis.com

Efficiently store backup files

DS backups are collections of files in a backup directory. To restore from backup, DS requires a coherent collection of backup files.

You can use the dsbackup command to purge stale backup files from a backup directory. When you purge stale backup files, the command leaves a coherent collection of files you can use to restore data.

You should also store copies of backup files remotely to guard against the loss of data in a disaster.

Remote storage

Perform the following steps to store copies of backup files remotely in an efficient way. These steps address backup of directory data, which is potentially very large, not backup of configuration data, which is almost always small:

  1. Choose a local directory or local network directory to hold backup files.

    Alternatively, you can back up to cloud storage.

  2. Schedule a regular backup task to back up files to the directory you chose.

    Make sure that the backup task runs more often than the replication purge delay. For example, schedule the backup task to run every three hours for a default purge delay of three days. Each time the task runs, it backs up only new directory backend files.

    For details, refer to the steps for backing up directory data.

  3. Store copies of the local backup files at a remote location for safekeeping:

    1. Purge old files in the local backup directory.

      As described in How backup works, DS backups are cumulative in nature; DS reuses common data that has not changed from previous backup operations when backing up data again. The set of backup files is fully standalone.

      The purge removes stale files without impacting the integrity of newer backups, reducing the volume of backup files to store when you copy files remotely.

    2. Regularly copy the backup directory and all the files it holds to a remote location.

      For example, copy all local backup files every day to a remote directory called bak-date:

      $ ssh user@remote-storage mkdir /path/to/bak-date
      $ scp -R /path/to/bak/* user@remote-storage:/path/to/bak-date/
  4. Remove old bak-date directories from remote storage in accordance with the backup policy for the deployment.

Restore from remote backup

For each DS directory server to restore:

  1. Install DS using the same cryptographic keys and deployment ID.

    Backup files are protected using keys derived from the DS deployment ID and password. You must use the same ones when recovering from a disaster.

  2. Restore configuration files.

  3. Restore directory data from the latest remote backup folder.

After restoring all directory servers, validate that the restore procedure was a success.

Copyright © 2010-2024 ForgeRock, all rights reserved.