Replication in DS/OpenDJ

This book provides information on configuring, managing, troubleshooting and recovering replication in DS/OpenDJ. Known issues are included along with solutions.


Configuring and Managing Replication


How do I quickly create a new DS/OpenDJ (All versions) replica?

The purpose of this article is to provide information on quickly creating a new DS/OpenDJ replica. It assumes you have an existing replication topology.

Creating a new replica

This process refers to the following example servers in the commands to help distinguish between them:

  • server1 - an existing replica in the replication topology.
  • serverNew - the new replica server being added to the replication topology.

You can create a new replica as follows:

  1. Install and configure your new replica server from scratch with non-replicated configuration elements such as password policies, global ACI, backends, indexes etc; do not configure replication.
  2. Enable replication on your new replica server using the dsreplication command applicable to your version (do not use the dsreplication initialize command):
    • DS 5 and later:
      $ ./dsreplication configure --adminUid admin --adminPassword password --baseDn dc=example,dc=com --host1 ds1.example.com --port1 4444 --bindDn1 "cn=Directory Manager" --bindPassword1 password --replicationPort1 8989 --host2 ds2.example.com --port2 4444 --bindDn2 "cn=Directory Manager" --bindPassword2 password --replicationPort2 8989 --trustAll --no-prompt
    • Pre-DS 5:
      $ ./dsreplication enable --adminUID admin --adminPassword password --baseDN dc=example,dc=com --host1 ds1.example.com --port1 4444 --bindDN1 "cn=Directory Manager" --bindPassword1 password --replicationPort1 8989 --host2 ds2.example.com --port2 4444 --bindDN2 "cn=Directory Manager" --bindPassword2 password --replicationPort2 8989 --trustAll --no-prompt
  3. Back up an existing replica in your replication topology using the backup command, for example:
    $ ./backup --hostname ds1.example.com --port 4444 --bindDN "cn=Directory Manager" --bindPassword password --backendID [backendID] --backupDirectory /path/to/ds/bak --start 0
  4. Transfer the backup you created in step 3 from server1 to serverNew.
  5. Restore the data you backed up in step 3 on your new replica server using the restore command, for example:
    $ ./restore --hostname ds2.example.com --port 4444 --bindDN "cn=Directory Manager" --bindPassword password --backupID [backupid] --backupDirectory /path/to/ds/bak
  6. Tell all the replica servers to recompute the generation ID for this baseDN and to broadcast it to each other; this enables them to replicate again. Ensure you use the same baseDN as in step 2:
    $ ./dsreplication post-external-initialization --hostname ds2.example.com --port 4444 --baseDN dc=example,dc=com --adminUID admin --adminPassword password --trustAll --no-prompt

​You have now created a new replica which is included in your replication topology. Any changes made to the other servers in between the back up and running the post-external-initialization command will now be replayed on the new server, as long as the replication purge delay covers the time between steps 2 and 6. The default replication purge delay is 3 days.

See Also

How do I design and implement my backup and restore strategies for DS/OpenDJ (All versions)?

FAQ: Backup and restore in DS/OpenDJ

Installing and Administering DS/OpenDJ

Administration Guide › Managing Data Replication › Configuring Replication

Related Training

N/A

Related Issue Tracker IDs

N/A


How do I use the AWS snapshot feature to quickly create DS/OpenDJ (All versions) instances?

The purpose of this article is to provide information on using the Amazon Web Services™ (AWS™) snapshot feature to quickly provision new DS/OpenDJ instances from an existing running server and add them to the replication topology. Although this article is specific to the AWS snapshot feature, you can use the same concepts with other instantaneous copy systems such as OpenZFS snapshot.

Prerequisites

The following process assumes you already have two DS 5.5 instances installed and running (Master 1 and Master 2), where both instances were installed using the --instancePath setup option. This option allows the setup to separate the command line tools and runtime libraries from the instances data, instance libraries, configuration and log files. See How do I install DS/OpenDJ (All versions) so that the instance files are separate to the install files? for further information. 

Both Master 1 (opendj-source.forgerock.com) and Master 2 use the same setup install and instance paths. You can verify this as shown in the following example:

master1/$ cat opendj/instance.loc 
/opt/instances/opendjdata

master1/$ ./status
...
...
Installation Path:        /opt/instances/opendj
Instance Path:            /opt/instances/opendjdata
Version:                  ForgeRock Directory Services 5.5.0


master2/$ cat opendj/instance.loc
/opt/instances/opendjdata

master2/$ ./status
...
...
Installation Path:        /opt/instances/opendj
Instance Path:            /opt/instances/opendjdata
Version:                  ForgeRock Directory Services 5.5.0

Using the AWS snapshot feature to create DS/OpenDJ instances

Warning

You cannot use this process to upgrade a server to a new version; you must upgrade as detailed in the Installation Guide.

This example process refers to the following DS/OpenDJ instances:

  • Master 1 - this is the source instance and has the following hostname: opendj-source.forgerock.com
  • Master 2 - this is the other existing instance.
  • Master 3 - this is the new instance you are creating and has the following hostname: opendj-new.forgerock.com

You can create new DS/OpenDJ instances as follows:

  1. Select the Master 1 instance to use in the snapshot process; this is the snapshot source (opendj-source.forgerock.com).
  2. Stop the Master 1 instance; it is crucial that you stop the DS/OpenDJ instance before using the AWS snapshot feature as this guarantees the backend databases have been properly shut down and the lock region/data is properly recorded; failing to do so can corrupt backend databases.
  3. Wait 5-10 seconds after the instance has shut down to allow the system to finish syncing the data to disk;  failing to do so can corrupt the changelogDB file.
  4. Use the AWS snapshot feature to instantly copy the Master 1 instance, system. This snapshot is used as the template for the AWS newly provisioned (auto-scaled) system.
  5. Restart the Master 1 instance.
  6. Use the AWS auto-scaling system to provision the new system (Master 3) based on your snapshot; do not start this new Master 3 instance yet.   
  7. Remove the entire installation path from the new Master 3 instance to prepare for the new configuration:
    $ rm -rf /opt/instances/opendj
    
  8. Remove all but the /opt/instances/opendjdata/db and /opt/instances/opendjdata/changelogDb directories from the instance path for the new Master 3 instance:
    $ cd /opt/instances/opendjdata
    $ rm -rf bak classes config import-tmp ldif legal-notices lib locks logs
    
    Your new Master 3 instance now only has the following elements leftover from the source instance (Master 1): 
    opendj; instances/$ ls -la opendj
    ls: opendj: No such file or directory
    
    opendj; instances/$ ls -la opendjdata
    total 0
    drwxr-xr-x   4 opendj  opendj   136 Jun  9 11:51 .
    drwxr-xr-x  66 opendj  opendj  2244 Jun  9 11:50 ..
    drwxr-xr-x   5 opendj  opendj   170 Jun  9 10:53 changelogDb
    drwxr-xr-x   3 opendj  opendj   102 Jun  9 09:49 db
    You are now ready to install and set up the new Master 3 instance.
  9. Install DS/OpenDJ in the instances path of the new Master 3 instance, for example:
    $ cd /opt/instances
    $ unzip DS-5.5.0.zip
    $ cd opendj
    
  10. In OpenDJ 2.6.x and 3.x only: Create a new instance.loc file in this location:
    $ echo /opt/instances/opendjdata > /opt/instances/opendj/instance.loc
    
  11. Set up the new Master 3 instance with no backend configuration using the setup command applicable to your version:
    • DS 5 and later:
      $ ./setup --instancePath /opt/instances/opendjdata --ldapPort 1389 --adminConnectorPort 4444 --rootUserDN "cn=Directory Manager" --rootUserPassword password --hostname opendj-new.forgerock.com --enableStartTLS --ldapsPort 1636 --acceptLicense
      
    • Pre-DS 5:
      $ ./setup --cli --ldapPort 1389 --adminConnectorPort 4444 --rootUserDN "cn=Directory Manager" --rootUserPassword password --hostname opendj-new.forgerock.com --enableStartTLS --ldapsPort 1636 --generateSelfSignedCertificate --noPropertiesFile --acceptLicense --no-prompt
      
    The Master 3 instance is set up so that:
    • the /opt/instances/opendj directory contains all the command line tools and runtime libraries, for example:
      opendj; cd /opt/instances/opendj
      
      opendj; opendj/$ ls -l
      total 88
      drwxr-xr-x@  3 opendj  opendj   102 Jan  8 16:30 QuickSetup.app
      -rw-r--r--@  1 opendj  opendj  1801 Jan  8 16:22 README
      drwxr-xr-x@  3 opendj  opendj   102 Jan  8 16:30 Uninstall.app
      drwxr-xr-x@ 31 opendj  opendj  1054 Jan  8 16:30 bat
      drwxr-xr-x@ 35 opendj  opendj  1190 Jun  9 12:38 bin
      -rw-r--r--   1 opendj  opendj   278 Jun  9 12:04 install
      -rw-r--r--   1 opendj  opendj    26 Jun  9 11:50 instance.loc
      drwxr-xr-x@  5 opendj  opendj   170 Jan  8 16:30 legal-notices
      drwxr-xr-x@ 66 opendj  opendj  2244 Jan  8 16:30 lib
      -rw-r--r--@  1 opendj  opendj  6501 Jan  8 16:22 opendj_logo.png
      -rwxr-xr-x@  1 opendj  opendj  1838 Jan  8 16:22 setup
      -rw-r--r--@  1 opendj  opendj  2504 Jan  8 16:22 setup.bat
      drwxr-xr-x@  3 opendj  opendj   102 Jan  8 16:30 snmp
      drwxr-xr-x@ 11 opendj  opendj   374 Jan  8 16:30 template
      -rwxr-xr-x@  1 opendj  opendj  1875 Jan  8 16:22 uninstall
      -rw-r--r--@  1 opendj  opendj  2109 Jan  8 16:22 uninstall.bat
      -rwxr-xr-x@  1 opendj  opendj  1754 Jan  8 16:22 upgrade
      -rw-r--r--@  1 opendj  opendj  1840 Jan  8 16:22 upgrade.bat
    • the /opt/instances/opendjdata directory contains the database backend, the changelogDb, instance libraries, logs etc, for example:
      opendj; cd /opt/instances/opendjdata
      
      opendj; opendjdata/$ ls -l
      total 0
      drwxr-xr-x   2 opendj  opendj   68 Jun  9 12:12 bak
      drwxr-xr-x   5 opendj  opendj  170 Jun  9 10:53 changelogDb
      drwxr-xr-x   2 opendj  opendj   68 Jun  9 12:12 classes
      drwxr-xr-x  26 opendj  opendj  884 Jun  9 12:12 config
      drwxr-xr-x   3 opendj  opendj  102 Jun  9 09:49 db
      drwxr-xr-x   2 opendj  opendj   68 Jun  9 12:12 import-tmp
      drwxr-xr-x   2 opendj  opendj   68 Jun  9 12:12 ldif
      drwxr-xr-x   3 opendj  opendj  102 Jun  9 12:12 legal-notices
      drwxr-xr-x   3 opendj  opendj  102 Jun  9 12:12 lib
      drwxr-xr-x   9 opendj  opendj  306 Jun  9 12:12 locks
      drwxr-xr-x   7 opendj  opendj  238 Jun  9 12:12 logs
      
Note

The Instance libraries (/opt/instances/opendjdata/lib) can be used for installing support patches that should only exist for this instance.

  1. Initialize (pre-warm) the EBS volume using the following command:
    $ fio --filename=/dev/xvdg --rw=randread --bs=256k --iodepth=64 --ioengine=libaio --direct=1 --name=volume-initialize --output=/tmp/prefetch-summary.log
    This step can take a long time but is necessary to prevent performance issues in the new Master 3 instance due to the way in which AWS builds a snapshot. AWS builds a snapshot by copying the data to S3 behind the scenes; once the snapshot is applied to an EBS volume, the data in S3 is lazily loaded as each data block is loaded.
  2. Check the status of the new Master 3 instance; you will notice -No LDAP Databases Found- under Data Sources:
    $ ./status --bindDN "cn=Directory Manager" --bindPassword password
    
              --- Server Status ---
    Server Run Status:        Started
    Open Connections:         1
    
              --- Server Details ---
    Host Name:                opendj-new.forgerock.com
    Administrative Users:     cn=Directory Manager
    Installation Path:        /opt/instances/opendj
    Instance Path:            /opt/instances/opendjdata
    Version:                  ForgeRock Directory Services 5.5.0
    Java Version:             1.8.0_45
    Administration Connector: Port 4444 (LDAPS)
    
              --- Connection Handlers ---
    Address:Port : Protocol               : State
    -------------:------------------------:---------
    --           : LDIF                   : Disabled
    0.0.0.0:161  : SNMP                   : Disabled
    0.0.0.0:1389 : LDAP (allows StartTLS) : Enabled
    0.0.0.0:1636 : LDAPS                  : Enabled
    0.0.0.0:1689 : JMX                    : Disabled
    0.0.0.0:8080 : HTTP                   : Disabled
    
              --- Data Sources ---
    -No LDAP Databases Found-
  3. Add any implementation specific configuration such as password policies, timeouts etc.
  4. Add schema to the instance either by using ldapmodify or copying the schema across.
  5. Create the backend configuration using the existing /opt/instances/opendjdata path:
    $ ./dsconfig create-backend --set base-dn:dc=forgerock,dc=com --set enabled:true --set db-directory:/opt/instances/opendjdata/db --type [je or "local-db"] --backend-name userRoot --hostname opendj-new.forgerock.com --port 4444 --trustAll --bindDN "cn=Directory Manager" --bindPassword password --no-prompt
    
    where --type is je for DS / OpenDJ 3.x or "local-db" for OpenDJ 2.6.x.
Note

The backend database is immediately available for use, but is not yet replicated. Do not change database entries in the backend until replication has been fully enabled.

You may notice the following errors in the OpenDJ error log when the backend is created; these can be safely ignored:

[09/Jun/2016:12:24:16 -0600] category=PLUGGABLE severity=NOTICE msgID=org.opends.messages.backend.513 msg=The database backend userRoot containing 1000 entries has started
[09/Jun/2016:12:24:16 -0600] category=CORE severity=WARNING msgID=org.opends.messages.core.724 msg=The search filter "(|(objectClass=subentry)(objectClass=ldapSubentry))" used by subentry manager is not indexed in backend userRoot. Backend initialization for subentry manager processing might take a very long time to complete
[09/Jun/2016:12:24:17 -0600] category=CORE severity=WARNING msgID=org.opends.messages.core.660 msg=The search filter "(&(|(objectClass=groupOfNames)(objectClass=groupOfUniqueNames)(objectClass=groupOfEntries))(!(objectClass=ds-virtual-static-group)))" used by group implementation cn=Static,cn=Group Implementations,cn=config is not indexed in backend userRoot. Backend initialization for this group implementation might take a very long time to complete
[09/Jun/2016:12:24:17 -0600] category=CORE severity=WARNING msgID=org.opends.messages.core.660 msg=The search filter "(objectClass=ds-virtual-static-group)" used by group implementation cn=Virtual Static,cn=Group Implementations,cn=config is not indexed in backend userRoot. Backend initialization for this group implementation might take a very long time to complete
[09/Jun/2016:12:24:17 -0600] category=CORE severity=WARNING msgID=org.opends.messages.core.660 msg=The search filter "(objectClass=groupOfURLs)" used by group implementation cn=Dynamic,cn=Group Implementations,cn=config is not indexed in backend userRoot. Backend initialization for this group implementation might take a very long time to complete
[09/Jun/2016:12:24:17 -0600] category=ACCESS_CONTROL severity=WARNING msgID=org.opends.messages.access_control.96 msg=Backend userRoot does not have a presence index defined for attribute type aci. Access control initialization may take a very long time to complete in this backend
  1. Create any implementation specific indexes and index configuration. 
  2. Check the status of the new Master 3 instance again; you will now see details under Data Sources:
              --- Server Status ---
    Server Run Status:        Started
    Open Connections:         1
    
              --- Server Details ---
    Host Name:                opendj-new.forgerock.com
    Administrative Users:     cn=Directory Manager
    Installation Path:        /opt/instances/opendj
    Instance Path:            /opt/instances/opendjdata
    Version:                  ForgeRock Directory Services 5.5.0
    Java Version:             1.8.0_45
    Administration Connector: Port 4444 (LDAPS)
    
              --- Connection Handlers ---
    Address:Port : Protocol               : State
    -------------:------------------------:---------
    --           : LDIF                   : Disabled
    0.0.0.0:161  : SNMP                   : Disabled
    0.0.0.0:1389 : LDAP (allows StartTLS) : Enabled
    0.0.0.0:1636 : LDAPS                  : Enabled
    0.0.0.0:1689 : JMX                    : Disabled
    0.0.0.0:8080 : HTTP                   : Disabled
    
              --- Data Sources ---
    Base DN:     dc=example,dc=com
    Backend ID:  userRoot
    Entries:     1000
    Replication: 
    
  3. Restart the new Master 3 instance; this will clear up any errors previously seen in the DS/OpenDJ error log.
  4. Add this new Master 3 instance to the replication topology:
    • Log in to the Master 1 system that was used as the snapshot source.
    • Enable replication from the Master 1 instance to the new Master 3 instance using the dsreplication command applicable to your version:
      • DS 5 and later:
        $ ./dsreplication configure --adminUid admin --adminPassword password --baseDn dc=example,dc=com --host1 opendj-source.forgerock.com --port1 4444 --bindDn1 "cn=Directory Manager" --bindPassword1 password --replicationPort1 8989 --host2 opendj-new.forgerock.com --port2 6444 --bindDn2 "cn=Directory Manager" --bindPassword2 password --replicationPort2 10989 --trustAll --no-prompt
        
      • Pre-DS 5:
        $ ./dsreplication enable --adminUID admin --adminPassword password --baseDN dc=example,dc=com --host1 opendj-source.forgerock.com --port1 4444 --bindDN1 "cn=Directory Manager" --bindPassword1 password --replicationPort1 8989 --host2 opendj-new.forgerock.com --port2 6444 --bindDN2 "cn=Directory Manager" --bindPassword2 password --replicationPort2 10989 --trustAll --no-prompt
        
    The newly provisioned DS/OpenDJ instance (Master 3) is now ready to be placed into the load balancing pool.

Decommissioning a DS/OpenDJ instance

The above process can be used to easily auto-provision new DS/OpenDJ instances and add them to the replication topology. When removing an unused instance, you must first disable replication on the DS/OpenDJ instance to be decommissioned before the AWS system can be deprovisioned. Failing to do so will leave configuration elements in the remaining servers which will cause errors for commands such as dsreplication status and may affect replication/replication performance.

You can disable replication as follows:

  • DS 5 and later:
    $ ./dsreplication unconfigure --unconfigureAll --port 4444 --hostname opendj-newX.forgerock.com --bindDn "cn=Directory Manager" --adminPassword password --trustAll --no-prompt
  • Pre-DS 5:
    $ ./dsreplication disable --disableAll --port 4444 --hostname opendj-newX.forgerock.com --bindDN "cn=Directory Manager" --adminPassword password --trustAll --no-prompt

See Also

How do I install DS/OpenDJ (All versions) so that the instance files are separate to the install files?

Creating an Amazon EBS Snapshot

Auto Scaling Groups

Scaling the Size of Your Auto Scaling Group

Initializing Amazon EBS Volumes

Related Training

N/A

Related Issue Tracker IDs

N/A


How do I configure DS/OpenDJ (All versions) to ensure accidentally deleted or changed data can be restored when replication is enabled?

The purpose of this article is to provide information on configuring DS/OpenDJ to provide a simple way of restoring accidentally deleted or changed data when replication is enabled.

Warning

Do not compress, tamper with, or otherwise alter changelog database files directly unless specifically instructed to do so by a qualified ForgeRock technical support engineer. External changes to changelog database files can render them unusable by the server. By default, changelog database files are located under the /path/to/ds/changelogDb directory.

Overview

Accidental deletions of data in DS/OpenDJ can be reverted in two ways:

  • The first way, described in this article, configures the replication changelog to record additional information about each change. This allows changes to be reverted at a very fine-grained level and with very little impact to the servers in the replication topology. However, reverting each change requires several manual steps.
  • The second way, described in How do I roll back an entire network of DS/OpenDJ (All versions) replicas to a previous backup?, uses the backup and restore tools. This is comparatively coarse as you can only restore up until a given backup and it does require that every replicating server is reinitialized.

Using the External Changelog

The External Changelog (cn=changelog) records all changes but you must configure it to record additional details if you want to use it for restoring deleted or changed data. You can then manually re-apply each change, for example, using ldapmodify.

Caution

Configuring the changelog to include this additional data will increase the size of the changelog; you must ensure you have sufficient disk space for this prior to making these changes. Information is kept in the changelog for three days by default; you can increase this retention period if required, although you should be aware that this will also increase the size of the changelog. The changelog is stored in the replication changes database (changelogDb directory).

This article contains the following sections to guide you through this process: 

Configuring the changelog

You can configure the changelog to ensure it contains sufficient information as follows:

  1. Enter the following command in your terminal window to record additional information for deleted data:
    $ ./dsconfig set-external-changelog-domain-prop --port 4444 --hostname ds1.example.com --bindDN "cn=Directory Manager" --bindPassword password --provider-name "Multimaster Synchronization" --domain-name dc=example,dc=com --add ecl-include-for-deletes:"*" --add ecl-include-for-deletes:"+"
    When an entry is deleted, the changelogEntry will now have an additional includedAttributes attribute that contains the encoded contents of the deleted entries.
  2. Enter the following command in your terminal window to record additional information for changed data:
    $ ./dsconfig set-external-changelog-domain-prop --port 4444 --hostname ds1.example.com --bindDN "cn=Directory Manager" --bindPassword password --provider-name "Multimaster Synchronization" --domain-name dc=example,dc=com --add ecl-include:"*" --add ecl-include:"+"
    When an entry is changed, the changelogEntry will now have an additional includedAttributes attribute that contains the encoded contents of the changed entries.

Retrieving deleted or changed data from the changelog

You can query your changelog for specific changes using a command similar to the following in your terminal window:

$ ./ldapsearch --hostname ds1.example.com --port 1389 --bindDN "cn=Directory Manager" --bindPassword password --baseDN cn=changelog --searchScope one "(changenumber>=100)" "*" "+"
Note

This example will return changes for change 100 onwards; you can filter the search on other attributes, for example, change time.

The changelog shows a changeType attribute for each entry so you can identify if it resulted from a change or delete action. The original data that was changed or deleted is encoded in the includedAttributes attribute. You can decode this using a Base64 decoder (for example, the base64 program provided with DS/OpenDJ or http://www.base64decode.org/) to retrieve the original data.

See How do I search and view the changelog records in DS/OpenDJ (All versions)? for further information on querying the changelog.

Finding change numbers

If you do not know the change number, you can use one of the following approaches to find it depending on which is the most appropriate to your setup:

Return the firstchangenumber and lastchangenumber attributes

You can use an ldapsearch command to return the firstchangenumber and lastchangenumber attributes. For example:

$ ./ldapsearch --port 1389 --hostname ds1.example.com --bindDN "cn=Directory Manager" --bindPassword password --baseDN "" --searchScope base '(objectclass=*)' firstchangenumber lastchangenumber

Example output:

dn:
firstchangenumber: 1
lastchangenumber: 6

Return a range of change numbers

You can use an ldapsearch command to return change numbers for a known range. For example:

$ ./ldapsearch --port 1389 --hostname ds1.example.com --bindDN "cn=Directory Manager" --bindPassword password --baseDN cn=changelog --searchScope one "(&(changeNumber>=2)(changeNumber<=5))"

Return change numbers for a specific change time

You can use an ldapsearch command to return change numbers with a known change time. For example:

$ ./ldapsearch --port 1389 --hostname ds1.example.com --bindDN "cn=Directory Manager" --bindPassword password --baseDN cn=changelog --searchScope one '(changeTime<=20180417224638Z)'

Return all change numbers

You can use an ldapsearch command to check the changelog status. For example:

$ ./ldapsearch --port 1389 --hostname ds1.example.com --bindDN "cn=Directory Manager" --bindPassword password --baseDN cn=changelog --searchScope one changelog=*
Caution

The above ldapsearch provides a detailed output of your changelog history.  If you make extensive or frequent changes to your Directory Server, checking the entire changelog can result in producing a substantial amount of data.

See Also

How do I understand the changelogDb directory in DS/OpenDJ (All versions)?

How do I restore old backup data to a DS/OpenDJ (All versions) replication topology?

How do I control how long replication changes are retained in DS/OpenDJ (All versions)?

Replication in DS/OpenDJ

Administration Guide › Managing Data Replication › Configuring Replication

Administration Guide › To Include Unchanged Attributes in the External Change Log

Administration Guide › Tools Reference › dsconfig

Administration Guide › Tools Reference › ldapsearch

Administration Guide › Tools Reference › ldapmodify

Related Training

ForgeRock Directory Services Core Concepts

Related Issue Tracker IDs

N/A


How do I control how long replication changes are retained in DS/OpenDJ (All versions)?

The purpose of this article is to provide assistance on changing the replication purge delay setting to control how long replication changes are retained in DS/OpenDJ. Replication changes are saved to the External Changelog (cn=changelog) and are kept for three days by default. Changing this setting will affect the size of the changelog. The changelog is stored in the replication changes database (changelogDb directory).

Warning

Do not compress, tamper with, or otherwise alter changelog database files directly unless specifically instructed to do so by a qualified ForgeRock technical support engineer. External changes to changelog database files can render them unusable by the server. By default, changelog database files are located under the /path/to/ds/changelogDb directory.

Changing the replication purge delay

You can change the replication purge delay as follows in your terminal window:

  1. Enter the following command:
    $ ./dsconfig --port 4444 --hostname ds1.example.com --bindDN "cn=Directory Manager" --bindPassword password --no-prompt set-replication-server-prop --provider-name "Multimaster Synchronization" --set replication-purge-delay:[timeperiod]
    Replacing [timeperiod] with an appropriate value that consists of a number and a letter, for example, 5d for five days and 1w for one week.
Caution

Information about replication changes are permanently gone once they have been purged from the changelog; you must ensure that you set this appropriately to ensure you keep data long enough for replication and data recovery purposes.

See Also

How do I configure DS/OpenDJ (All versions) to ensure accidentally deleted or changed data can be restored when replication is enabled?

FAQ: Backup and restore in DS/OpenDJ

Related Training

ForgeRock Directory Services Core Concepts 

Related Issue Tracker IDs

N/A


How do I enable the External Change Log on a single DS/OpenDJ (All versions) server?

The purpose of this article is to provide information on enabling the the External Change Log (cn=changelog) on a single DS/OpenDJ server. You may want to do this for testing purposes or if you want the changes to be available to a third-party system for synchronization purposes.

Warning

Do not compress, tamper with, or otherwise alter changelog database files directly unless specifically instructed to do so by a qualified ForgeRock technical support engineer. External changes to changelog database files can render them unusable by the server. By default, changelog database files are located under the /path/to/ds/changelogDb directory.

Enabling the external change log on a single DS/OpenDJ server

Note

If you want to enable replication across multiple servers, you should use dsreplication configure (DS 5 and later) or dsreplication enable (pre-DS 5) instead as described in Reference › Tools Reference › dsreplication configure and OpenDJ Reference › Tools Reference › dsreplication enable. When you enable replication across multiple servers, the external change log is enabled by default. See Administration Guide › Managing Data Replication › To Enable the External Change Log for further information.

You can enable the external change log on a single DS/OpenDJ server as follows:

  1. Create your server as the replication server using the dsconfig create-replication-server command. For example:
    $ ./dsconfig create-replication-server --provider-name "Multimaster Synchronization" --set replication-port:8989 --set replication-server-id:2 --type generic --hostName ldap.example.com --port 4444 --bindDN "cn=Directory Manager" --bindPassword password --no-prompt --trustAll
    
  2. Create the replication domain for your server using the dsconfig create-replication-domain command. For example:
    $ ./dsconfig create-replication-domain --provider-name "Multimaster Synchronization" --set base-dn:ou=people,dc=example,dc=com --set replication-server:localhost:8989 --set server-id:3 --type generic --domain-name example_com --hostName ldap.example.com --port 4444 --bindDN "cn=Directory Manager" --bindPassword password --no-prompt --trustAll
    

Searching the external change log

Once the external change log is enabled, you can search it using the ldapsearch command. You should include the ECL Cookie Control (1.3.6.1.4.1.26027.1.5.4) as this is the optimized way to search the external change log. The following example command also includes ; in the Cookie Control; this indicates the cookie is unknown, which means the search starts from the first change. You can replace this with the string value associated with the last change received if you want to continue from that point. You can find this string value from the ChangeLogCookie operational attribute (if requested) or the comment before the change itself. See Administration Guide › Managing Data Replication › To Use the External Change Log for further information.

For example:

$ ./ldapsearch --bindDN "cn=Directory Manager" --bindPassword password --hostName ldap.example.com --port 1389 --control "1.3.6.1.4.1.26027.1.5.4:false:;" --baseDN "cn=changelog" '(objectclass=*)'

See Also

Replication in DS/OpenDJ

Administration Guide › Managing Data Replication › Change Notification For Your Applications

Configuration Reference › create-replication-server

Configuration Reference › delete-replication-domain

Related Training

N/A

Related Issue Tracker IDs

N/A


How do I reset the cn=changelog changeNumber in DS/OpenDJ (All versions)?

The purpose of this article is to provide assistance if you need to reset the cn=changelog changeNumber in DS/OpenDJ. You only need to reset a changeNumber if one becomes out of sync across multiple DS/OpenDJ servers.

Warning

Do not compress, tamper with, or otherwise alter changelog database files directly unless specifically instructed to do so by a qualified ForgeRock technical support engineer. External changes to changelog database files can render them unusable by the server. By default, changelog database files are located under the /path/to/ds/changelogDb directory.

Reset changeNumber (DS / OpenDJ 3.x)

In DS / OpenDJ 3.x, you can reset the changeNumber on one server with the changeNumber from another using a dsreplication command such as the following:

$ ./dsreplication reset-change-number --adminUID admin --adminPassword password --hostSource ds1.example.com --portSource 4444 --hostDestination ds2.example.com --portDestination 4444 --trustAll --no-prompt

Alternatively, you can set the changeNumber to an exact number if needed, for example:

$ ./dsreplication reset-change-number --adminUID admin --adminPassword password --hostSource ds1.example.com --portSource 4444 --hostDestination ds2.example.com --portDestination 4444 --change-number 1234 --trustAll --no-prompt

See Reference › Tools Reference › dsreplication reset-change-number for further information.

Reset changeNumber (OpenDJ 2.6.x)

In OpenDJ 2.6.x, you need to follow a manual process to reset the changeNumber. You can either reset it to 0 or to a known changeNumber on a different server.

Caution

You should test this procedure in a development environment first and ensure you have up to date backups in case you encounter any issues.

Reset to 0

  1. Choose the most current Master (Master 1 in this example).
  2. Send all traffic to Master 1.
  3. Check that all changes have been replicated. You can do this using the dsreplication status command:
    $ ./dsreplication status --adminUID admin --adminPassword password --hostname ds1.example.com --port 4444 --trustAll
  4. Stop replication temporarily on Master 1:
    $ ./dsconfig set-synchronization-provider-prop --port 4444 --hostname ds1.example.com --bindDN "cn=Directory Manager" --bindPassword password --provider-name "Multimaster Synchronization" --set enabled:false --trustAll --no-prompt
  5. Stop replication on one of the OpenDJ servers:
    $ ./dsreplication disable --disableAll --port 4444 --hostname ds2.example.com --bindDN "cn=Directory Manager" --adminPassword password --trustAll --no-prompt
  6. Stop this OpenDJ server:
    $ ./stop-ds
  7. Remove the changelogDb file from this OpenDJ server:
    $ rm -rf /path/to/ds/changelogDb
  8. Restart this OpenDJ server:
    $ ./start-ds
  9. Re-enable replication on this server, where Master 1 is the source of the replicated data:
    $ ./dsreplication enable --adminUID admin --adminPassword password --baseDN dc=example,dc=com  --host1 ds2.example.com --port1 4444 --bindDN1 "cn=Directory Manager" --bindPassword1 password --replicationPort1 8989 --host2 ds1.example.com --port2 4444 --bindDN2 "cn=Directory Manager" --bindPassword2 password --replicationPort2 8989 --trustAll --no-prompt
    If replication does not flow and the cn=changelog is not populated, restart OpenDJ using the following command instead:
    $ ./stop-ds --restart
  10. Repeat steps 5 to 9 on all servers.

Reset to a known changeNumber

  1. Choose the server that has the changeNumber you want your applications to keep (Master 1 in this example).
  2. Send all traffic to Master 1.
  3. Check that all changes have been replicated. You can do this using the dsreplication status command:
    $ ./dsreplication status --adminUID admin --adminPassword password --hostname ds1.example.com --port 4444 --trustAll
  4. Stop replication on the OpenDJ server that needs its changeNumber reset:
    $ ./dsreplication disable --disableAll --port 4444 --hostname ds2.example.com --bindDN "cn=Directory Manager" --adminPassword password --trustAll --no-prompt
  5. Stop this OpenDJ server:
    $ ./stop-ds
  6. Remove the changelogDb file from this OpenDJ server:
    $ rm -rf /path/to/ds/changelogDb
  7. Restart this OpenDJ server:
    $ ./start-ds
  8. Re-enable replication on this server to add it back into the replication topology, where Master 1 is the source of the replicated data:
    $ ./dsreplication enable --adminUID admin --adminPassword password --baseDN dc=example,dc=com  --host1 ds2.example.com --port1 4444 --bindDN1 "cn=Directory Manager" --bindPassword1 password --replicationPort1 8989 --host2 ds1.example.com --port2 4444 --bindDN2 "cn=Directory Manager" --bindPassword2 password --replicationPort2 8989 --trustAll --no-prompt
    If replication does not flow and the cn=changelog is not populated, restart OpenDJ using the following command instead:
    $ ./stop-ds --restart

See Also

How do I control how long replication changes are retained in DS/OpenDJ (All versions)?

How do I configure DS/OpenDJ (All versions) to ensure accidentally deleted or changed data can be restored when replication is enabled?

FAQ: Backup and restore in DS/OpenDJ

Administration Guide › Managing Data Replication

Related Training

N/A

Related Issue Tracker IDs

OPENDJ-1937 (Replication draft change log changeNumber attribute should be synchronized with other RSs during initialization)


How do I configure an external or CA Signed certificate for replication in DS/OpenDJ (All versions)?

The purpose of this article is to provide assistance with configuring an external or CA Signed certificate for replication in DS/OpenDJ. This allows you to use a certificate other than a self-signed one for increased security. This article assumes replication is not enabled.

Overview

Caution

The following process assumes that replication is not enabled and the instance only has the default DS/OpenDJ created self-signed ads-truststore certificates.

In summary, the steps are:

  1. Delete existing replication instance keys.
  2. Delete existing ads-truststore certificates (or create a blank keystore).
  3. Generate a new keypair.
  4. Generate a certificate signing request.
  5. Export and import the certificate to create a trusted cert entry.
  6. Have the CA sign the certificate request.
  7. Import the CA certificate chain.
  8. Import the signed certificate.
  9. Enable replication.

Configuring an external or CA Signed certificate for replication

This process refers to two masters (Master 1 and Master 2).

You can configure an external or CA Signed certificate for replication as follows:

  1. Search for and delete the instance key in cn=admin data that matches the MD5 hash string of the ads-certificate held within the ads-truststore; this can be done on any instance:
    $ keytool -list -storetype jks -keystore ads-truststore -v -storepass `cat ads-truststore.pin`
    
    Keystore type: JKS
    Keystore provider: SUN
    
    Your keystore contains 2 entries
    
    Alias name: ads-certificate
    Creation date: Aug 9, 2016
    Entry type: PrivateKeyEntry
    Certificate chain length: 1
    Certificate[1]:
    Owner: CN=ds1.example.com, O=OpenDJ RSA Certificate
    Issuer: CN=ds1.example.com, O=OpenDJ RSA Certificate
    Serial number: 207149be
    Valid from: Tue Aug 09 17:17:34 MDT 2016 until: Mon Aug 04 17:17:34 MDT 2036
    Certificate fingerprints:
         MD5:  0A:BC:37:0A:0A:1B:C8:B0:EB:C4:A2:91:E1:86:05:36
         SHA1: E2:BC:6A:7C:A1:F1:71:EE:ED:01:98:25:0E:9D:3D:B8:1C:8C:AD:9C
         SHA256: 09:C1:2A:A6:40:B9:4B:49:CC:B0:FA:3F:86:C6:9E:E2:BB:C7:F1:B3:C1:21:81:54:94:21:C4:84:A5:4C:8F:98
         Signature algorithm name: SHA1withRSA
         Version: 3
    
    
    *******************************************
    *******************************************
    
    
    Alias name: 0abc370a0a1bc8b0ebc4a291e1860536
    Creation date: Aug 9, 2016
    Entry type: trustedCertEntry
    
    Owner: CN=ds1.example.com, O=OpenDJ RSA Certificate
    Issuer: CN=ds1.example.com, O=OpenDJ RSA Certificate
    Serial number: 207149be
    Valid from: Tue Aug 09 17:17:34 MDT 2016 until: Mon Aug 04 17:17:34 MDT 2036
    Certificate fingerprints:
         MD5:  0A:BC:37:0A:0A:1B:C8:B0:EB:C4:A2:91:E1:86:05:36
         SHA1: E2:BC:6A:7C:A1:F1:71:EE:ED:01:98:25:0E:9D:3D:B8:1C:8C:AD:9C
         SHA256: 09:C1:2A:A6:40:B9:4B:49:CC:B0:FA:3F:86:C6:9E:E2:BB:C7:F1:B3:C1:21:81:54:94:21:C4:84:A5:4C:8F:98
         Signature algorithm name: SHA1withRSA
         Version: 3
    
    
    *******************************************
    *******************************************
    
    The above MD5 hash is 'MD5: 0A:BC:37:0A:0A:1B:C8:B0:EB:C4:A2:91:E1:86:05:36'; if you remove the colons (:) from the hash, you can determine the instance key (ds-cfg-key-id) this matches in the cn=admin data backend:
    dn: ds-cfg-key-id=0ABC370A0A1BC8B0EBC4A291E1860536,cn=instance keys,cn=admin data
    objectClass: top
    objectClass: ds-cfg-instance-key
    ds-cfg-public-key-certificate;binary:: MIIC/DCCAeSgAwIBAgIEIHFJvjANBgkqhkiG9w0BAQUFADBAMR8wHQYDVQQKExZPc<SNIP>
    ds-cfg-key-id: 0ABC370A0A1BC8B0EBC4A291E1860536
    createTimestamp: 20160809231735Z
    creatorsName: cn=Internal Client,cn=Root DNs,cn=config
    entryUUID: c2106051-7b4b-4188-adcd-01cf08f3f268
    
    You can now delete this instance key:
    $ ./ldapdelete --port 1389 --bindDN "cn=Directory Manager" --bindPassword password 
    "ds-cfg-key-id=0ABC370A0A1BC8B0EBC4A291E1860536,cn=instance keys,cn=admin data"
    
    Processing DELETE request for ds-cfg-key-id=0ABC370A0A1BC8B0EBC4A291E1860536,cn=instance keys,cn=admin data
    DELETE operation successful for DN ds-cfg-key-id=0ABC370A0A1BC8B0EBC4A291E1860536,cn=instance keys,cn=admin data
    
  2. Delete the existing ads-certificate PrivateKeyEntry and its corresponding trustedCertEntry (shown in the above keytool -list output) or create a blank ads-truststore on the instance you want to configure with the external or CA Signed certificate:
    • Option 1: Delete existing certificates:
      $ keytool -delete -alias ads-certificate -storetype jks -keystore ads-truststore -v -storepass `cat ads-truststore.pin`
      [Storing ads-truststore]
      
      $ keytool -delete -alias 0abc370a0a1bc8b0ebc4a291e1860536 -storetype jks -keystore ads-truststore -v -storepass `cat ads-truststore.pin`
      [Storing ads-truststore]
      
      $ keytool -list -storetype jks -keystore ads-truststore -v -storepass `cat ads-truststore.pin`
      
      Keystore type: JKS
      Keystore provider: SUN
      
      Your keystore contains 0 entries
      
    • Option 2: Create a blank ads-truststore:
      $ rm ads-truststore
      
      $ keytool -genkey -alias foo -keystore ads-truststore -storepass `cat ads-truststore.pin` -keypass `cat ads-truststore.pin` -dname "CN=ds1.example.com, O=OpenDJ RSA Certificate"
      
      $ keytool -delete -alias foo -keystore ads-truststore -storepass `cat ads-truststore.pin` -keypass `cat ads-truststore.pin`
      
  3. Create new PrivateKeyEntry and trustedCertEntry truststore entries on Master 1 (CN = ds1.example.com):
    $ keytool -genkeypair -alias ads-certificate -keyalg RSA -validity 7300 -keysize 2048 -storetype JKS -keystore ads-truststore -storepass `cat ads-truststore.pin` -keypass `cat ads-truststore.pin` -dname "CN=ds1.example.com, O=OpenDJ RSA Certificate"
    
    You can check the certificate in the ads-truststore; it should now look like the following:
    $ keytool -list -storetype jks -keystore ads-truststore -v -storepass `cat ads-truststore.pin`
    
    Keystore type: JKS
    Keystore provider: SUN
    
    Your keystore contains 1 entry
    
    Alias name: ads-certificate
    Creation date: Aug 12, 2016
    Entry type: PrivateKeyEntry
    Certificate chain length: 1
    Certificate[1]:
    Owner: CN=ds1.example.com, O=OpenDJ RSA Certificate
    Issuer: CN=ds1.example.com, O=OpenDJ RSA Certificate
    Serial number: 24a0d8df
    Valid from: Fri Aug 12 12:01:27 MDT 2016 until: Thu Aug 07 12:01:27 MDT 2036
    Certificate fingerprints:
         MD5:  AF:43:1B:F5:BB:3A:41:FD:83:AF:F2:13:FE:B1:62:3A
         SHA1: 65:65:04:C7:68:2D:C7:EC:41:FA:8C:C7:61:E0:E3:3A:CE:1A:D7:C0
         SHA256: 88:B9:95:81:A8:7F:DB:62:CD:4A:04:8A:1D:C8:F2:B9:56:6E:6F:66:9F:1D:D1:FE:45:FA:AA:91:17:C5:EB:EB
         Signature algorithm name: SHA256withRSA
         Version: 3
    
    Extensions: 
    
    #1: ObjectId: 2.5.29.14 Criticality=false
    SubjectKeyIdentifier [
    KeyIdentifier [
    0000: 77 1E 72 73 87 23 24 D5   E1 E9 71 29 16 33 04 3C  w.rs.#$...q).3.<
    0010: 3A 4D 46 64                                        :MFd
    ]
    ]
    
    
    *******************************************
    *******************************************
    
  4. Create a Certificate Signing Request from the above certificate on Master 1:
    $ keytool -certreq -alias ads-certificate -keystore ads-truststore -storepass `cat ads-truststore.pin` -keypass `cat ads-truststore.pin` -file newreq.pem
    
  5. Export and import the certificate to create a trusted certificate entry on Master 1. The trusted certificate entry requires a certificate alias using a lowercase hash with no colons (:) so an intermediate keytool and sed are used:
    $ keytool -export -alias ads-certificate -keystore ads-truststore -storepass `cat ads-truststore.pin` -keypass `cat ads-truststore.pin` -file ads-cert.crt
    Certificate stored in file <ads-cert.crt>
    
    $ export md5hash=`keytool -printcert -file ads-cert.crt | grep MD5 | awk '{print $2}' | tr [:upper:] [:lower:] | sed "s/://g"`
    
    $ keytool -import -trustcacerts -alias $md5hash -keystore ads-truststore -storepass `cat ads-truststore.pin` -keypass `cat ads-truststore.pin` -file ads-cert.crt
    Certificate already exists in keystore under alias <ads-certificate>
    Do you still want to add it? [no]:  yes
    Certificate was added to keystore
    
    You can re-check the certificate in the ads-truststore; it should now look like the following:
    $ keytool -list -storetype jks -keystore ads-truststore -v -storepass `cat ads-truststore.pin`
    
    Keystore type: JKS
    Keystore provider: SUN
    
    Your keystore contains 2 entries
    
    Alias name: ads-certificate
    Creation date: Aug 12, 2016
    Entry type: PrivateKeyEntry
    Certificate chain length: 1
    Certificate[1]:
    Owner: CN=ds1.example.com, O=OpenDJ RSA Certificate
    Issuer: CN=ds1.example.com, O=OpenDJ RSA Certificate
    Serial number: 24a0d8df
    Valid from: Fri Aug 12 12:01:27 MDT 2016 until: Thu Aug 07 12:01:27 MDT 2036
    Certificate fingerprints:
         MD5:  AF:43:1B:F5:BB:3A:41:FD:83:AF:F2:13:FE:B1:62:3A
         SHA1: 65:65:04:C7:68:2D:C7:EC:41:FA:8C:C7:61:E0:E3:3A:CE:1A:D7:C0
         SHA256: 88:B9:95:81:A8:7F:DB:62:CD:4A:04:8A:1D:C8:F2:B9:56:6E:6F:66:9F:1D:D1:FE:45:FA:AA:91:17:C5:EB:EB
         Signature algorithm name: SHA256withRSA
         Version: 3
    
    Extensions: 
    
    #1: ObjectId: 2.5.29.14 Criticality=false
    SubjectKeyIdentifier [
    KeyIdentifier [
    0000: 77 1E 72 73 87 23 24 D5   E1 E9 71 29 16 33 04 3C  w.rs.#$...q).3.<
    0010: 3A 4D 46 64                                        :MFd
    ]
    ]
    
    
    *******************************************
    *******************************************
    
    
    Alias name: af431bf5bb3a41fd83aff213feb1623a
    Creation date: Aug 12, 2016
    Entry type: trustedCertEntry
    
    Owner: CN=ds1.example.com, O=OpenDJ RSA Certificate
    Issuer: CN=ds1.example.com, O=OpenDJ RSA Certificate
    Serial number: 24a0d8df
    Valid from: Fri Aug 12 12:01:27 MDT 2016 until: Thu Aug 07 12:01:27 MDT 2036
    Certificate fingerprints:
         MD5:  AF:43:1B:F5:BB:3A:41:FD:83:AF:F2:13:FE:B1:62:3A
         SHA1: 65:65:04:C7:68:2D:C7:EC:41:FA:8C:C7:61:E0:E3:3A:CE:1A:D7:C0
         SHA256: 88:B9:95:81:A8:7F:DB:62:CD:4A:04:8A:1D:C8:F2:B9:56:6E:6F:66:9F:1D:D1:FE:45:FA:AA:91:17:C5:EB:EB
         Signature algorithm name: SHA256withRSA
         Version: 3
    
    Extensions: 
    
    #1: ObjectId: 2.5.29.14 Criticality=false
    SubjectKeyIdentifier [
    KeyIdentifier [
    0000: 77 1E 72 73 87 23 24 D5   E1 E9 71 29 16 33 04 3C  w.rs.#$...q).3.<
    0010: 3A 4D 46 64                                        :MFd
    ]
    ]
    
    
    *******************************************
    *******************************************
    
  6. Import the CA certificate chain on Master 1:
    $ keytool -import -alias ca-cert -keystore ads-truststore -storepass `cat ads-truststore.pin` -keypass `cat ads-truststore.pin` -file cacert.pem
    
    Owner: CN=ForgeRock CA, O=ForgeRock AS, ST=California, C=US
    Issuer: CN=ForgeRock CA, O=ForgeRock AS, ST=California, C=US
    Serial number: 92d5b8cc173128b4
    Valid from: Wed Jul 15 10:48:11 MDT 2015 until: Sat Jul 14 10:48:11 MDT 2018
    Certificate fingerprints:
         MD5:  DC:3E:00:6B:AE:D7:76:AC:D2:A1:84:E4:C3:02:AD:C1
         SHA1: C8:9D:B3:31:76:DE:88:42:57:66:63:02:7D:87:8A:EF:29:25:FB:94
         SHA256: ED:1C:30:8F:66:74:AB:95:79:D7:ED:C7:84:05:79:30:B8:3F:3F:77:90:BF:68:98:C4:77:B9:99:1E:10:FE:79
         Signature algorithm name: SHA1withRSA
         Version: 3
    
    Extensions: 
    
    #1: ObjectId: 2.5.29.35 Criticality=false
    AuthorityKeyIdentifier [
    KeyIdentifier [
    0000: C1 DE 0D A3 D6 E8 0D 7F   58 35 13 6A C4 06 ED 20  ........X5.j... 
    0010: C5 C6 E9 EB                                        ....
    ]
    [CN=ForgeRock CA, O=ForgeRock AS, ST=California, C=US]
    SerialNumber: [    92d5b8cc 173128b4]
    ]
    
    #2: ObjectId: 2.5.29.19 Criticality=false
    BasicConstraints:[
      CA:true
      PathLen:2147483647
    ]
    
    #3: ObjectId: 2.5.29.14 Criticality=false
    SubjectKeyIdentifier [
    KeyIdentifier [
    0000: C1 DE 0D A3 D6 E8 0D 7F   58 35 13 6A C4 06 ED 20  ........X5.j... 
    0010: C5 C6 E9 EB                                        ....
    ]
    ]
    
    Trust this certificate? [no]:  yes
    Certificate was added to keystore
    
    You can re-check the certificate in the ads-truststore; it should now look like the following, where the Owner and Issuer have the same value for the ads-certificate:
    $ keytool -list -storetype jks -keystore ads-truststore -v -storepass `cat ads-truststore.pin`
    
    Keystore type: JKS
    Keystore provider: SUN
    
    Your keystore contains 3 entries
    
    Alias name: af431bf5bb3a41fd83aff213feb1623a
    Creation date: Aug 12, 2016
    Entry type: trustedCertEntry
    
    Owner: CN=ds1.example.com, O=OpenDJ RSA Certificate
    Issuer: CN=ds1.example.com, O=OpenDJ RSA Certificate
    Serial number: 24a0d8df
    Valid from: Fri Aug 12 12:01:27 MDT 2016 until: Thu Aug 07 12:01:27 MDT 2036
    Certificate fingerprints:
         MD5:  AF:43:1B:F5:BB:3A:41:FD:83:AF:F2:13:FE:B1:62:3A
         SHA1: 65:65:04:C7:68:2D:C7:EC:41:FA:8C:C7:61:E0:E3:3A:CE:1A:D7:C0
         SHA256: 88:B9:95:81:A8:7F:DB:62:CD:4A:04:8A:1D:C8:F2:B9:56:6E:6F:66:9F:1D:D1:FE:45:FA:AA:91:17:C5:EB:EB
         Signature algorithm name: SHA256withRSA
         Version: 3
    
    Extensions: 
    
    #1: ObjectId: 2.5.29.14 Criticality=false
    SubjectKeyIdentifier [
    KeyIdentifier [
    0000: 77 1E 72 73 87 23 24 D5   E1 E9 71 29 16 33 04 3C  w.rs.#$...q).3.<
    0010: 3A 4D 46 64                                        :MFd
    ]
    ]
    
    
    *******************************************
    *******************************************
    
    
    Alias name: ads-certificate
    Creation date: Aug 12, 2016
    Entry type: PrivateKeyEntry
    Certificate chain length: 1
    Certificate[1]:
    Owner: CN=ds1.example.com, O=OpenDJ RSA Certificate
    Issuer: CN=ds1.example.com, O=OpenDJ RSA Certificate
    Serial number: 24a0d8df
    Valid from: Fri Aug 12 12:01:27 MDT 2016 until: Thu Aug 07 12:01:27 MDT 2036
    Certificate fingerprints:
         MD5:  AF:43:1B:F5:BB:3A:41:FD:83:AF:F2:13:FE:B1:62:3A
         SHA1: 65:65:04:C7:68:2D:C7:EC:41:FA:8C:C7:61:E0:E3:3A:CE:1A:D7:C0
         SHA256: 88:B9:95:81:A8:7F:DB:62:CD:4A:04:8A:1D:C8:F2:B9:56:6E:6F:66:9F:1D:D1:FE:45:FA:AA:91:17:C5:EB:EB
         Signature algorithm name: SHA256withRSA
         Version: 3
    
    Extensions: 
    
    #1: ObjectId: 2.5.29.14 Criticality=false
    SubjectKeyIdentifier [
    KeyIdentifier [
    0000: 77 1E 72 73 87 23 24 D5   E1 E9 71 29 16 33 04 3C  w.rs.#$...q).3.<
    0010: 3A 4D 46 64                                        :MFd
    ]
    ]
    
    
    *******************************************
    *******************************************
    
    
    Alias name: ca-cert
    Creation date: Aug 12, 2016
    Entry type: trustedCertEntry
    
    Owner: CN=ForgeRock CA, O=ForgeRock AS, ST=California, C=US
    Issuer: CN=ForgeRock CA, O=ForgeRock AS, ST=California, C=US
    Serial number: 92d5b8cc173128b4
    Valid from: Wed Jul 15 10:48:11 MDT 2015 until: Sat Jul 14 10:48:11 MDT 2018
    Certificate fingerprints:
         MD5:  DC:3E:00:6B:AE:D7:76:AC:D2:A1:84:E4:C3:02:AD:C1
         SHA1: C8:9D:B3:31:76:DE:88:42:57:66:63:02:7D:87:8A:EF:29:25:FB:94
         SHA256: ED:1C:30:8F:66:74:AB:95:79:D7:ED:C7:84:05:79:30:B8:3F:3F:77:90:BF:68:98:C4:77:B9:99:1E:10:FE:79
         Signature algorithm name: SHA1withRSA
         Version: 3
    
    Extensions: 
    
    #1: ObjectId: 2.5.29.35 Criticality=false
    AuthorityKeyIdentifier [
    KeyIdentifier [
    0000: C1 DE 0D A3 D6 E8 0D 7F   58 35 13 6A C4 06 ED 20  ........X5.j... 
    0010: C5 C6 E9 EB                                        ....
    ]
    [CN=ForgeRock CA, O=ForgeRock AS, ST=California, C=US]
    SerialNumber: [    92d5b8cc 173128b4]
    ]
    
    #2: ObjectId: 2.5.29.19 Criticality=false
    BasicConstraints:[
      CA:true
      PathLen:2147483647
    ]
    
    #3: ObjectId: 2.5.29.14 Criticality=false
    SubjectKeyIdentifier [
    KeyIdentifier [
    0000: C1 DE 0D A3 D6 E8 0D 7F   58 35 13 6A C4 06 ED 20  ........X5.j... 
    0010: C5 C6 E9 EB                                        ....
    ]
    ]
    
    
    *******************************************
    *******************************************
    
  7. Import the signed certificate on Master 1:
    $ keytool -import -trustcacerts -alias ads-certificate -keystore ads-truststore -storepass `cat ads-truststore.pin` -keypass `cat ads-truststore.pin` -file newcert.pem
    
    Certificate reply was installed in keystore
    
    You can re-check the certificate in the ads-truststore; it should now look like the following, where the Owner and Issuer have different values. The Issuer is now from the Certificate Authority for the ads-certificate.
    $ keytool -list -storetype jks -keystore ads-truststore -v -storepass `cat ads-truststore.pin`
    
    Keystore type: JKS
    Keystore provider: SUN
    
    Your keystore contains 3 entries
    
    Alias name: ads-certificate
    Creation date: Aug 12, 2016
    Entry type: PrivateKeyEntry
    Certificate chain length: 2
    Certificate[1]:
    Owner: CN=ds1.example.com, O=OpenDJ RSA Certificate
    Issuer: CN=ForgeRock CA, O=ForgeRock AS, ST=California, C=US
    Serial number: 92d5b8cc173128b9
    Valid from: Fri Aug 12 12:04:21 MDT 2016 until: Sat Aug 12 12:04:21 MDT 2017
    Certificate fingerprints:
         MD5:  1E:5C:27:F6:2F:E2:1A:77:E7:CB:4C:12:A7:AB:08:4A
         SHA1: 19:F9:01:D1:26:DA:C9:5C:F5:F8:63:A9:E9:66:72:C1:1C:74:86:F3
         SHA256: 86:AA:2C:10:3D:FE:28:C9:8D:96:C5:EF:8E:1B:6C:9C:CB:E3:18:6A:C1:20:CE:A9:B7:6F:BC:8E:92:F5:83:47
         Signature algorithm name: SHA1withRSA
         Version: 3
    
    Extensions: 
    
    #1: ObjectId: 2.16.840.1.113730.1.13 Criticality=false
    0000: 16 1D 4F 70 65 6E 53 53   4C 20 47 65 6E 65 72 61  ..OpenSSL Genera
    0010: 74 65 64 20 43 65 72 74   69 66 69 63 61 74 65     ted Certificate
    
    
    #2: ObjectId: 2.5.29.35 Criticality=false
    AuthorityKeyIdentifier [
    KeyIdentifier [
    0000: C1 DE 0D A3 D6 E8 0D 7F   58 35 13 6A C4 06 ED 20  ........X5.j... 
    0010: C5 C6 E9 EB                                        ....
    ]
    ]
    
    #3: ObjectId: 2.5.29.19 Criticality=false
    BasicConstraints:[
      CA:false
      PathLen: undefined
    ]
    
    #4: ObjectId: 2.5.29.14 Criticality=false
    SubjectKeyIdentifier [
    KeyIdentifier [
    0000: 77 1E 72 73 87 23 24 D5   E1 E9 71 29 16 33 04 3C  w.rs.#$...q).3.<
    0010: 3A 4D 46 64                                        :MFd
    ]
    ]
    
    Certificate[2]:
    Owner: CN=ForgeRock CA, O=ForgeRock AS, ST=California, C=US
    Issuer: CN=ForgeRock CA, O=ForgeRock AS, ST=California, C=US
    Serial number: 92d5b8cc173128b4
    Valid from: Wed Jul 15 10:48:11 MDT 2015 until: Sat Jul 14 10:48:11 MDT 2018
    Certificate fingerprints:
         MD5:  DC:3E:00:6B:AE:D7:76:AC:D2:A1:84:E4:C3:02:AD:C1
         SHA1: C8:9D:B3:31:76:DE:88:42:57:66:63:02:7D:87:8A:EF:29:25:FB:94
         SHA256: ED:1C:30:8F:66:74:AB:95:79:D7:ED:C7:84:05:79:30:B8:3F:3F:77:90:BF:68:98:C4:77:B9:99:1E:10:FE:79
         Signature algorithm name: SHA1withRSA
         Version: 3
    
    Extensions: 
    
    #1: ObjectId: 2.5.29.35 Criticality=false
    AuthorityKeyIdentifier [
    KeyIdentifier [
    0000: C1 DE 0D A3 D6 E8 0D 7F   58 35 13 6A C4 06 ED 20  ........X5.j... 
    0010: C5 C6 E9 EB                                        ....
    ]
    [CN=ForgeRock CA, O=ForgeRock AS, ST=California, C=US]
    SerialNumber: [    92d5b8cc 173128b4]
    ]
    
    #2: ObjectId: 2.5.29.19 Criticality=false
    BasicConstraints:[
      CA:true
      PathLen:2147483647
    ]
    
    #3: ObjectId: 2.5.29.14 Criticality=false
    SubjectKeyIdentifier [
    KeyIdentifier [
    0000: C1 DE 0D A3 D6 E8 0D 7F   58 35 13 6A C4 06 ED 20  ........X5.j... 
    0010: C5 C6 E9 EB                                        ....
    ]
    ]
    
    
    *******************************************
    *******************************************
    
    
    Alias name: af431bf5bb3a41fd83aff213feb1623a
    Creation date: Aug 12, 2016
    Entry type: trustedCertEntry
    
    Owner: CN=ds1.example.com, O=OpenDJ RSA Certificate
    Issuer: CN=ds1.example.com, O=OpenDJ RSA Certificate
    Serial number: 24a0d8df
    Valid from: Fri Aug 12 12:01:27 MDT 2016 until: Thu Aug 07 12:01:27 MDT 2036
    Certificate fingerprints:
         MD5:  AF:43:1B:F5:BB:3A:41:FD:83:AF:F2:13:FE:B1:62:3A
         SHA1: 65:65:04:C7:68:2D:C7:EC:41:FA:8C:C7:61:E0:E3:3A:CE:1A:D7:C0
         SHA256: 88:B9:95:81:A8:7F:DB:62:CD:4A:04:8A:1D:C8:F2:B9:56:6E:6F:66:9F:1D:D1:FE:45:FA:AA:91:17:C5:EB:EB
         Signature algorithm name: SHA256withRSA
         Version: 3
    
    Extensions: 
    
    #1: ObjectId: 2.5.29.14 Criticality=false
    SubjectKeyIdentifier [
    KeyIdentifier [
    0000: 77 1E 72 73 87 23 24 D5   E1 E9 71 29 16 33 04 3C  w.rs.#$...q).3.<
    0010: 3A 4D 46 64                                        :MFd
    ]
    ]
    
    
    *******************************************
    *******************************************
    
    
    Alias name: ca-cert
    Creation date: Aug 12, 2016
    Entry type: trustedCertEntry
    
    Owner: CN=ForgeRock CA, O=ForgeRock AS, ST=California, C=US
    Issuer: CN=ForgeRock CA, O=ForgeRock AS, ST=California, C=US
    Serial number: 92d5b8cc173128b4
    Valid from: Wed Jul 15 10:48:11 MDT 2015 until: Sat Jul 14 10:48:11 MDT 2018
    Certificate fingerprints:
         MD5:  DC:3E:00:6B:AE:D7:76:AC:D2:A1:84:E4:C3:02:AD:C1
         SHA1: C8:9D:B3:31:76:DE:88:42:57:66:63:02:7D:87:8A:EF:29:25:FB:94
         SHA256: ED:1C:30:8F:66:74:AB:95:79:D7:ED:C7:84:05:79:30:B8:3F:3F:77:90:BF:68:98:C4:77:B9:99:1E:10:FE:79
         Signature algorithm name: SHA1withRSA
         Version: 3
    
    Extensions: 
    
    #1: ObjectId: 2.5.29.35 Criticality=false
    AuthorityKeyIdentifier [
    KeyIdentifier [
    0000: C1 DE 0D A3 D6 E8 0D 7F   58 35 13 6A C4 06 ED 20  ........X5.j... 
    0010: C5 C6 E9 EB                                        ....
    ]
    [CN=ForgeRock CA, O=ForgeRock AS, ST=California, C=US]
    SerialNumber: [    92d5b8cc 173128b4]
    ]
    
    #2: ObjectId: 2.5.29.19 Criticality=false
    BasicConstraints:[
      CA:true
      PathLen:2147483647
    ]
    
    #3: ObjectId: 2.5.29.14 Criticality=false
    SubjectKeyIdentifier [
    KeyIdentifier [
    0000: C1 DE 0D A3 D6 E8 0D 7F   58 35 13 6A C4 06 ED 20  ........X5.j... 
    0010: C5 C6 E9 EB                                        ....
    ]
    ]
    
    
    *******************************************
    *******************************************
    
  8. Repeat steps 3 to 7 to create new PrivateKeyEntry and trustedCertEntry truststore entries on Master 2 (CN = ds2.example.com).
  9. Enable replication on Master 1 using the dsreplication command applicable to your version:
    • DS 5 and later:
      $ ./dsreplication configure --adminUid admin --adminPassword password --baseDn dc=example,dc=com --host1 ds1.example.com --port1 4444 --bindDn1 "cn=Directory Manager" --bindPassword1 password --replicationPort1 8989 --host2 ds2.example.com --port2 4444 --bindDn2 "cn=Directory Manager" --bindPassword2 password --replicationPort2 9989 --trustAll --no-prompt
    • Pre-DS 5:
      $ ./dsreplication enable --adminUID admin --adminPassword password --baseDN dc=example,dc=com --host1 ds1.example.com --port1 4444 --bindDN1 "cn=Directory Manager" --bindPassword1 password --replicationPort1 8989 --host2 ds2.example.com --port2 4444 --bindDN2 "cn=Directory Manager" --bindPassword2 password --replicationPort2 9989 --trustAll --no-prompt
    You can then check the replication status to ensure it is successful:
    $ ./dsreplication status --adminUID admin --adminPassword password --hostname ds1.example.com --port 4444 --trustAll
    
    Fri Aug 12 16:46:45 MDT 2016
    Suffix DN         : Server               : Entries : Replication enabled : DS ID : RS ID : RS Port (1) : M.C. (2) : A.O.M.C. (3) : Security (4)
    ------------------:----------------------:---------:---------------------:-------:-------:-------------:----------:--------------:-------------
    dc=example,dc=com : ds1.example.com:4444 : 10002   : true                : 15311 : 17455 : 8989        : 0        :              : false
    dc=example,dc=com : ds2.example.com:5444 : 10002   : true                : 212   : 5611  : 9989        : 0        :              : false
    

See Also

FAQ: SSL certificate management in DS/OpenDJ

How do I use externally created SSL keys with DS/OpenDJ (All versions)?

Administration Guide › Changing Server Certificates

Related Training

N/A

Related Issue Tracker IDs

N/A


How do I migrate an existing DS+RS replication topology to a DS to RS topology in DS/OpenDJ (All versions)?

The purpose of this article is to provide information on migrating a current directory server (DS) with replication server (RS) topology (DS+RS) to a standalone RS in DS/OpenDJ.

Overview

This article details the steps necessary to migrate from a DS with RS topology (DS+RS <-> DS+RS) to a standalone RS topology (DS <-> RS <-> DS), where:

  • DS+RS (1) <-> DS+RS (2) is the original topology.
  • DS (3) <-> RS (4) <-> DS (5) is the migrated topology.

The following hostnames and port numbers have been used in these examples:

Server(s) Hostname Admin Port Replication Port
DS+RS (1) ds1.example.com 4444 8989
DS+RS (2) ds2.example.com 5444 9989
DS (3) ds3.example.com 6444 --
RS (4) rs4.example.com 7444 10989
DS (5) ds5.example.com 8444 --

Migrating your replication topology

Starting with an initial configuration of DS+RS <-> DS+RS:

Suffix DN         : Server               : Entries : Replication enabled : DS ID : RS ID : RS Port (1) : M.C. (2) : A.O.M.C. (3) : Security (4)
------------------:----------------------:---------:---------------------:-------:-------:-------------:----------:--------------:-------------
dc=example,dc=com : ds1.example.com:4444 : 100     : true                : 10000 : 10000 : 8989        : 0        :              : false
dc=example,dc=com : ds2.example.com:5444 : 100     : true                :  7310 :  2040 : 9989        : 0        :              : true

You can migrate as follows: 

  1. Add RS (4) and replicate it to DS+RS (2) using the setup and dsreplication commands applicable to your version, for example:
    • DS 5 and later:
      $ ./setup --ldapPort 389 --adminConnectorPort 7444 --rootUserDN "cn=Directory Manager" --rootUserPassword password --enableStartTLS --ldapsPort 636 --hostName rs4.example.com --acceptLicense
      
      $ ./dsreplication configure --host1 rs4.example.com --port1 7444 --bindDn1 "cn=Directory Manager" --bindPassword1 password --replicationPort1 10989 --secureReplication1 --onlyReplicationServer1 --host2 ds2.example.com --port2 5444 --bindDn2 "cn=Directory Manager" --bindPassword2 password --secureReplication2 --noReplicationServer2 --baseDn dc=example,dc=com --adminUid admin --adminPassword password --no-prompt --noPropertiesFile --trustAll
      
    • Pre-DS 5:
      $ ./setup --cli --ldapPort 389 --adminConnectorPort 7444 --rootUserDN "cn=Directory Manager" --rootUserPassword password --enableStartTLS --ldapsPort 636 --generateSelfSignedCertificate --hostName rs4.example.com --no-prompt --noPropertiesFile --acceptLicense
      
      $ ./dsreplication enable --host1 rs4.example.com --port1 7444 --bindDN1 "cn=Directory Manager" --bindPassword1 password --replicationPort1 10989 --secureReplication1 --onlyReplicationServer1 --host2 ds2.example.com --port2 5444 --bindDN2 "cn=Directory Manager" --bindPassword2 password --secureReplication2 --noReplicationServer2 --baseDN dc=example,dc=com --adminUID admin --adminPassword password --no-prompt --noPropertiesFile --trustAll
      
    This results in DS+RS (1) <-> DS+RS (2) <-> RS (4):
    Suffix DN         : Server               : Entries : Replication enabled : DS ID : RS ID : RS Port (1) : M.C. (2) : A.O.M.C. (3) : Security (4)
    ------------------:----------------------:---------:---------------------:-------:-------:-------------:----------:--------------:-------------
    dc=example,dc=com : ds1.example.com:4444 : 100     : true                : 10000 : 10000 :  8989       : 0        :              : false
    dc=example,dc=com : ds2.example.com:5444 : 100     : true                :  7310 :  2040 :  9989       : 0        :              : true
                      : rs4.example.com:7444 : (6)     : true                :       : 11351 : 10989       :          :              : true
    
  2. Add DS (3) and replicate it to RS (4) using the setup and dsreplication commands applicable to your version, for example:
    • DS 5 and later:
      $ ./setup --ldapPort 1389 --adminConnectorPort 6444 --rootUserDN "cn=Directory Manager" --rootUserPassword password --enableStartTLS --ldapsPort 1636 --hostName ds3.example.com --addBaseEntry --baseDN dc=example,dc=com --acceptLicense
      
      $ ./dsreplication configure --host1 ds3.example.com --port1 6444 --bindDn1 "cn=Directory Manager" --bindPassword1 password --secureReplication1 --noReplicationServer1 --host2 rs4.example.com --port2 7444 --bindDn2 "cn=Directory Manager" --bindPassword2 password --replicationPort2 10989 --secureReplication2 --onlyReplicationServer2 --baseDn dc=example,dc=com --adminUid admin --adminPassword password --no-prompt --noPropertiesFile --trustAll
    • Pre-DS 5:
      $ ./setup --cli --ldapPort 1389 --adminConnectorPort 6444 --rootUserDN "cn=Directory Manager" --rootUserPassword password --enableStartTLS --ldapsPort 1636 --generateSelfSignedCertificate --hostName ds3.example.com --addBaseEntry --baseDN dc=example,dc=com --no-prompt --noPropertiesFile --acceptLicense
      
      $ ./dsreplication enable --host1 ds3.example.com --port1 6444 --bindDN1 "cn=Directory Manager" --bindPassword1 password --secureReplication1 --noReplicationServer1 --host2 rs4.example.com --port2 7444 --bindDN2 "cn=Directory Manager" --bindPassword2 password --replicationPort2 10989 --secureReplication2 --onlyReplicationServer2 --baseDN dc=example,dc=com --adminUID admin --adminPassword password --no-prompt --noPropertiesFile --trustAl
       
  3. Initialize DS (3) from an existing server (such as DS+RS (2)) using the dsreplication initialize command, for example:
    $ ./dsreplication initialize --adminUID admin --adminPassword password --baseDN dc=example,dc=com --hostSource ds2.example.com --portSource 5444 --hostDestination ds3.example.com --portDestination 6444 --trustAll --no-prompt
    This results in DS+RS (1) <-> DS+RS (2) <-> RS (4) <-> DS (3):
    Suffix DN         : Server               : Entries : Replication enabled : DS ID : RS ID : RS Port (1) : M.C. (2) : A.O.M.C. (3) : Security (4)
    ------------------:----------------------:---------:---------------------:-------:-------:-------------:----------:--------------:-------------
    dc=example,dc=com : ds1.example.com:4444 : 100     : true                : 10000 : 10000 :  8989       : 0        :              : false
    dc=example,dc=com : ds2.example.com:5444 : 100     : true                :  7310 : 2040  :  9989       : 0        :              : true
    dc=example,dc=com : ds3.example.com:6444 : 100     : true                :  8834 : (5)   :             : 0        :              : 
                      : rs4.example.com:7444 : (6)     : true                :       : 11351 : 10989       :          :              : true
    
  4. Disable replication for DS+RS (1) using the dsreplication command applicable to your version:
    • DS 5 and later:
      $ ./dsreplication unconfigure --unconfigureAll --hostname ds1.example.com --port 4444 --bindDN "cn=Directory Manager" --adminPassword password --trustAll --no-prompt
    • Pre-DS 5:
      $ ./dsreplication disable --disableAll --hostname ds1.example.com --port 4444 --bindDN "cn=Directory Manager" --adminPassword password --trustAll --no-prompt
      
    This results in DS+RS (2) <-> RS (4) <-> DS (3):
    Suffix DN         : Server               : Entries : Replication enabled : DS ID : RS ID : RS Port (1) : M.C. (2) : A.O.M.C. (3) : Security (4)
    ------------------:----------------------:---------:---------------------:-------:-------:-------------:----------:--------------:-------------
    dc=example,dc=com : ds2.example.com:5444 : 100     : true                : 7310  : 2040  :  9989       : 0        :              : true
    dc=example,dc=com : ds3.example.com:6444 : 100     : true                : 8834  : (5)   :             : 0        :              : 
                      : rs4.example.com:7444 : (6)     : true                :       : 11351 : 10989       :          :              : true
    
  5. Add DS (5) and replicate it to RS (4) using the setup and dsreplication commands applicable to your version, for example:
    • DS 5 and later:
      $ ./setup --ldapPort 2389 --adminConnectorPort 8444 --rootUserDN "cn=Directory Manager" --rootUserPassword password --enableStartTLS --ldapsPort 2636 --hostName ds5.example.com --addBaseEntry --baseDN dc=example,dc=com --acceptLicense
      
      $ ./dsreplication configure --host1 ds5.example.com --port1 8444 --bindDn1 "cn=Directory Manager" --bindPassword1 password --secureReplication1 --noReplicationServer1 --host2 rs4.example.com --port2 7444 --bindDn2 "cn=Directory Manager" --bindPassword2 password --replicationPort2 10989 --secureReplication2 --onlyReplicationServer2 --baseDn dc=example,dc=com --adminUid admin --adminPassword password --no-prompt --noPropertiesFile --trustAll
    • Pre-DS 5:
      $ ./setup --cli --ldapPort 2389 --adminConnectorPort 8444 --rootUserDN "cn=Directory Manager" --rootUserPassword password --enableStartTLS --ldapsPort 2636 --generateSelfSignedCertificate --hostName ds5.example.com  --addBaseEntry --baseDN dc=example,dc=com --no-prompt --noPropertiesFile --acceptLicense
      
      $ ./dsreplication enable --host1 ds5.example.com --port1 8444 --bindDN1 "cn=Directory Manager" --bindPassword1 password --secureReplication1 --noReplicationServer1 --host2 rs4.example.com --port2 7444 --bindDN2 "cn=Directory Manager" --bindPassword2 password --replicationPort2 10989 --secureReplication2 --onlyReplicationServer2 --baseDN dc=example,dc=com --adminUID admin --adminPassword password --no-prompt --noPropertiesFile --trustAll
  6. Initialize DS (5) from an existing server (such as DS+RS (2)) using the dsreplication initialize command, for example:
    $ ./dsreplication initialize --adminUID admin --adminPassword password --baseDN dc=example,dc=com --hostSource ds2.example.com --portSource 5444 --hostDestination ds5.example.com --portDestination 8444 --trustAll --no-prompt
    
    This results in DS+RS (2) <-> RS (4) <-> DS (3) <-> DS (5).
    Suffix DN         : Server               : Entries : Replication enabled : DS ID : RS ID : RS Port (1) : M.C. (2) : A.O.M.C. (3) : Security (4)
    ------------------:----------------------:---------:---------------------:-------:-------:-------------:----------:--------------:-------------
    dc=example,dc=com : ds2.example.com:5444 : 100     : true                :  7310 : 2040  :  9989       : 0        :              : true
    dc=example,dc=com : ds3.example.com:6444 : 100     : true                :  8834 : (5)   :             : 0        :              : 
    dc=example,dc=com : ds5.example.com:8444 : 100     : true                : 31314 : (5)   :             : 0        :              : 
                      : rs4.example.com:7444 : (6)     : true                :       : 11351 : 10989       :          :              : true
    
  7. Disable replication for DS+RS (2) using the dsreplication command applicable to your version:
    • DS 5 and later:
      $ ./dsreplication unconfigure --unconfigureAll --hostname ds2.example.com --port 5444 --bindDN "cn=Directory Manager" --adminPassword password --trustAll --no-prompt
    • Pre-DS 5:
      $ ./dsreplication disable --disableAll --hostname ds2.example.com --port 5444 --bindDN "cn=Directory Manager" --adminPassword password --trustAll --no-prompt
      
    This results in your standalone RS topology: DS (3) <-> RS (4) <-> DS (5)
    Suffix DN         : Server               : Entries : Replication enabled : DS ID : RS ID : RS Port (1) : M.C. (2) : A.O.M.C. (3) : Security (4)
    ------------------:----------------------:---------:---------------------:-------:-------:-------------:----------:--------------:-------------
    dc=example,dc=com : ds3.example.com:6444 : 100     : true                :  8834 : (5)   :             : 0        :              : 
    dc=example,dc=com : ds5.example.com:8444 : 100     : true                : 31314 : (5)   :             : 0        :              : 
                      : rs4.example.com:7444 : (6)     : true                :       : 11351 : 10989       :          :              : true
    

See Also

How do I troubleshoot replication issues in DS/OpenDJ (All versions)?

Replication in DS/OpenDJ

Administration Guide › Managing Data Replication › Configuring Replication

Administration Guide › Managing Data Replication › Standalone Replication Servers

Related Training

N/A

Related Issue Tracker IDs

N/A


How do I change the admin account password used for replication in DS/OpenDJ (All versions)?

The purpose of this article is to provide information on changing the admin account password used for replication in DS/OpenDJ.

Changing the admin account password used for replication

There is no default admin account password used for replication. When you enable replication for the first time using the dsreplication configure (DS 5 and later) or dsreplication enable (pre-DS 5)  command, you set the password to be used for this account. This user is created in the cn=admin data backend, which is replicated across all servers.

You can change the password using a standard LDAP operation. For example, you would use a command such as the following if the user was 'admin':

$ ./ldappasswordmodify --bindDN "cn=Directory Manager" --bindPassword password --port 4444 --newPassword Passw0rd --authzID "cn=admin,cn=Administrators,cn=admin data" --trustAll --useSSL

You can then test the new password with the dsreplication status command, for example:

$ ./dsreplication status --adminUID admin --adminPassword Passw0rd --hostname ds1.example.com --port 4444 --trustAll

See Also

FAQ: Passwords in DS/OpenDJ

Troubleshooting DS/OpenDJ

Administration Guide › Troubleshooting Server Problems › Resetting Administrator Passwords

Reference › Tools Reference › dsreplication

Related Training

N/A

Related Issue Tracker IDs

N/A


Troubleshooting Replication


How do I troubleshoot replication issues in DS/OpenDJ (All versions)?

The purpose of this article is to provide assistance for troubleshooting replication issues in DS/OpenDJ. It also provides other useful information about replication, including: background information, regular tasks you should perform to ensure replication is behaving as expected (to avoid future issues) and the recommended ways to recover and stop replication.

Warning

Do not compress, tamper with, or otherwise alter changelog database files directly unless specifically instructed to do so by a qualified ForgeRock technical support engineer. External changes to changelog database files can render them unusable by the server. By default, changelog database files are located under the /path/to/ds/changelogDb directory.

Overview

This article provides background information on replication and how it works, and also on how to monitor and troubleshoot replication issues. The following topics are covered:

Background information

DS/OpenDJ uses a DS and RS model for replication.

  • A DS is a Directory Server. DSs contain the backend databases and answer client requests.
  • A RS is a Replication Server. RSs contain a changelog and handle replication traffic with DSs and with other RSs; receiving, sending and storing only changes to directory data rather than directory data itself. A DS connects to an RS for replication purposes.

When installed without replication enabled, the DS/OpenDJ instance is a DS by default. If you enable replication, the instance spins up RS threads within the process. The instance then becomes a DS+RS.

a DS/OpenDJ replication topology can consist of the following types of instances.

  • DS+RS
  • Standalone DS
  • Standalone RS

With this in mind, each DS and each RS have a unique ID assigned to them. The DS ID is used for keeping track of changes to the system and is included in the CSN. See Using the CSN to troubleshoot data consistency for further information on decoding the CSN and identifying the DS ID.

See Administration Guide › Understanding Directory Services › About Replication and Administration Guide › Managing Data Replication › Replication Per Suffix for further information on replication.

Note

You should be cautious about changing the hostname as this affects replication. If you need to change it, follow the procedure given in How do I change the hostname for DS/OpenDJ (All versions)? to ensure replication is correctly handled. Additionally, you should be consistent with your use of either FQDNs or IP addresses for hostnames as noted in Administration Guide › Managing Data Replication › Configuring Replication.

changelogDb

The changelog stores replication changes in the replication changes database (changelogDb directory). These changes are purged from the changelog according to the replication purge delay setting. You must ensure this is set appropriately to keep data long enough for replication and data recovery purposes; these changes are permanently lost once they are purged from the changelog. See How do I control how long replication changes are retained in DS/OpenDJ (All versions)? and FAQ: Backup and restore in DS/OpenDJ (Q. When does the replication purge take place?) for further information.

Change Sequence Number (CSN)

The CSN is used to track changes via replication and the changelog. The CSN is an encoded value that represents the date and time, the DS's server ID and the change number for that timestamp. See Using the CSN to troubleshoot data consistency for further information on decoding the CSN.

Generation ID

The generation ID is a checksum of attributes from some of the entries and is used during replication to check that the suffix being updated is the same as the one offering the updates.

Checking the status of replication

It is very important to check the status of replication on a regular basis so that you can be confident that all changes are being replayed successfully; in particular it can be helpful to check if you notice that replication changes are slower than expected.

You can use the dsreplication status command to give you an overall view of the replication topology. The output shows you information on a per server/suffix basis, including how many entries each server has as well as any missing changes. For example:

$ ./dsreplication status --adminUID admin --adminPassword password --hostname ds1.example.com --port 4444 --trustAll

Suffix DN         : Server               : Entries : Replication enabled : DS ID : RS ID : RS Port (1) : M.C. (2) : A.O.M.C. (3) : Security (4)
------------------:----------------------:---------:---------------------:-------:-------:-------------:----------:--------------:-------------
dc=example,dc=com : ds1.example.com:4444 : 2002    : true                : 10000 : 11500 : 8989        : 0        :              : true
dc=example,dc=com : ds2.example.com:5444 : 2002    : true                : 14057 : 12210 : 9989        : 0        :              : true

Compare the Entries values across the servers to ensure they match and check there are no missing changes (M.C.). If you see discrepancies, you should run dsreplication status a few times in short succession to see if the server catches up. If Entries values continue to be different and/or you still have missing changes, you know replication is out of sync.

The M.C. metric is deprecated in DS 6; you should monitor replication delay instead. See Administration Guide › Monitoring Replication Delay Over LDAP and Administration Guide › Monitoring Replication Delay Over HTTP for further information.

Note

It is possible that you will sometimes see missing changes but as long as they do go away (that is, the count returns to 0) it is normal. dsreplication status searches each server's backend for the information it uses to calculate entries / missing entries. This is like trying to hit a moving target and can sometimes display missing changes when there aren't any. This is why it is important to monitor replication on a regular basis. There is also a known issue in OpenDJ 3.x, which mistakenly reports missing changes: OPENDJ-3133 (dsreplication status reports M.C. (Missing Changes) when none exist.)

See Reference › Tools Reference › dsreplication for further information.

Monitoring replication

It is very important to monitor replication on a regular basis; you should also ensure your systems are not down for longer than the period set in your purge delay. If a server is down longer than the purge delay and the entry count has not changed, any LDAP MOD operations that took place during this time will not be seen as Missing Changes in the M.C. column of the dsreplication status output. Any ADD or DEL operations during this time will show a difference in the number of entries.  For example, if the purge delay is one day and Master 2 is down for two days, any modifications made to Master 1 will be purged after one day and never seen in the M.C. column in the replication status.

You can do a ldapsearch against baseDN "cn=Replication,cn=monitor" to monitor replication. You can either return all attributes or request specific ones as required. For example, you may use a command such as this, which includes some useful attributes to check for replication purposes:

$ ./ldapsearch --port 1389 --bindDN "cn=Directory Manager" --bindPassword password --baseDN "cn=Replication,cn=monitor" --searchScope sub "(&(objectClass=*)(domain-name=dc=example,dc=com))" \* + last-status-change-date last-change lost-connections received-updates sent-updates replayed-updates pending-updates replayed-updates-ok resolved-modify-conflicts resolved-naming-conflicts unresolved-naming-conflicts missing-changes approximate-delay

Example of a partial response (you can see the full response in this file: ReplicationMonitoringResponse.txt):

dn: cn=Directory server DS(10000) ds1.example.com:54209,cn=dc_example_dc_com,cn=Replication,cn=monitor
objectClass: top
objectClass: ds-monitor-entry
objectClass: extensibleObject
domain-name: dc=example,dc=com
server-id: 10000
connected-to: ds1.example.com/198.51.100.0:8989
lost-connections: 0
received-updates: 0
sent-updates: 95
replayed-updates: 0
max-rcv-window: 100000
current-rcv-window: 100000
max-send-window: 100000
current-send-window: 99905
server-state: 0000015d13a119d23b470000005f Wed Jul 05 09:41:51 PDT 2017 1499272911314
ssl-encryption: false
generation-id: 19363628
pending-updates: 0
replayed-updates-ok: 0
resolved-modify-conflicts: 0
resolved-naming-conflicts: 0
unresolved-naming-conflicts: 0
remote-pending-changes-size: 0
dependent-changes-size: 0
changes-in-progress-size: 0
assured-sr-sent-updates: 0
assured-sr-acknowledged-updates: 0
assured-sr-not-acknowledged-updates: 0
assured-sr-timeout-updates: 0
assured-sr-wrong-status-updates: 0
assured-sr-replay-error-updates: 0
assured-sr-received-updates: 0
assured-sr-received-updates-acked: 0
assured-sr-received-updates-not-acked: 0
assured-sd-sent-updates: 0
assured-sd-acknowledged-updates: 0
assured-sd-timeout-updates: 0
last-status-change-date: 20170705163523.699Z
status: Normal
cn: Directory server DS(10000) ds1.example.com:54209
etag: 00000000345e8a3d
pwdPolicySubentry: cn=Default Password Policy,cn=Password Policies,cn=config
structuralObjectClass: dd-monitor-entry
subschemaSubentry: cn=schema
entryDN: cn=Directory server DS(10000) ds1.example.com:54209,cn=dc_example_dc_com,cn=Replication,cn=monitor
entryUUID: 39ed7ebc-6c79-38a7-ae4d-c84819eab121
hasSubordinates: false
numSubordinates: 0

Meanings of key attributes, such as the ones included in the above example:

Attribute Meaning
last-status-change-date The date of the last status change. 
last-change The CSN of the last change and associated timestamp.
lost-connections The number of times connection has been lost between DSs and RSs. This value should roughly equate to the number of times you have stopped replication. If it is much greater, you should investigate to find out what is causing these connection losses.
received-updates The number of replicated changes received by this server. This value should match replayed-updates.
sent-updates

The number of changes that have been sent by this server. You should see these being received and applied to other servers via the received-updates / replayed-updates attribute values.

This value, in conjunction with the received-updates / replayed-updates values on the other servers, indicates how well replication is working.

replayed-updates The number of replicated changes that have been applied to this server. This value should match received-updates.
pending-updates The number of replicated changes waiting to be applied to this server.
replayed-updates-ok The number of replicated changes that have been successfully applied to this server. This value should match received-updates.
resolved-modify-conflicts The number of modify conflicts that have been resolved since the server was last started. Modify conflicts are always resolved automatically.
resolved-naming-conflicts The number of naming conflicts that have been resolved since the server was last started. This value includes both automatically and manually resolved conflicts. 
unresolved-naming-conflicts

The number of unresolved naming conflicts since the server was last started. This value should equal 0, otherwise it means there are naming conflicts that you need to identify and resolve manually.

Naming conflicts are identified by an additional entryuuid RDN in the DN as demonstrated in the Identifying replication issues from the DS/OpenDJ log files section.

This value does not decrease once a conflict has been manually resolved. An RFE exists for this: OPENDJ-251 (Provide count of unresolved replication naming conflicts as part of the Monitoring information)

missing-changes The number of changes that have been sent by the other server but have not yet been applied to this server. This value should equal 0.
approximate-delay The difference between the RS current time and the timestamp of the oldest update that has not yet been sent to the DS; this indicates replication latency. This value should equal 0.
numSubordinates The number of entries below the baseDN.
Note

Some of these attributes (for example, missing-changes) can indicate issues with replication; however, there can be an innocent reason for these discrepancies since replication is not real time and replicated changes can be delayed because of things such as network speed or latency, CPU usage and overall load on the individual instances. That is why it is important to monitor replication on a regular basis to understand what is normal for your environment.

Using the CSN to troubleshoot data consistency

Some attributes, such as ds-sync-state and ds-sync-hist give the CSN so you can determine the replication state. The CSN consists of the following information:

  • The first 16 digits are a timestamp.
  • The next 4 digits make up the replica-id (DS ID).
  • The last 8 digits are the sequence numbers that identify the change.

For example, the ds-sync-state value of 00000155a8bd5aa3271000000002.

Decoding this number using the decodecsn tool gives the following output:

CSN 00000155a8bd5aa3271000000002
 -> ts=00000155a8bd5aa3 (1467414829731) Fri Jul  1 2016 17:13:49.731
 id=2710 (10000)
 no=00000002 (2)

This gives you the timestamp of the change, the ID of the DS server where the change was made (10000) and the sequence number. Comparing this DS ID to the information output from dsreplication status identifies this DS as Master 1.

Innocent replication example

The following ldapsearch example shows an innocent replication issue where there are multiple ds-sync-state attributes:

$ ./ldapsearch --port 51389 --bindDN "cn=Directory Manager" --bindPassword password --searchScope base --baseDN dc=openam,dc=forgerock,dc=org "(objectClass=*)" \* ds-sync-state +
    
dn: dc=openam,dc=forgerock,dc=org
objectClass: top
objectClass: domain
dc: openam
ds-sync-state: 0000015d190d9ab64d710000003a
ds-sync-state: 0000015d18b72dd8550b0000000a

The presence of multiple ds-sync-state attributes indicates that you have (or have had) many different replicas in the lifetime of your replicated directory service.

ds-sync-hist example

The following ldapsearch example shows that the attributes sn, cn, postalAddress, and givenname were changed to Doe, John Doe, John Doe$01251 Chestnut Street$Panama City, DE  50369 and John respectively:

$ ./ldapsearch --port 1389 --bindDN "cn=Directory Manager" --bindPassword password --searchScope sub --baseDN dc=example,dc=com "(uid=user.0)" \* ds-sync-hist +

dn: uid=user.0,ou=People,dc=example,dc=com
objectClass: top
objectClass: inetOrgPerson
objectClass: organizationalPerson
objectClass: person
mail: user.0@maildomain.net
initials: ASA
homePhone: +1 225 216 5900
pager: +1 779 041 6341
givenName: John
employeeNumber: 0
telephoneNumber: +1 685 622 6202
mobile: +1 010 154 3228
sn: Doe
cn: John Doe
userPassword: {SSHA}f+6nCXygJSBwS9G3VDAOXNDRvI+YXI3CYswvug==
description: This is the description for Aaccf Amar.
street: 01251 Chestnut Street
st: DE
postalAddress: Aaccf Amar$01251 Chestnut Street$Panama City, DE  50369
uid: user.0
l: Panama City
postalCode: 50369
ds-sync-hist: sn:0000015d13a119d23b470000005f:repl:Doe
ds-sync-hist: cn:0000015d13a119d23b470000005f:repl:John Doe
ds-sync-hist: postaladdress:0000015d13a119d23b470000005f:repl:John Doe$01251 Chestnut Street$Panama City, DE  50369
ds-sync-hist: modifiersName:0000015d13a119d23b470000005f:repl:cn=Directory Manager,cn=Root DNs,cn=config
ds-sync-hist: modifyTimestamp:0000015d13a119d23b470000005f:repl:20170705164151Z
ds-sync-hist: givenname:0000015d13a119d23b470000005f:repl:John
modifyTimestamp: 20170705164151Z
modifiersName: cn=Directory Manager,cn=Root DNs,cn=config
entryUUID: 0d3ce3bf-4107-3b34-9e5a-fa71deb8b504
pwdPolicySubentry: cn=Default Password Policy,cn=Password Policies,cn=config
subschemaSubentry: cn=schema
hasSubordinates: false
numSubordinates: 0
etag: 000000001d86d123
structuralObjectClass: inetOrgPerson
entryDN: uid=user.0,ou=People,dc=example,dc=com

Data consistency example

Remembering what a CSN (Change Sequence Number) is, each CSN represents a single change, be it an ADD, DELETE or MODIFY. These CSNs therefore represent a data element's (change) consistency ID. Since dsreplication checks for status based on the deltas of these CSNs, we can extrapolate that if all servers have all changes/CSNs, then the data can be deemed consistent between the instances. If an instance is missing a change, then it can be assumed there is a divergence in the consistency of each entry on the database.

The following worked example demonstrates using the CSN to determine data consistency:

  1. Let's take a two master replication topology with 2000 entries. Since the following dsreplication status is taken just after instance setup, data creation and initialization, the data is known to be consistent; at this point, the backends are the same and there are no changes yet, that is, the changelogDb has 0 changes:
    $ ./dsreplication status --adminUID admin --adminPassword password --hostname ds1.example.com --port 4444 --trustAll
    
    Suffix DN         : Server               : Entries : Replication enabled : DS ID : RS ID : RS Port (1) : M.C. (2) : A.O.M.C. (3) : Security (4)
    ------------------:----------------------:---------:---------------------:-------:-------:-------------:----------:--------------:-------------
    dc=example,dc=com : ds1.example.com:4444 : 2000    : true                : 14409 : 5070  : 8989        : 0        :              : true
    dc=example,dc=com : ds1.example.com:5444 : 2000    : true                : 26696 : 3946  : 9989        : 0        :              : true
    
  2. Make a change in the form of an ADD (adding John Doe). The CSN for this change is 00000155c6188be0384900000001 as seen in the changelog (both masters' changelogs now have the same change - an ADD):
    dn: changeNumber=1,cn=changelog
    objectClass: top
    objectClass: changeLogEntry
    changeNumber: 1
    changeTime: 20160707160225Z
    changeType: add
    targetDN: uid=jdoe,ou=People,dc=example,dc=com
    changes:: b2JqZWN0Q2xhc3M6IG9yZ2FuaXphdGlvbmFsUGVyc29uCm9iamVjdENsYXNzOiB0b3AKb2JqZWN0Q2xhc3M6IHBlcnNvbgpvYmplY3RDbGFzczogaW5ldE9yZ1BlcnNvbgp1aWQ6IGpkb2UKZ2l2ZW5OYW1lOiBKb2huCnNuOiBEb2UKY246IEpvaG4gRG9lCnVzZXJQYXNzd29yZDoge1NTSEF9WmJTcnJDL05BMHEwUFBzQmRPaVdRZTRaV3FQTDQ5Nll2RmR2NVE9PQplbnRyeVVVSUQ6IGY0MTRmZmVkLTVlZDAtNDUzNy1iMDU5LTU5YzUyMTc5MmNkMApjcmVhdGVUaW1lc3RhbXA6IDIwMTYwNzA3MTYwMjI1Wgpwd2RDaGFuZ2VkVGltZTogMjAxNjA3MDcxNjAyMjUuMzc2WgpjcmVhdG9yc05hbWU6IGNuPURpcmVjdG9yeSBNYW5hZ2VyLGNuPVJvb3QgRE5zLGNuPWNvbmZpZw==
    subschemaSubentry: cn=schema
    numSubordinates: 0
    hasSubordinates: false
    entryDN: changeNumber=1,cn=changelog
    replicationCSN: 00000155c6188be0384900000001
    replicaIdentifier: 14409
    changeInitiatorsName: cn=Directory Manager,cn=Root DNs,cn=config
    targetEntryUUID: f414ffed-5ed0-4537-b059-59c521792cd0
    changeLogCookie: dc=example,dc=com:00000155c6188be0384900000001;
    
  3. Check the replication status again:
    $ ./dsreplication status --adminUID admin --adminPassword password --hostname ds1.example.com --port 4444 --trustAll
    
    Suffix DN         : Server               : Entries : Replication enabled : DS ID : RS ID : RS Port (1) : M.C. (2) : A.O.M.C. (3) : Security (4)
    ------------------:----------------------:---------:---------------------:-------:-------:-------------:----------:--------------:-------------
    dc=example,dc=com : ds1.example.com:4444 : 2001    : true                : 14409 : 5070  : 8989        : 0        :              : true
    dc=example,dc=com : ds1.example.com:5444 : 2001    : true                : 26696 : 3946  : 9989        : 0        :              : true
    
    Since the dsreplication command uses the CSN as its basis for replication and data consistency, we now have proof that:
    • The data element representing the ADD is on both servers.
    • Because the data element is on both servers, we know based on the matching "entry count" from the replication status, that the entries in the backend are consistent.
    If the CSN (change) was not played to the other server, the servers' CSNs would not match and therefore the data would not be consistent. This would be seen as a difference in the entry count displayed by dsreplication status.
  4. Make a simple MODIFY to John's entry (add a description). The CSN for this change is 00000155c619858e384900000002.
    dn: changeNumber=2,cn=changelog
    objectClass: top
    objectClass: changeLogEntry
    changeNumber: 2
    changeTime: 20160707160329Z
    changeType: modify
    targetDN: uid=jdoe,ou=People,dc=example,dc=com
    changes:: YWRkOiBkZXNjcmlwdGlvbgpkZXNjcmlwdGlvbjogVGhpcyBpcyBKb2huJ3MgRGVzY3JpcHRpb24KLQpyZXBsYWNlOiBtb2RpZmllcnNOYW1lCm1vZGlmaWVyc05hbWU6IGNuPURpcmVjdG9yeSBNYW5hZ2VyLGNuPVJvb3QgRE5zLGNuPWNvbmZpZwotCnJlcGxhY2U6IG1vZGlmeVRpbWVzdGFtcAptb2RpZnlUaW1lc3RhbXA6IDIwMTYwNzA3MTYwMzI5Wgot
    subschemaSubentry: cn=schema
    numSubordinates: 0
    hasSubordinates: false
    entryDN: changeNumber=2,cn=changelog
    replicationCSN: 00000155c619858e384900000002
    replicaIdentifier: 14409
    changeInitiatorsName: cn=Directory Manager,cn=Root DNs,cn=config
    targetEntryUUID: f414ffed-5ed0-4537-b059-59c521792cd0
    changeLogCookie: dc=example,dc=com:00000155c619858e384900000002;
    
  5. Check the replication status again and observe there are no Missing Changes:
    $ ./dsreplication status --adminUID admin --adminPassword password --hostname ds1.example.com --port 4444 --trustAll
    Thu Jul  7 10:26:12 MDT 2016
    Suffix DN         : Server               : Entries : Replication enabled : DS ID : RS ID : RS Port (1) : M.C. (2) : A.O.M.C. (3) : Security (4)
    ------------------:----------------------:---------:---------------------:-------:-------:-------------:----------:--------------:-------------
    dc=example,dc=com : ds1.example.com:4444 : 2001    : true                : 14409 : 5070  : 8989        : 0        :              : true
    dc=example,dc=com : ds1.example.com:5444 : 2001    : true                : 26696 : 3946  : 9989        : 0        :              : true
    
    Knowing the same criteria from above, all CSNs have been played from Master 1 to Master 2 and therefore again, we have proof that the data is consistent between Master 1 and Master 2.  

Identifying replication issues from the DS/OpenDJ log files

The following table shows error messages you may see in your logs along with what they mean and possible resolutions:

Error Meaning / resolution
dn="entryuuid=bfbbd0fd-53ba-451f-93a1-2f446f4de18+uid=user1,dc=example,dc=com"

This entry indicates a naming conflict (that is, you have an additional entryuuid RDN in the DN), which can happen when changes are applied to two servers at the same time meaning replication then creates a duplicate entry.

It can also occur as a result of the following known issue: OPENDJ-3343 (Invalid Conflict resolution on Add sequence when Parent & Child are added on different replica).

This can be resolved by locating the duplicate entries on both servers and then deleting the entry that was modified first. DS/OpenDJ understands that you are deleting a conflicting entry and cleans up after itself.

category=SYNC severity=MILD_ERROR msgID=14876739 msg=Could not replay operation AddOperation(connID=-1, opID=72, dn=uid=user1,ou=People,dc=example,dc=com) with ChangeNumber 00000148a9d35134620200000002 error Unwilling to Perform There is not enough space on the disk for the database to perform the write operation"

The server has low disk space, which prevents write operations from taking place.

This can be resolved by increasing or freeing up disk space. Once space is available, replication will resume with the next set of changes, although you will be missing the changes indicated in this error as they will have been skipped.

You should avoid running out of disk space by utilizing the disk space monitoring tools detailed in Administration Guide › Managing Directory Data › Setting Disk Space Thresholds For Database Backends.

category=SYNC severity=SEVERE_WARNING msgID=14811232 msg=Directory server DS(42134) has connected to replication server RS(42706) for domain "cn=admin data" at ds1.example.com/198.51.100.0:8080, but the generation IDs do not match, indicating that a full re-initialization is required. The local (DS) generation ID is 36215293 and the remote (RS) generation ID is 172193
The generation ID contained in the restored data is not the same as the one in the current replication topology. See Generation IDs do not match error after restoring a DS/OpenDJ (All versions) replica for further information on resolving this.
category=SYNC severity=INFORMATION msgID=14680180 msg=Late monitor data received for domain "cn=schema" from replication server RS(1797), and will be ignored
category=SYNC severity=SEVERE_WARNING msgID=14811242 msg=Timed out waiting for monitor data for the domain "dc=example,dc=com" from replication server RS(1797)

The other server is not responding quickly enough with monitoring information or there is a heavy load on the server.

This can be caused by an underlying network issue but is typically related to high levels of Garbage Collection (GC) and can be resolved by tuning your JVM. See Best practice for JVM Tuning for further information.

category=SYNC severity=SEVERE_ERROR msgID=14841194 msg=Replication server caught exception while listening for client connections Read timed out
This entry indicates network issues or there is a heavy load on the server.
category=SYNC severity=NOTICE msgID=14837116 msg=New replication connection from 198.51.100.0 started with unexpected message StopMsg and is being closed
The RS exits silently in a misconfigured network. This is a known issue, which is fixed in DS 5: OPENDJ-3309 (Replication server connection listener thread exits silently).
category=SYNC severity=ERROR msgID=org.opends.messages.replication.178 msg=Directory server 8764 was attempting to connect to replication server 1245 but has disconnected in handshake phase
category=SYNC severity=SEVERE_ERROR msgID=14942387 msg=Replication server 1797 was attempting to connect to replication server ds1.example.com/198.51.100.0:8989 but has disconnected in handshake phase

The DS cannot connect to the RS. See Directory server 1 was attempting to connect to replication server 2 but has disconnected in handshake phase error in DS 5 and OpenDJ 2.6.x, 3.0, 3.5, 3.5.1 for further information on resolving this.

The underlying issue is: OPENDJ-1135 (DS sometimes fail to connect to RS after server restart), which is fixed in OpenDJ 3.5.2.

The addition of the IP address in this message can suggest that the hostname cannot be resolved to the IP address and, in many cases, is trying to connect to itself. DS/OpenDJ can be a DS+RS so depending upon your topology, the DS is connecting to the RS if it's on the same instance/system. If you believe the issue is related to the IP address, you can resolve this by ensuring each server can connect to each other's replication port via the hostname used and also that the hostnames resolve to IP addresses physically present on the servers. You should also ensure that you consistently use either FQDNs or IP addresses for hostnames.

category=SYNC severity=ERROR msgID=org.opends.messages.replication.11 msg=The replication server failed to start because the database /path/to/ds/changelogDb could not be read : Could not get or create replica DB for baseDN 'dc=example,dc=com', serverId '10566', generationId '72390'

The server fails to compute the newest record after being restarted. See Replication server fails to start after starting an OpenDJ 3 instance for further information on resolving this.

The underlying issue is:  OPENDJ-2969 (changelogDb could not be read on OpenDJ instance startup), which is fixed in OpenDJ 3.5.

Identifying replication issues in the embedded DS/OpenDJ

The following error in the AM/OpenAM Configuration log indicates replication is failing in the embedded DS/OpenDJ:

ERROR: EmbeddedOpenDS:syncReplication:cmd failed

OpenAM 13 and 12.x

In OpenAM 13 and earlier, you must ensure the hostname and port details used for the embedded DS/OpenDJ during install exactly match the details you use when re-enabling replication (this includes the case used for the hostname) else replication will fail if you make any subsequent changes to the embedded DS/OpenDJ configuration. See OPENAM-8254 (Uppercase characters in server URL hostnames break embedded replication) and OPENAM-11263 (Improve Embedded DJ logging to pick up replication configuration errors.) for further information. A mismatch in these connection details when re-enabling replication will trigger AM/OpenAM to run its clean up process, which removes the replication servers on startup. At this point, replication threads are shutdown and replication fails.

Recovering replication

You can quickly recover replication by restoring a backup or you can initialize from a known good node. However, you should ensure you are not restoring a backup that has a corrupted database as this will restore the corrupted database as well. See the following articles for further information on recovering replication in different situations:

Stopping replication

Warning

Do not allow modifications on the DS server when replication is disabled, as no record of such changes is kept, and the changes cause replication to diverge.

The way in which you stop replication varies depending on which version you are using as follows:

DS 5 and later

The following dsreplication commands are used to stop replication; you should ensure you use the correct command for your use case as follows:

  • dsreplication suspend: this command is used to temporarily stop replication, for example, to do maintenance etc.
  • dsreplication unconfigure: this command is used to permanently stop replication and completely removes the replication configuration information from the server.
  • dsreplication unconfigure -- unconfigureAll: this command is used to fully remove the local server's replication configuration from itself and all other servers in the topology.

Using the wrong command can cause issues with replication.

See Reference › Tools Reference › dsreplication and Administration Guide › Managing Data Replication › Stopping Replication for further information and examples.

OpenDJ 3.5.x and earlier

The dsconfig and dsreplication disable commands are used to stop replication, but you should ensure you use the correct command for your use case as follows:

  • dsconfig set-synchronization-provider-prop: this command is used to temporarily stop replication, for example, to do maintenance etc.
  • dsreplication disable --disableAll: this command is used to fully remove the local server's replication configuration from itself and all other servers in the topology.

Using the wrong command can cause issues with replication.

See OpenDJ Administration Guide > Stopping Replication for further information and examples.

See Also

FAQ: Monitoring DS/OpenDJ

How do I use cn=monitor entry in DS/OpenDJ (All versions) for monitoring?

FAQ: Backup and restore in DS/OpenDJ

Replication in DS/OpenDJ

Administration Guide › Understanding Directory Services › About Replication 

Administration Guide › Managing Data Replication › Configuring Replication

Administration Guide › Managing Data Replication

Reference › Tools Reference › dsreplication 

Related Training

ForgeRock Directory Services Core Concepts 

Related Issue Tracker IDs

OPENAM-11263 (Improve Embedded DJ logging to pick up replication configuration errors.) 

OPENAM-8254 (Uppercase characters in server URL hostnames break embedded replication) 

OPENDJ-3337 (dsreplication status on a DS, shows a DS+RS missing after the DS+RS is disabled/enabled.)

OPENDJ-3343 (Invalid Conflict resolution on Add sequence when Parent & Child are added on different replica)

OPENDJ-3309 (Replication server connection listener thread exits silently)

OPENDJ-3133 (dsreplication status reports M.C. (Missing Changes) when none exist.)

OPENDJ-2969 (changelogDb could not be read on OpenDJ instance startup)

OPENDJ-1577 (Server should stop all DB-related activity when disk full is detected)

OPENDJ-1135 (DS sometimes fail to connect to RS after server restart)

OPENDJ-251 (Provide count of unresolved replication naming conflicts as part of the Monitoring information)

OPENDJ-49 (Replication replay does not take into consideration the server/backend's writability mode.)


How do I find replication conflicts in DS/OpenDJ (All versions)?

The purpose of this article is to provide information on searching for replication conflicts in DS/OpenDJ so that you can fix them. A common error associated with replication conflicts is "Plug-in org.forgerock.openam.idrepo.ldap.DJLDAPv3Repo encountered a ldap exception. ldap errorcode=95" which indicates multiple matching entries.

Background information

Replication conflicts occur when incompatible changes are applied concurrently to multiple read-write replicas. There are two types of conflict possible:

  • Modify conflicts - these involve concurrent modifications to the same entry.
  • Naming conflicts - these involve other operations that affect the DN of the entry.

Replication can resolve modify conflicts and most naming conflicts without intervention. However, the following types of naming conflicts cannot be resolved during replication and must be fixed manually:

  • Different entries that share the same DN are added concurrently on multiple replicas.
  • An entry on one replica is moved (renamed) to use the same DN as a new entry concurrently added on another replica.
  • A parent entry is deleted on one replica while a child entry is added or renamed concurrently on another replica.

This is a brief summary of the information documented in the Administration Guide › Resolving Replication Conflicts.

Multiple matching entries

The following error is commonly seen when there are multiple matching entries:

Plug-in org.forgerock.openam.idrepo.ldap.DJLDAPv3Repo encountered a ldap exception. ldap errorcode=95

This error can occur in a variety of scenarios, including (but are not limited to) the following common examples:

Finding replication conflicts

You can use the logs to identify all types of replication issues; this is covered in detail in How do I troubleshoot replication issues in DS/OpenDJ (All versions)? along with information on monitoring replication to identify issues quickly. However, if you know you have naming conflicts, you can run an ldapsearch command on ds-sync-conflict to identify the specific entries.

Note

You must run the ldapsearch command on each node since replication conflict entries can differ across nodes. Alternatively, you can resolve the replication conflicts on one node and re-initialize the remaining nodes.

Example ldapsearch command:

$ ./ldapsearch --bindDN "cn=Directory Manager" --bindPassword password --hostname ds1.example.com --port 1389 --trustAll --baseDN "dc=example,dc=com" "(ds-sync-conflict=*)" ds-sync-conflict > replication-conflict-entries.out

The naming conflicts written to the file would look similar to the output below (where the addition of entryuuid=[string]+ signifies a naming conflict):

entryuuid=bfbbd0fd-42f4-4d54-b0b2-69b32233cec9+cn=jdoe,ou=group,ou=employees,dc=example,dc=com
entryuuid=c49df422-5f96-ea2b-85a2-d921eb0c3309+uid=user1,dc=example,dc=com
entryuuid=eb8ad149-aa61-ea2b-8b0f-935663bcd8ed+uid=user74,dc=example,dc=com
...

You should resolve the naming conflicts as described in the documentation: Administration Guide › Resolving Replication Conflicts.

See Also

How do I use the Access log to troubleshoot DS/OpenDJ (All versions)?

Troubleshooting DS/OpenDJ

LDAP Schema Reference › ds-sync-conflict

Related Training

N/A

Related Issue Tracker IDs

OPENDJ-3343 (Invalid Conflict resolution on Add sequence when Parent & Child are added on different replica)


How do I use the Access log to troubleshoot DS/OpenDJ (All versions)?

The purpose of this article is to provide information on understanding the Access log to help you troubleshoot DS/OpenDJ.

Overview

This article provides information to help you understand the Access log and use it to troubleshoot DS/OpenDJ. The following topics are covered:

You can also analyze operation performance in the log files using a tool called slowops; the source for the slowops tool can be found here: Githutb: OpenDJ Utilities - slowops.

Log details

In DS 5, the default log format changed to JSON, with each entry providing details of the request (REQ) and the response (RES). Earlier versions of OpenDJ use a file-based log format. This article shows examples based on both the JSON and file-based log formats; each set of examples contain the same details to help you correlate the log formats. 

In all versions of DS/OpenDJ, you can choose which log format to use: Administration Guide › Monitoring, Logging, and Alerts › Access Logs. The following table indicates which log formats are available depending on your version of DS/OpenDJ:

Access log format DS 6.5 DS 6 DS 5.x OpenDJ 3.5.x OpenDJ 3
File-based -- -- -- Yes (default) Yes (default)
JSON Yes (default) Yes (default) Yes (default) -- --
CSV Yes Yes Yes Yes Yes
Syslog Yes Yes Yes Yes Yes
JDBC Yes Yes Yes Yes Yes
Elasticsearch Yes Yes Yes Yes --
JMS Yes Yes Yes -- --
Splunk Yes Yes Yes -- --
Standard Output Yes -- -- -- --

The log files are located in the following directory depending on whether you are using an embedded or external DS/OpenDJ:

  • Embedded$HOME/[am_instance]/opends/logs
  • External/path/to/ds/logs directory where DS/OpenDJ is installed.

The default log files are named as follows depending on version:

  • DS 5 and later: the files are called ldap-access.audit.json.<datestamp>, for example: ldap-access.audit.json.20170722112257
  • Pre-DS 5: the files are called access.<datestamp>, for example, access.20170707085044Z

Identifying source of the change

You can identify whether a change was made locally or via replication by checking the logs as follows depending on your DS/OpenDJ version:

DS 5 and later

  • Local changes are made on a real connection as indicated by actual client/server addresses and the connection value ("connId":nnnn), for example:
    {"eventName":"DJ-LDAP","client":{"ip":"198.51.100.0","port":8080},"server":{"ip":"198.51.100.0","port":1389},"request":{"protocol":"LDAP","operation":"ADD","connId":5799,"msgId":361,"dn":"uid=user1,ou=People,ou=employees,dc=example,dc=com"},"transactionId":"0"
    
  • Replicated changes are shown with internal IP addresses ("ip":"internal","port":-1), a negative connection value ("connId":-n) and a sync type operation ("opType":"sync"), for example:
    {"eventName":"DJ-LDAP","client":{"ip":"internal","port":-1},"server":{"ip":"internal","port":-1},"request":{"protocol":"internal","operation":"ADD","connId":-4,"msgId":361,"opType":"sync","dn":"uid=user101,ou=People,ou=employees,dc=example,dc=com"},"transactionId":"0"

See How do I troubleshoot replication issues in DS/OpenDJ (All versions)? for further information on troubleshooting any replication issues that are identified. 

Pre-DS 5

  • Local changes are made on a real connection (conn=nnnn), for example:
    [21/Jul/2017:15:47:01 +0000] ADD REQ conn=5799 op=210 msgID=361 dn="uid=user1,ou=People,ou=employees,dc=example,dc=com"
    
  • Replicated changes are shown with a conn=-1 (to indicate it's a replicated internal operation) and type=synchronization, for example:
    [21/Jul/2017:15:47:01 +0000] ADD REQ conn=-1 op=210 msgID=361 dn="uid=user101,ou=People,ou=employees,dc=example,dc=com" type=synchronization

See How do I troubleshoot replication issues in DS/OpenDJ (All versions)? for further information on troubleshooting any replication issues that are identified. 

Identifying missing entries

Missing entries are indicated by a "statusCode":"32" or a result=32 in your logs (depending on DS/OpenDJ version). These log entries can indicate a genuine missing entry or have an innocent explanation (such as a client application testing to see if the entry exists by doing a read):

  • If the entry is really missing, you will need to add the entry or restructure your operation to use an entry that does exist.
  • If you have a misconfigured client application, for example pointing to the wrong base DN, you need to correct the configuration.
  • If you have a client application doing test reads, you can ignore these entries.

DS 5 and later

  • Search issue:
    "response":{"status":"FAILED","statusCode":"32","elapsedTime":2,"elapsedTimeUnits":"MILLISECONDS","detail":"The search base entry 'ou=agent1,ou=default,ou=1.0,ou=AgentService,ou=services,dc=example,dc=com' does not exist","nentries":0},"timestamp":"2017-07-21T15:04:53.850Z","_id":"2fff31b5-4682-4760-b32a-fbbfed6bd9ac-433"}
  • Add issues:
    "response":{"status":"FAILED","statusCode":"32","elapsedTime":1,"elapsedTimeUnits":"MILLISECONDS","detail":"The provided entry entryuuid=f3182f2c-0ed1-83e1-7d61-4f1c712e7fd2+ou=People,o=employees cannot be added because its suffix is not defined as one of the suffixes within the Directory Server"},"timestamp":"2017-07-21T15:04:53.850Z","_id":"2fff31b5-4682-4760-b32a-fbbfed6bd9ac-433"}
    "response":{"status":"FAILED","statusCode":"32","elapsedTime":0,"elapsedTimeUnits":"MILLISECONDS","detail":"Entry uid=jdoe,ou=People,ou=employees,dc=example,dc=com cannot be added because its parent entry ou=People,ou=employees,dc=example,dc=com does not exist in the server"},"timestamp":"2017-07-21T15:04:53.850Z","_id":"2fff31b5-4682-4760-b32a-fbbfed6bd9ac-433"}

Pre-DS 5

  • Search issue:
    [21/Jul/2017:15:04:51 +0800] SEARCH RES conn=189 op=213 msgID=14 result=32 message="The search base entry 'ou=agent1,ou=default,ou=1.0,ou=AgentService,ou=services,dc=example,dc=com' does not exist" nentries=0 etime=2
  • Add issues:
    [21/Jul/2017:15:04:51 +0800] ADD RES conn=-189 op=192 msgID=87 result=32 message="The provided entry entryuuid=f3182f2c-0ed1-83e1-7d61-4f1c712e7fd2+ou=People,o=employees cannot be added because its suffix is not defined as one of the suffixes within the Directory Server" etime=1
    [21/Jul/2017:15:04:51 ADD RES conn=189 op=176 msgID=29 result=32 message="Entry uid=jdoe,ou=People,ou=employees,dc=example,dc=com cannot be added because its parent entry ou=People,ou=employees,dc=example,dc=com does not exist in the server" etime=0
    

Understanding slow searches

Slow searches are indicated by high etimes in the access log and are often the result of unindexed searches. Etimes are typically measured in milliseconds, although you can configure the server to use nanoseconds instead. You can also learn other things about the search, which may be causing an issue.

See How do I troubleshoot issues with my indexes in DS/OpenDJ (All versions)? for further information on resolving indexe issues.

The following examples show some common log entries and their meanings/resolution.

Unindexed search

  • DS 5 and later:
    {"eventName":"DJ-LDAP","client":{"ip":"198.51.100.0","port":8080},"server":{"ip":"198.51.100.0","port":1389},"request":{"protocol":"LDAP","operation":"SEARCH","connId":159078,"msgId":154,"dn":"ou=groups,dc=AMusers","scope":"wholeSubtree","filter":"(memberUid=JDoe)","attrs":["1.1"]},"transactionId":"0","response":{"status":"SUCCESSFUL","statusCode":"0","elapsedTime":11999,"elapsedTimeUnits":"MILLISECONDS","additionalItems":" unindexed","nentries":0},"timestamp":"2017-07-21T15:04:51.850Z","_id":"e3a44b25-4800-8cfa-1bad-6e2bb6461607-38233"}
  • Pre-DS 5:
    [21/Jul/2017:15:04:51 +0000] SEARCH REQ conn=159078 op=1403 msgID=154 base="ou=groups,dc=AMusers" scope=wholeSubtree filter="(memberUid=JDoe)" attrs="1.1"
    [21/Jul/2017:15:04:51 +0000] SEARCH RES conn=159078 op=1403 msgID=154 result=0 nentries=0 unindexed etime=11999

The search is shown as unindexed and has a high etime value (etime values should typically be 2 or 3). The high etime value is also an indicator that the search may be unindexed.

This could be resolved by appropriate indexing as detailed in Unindexed searches causing slow searches and poor performance on DS/OpenDJ (All versions) server.

Unindexed search (objectclass=person filter)

  • DS 5 and later:
    {"eventName":"DJ-LDAP","client":{"ip":"198.51.100.0","port":8080},"server":{"ip":"198.51.100.0","port":1389},"request":{"protocol":"LDAP","operation":"SEARCH","connId":159078,"msgId":135,"dn":"dc=example,dc=com","scope":"sub","filter":"(objectclass=person)","attrs":"cn,memberof,ismemberof"},"transactionId":"0","response":{"status":"SUCCESSFUL","statusCode":"0","elapsedTime":11999,"elapsedTimeUnits":"MILLISECONDS","additionalItems":" unindexed","nentries":500},"timestamp":"2017-07-21T15:04:51.850Z","_id":"e3a44b25-4800-8cfa-1bad-6e2bb6461607-38233"}
  • Pre-DS 5:
    [21/Jul/2017:15:04:51 +0000] SEARCH REQ conn=159078 op=254 msgID=135 base="dc=example,dc=com" scope=sub filter="(objectclass=person)" attrs="cn,memberof,ismemberof"
    [21/Jul/2017:15:04:51 +0000] SEARCH RES conn=159078 op=254 msgID=135 result=0 nentries=500 unindexed etime=11999

This example also indicates the search is unindexed.

Additionally, you can see the search is using the objectclass=person filter. It is very likely that all user entries have a objectclass=person value, which would take the IDs maintained by the objectclass=person index over the index-entry-limit. This means all subsequent searches would be forced to search through the entire backend database to find the search candidates. These searches only hit the objectclass=person index; the requested return attributes listed do not touch any indexes as the values for these attributes are returned by DS/OpenDJ.

This could be resolved by appropriate indexing and also by reconstructing the search to avoid using the objectclass=person filter.

Unindexed search (objectclass=* complex filter)

  • DS 5 and later:
    {"eventName":"DJ-LDAP","client":{"ip":"198.51.100.0","port":8080},"server":{"ip":"198.51.100.0","port":1389},"request":{"protocol":"LDAP","operation":"SEARCH","connId":159078,"msgId":135,"dn":"dc=example,dc=com","scope":"sub","filter":"(|(objectclass=*)(myAttr=12345))","attrs":"cn,uid,mail"},"transactionId":"0","response":{"status":"SUCCESSFUL","statusCode":"0","elapsedTime":20379,"elapsedTimeUnits":"MILLISECONDS","additionalItems":" unindexed","nentries":500},"timestamp":"2017-07-21T15:04:51.850Z","_id":"e3a44b25-4800-8cfa-1bad-6e2bb6461607-38233"}
  • Pre-DS 5:
    [21/Jul/2017:15:04:51 +0000] SEARCH REQ conn=159078 op=254 msgID=135 base="dc=example,dc=com" scope=sub filter="(|(objectclass=*)(myAttr=12345))" attrs="cn,uid,mail"
    [21/Jul/2017:15:04:51 +0000] SEARCH RES conn=159078 op=254 msgID=135 result=0 nentries=500 unindexed etime=20379

This example also indicates the search is unindexed.

Using objectclass=* is a presence filter, meaning it will match all objectclass values. Since all LDAP entries have at least one objectclass, this type of search will always be slow and come back unindexed unless your directory is tiny (less than 4k), above this the searches for objectClass=* will always be unindexed.

To further troubleshoot these types of issue, you may also need to log internal operations. You can enable this logging using the following dsconfig command:

$ ./dsconfig set-log-publisher-prop --publisher-name "File-Based Access Logger" --set suppress-internal-operations:false --hostname ds1.example.com --port 4444 --bindDN "cn=Directory Manager" --bindPassword password

High etime

  • DS 5 and later:
    "response":{"status":"SUCCESSFUL","statusCode":"0","elapsedTime":1187,"elapsedTimeUnits":"MILLISECONDS","nentries":1},"timestamp":"2017-07-21T15:04:51.850Z","_id":"e3a44b25-4800-8cfa-1bad-6e2bb6461607-38233"}
  • Pre-DS 5:
    [21/Jul/2017:15:04:51 -0000] SEARCH RES conn=2220857 op=1335 msgID=1136 result=0 nentries=1 etime=1187

This example has the high etime but does not state it is unindexed. Unindexed only appears in the access log if no indexes were used to process the search, but the high etime still suggests a partially unindexed search.

This could be resolved by appropriate indexing.

Understanding slow write operations

Slow write operations can cause their own issues, but if they are too slow, they can eventually timeout causing the write to fail. The following example shows entry locks that prevent the write taking place:

  • DS 5 and later:
    {"eventName":"DJ-LDAP","client":{"ip":"198.51.100.0","port":8080},"server":{"ip":"198.51.100.0","port":1389},"request":{"protocol":"LDAP","operation":"MODIFY","connId":2131,"msgId":47,"dn":"uid=jdoe,ou=People,ou=employees,dc=example,dc=com"},"transactionId":"0","response":{"status":"FAILED","statusCode":"51","elapsedTime":9000,"elapsedTimeUnits":"MILLISECONDS","detail":"Entry uid=jdoe,ou=People,ou=employees,dc=example,dc=com cannot be modified because the server failed to obtain a write lock for this entry after multiple attempts"},"timestamp":"2017-08-08T14:22:43.683Z","_id":"c2a5dbd3-4960-b7d2-96e2-20370bf25b31-737"}
  • Pre-DS 5:
    08/Aug/2017:14:22:43 -0000] MODIFY conn=2131 op=5428 msgID=47 dn="uid=jdoe,ou=People,ou=employees,dc=example,dc=com" result=51 message="Entry uid=jdoe,ou=People,ou=employees,dc=example,dc=com cannot be modified because the server failed to obtain a write lock for this entry after multiple attempts" etime=9000 

These type of errors indicate potential contention issues with clients updating the same entries at the same time. If one client updates an entry on a system that is highly loaded and then another client tries to update the same entry, you will see these write errors.

Expensive indexes

Some index types (for example, substring) are expensive for the server to maintain, particularly with regards to write performance. Indexes should be configured for the types of searches specific to your client application's needs and unnecessary indexes should be avoided. See the following links for further information:

Static groups

If you see writes taking place against static groups, for example:

  • DS 5 and later:
    {"eventName":"DJ-LDAP","client":{"ip":"198.51.100.0","port":8080},"server":{"ip":"198.51.100.0","port":1389},"request":{"protocol":"LDAP","operation":"MODIFY","connId":2131,"msgId":3,"dn":"cn=Admin1,ou=groups,ou=People,o=Test"},"transactionId":"0","response":{"status":"SUCCESSFUL","statusCode":"0","elapsedTime":26495,"elapsedTimeUnits":"MILLISECONDS"},"timestamp":"2017-05-23T09:56:50.683Z","_id":"c2a5dbd3-4960-b7d2-96e2-20370bf25b31-737"}
  • Pre-DS 5:
    [23/May/2017:09:56:50 -0000] MODIFY REQ conn=2131 op=2 msgID=3 dn="cn=Admin1,ou=groups,ou=People,o=Test"
    [23/May/2017:09:56:50 -0000] MODIFY RES conn=2131 op=2 msgID=3 result=0 etime=26495 

You need to ask yourself the following questions to help understand why these write operations are not as fast as expected:

  • How big is this static group? That is, how many uniqueMembers does it contain?
  • How is the application doing the MODIFY operation on these static groups? Is it replacing the whole group or just adding a single member?
  • Are you performing any isMemberOf searches while the group modifications are taking place? 

Finding out more about the operation

Once you have identified your issue, you can grep on a specific connId or conn value to discover more about that operation to help pinpoint the issue. For example:

Identify the origin of the call

grepping for CONNECT in your logs for a specific connId or conn value will tell you where that call originated from; in these examples, the call came from the 198.51.100.0 IP address:

  • DS 5 and later:
    grep 'connId":159078' ldap-access.audit.* | grep CONNECT
    
    ldap-access.audit.json.20170722112257:{"eventName":"DJ-LDAP","client":{"ip":"198.51.100.0","port":8080},"server":{"ip":"198.51.100.0","port":1389},"request":{"protocol":"LDAP","operation":"CONNECT","connId":159078},"transactionId":"0","response":{"status":"SUCCESSFUL","statusCode":"0","elapsedTime":0,"elapsedTimeUnits":"MILLISECONDS"},"timestamp":"2017-07-21T00:38:33.437Z","_id":"3a6d298f-f9e7-4bbf-bc03-0b2c82224c15-1961"}
    
  • Pre-DS 5:
    grep conn=159078 access.* | grep CONNECT
    
    access.201704221607Z:[22/Apr/2017:11:28:43 +0000] CONNECT conn=159078 from=198.51.100.0:8080 to=198.51.100.0:1389 protocol=LDAP
    

Identify who made the call

grepping for BIND or 'BIND REQ' in your logs for a specific connId or conn value will tell you about the authentication request; in these examples, the call was made by cn=Directory Manager:

  • DS 5 and later:
    grep 'connId":159078' ldap-access.audit.* | grep BIND
    
    ldap-access.audit.json.20170722112257:{"eventName":"DJ-LDAP","client":{"ip":"198.51.100.0","port":8080},"server":{"ip":"198.51.100.0","port":1389},"request":{"protocol":"LDAP","operation":"BIND","connId":159078,"msgId":1,"version":"3","dn":"cn=Directory Manager","authType":"SIMPLE"},"transactionId":"1f80c37c-436a-800f-9528-33ec2b6707be-0/16","response":{"status":"SUCCESSFUL","statusCode":"0","elapsedTime":37,"elapsedTimeUnits":"MILLISECONDS"},"userId":"cn=Directory Manager,cn=Root DNs,cn=config","timestamp":"2017-07-21T00:38:33.437Z","_id":"1f80c37c-800f-436a-9528-33ec2b6707be-661"}
    
  • Pre-DS 5:
    grep conn=159078 access.* | grep 'BIND REQ'
    
    access.201704221607Z:[22/Apr/2017:11:28:43 +0000] BIND REQ conn=159078 op=0 msgID=1 version=3 type=SIMPLE dn="cn=Directory\ Manager"
    

See Also

Unindexed searches causing slow searches and poor performance on DS/OpenDJ (All versions) server

How do I troubleshoot issues with my indexes in DS/OpenDJ (All versions)?

How do I use the Support Extract tool in DS 5.x, 6 and OpenDJ 3.x to capture troubleshooting data?

How do I troubleshoot replication issues in DS/OpenDJ (All versions)?

How do I change the location of log files for DS/OpenDJ (All versions)?

How do I configure DS (All versions) and OpenDJ 3.x to use the Syslog audit event handler?

Troubleshooting DS/OpenDJ

Administration Guide › Monitoring, Logging, and Alerts › Access Logs

Administration Guide

Related Training

N/A

Related Issue Tracker IDs

OPENAM-11257 (Numerous "result=32" entries in the OpenDJ 's access log whenever OpenAM restarts )


How do I understand the changelogDb directory in DS/OpenDJ (All versions)?

The purpose of this article is to provide information to help you understand the changelogDb directory in DS/OpenDJ. The format of this directory differs depending on which version you are using.

Warning

Do not compress, tamper with, or otherwise alter changelog database files directly unless specifically instructed to do so by a qualified ForgeRock technical support engineer. External changes to changelog database files can render them unusable by the server. By default, changelog database files are located under the /path/to/ds/changelogDb directory.

Overview

The changelogDb directory changed substantially in OpenDJ 3.x. The JE backend used in OpenDJ 3 is different to the old 'local-db' JE format used in OpenDJ 2.6.x. to store all changes. See OpenDJ 3 Release Notes › What's New In OpenDJ › New Features for further information on these changes.

The following sections explain the structure of the directory depending on which version of DS/OpenDJ you are using:

See How do I search and view the changelog records in DS/OpenDJ (All versions)? for information on querying the changelog files (cn=changelog). How do I search and view the changelog records in DS/OpenDJ (All versions)?

Change Sequence Number (CSN)

The CSN is used to track changes via replication and the changelog. The CSN is an encoded value that represents the date and time, the Directory Server's (DS) server ID and the change number for that timestamp. 

The CSN consists of the following information:

  • The first 16 digits are a timestamp.
  • The next 4 digits make up the replica-id (DS ID).
  • The last 8 digits are the sequence numbers that identify the change.

For example, with a CSN of 00000155a8bd5aa3271000000002, you can decode this using the decodecsn tool to give the following output:

CSN 00000155a8bd5aa3271000000002
 -> ts=00000155a8bd5aa3 (1467414829731) Fri Jul  1 2016 17:13:49.731
 id=2710 (10000)
 no=00000002 (2)

This gives you the timestamp of the change, the ID of the DS server where the change was made (10000) and the sequence number.

See How do I troubleshoot replication issues in DS/OpenDJ (All versions)? (Using the CSN to troubleshoot data consistency) for further information on the CSN.

Understanding the changelogDb directory (DS 5 and later; OpenDJ 3.x)

A typical changelogDb directory in DS 5 and later, and OpenDJ 3.x is as follows:

/path/to/ds/changelogDb:

total 8
drwxr-xr-x   7 opendj   staff  238 Nov 24 09:22 .
drwxr-xr-x@ 29 opendj   staff  986 Nov 24 09:16 ..
drwxr-xr-x   4 opendj   staff  136 Nov 24 09:20 1.dom
drwxr-xr-x   4 opendj   staff  136 Nov 24 09:21 2.dom
drwxr-xr-x   4 opendj   staff  136 Nov 24 09:22 3.dom
drwxr-xr-x   3 opendj   staff  102 Nov 24 09:18 changenumberindex
-rw-r--r--   1 opendj   staff   48 Nov 24 09:22 domains.state

Where:

  • The .dom directories each represent a replicated domain in the changelog and contain the changes related to that replicated baseDN domain. See the Replicated domain directories example below for more information.
  • The changenumberindex directory contains a head.log file, which indexes the changes for the replicated domains that the changes originate in:
    /path/to/ds/changelogDb/changenumberindex:
    
    total 8
    drwxr-xr-x  3 opendj  opendj  102 Nov 24 09:18 .
    drwxr-xr-x  7 opendj  opendj  238 Nov 24 09:22 ..
    -rw-r--r--  1 opendj  opendj   44 Nov 24 09:21 head.log
    
  • The domains.state file contains a list of all replicated baseDN domains the changelog holds, for example:
    /path/to/ds/changelogDb$ cat domains.state 
    
    3:cn=admin data
    2:dc=example,dc=com
    1:cn=schema 
    These domain numbers correspond to the .dom directories above.

Replicated domain directories example

In this example, we will look at the replicated domain for dc=example,dc=com (which we know is domain 2 from the domains.state file); this corresponds to the 2.dom directory:

/path/to/ds/changelogDb/2.dom:

total 0
drwxr-xr-x  3 opendj  opendj  102 Nov 24 09:21 22173.server
-rw-r--r--  1 opendj  opendj    0 Nov 24 09:21 generation19363676.id

This directory contains a file with the generation ID for dc=example,dc=com as well as a server directory file: 22173.server. The path and number in the 22173.server filename indicate all changes to dc=example,dc=com came into the DS with the DS ID of 22173 as seen in dsreplication status:

$ ./dsreplication status --adminUID admin --adminPasswordFile pass --hostname opendj.forgerock.com --port 4444 --trustAll

Suffix DN        : Server                    : Entries : Replication enabled : DS ID : RS ID : RS Port (1) : M.C. (2) : A.O.M.C. (3) : Security (4)
------------------:---------------------------:---------:---------------------:-------:-------:-------------:----------:--------------:-------------
dc=example,dc=com : opendj.forgerock.com:4444 : 2020    : true                : 22173 : 31291 : 8989        : 0        :              : false
dc=example,dc=com : opendj.forgerock.com:5444 : 2020    : true                : 23188 : 365   : 9989        : 0        :              : false

Drilling down into the 22173.server directory, we see the following files:

/path/to/ds/changelogDb/2.dom/22173.server:

total 23464
-rw-r--r--  1 opendj  opendj  10486242 Nov 24 09:43 000001589724136a569d00000001_000001589738b791569d00005ca0.log
-rw-r--r--  1 opendj  opendj   1518252 Nov 24 09:44 head.log

Where:

  • head.log contains the current changes. When this file reaches 10MB, it rolls over into a new file that uses the naming convention described next.
  • 000001589724136a569d00000001_000001589738b791569d00005ca0.log contains older changes from timeX to timeY. The file name indicates the range of CSNs included in the file (firstCSN_lastCSN). There are likely to be multiple files named this way; the number of files depends on the number of changes being made in your system and the purge delay setting. 

You will also see an offline.state file when the server for that replicated domain / DS ID is shut down; the file is removed when that server starts up. This file contains a CSN of the state of the replicated domain when the server was shutdown. 

Understanding the changelogDb directory (OpenDJ 2.6.x)

A typical changelogDb directory in OpenDJ 2.6.x is as follows:

/path/to/opendj/changelogDb:

total 14
drwxr-xr-x  2 root root    4096 Oct 15 16:28 .
drwxr-xr-x 21 root root    4096 Aug  6 14:11 ..
-rw-r--r--  1 root root 9999791 Oct  6 17:04 0004ea11.jdb
-rw-r--r--  1 root root 9999526 Oct  6 18:06 0004ea12.jdb
-rw-r--r--  1 root root 9997748 Oct  6 19:09 0004ea13.jdb
-rw-r--r--  1 root root 9997608 Oct  6 20:03 0004ea14.jdb
-rw-r--r--  1 root root 9999762 Oct  6 21:11 0004ea15.jdb
-rw-r--r--  1 root root    9858 Sep 17 14:09 je.config.csv
-rw-r--r--  1 root root 2233503 Oct  9 16:26 je.info.0
-rw-r--r--  1 root root       0 Sep 17 14:09 je.info.0.lck
-rw-r--r--  1 root root       0 Aug 22 19:37 je.lck
-rw-r--r--  1 root root  745786 Oct  7 17:54 je.stat.0.csv
-rw-r--r--  1 root root  754422 Oct  8 17:55 je.stat.1.csv
-rw-r--r--  1 root root  709478 Oct  9 16:38 je.stat.csv

Where:

  • The .jdb files are rolling log files that store all changes. Write operations append entries as the last items in the log file. When a certain size is reached (10MB by default), a new log file is created. This results in consistent write performance regardless of the database size. The initial log file is 00000000.jdb. When that file reaches a size of 10MB, a new file is created as 00000001.jdb and so on.
  • Internal logging information is written to je.info.* files.
  • The je.*.csv files store internal environment configuration and statistics that can be opened in a spreadsheet.

See Also

How do I configure DS/OpenDJ (All versions) to ensure accidentally deleted or changed data can be restored when replication is enabled?

How do I control how long replication changes are retained in DS/OpenDJ (All versions)?

How do I reset the cn=changelog changeNumber in DS/OpenDJ (All versions)?

How do I enable the External Change Log on a single DS/OpenDJ (All versions) server?

FAQ: Backup and restore in DS/OpenDJ

Replication in DS/OpenDJ

Administration Guide › Managing Data Replication

Related Training

N/A

Related Issue Tracker IDs

OPENDJ-3283 (Cleaner threads unable to clean files, changelogDb grows until disk fills up)

OPENDJ-3522 (Idle replication change log consumes CPU and disk IO)


How do I search and view the changelog records in DS/OpenDJ (All versions)?

The purpose of this article is to provide information on searching and viewing the changelog information in DS/OpenDJ.

Warning

Do not compress, tamper with, or otherwise alter changelog database files directly unless specifically instructed to do so by a qualified ForgeRock technical support engineer. External changes to changelog database files can render them unusable by the server. By default, changelog database files are located under the /path/to/ds/changelogDb directory.

Overview

The External Changelog (cn=changelog) records all replication changes.

The changelog shows a changeType attribute for each entry so you can identify if it resulted from a change or delete operation. The original data that was changed or deleted is encoded in the includedAttributes entry. You can decode this using a Base64 decoder (for example, the base64 program provided with DS/OpenDJ or http://www.base64decode.org/).

Note

The changelog only shows the value of the new change by default. To view the original values that were changed, you must configure the changelog to record additional information for deleted data as described in How do I configure DS/OpenDJ (All versions) to ensure accidentally deleted or changed data can be restored when replication is enabled?

Querying the changelog

You can query your changelog for specific changes using the ldapsearch command. You can filter the search based on different attributes, for example, the change number or change time. 

The following example demonstrates querying and decoding the changelog by changenumber.

  1. Query the changelog using a command similar to the following (this example looks at change 5 only):
    $ ./ldapsearch --port 1389 --hostname ds1.example.com --bindDN "cn=Directory Manager" --bindPassword password --baseDN cn=changelog --searchScope one "(changenumber=5)" "*" "+"
    This gives an output similar to this:
    dn: changeNumber=5,cn=changelog
    objectClass: top
    objectClass: changeLogEntry
    changeNumber: 5
    changeTime: 20140501152505Z
    changeType: modify
    targetDN: o=test organization,dc=example,dc=com
    changes:: cmVwbGFjZTogZGVzY3JpcHRpb24KZGVzY3JpcHRpb246IE1vZGlmaWVkIHZhbHVlCi0Kcm
     VwbGFjZTogbW9kaWZpZXJzTmFtZQptb2RpZmllcnNOYW1lOiBjbj1EaXJlY3RvcnkgTWFuYWdlcixjb
     j1Sb290IEROcyxjbj1jb25maWcKLQpyZXBsYWNlOiBtb2RpZnlUaW1lc3RhbXAKbW9kaWZ5VGltZXN0
     YW1wOiAyMDE0MDUwMTE1MjUwNVoKLQo=
    subschemaSubentry: cn=schema
    numSubordinates: 0
    hasSubordinates: false
    entryDN: changeNumber=5,cn=changelog
    replicationCSN: 00000145b86396664d4b00000005
    replicaIdentifier: 19787
    changeInitiatorsName: cn=Directory Manager,cn=Root DNs,cn=config
    targetEntryUUID: d1f8fa64-d0ef-42a8-b551-038415a2ae3b
    changeLogCookie: dc=example,dc=com:00000145b86396664d4b00000005;
    includedAttributes:: b2JqZWN0Q2xhc3M6IHRvcApvYmplY3RDbGFzczogb3JnYW5pemF0aW9uCmR
     lc2NyaXB0aW9uOiBPcmlnaW5hbCB2YWx1ZQpvOiBUZXN0IE9yZ2FuaXphdGlvbgpkcy1zeW5jLWhpc3
     Q6IGRuOjAwMDAwMTQ1Yjg2MzQzNDg0ZDRiMDAwMDAwMDQ6YWRkCmVudHJ5VVVJRDogZDFmOGZhNjQtZ
     DBlZi00MmE4LWI1NTEtMDM4NDE1YTJhZTNiCmNyZWF0ZVRpbWVzdGFtcDogMjAxNDA1MDExNTI0NDRa
     CmNyZWF0b3JzTmFtZTogY249RGlyZWN0b3J5IE1hbmFnZXIsY249Um9vdCBETnMsY249Y29uZmlnCmV
     0YWc6IDAwMDAwMDAwM2E5ZjQ2ODcKc3RydWN0dXJhbE9iamVjdENsYXNzOiBvcmdhbml6YXRpb24KcH
     dkUG9saWN5U3ViZW50cnk6IGNuPURlZmF1bHQgUGFzc3dvcmQgUG9saWN5LGNuPVBhc3N3b3JkIFBvb
     GljaWVzLGNuPWNvbmZpZwpudW1TdWJvcmRpbmF0ZXM6IDAKaGFzU3Vib3JkaW5hdGVzOiBmYWxzZQpz
     dWJzY2hlbWFTdWJlbnRyeTogY249c2NoZW1hCmVudHJ5RE46IG89dGVzdCBvcmdhbml6YXRpb24sZGM
     9ZXhhbXBsZSxkYz1jb20K
    This change represents a modification to the <o=test organization,dc=example,dc=com> entry.
  2. Decode the changes attribute using a Base64 decoder. This provides details of the change (along with changes the server is making to the standard modifiersName and modifyTimestamp attributes):
    replace: description
    description: Modified value
    -
    replace: modifiersName
    modifiersName: cn=Directory Manager,cn=Root DNs,cn=config
    -
    replace: modifyTimestamp
    modifyTimestamp: 20140501152505Z
    -
    
    In this example, the type of modification (replace) that was used does not include the old values of the attribute that's changed. To view the original value you must first configure the changelog to record additional information for deleted data if you want the DS to store old values as shown in this example includedAttributes entry. 
  3. Decode the includedAttributes attribute using a Base64 decoder:
    objectClass: top
    objectClass: organization
    description: old description
    o: Test Organization
    ds-sync-hist: dn:00000145b86343484d4b00000004:add
    entryUUID: d1f8fa64-d0ef-42a8-b551-038415a2ae3b
    createTimestamp: 20140501152444Z
    creatorsName: cn=Directory Manager,cn=Root DNs,cn=config
    etag: 000000003a9f4687
    structuralObjectClass: organization
    pwdPolicySubentry: cn=Default Password Policy,cn=Password Policies,cn=config
    numSubordinates: 0
    hasSubordinates: false
    subschemaSubentry: cn=schema
    entryDN: o=test organization,dc=example,dc=com
    
    Now we can see the original value (in this example, it is old description).

Reverting a change

Once you have identified the original value (description: old description in the above example), you can restore it as follows:

  1. Create a ldif file with the current value and the value you want to restore. For example:
    $ cat revert-changes.ldif
    dn: o=test organization,dc=example,dc=com 
    changetype: modify 
    replace: description
    description: old description
    
  2. Apply the changes using the following ldapmodify command depending on your version:
    • DS 5 and later:
      $ ./ldapmodify --hostname ds1.example.com --port 1389 --bindDN "cn=Directory Manager" --bindPassword password revert-changes.ldif
    • Pre-DS 5:
      $ ./ldapmodify --hostname ds1.example.com --port 1389 --bindDN "cn=Directory Manager" --bindPassword password --filename revert-changes.ldif
      

See Also

How do I understand the changelogDb directory in DS/OpenDJ (All versions)?

How do I control how long replication changes are retained in DS/OpenDJ (All versions)?

How do I troubleshoot replication issues in DS/OpenDJ (All versions)?

Replication in DS/OpenDJ

Administration Guide › To Include Unchanged Attributes in the External Change Log

Related Training

N/A

Related Issue Tracker IDs

N/A


Recovering Replication


How do I restore old backup data to a DS/OpenDJ (All versions) replication topology?

The purpose of this article is to provide information on restoring old backup data to a DS/OpenDJ replication topology and using this old data to initialize replication. In this way, you can restore good data from an old backup and ensure it replicates across all servers. This article assumes replication has already been enabled.

Restoring old backup data and initializing replication

Note

If you are restoring a binary backup, you must ensure that the backup file exists on the local server on which you want to run the restore command. You cannot restore a backup file from a remote instance.

  • A binary backup - created using the DS/OpenDJ backup --backUpAll command.
  • In LDIF format - created using the DS/OpenDJ export-ldif command.

You can restore old backup data to a DS/OpenDJ replication topology and initialize replication as follows:

  1. Enter the following command on one of the servers (server1 for purposes of example) to prepare the domain on all servers for being externally initialized: You must specify the baseDN of the data you are going to be changing, for example, dc=example,dc=com.
    $ ./dsreplication pre-external-initialization --hostname ds1.example.com --port 4444 --baseDN dc=example,dc=com --adminUID admin --adminPassword password --no-prompt
  2. Restore the backup data to server1 as follows, depending on what type it is:
    • Binary backup - run the restore command on server1, for example:
      $ ./restore --hostname ds1.example.com --port 4444 --bindDN "cn=Directory Manager" --bindPassword password --backupID 20161019100952Z --backupDirectory /path/to/ds/binaryBackup_bak
    • LDIF format - use the import-ldif command, for example:
      $ ./import-ldif --hostname ds1.example.com --port 4444 --baseDN dc=example,dc=com --backendID userRoot --ldifFile /path/to/backupfile.ldif
      
  3. Back up server1 (which now includes your restored backup data) using the backup command, for example:
    $ ./backup --hostname ds1.example.com --port 4444 --bindDN "cn=Directory Manager" --bindPassword password --backendID userRoot --backupDirectory /path/to/ds/server1_bak --start 0
    
  4. Copy the backup file you created in the previous step and the accompanying backup.info file to each server you want to restore (server2 and server3 in this example).
  5. Restore this backup to all the other servers by running the restore command locally on each server in the topology, for example:
    $ server2/bin/restore --hostname ds2.example.com --port 4444 --bindDN "cn=Directory Manager" --bindPassword password --backupID 20161019102414Z --backupDirectory /path/to/ds/server1_bak
    $ server3/bin/restore --hostname ds3.example.com --port 4444 --bindDN "cn=Directory Manager" --bindPassword password --backupID 20161019102414Z --backupDirectory /path/to/ds/server1_bak
    [...]
  6. Enter the following command on server1 to set the new generation ID for the entire domain. Ensure you use the same baseDN as in step 1:
    $ ./dsreplication post-external-initialization --hostname ds1.example.com --port 4444 --baseDN dc=example,dc=com --adminUID admin --adminPassword password --no-prompt
Note

Instead of backing up and restoring (steps 3 and 5), you could use the dsreplication initialize-all command to initialize all servers via the network. This command Initializes the contents of the data under the specified base DN on all the servers with the contents on the specified server. See Reference › dsreplication for further details.

The above steps alter the generation ID of the replicated domain. "Old" changes will not get replayed because they were targeting the data using the previous generation ID. The final step calculates a new generation ID for the domain and broadcasts it to all the servers, which allows them to replicate again.

Replication will now proceed as normal, but from the restored point in time.

See Also

Generation IDs do not match error after restoring a DS/OpenDJ (All versions) replica

How do I roll back an entire network of DS/OpenDJ (All versions) replicas to a previous backup?

How do I configure DS/OpenDJ (All versions) to ensure accidentally deleted or changed data can be restored when replication is enabled?

How do I design and implement my backup and restore strategies for DS/OpenDJ (All versions)?

FAQ: Backup and restore in DS/OpenDJ

How do I quickly create a new DS/OpenDJ (All versions) replica?

Administration Guide › Backing Up and Restoring Data

Administration Guide › Managing Data Replication › Initializing Replicas

Related Training

ForgeRock Directory Services Core Concepts

Related Issue Tracker IDs

N/A


How do I roll back an entire network of DS/OpenDJ (All versions) replicas to a previous backup?

The purpose of this article is to provide guidance on rolling back an entire network of DS/OpenDJ replication servers to a previous backup. This approach can be used if you want to be rid of undesirable changes that have occurred and are now being replicated across all servers.

Overview

Accidental deletions of data in DS/OpenDJ can be reverted in two ways:

  • The first way, described in How do I configure DS/OpenDJ (All versions) to ensure accidentally deleted or changed data can be restored when replication is enabled?, configures the replication changelog to record additional information about each change. This allows changes to be reverted at a very fine-grained level and with very little impact to the servers in the replication topology. However, reverting each change requires several manual steps.
  • The second way, described in this article, uses the backup and restore tools. This is comparatively coarse as you can only restore up until a given backup and it does require that every replicating server is reinitialized.

Rolling back an entire network of DS/OpenDJ replicas

To roll back an entire network of DS/OpenDJ replicas to a previous backup, you must restore the same backup to every replica and use pre-external-initialization and post-external-initialization as follows:

  1. Enter the following command on one of the servers to prepare the domain on all servers for being externally initialized: You must specify the baseDN of the data you are going to be changing, for example, dc=example,dc=com.
    $ ./dsreplication pre-external-initialization --hostname ds1.example.com --port 4444 --baseDN dc=example,dc=com --adminUID admin --adminPassword password --trustAll --no-prompt
    
  2. Enter the following command to restore the backup to each server (this command performs an online restore, so you do not need to stop the server first):
    $ ds1/bin/restore --hostname ds1.example.com --port 4444 --bindDN "cn=Directory Manager" --bindPassword password --backupID [backupid] --backupDirectory /path/to/ds/bak
    $ ds2/bin/restore --hostname ds2.example.com --port 4444 --bindDN "cn=Directory Manager" --bindPassword password --backupID [backupid] --backupDirectory /path/to/ds/bak
    [...]
  3. Enter the following command on one of the servers to set the new generation ID for the entire domain. Ensure to use the same baseDN as the first step:
    $ ./dsreplication post-external-initialization --hostname ds1.example.com --port 4444 --baseDN dc=example,dc=com --adminUID admin --adminPassword password --trustAll --no-prompt
    

The above steps alter the generation ID of the replicated domain. "Old" changes will not get replayed because they were targeting the data using the previous generation ID. The final step calculates a new generation ID for the domain and broadcasts it to all the servers, which allows them to replicate again.

Replication will now proceed as normal, but from the restored point in time.

See Also

How do I configure DS/OpenDJ (All versions) to ensure accidentally deleted or changed data can be restored when replication is enabled?

How do I restore old backup data to a DS/OpenDJ (All versions) replication topology?

How do I design and implement my backup and restore strategies for DS/OpenDJ (All versions)?

FAQ: Backup and restore in DS/OpenDJ

FAQ: General DS/OpenDJ

Administration Guide › Backing Up and Restoring Data › Backing Up Directory Data

Administration Guide › Backing Up and Restoring Data › Restoring Directory Data From Backup

Reference › Tools Reference › dsreplication

Reference › Tools Reference › restore

Related Training

ForgeRock Directory Services Core Concepts

Related Issue Tracker IDs

N/A


How do I repair replication configuration in DS/OpenDJ (All versions) when dsreplication has failed?

The purpose of this article is to provide information on re-aligning replication configuration in DS/OpenDJ when the dsreplication tool has been unable to make a full set of necessary updates.

Summary

DS/OpenDJ replication configuration is split between two main locations:

  • Each DS or RS server has local configuration in their cn=config (config.ldif) backend.
  • Shared configuration (including public key entries and global admin users) is stored in the replicated backend: cn=admin data (admin-backend.ldif). 

When enabling and disabling replication, changes are made to the local configuration of each server in the topology; this is done by the dsreplication tool directly against the admin port on each server. Changes to the global configuration are made once and replicated to the other instances.

Due to this mechanism there are a number of scenarios where configuration could become inconsistent across the topology. For example, if Server4 is removed from a 4-server topology while Server1 is unavailable (offline or otherwise unable to execute configuration changes, such as being out of disk space) then only Server2 and Server3 will have their local configuration updated and the replicated admin data configuration will be updated. When Server1 comes back online later, it will still pick up the replicated admin data changes via another RS changelog, but it will not get the local configuration changes.

Note

If the dsreplication disable --disableAll command (pre-DS 5) has failed to stop replication, you can use the Replica Removal tool as detailed in How do I use the Replica Remover tool in OpenDJ 2.6.x and 3.x to remove replication when the --disableAll command has failed? or you can use the manual processes detailed below.

Checking and repairing local replication configuration with dsconfig

You can list the replication domains and servers in the local configuration of a DS/OpenDJ instance with the following commands:

  • To list replication domains in the local configuration:
    $ ./dsconfig list-replication-domains --provider-name "Multimaster Synchronization" --hostname ds.example.com --port 4444 --bindDN "cn=Directory Manager" --bindPassword password --no-prompt --trustAll
    
    Replication Domain : server-id : replication-server                                                     : base-dn
    -------------------:-----------:------------------------------------------------------------------------:--------------------
    cn=admin data      : 300       : ds-1.example.com:18989, ds-2.example.com:28989, ds-3.example.com:38989 : cn=admin data
    cn=schema          : 23155     : ds-1.example.com:18989, ds-2.example.com:28989, ds-3.example.com:38989 : cn=schema
    dc=example,dc=com  : 15187     : ds-1.example.com:18989, ds-2.example.com:28989, ds-3.example.com:38989 : "dc=example,dc=com"
  • To list replication servers in the local configuration:
    $ ./dsconfig list-replication-server --provider-name "Multimaster Synchronization" --hostname ds.example.com --port 4444 --bindDN cn="Directory Manager" --bindPassword password --no-prompt
    
    Replication Server : replication-server-id : replication-port : replication-server
    -------------------:-----------------------:------------------:-----------------------------------------------------------------------------------
    replication-server : 14275                 : 18989            : ds-1.example.com:18989, ds-2.example.com:28989, ds-3.example.com:38989
    

In this particular example, this is the configuration on ds-1, but ds-3 is actually gone from the topology. To remove it here, you can either navigate the interactive dsconfig menus or use the non-interactive commands as follows:

  • To remove from the replication server configuration:
    $ ./dsconfig set-replication-server-prop --provider-name "Multimaster Synchronization" --remove replication-server:ds-3.example.com:38989 --hostname ds.example.com --port 4444 --bindDN cn="Directory Manager" --bindPassword password --no-prompt
  • To remove from the replication domain (dc=example,dc=com):
    $ ./dsconfig set-replication-domain-prop --provider-name "Multimaster Synchronization" --domain-name dc=example,dc=com --remove replication-server:ds-3.example.com:38989 --hostname ds.example.com --port 4444 --bindDN cn="Directory Manager" --bindPassword password --no-prompt

Repeat for each replication domain, including cn=schema and cn=admin data.

Checking and repairing cn=admin data

Inconsistencies in the admin data configuration are less likely. This is a replicated backend that can catch up from the changelog if a server is temporarily unavailable when topology changes are made. 

The most important thing is to ensure that all admin data backends contain the same data. Since this is a file-based backend, you can find all of the data in LDIF format in admin-backend.ldif (located in /path/to/ds/db/admin in DS 6 and later, or /path/to/ds/config in pre-DS 6). Use the ldifdiff tool in DS or the ldif-diff tool in OpenDJ to compare the file from one instance to another. The only differences present should be for ds-sync-* attributes (replication housekeeping).

You can copy an admin-backend.ldif from a good server to overwrite one on a bad server (stop the server first and start it afterwards). If everything else is in order this will resolve any inconsistencies.

There is one particular case where you may need to make changes to the admin backend manually. This is when a server has been completely removed and is no longer available, but replication was not disabled beforehand. You can use the Replica Remover tool as detailed in How do I use the Replica Remover tool in OpenDJ 2.6.x and 3.x to remove replication when the --disableAll command has failed? for pre-DS 5 or use the manual process below.

The suggested manual process for this is as follows:

  1. Create a copy of admin-backend.ldif called admin-backend-new.ldif.
  2. Manually edit this new ldif file and delete references to the server being removed:
    • The uniqueMember reference in 'cn=all-servers,cn=Server Groups,cn=admin data'
    • The entry for the server under 'cn=Servers,cn=admin data', for example: cn=<server>,cn=Servers,cn=admin data
  3. Run an ldifdiff or ldif-diff between the existing and new ldif files:
    $ ​./ldifdiff --sourceLDIF admin-backend.ldif --targetLDIF admin-backend-new.ldif --outputLDIF changes.ldif
    
    $ ​./ldif-diff --sourceLDIF admin-backend.ldif --targetLDIF admin-backend-new.ldif  --outputLDIF changes.ldif
  4. Execute changes.ldif against a server in the topology:
    $ ./ldapmodify --hostname ds1.example.com --port 4444 --bindDN "cn=Directory Manager" --bindPassword password --continueOnError --useSSL changes.ldif
    In pre-DS 5, you need to include --filename in this command before changes.ldif.
  5. Check that the change has been replicated to all the other servers.
  6. Perform the steps in the previous section (local configuration) for each server still in the topology.

See Also

How do I use the Replica Remover tool in OpenDJ 2.6.x and 3.x to remove replication when the --disableAll command has failed?

How do I delete an AM/OpenAM instance (All versions) from a site along with the replicated embedded DS/OpenDJ server?

Replication in DS/OpenDJ

Configuration Reference › Replication Server

Configuration Reference › Replication Domain

Configuration Reference › Replication Synchronization Provider

Related Training

N/A

Related Issue Tracker IDs

OPENDJ-1054 (Provide the ability to disable replication for a replica which is already offline)

OPENDJ-3029 (dsreplication disable --disableAll does not remove all replication data from other instances cn=admin data backend.)


Known Issues


Replication error in DS/OpenDJ (All versions) indicates same ServerId

The purpose of this article is to provide assistance if you encounter a "same ServerId" replication error in DS/OpenDJ.

Symptoms

The following error is shown on a DS/OpenDJ replication server (in the Errors and Replication logs):

[21/Mar/2017:11:43:29 -0500] category=SYNC severity=SEVERE_ERROR msgID=10 msg=In Replication server Replication Server 8989 18989: replication servers 192.0.0.0:8989 and 203.0.113.0:8989 have the same ServerId : 12345

Recent Changes

N/A

Causes

This error can occur if you cloned the configuration from one replication server to another; however, the most common reason is having multiple IP addresses.

For example, in the scenario where replication server (server1) is on a host that has two IP addresses (192.0.0.0 and 203.0.113.0) and connects to another replication server (server2), this error would be logged on server2 if server1 connected to it via a different IP address than the one used when configuring replication initially.

This issue is considered a severe error because it suggests that a fundamental aspect of the replication topology is being violated; namely, that every server in the topology has a unique server ID. 

Solution

This issue can be resolved by configuring the source-address property on your replication servers to just one of the interfaces; you should choose the one that is configured with the FQDN being used in the replication setup. That way you should get connections coming consistently from the correct address.

You can use the dsconfig set-replication-server-prop command to make the appropriate change. For example, the following command configures the server at ds1.example.com to only make outbound replication connections from the NIC with the address rs.example.com:

$ ./dsconfig set-replication-server-prop --provider-name "Multimaster Synchronization" --set source-address:rs.example.com --hostname ds1.example.com --port 4444 --bindDN "cn=Directory Manager" --bindPassword password --no-prompt --trustAll 

See Also

Reference › Tools Reference › dsconfig

Configuration Reference › set-replication-server-prop

Related Training

N/A

Related Issue Tracker IDs

OPENDJ-4221 (same server ID for replication servers)

OPENDJ-567 (Better support for replication on multi-homed servers)


Connection issues cause replication to fail in DS/OpenDJ (All versions)

The purpose of this article is to provide assistance if replication fails in DS/OpenDJ due to a connection issue. You will see an error such as "The connection from this replication server RS(1234) to replication server RS(5678) at ds1.example.com/203.0.113.0:9989 for domain "dc=example,dc=com" has failed" when this happens.

Symptoms

Replication fails and you see an error such as the following in your logs:

[25/Aug/2017:10:27:23 +0000] category=SYNC severity=SEVERE_ERROR msgID=14942389 msg=The connection from this replication server RS(1234) to replication server RS(5678) at ds1.example.com/203.0.113.0:9989 for domain "dc=example,dc=com" has failed

Observing the connection flow

When a DS/OpenDJ instance starts, the connection flow should be as follows:

  1. The instance starts up the Replication Server (RS) and starts listening for connections on the replication port.
  2. The local RS connects to a remote RS for each domain; cn=schema, cn=admin data and the replicated backend.
  3. The local Directory Server (DS) for each domain connects to its local RS.

When you encounter this error, you will be able to observe the DS connecting to its local RS in the log files, but you will not not see the local RS connecting to the remote RS:

  • You will see messages similar to the following to indicate that the DS has connected to the local RS:
    [21/Jun/2017:16:17:24 -0400] category=SYNC severity=INFORMATION msgID=131 msg=Replication server RS(1234) has accepted a connection from directory server DS(5678) for domain "dc=example,dc=com" at ds1.example.com/203.0.113.0:9989
    
  • But you will not see messages like the following that show the local RS connecting to the remote RS:
    [21/Jun/2017:16:17:56 -0400] category=SYNC severity=INFORMATION msgID=116 msg=Replication server RS(9012) has accepted a connection from replication server RS(1234) for domain "dc=example,dc=com" at ds1.example.com/203.0.113.0:9989
    

Recent Changes

Network changes, such as updates to the firewall.

Causes

The local RS cannot connect to the remote RS; this happens when the RS server is down or unreachable.

Solution

You need to check that the following are all true, and if not, resolve any issues you encounter:

  • The remote RS is up and running.
  • The network is working correctly.
  • The local RS can successfully connect to the remote RS over the network. You should test connectivity as indicated below. If connectivity fails, here are a few suggested things to check that can commonly prevent connection:
    • Is there a firewall or other network device blocking the replication port and/or the admin port?
    • Is the hostname resolution as expected? For example, can the DNS resolve each hostname from the other server? Do the hostnames resolve to IP addresses physically present on the servers?

Once you have resolved any issues and confirmed that the local RS can connect to the remote RS, you will need to reinitialize replication and ensure the servers are in sync. You can reinitialize replication using the initialize command, for example:

$ ./dsreplication initialize --adminUID admin --adminPassword password --baseDN dc=example,dc=com --hostSource ds1.example.com --portSource 4444 --hostDestination ds2.example.com --portDestination 5444 --trustAll --no-prompt

Testing connectivity

You should test connectivity to ensure that each server can connect to each others' replication ports. You can use a variety of tools for this, for example:

  • OpenSSL:
    $ openssl s_client -connect [remote_server]:[replication_port]
  • Telnet:
    $ telnet [remote_server] [replication_port]

Ensure that you test connectivity from all servers to verify that all connections are working as expected.

See Also

How do I use the Support Extract tool in DS 5.x, 6 and OpenDJ 3.x to capture troubleshooting data?

How do I troubleshoot replication issues in DS/OpenDJ (All versions)?

Replication in DS/OpenDJ

Related Training

N/A

Related Issue Tracker IDs

N/A


Generation IDs do not match error after restoring a DS/OpenDJ (All versions) replica

The purpose of this article is to provide assistance if a "Generation IDs do not match" error occurs during replication after restoring an old backup or LDIF data to a DS/OpenDJ instance.

Symptoms

When restoring a replica, the following error is shown in the DS/OpenDJ error logs:

[10/Oct/2016:15:19:33 +0000] category=SYNC severity=SEVERE_WARNING msgID=14811232 msg=Directory server DS(42134) has connected to replication server RS(42706) for domain "cn=admin data" at ds1.example.com/10.127.0.61:8080, but the generation IDs do not match, indicating that a full re-initialization is required. The local (DS) generation ID is 36215293 and the remote (RS) generation ID is 172193
[10/Oct/2016:15:19:33 +0000] category=SYNC severity=SEVERE_WARNING msgID=14811272 msg=Replication server RS(42706) not sending update 00000148c7a7e07c102600000001 for domain "cn=admin data" to directory server DS(42134) at ds1.example.com/10.127.0.61:8989 because its generation ID 36215293 is different to the local generation ID 172193

The restored replica is not synchronized with the other replication servers.

Recent Changes

Restored old backup or imported LDIF data to a DS/OpenDJ instance.

Causes

The generation ID contained in the restored data is not the same as the one in the current replication topology. The generation ID is a checksum of attributes from some of the entries and is used during replication to check that the suffix being updated is the same as the one offering the updates.

Solution

This issue can be resolved using one of the following approaches:

  • Re-initializing the suffix from another replica, since the generation IDs will naturally be in sync.
  • Using pre-external-initialization and post-external-initialization as part of your restore process if you are certain the data is correct.

Re-initializing the suffix

You can re-initialize the suffix as follows:

  1. Initialize the suffix using the dsreplication command, for example:
    $ ./dsreplication initialize --adminUID admin --adminPassword password --baseDN dc=example,dc=com --hostSource ds1.example.com --portSource 4444 --hostDestination ds2.example.com --portDestination 5444 --trustAll --no-prompt

Using pre-external-initialization and post-external-initialization

If you are certain the data is correct, you can use pre-external-initialization and post-external-initialization as part of your restore process. These commands ensure the generation ID of the replicated domain is updated. "Old" changes will not get replayed because they were targeting the data using the previous generation ID.  

You restore a replica as follows:

  1. Prepare the domain on all servers for being externally initialized. You must specify the baseDN of the data you are going to be changing, for example:
    $ ./dsreplication pre-external-initialization --hostname ds1.forgerock.com --port 4444 --baseDN dc=example,dc=com --adminUID admin --adminPassword password --trustAll --no-prompt
  2. Restore or import from ldif to restore the data:
    • Using the restore command, for example:
      $ ./restore --hostname ds1.forgerock.com --port 4444 --bindDN "cn=Directory Manager" --bindPassword password --backupID 20160614031032Z --backupDirectory /path/to/ds/backupfile
      
    • Using the import-ldif command, for example:
      $ ./import-ldif --hostname ds1.example.com --port 4444--includeBranch dc=example,dc=com --backendID userRoot --ldifFile /path/to/ldif_file
  3. Enter the following command to set the new generation ID for the domain and broadcast it to all the servers, which allows them to replicate again. Ensure you use the same baseDN as in step 1:
    $ ./dsreplication post-external-initialization --hostname ds1.forgerock.com --port 4444 --baseDN dc=example,dc=com --adminUID admin --adminPassword password --trustAll --no-prompt

See Also

How do I restore old backup data to a DS/OpenDJ (All versions) replication topology?

How do I roll back an entire network of DS/OpenDJ (All versions) replicas to a previous backup?

How do I configure DS/OpenDJ (All versions) to ensure accidentally deleted or changed data can be restored when replication is enabled?

How do I design and implement my backup and restore strategies for DS/OpenDJ (All versions)?

FAQ: Backup and restore in DS/OpenDJ

Administration Guide › Backing Up and Restoring Data

Administration Guide › Managing Data Replication › Initializing Replicas

Related Training

ForgeRock Directory Services Core Concepts

Related Issue Tracker IDs

N/A


Enabling or initializing replication interactively fails in DS (All versions) and OpenDJ 3.x with There is an error with the certificate presented by the server

The purpose of this article is to provide assistance if you cannot enable or initialize replication in interactive mode in DS/OpenDJ. You will see "There is an error with the certificate presented by the server" error.

Symptoms

The following error is shown when attempting to enable or initialize replication interactively using the dsreplication menu with option 1 (Enable Replication) or option 3 (Initialize Replication on one Server):

Establishing connections ..... 
Error reading data from server localhost:4444. There is an error with the certificate presented by the server.
Details: simple bind failed: localhost:4444

You should also be aware that:

  • Using option 4 (Initialize All Servers) appears to be successful, but the servers are not in sync.
  • You can enable or initialize replication successfully via the command line. 

Recent Changes

Upgraded to, or installed DS 5 or later.

Upgraded to, or installed OpenDJ 3.x.

Causes

This is a trust issue caused by the order of operations in how you enabled/initialized replication interactively:

  • If you launch dsreplication on Master 1 and use Master 1's connection details for the first server, replication is not enabled or initialized because Master 1 does not trust Master 2's replication certificate.
  • If you launch dsreplication on Master 1 and use Master 2's connection details for the first server, Master 1 connects to Master 2, retrieves the replication certificate and sets up a trust relationship; this allows replication to be successfully enabled or initialized.

Solution

This issue can be resolved by specifying the correct connection details for the first server. Per the explanation above, you must specify the remote server as the first server (rather than the local server) to allow a trust relationship to be formed.

Alternatively, you can use the command line to enable and/or initialize replication:

  • Enable replication using the dsreplication command applicable to your version:
    • DS 5 and later:
      $ ./dsreplication configure --adminUid admin --adminPassword password --baseDn dc=example,dc=com --host1 ds1.example.com --port1 4444 --bindDn1 "cn=Directory Manager" --bindPassword1 password --replicationPort1 8989 --host2 ds2.example.com --port2 4444 --bindDn2 "cn=Directory Manager" --bindPassword2 password --replicationPort2 8989 --trustAll --no-prompt
    • Pre-DS 5:
      $ ./dsreplication enable --adminUID admin --adminPassword password --baseDN dc=example,dc=com --host1 ds1.example.com --port1 4444 --bindDN1 "cn=Directory Manager" --bindPassword1 password --replicationPort1 8989 --host2 ds2.example.com --port2 4444 --bindDN2 "cn=Directory Manager" --bindPassword2 password --replicationPort2 8989 --trustAll --no-prompt
  • Initialize the new server to ensure both servers have the same data:
    $ ./dsreplication initialize --adminUID admin --adminPassword password --baseDN dc=example,dc=com --hostSource ds1.example.com --portSource 4444 --hostDestination ds2.example.com --portDestination 4444 --trustAll --no-prompt

See Administration Guide › To Configure Replication Interactively for further information. 

See Also

Replication in DS/OpenDJ

Administration Guide › Configuring Replication

Related Training

N/A

Related Issue Tracker IDs

OPENDJ-3475 (Docs: Replication (interactively) fails when using the local master as the "First Servers" )


High CPU in DS 5, 5.5, 5.5.1 and OpenDJ 3.5.2, 3.5.3 when RS is writing to another RS

The purpose of this article is to provide assistance if you observe high CPU in DS/OpenDJ when one Replication Server (RS) is writing to another RS.

Symptoms

You observe high CPU utilization on one or more replication servers. Restarting the DS/OpenDJ service and/or server does not resolve it.

An error similar to the following is shown in the stack trace (jstack) when this happens:

CPU = 99.9

"Replication server RS(1000) writing to Replication server RS(2000) for domain "dc=example,dc=com" at ds1.example.com/203.0.113.0:8989" #111 prio=5 os_prio=0 tid=0x0007fbaf01c71800 nid=0x617d runnable [0x00007f6f7fbf9000]

   java.lang.Thread.State: RUNNABLE
   at java.io.RandomAccessFile.length(Native Method)
   at java.io.RandomAccessFile.skipBytes(RandomAccessFile.java:468)
   at org.opends.server.replication.server.changelog.file.BlockLogReader.readNextRecord(BlockLogReader.java:311)
   at org.opends.server.replication.server.changelog.file.BlockLogReader.readRecord(BlockLogReader.java:244)
   at org.opends.server.replication.server.changelog.file.BlockLogReader.searchClosestBlockStartToKey(BlockLogReader.java:399)
   at org.opends.server.replication.server.changelog.file.BlockLogReader.seekToRecord(BlockLogReader.java:158)
   at org.opends.server.replication.server.changelog.file.LogFile$LogFileCursor.positionTo(LogFile.java:634)
   at org.opends.server.replication.server.changelog.file.Log$InternalLogCursor.positionTo(Log.java:1286)
   at org.opends.server.replication.server.changelog.file.Log$AbortableLogCursor.positionTo(Log.java:1537)
   at org.opends.server.replication.server.changelog.file.FileReplicaDBCursor.nextWhenCursorIsExhaustedOrNotCorrectlyPositionned(FileReplicaDBCursor.java:117)
   at org.opends.server.replication.server.changelog.file.FileReplicaDBCursor.next(FileReplicaDBCursor.java:111)
   at org.opends.server.replication.server.changelog.file.ReplicaCursor.next(ReplicaCursor.java:123)
   at org.opends.server.replication.server.changelog.file.CompositeDBCursor.addCursor(CompositeDBCursor.java:170)
   at org.opends.server.replication.server.changelog.file.CompositeDBCursor.recycleExhaustedCursors(CompositeDBCursor.java:126)
   at org.opends.server.replication.server.changelog.file.CompositeDBCursor.next(CompositeDBCursor.java:107)
   at org.opends.server.replication.server.changelog.file.DomainDBCursor.next(DomainDBCursor.java:32)

You can collect a stack trace as shown in How do I collect JVM data for troubleshooting DS/OpenDJ (All versions)?

In the changelogDb, you may notice DSID.server files for obsolete servers. Alternatively, you will have a large number of directory servers in your replication topology. 

Recent Changes

Upgraded to, or installed DS 5, 5.5 or 5.5.1.

Upgraded to, or installed OpenDJ 3.5.2 or 3.5.3.

Repeatedly enabled and disabled replication (using dsreplication configure/unconfigure or dsreplication enable/disable depending on your version).

Causes

When you repeatedly enable and disable replication, old replica ID data (DSID.server) remains in the changelogDb. Due to the obsolete DSID.server data and/or the sheer number of directory servers, the changelogDb will contain a lot of data. When one RS is writing to another and iterating through the changelogDb files, this data is constantly being opened and read, which results in high CPU.

Solution

This issue can be resolved by upgrading to DS 5.5.2 or later; you can download this from BackStage.

Workaround

Please raise a ticket for assistance; the procedure to workaround this issue should only be performed under support supervision.

See Also

How do I troubleshoot high CPU utilization on DS/OpenDJ (All versions) servers?

How do I find which thread is consuming CPU in a Java process in DS/OpenDJ (All versions)?

How do I migrate an existing DS+RS replication topology to a DS to RS topology in DS/OpenDJ (All versions)?

Troubleshooting DS/OpenDJ

Related Training

N/A

Related Issue Tracker IDs

OPENDJ-4598 (Replication Server cursoring through obsolete replica ID's causing high CPU spin)


Directory server 1 was attempting to connect to replication server 2 but has disconnected in handshake phase error in DS 5 and OpenDJ 2.6.x, 3.0, 3.5, 3.5.1

The purpose of this article is to provide assistance if you encounter a "Directory server 1 was attempting to connect to replication server 2 but has disconnected in handshake phase" error in DS/OpenDJ. You may also see a result=53 message="The Replication is configured for suffix dc=openam,dc=example,dc=org but was not able to connect to any Replication Server". This issue also affects the embedded DS/OpenDJ in AM/OpenAM.

Symptoms

The errors you will see vary slightly depending on whether DS/OpenDJ is standalone or embedded as follows: 

Standalone DS/OpenDJ

The following errors are shown in the DS/OpenDJ Errors log:

[11/Nov/2016:13:22:51 -0400] category=SYNC severity=NOTICE msgID=org.opends.messages.replication.204 msg=Replication server RS(1245) started listening for new connections on address 0.0.0.0 port 8989
[11/Nov/2016:13:22:55 -0400] category=SYNC severity=WARNING msgID=org.opends.messages.replication.208 msg=Directory server DS(8764) was unable to connect to any replication servers for domain "cn=admin data"
[11/Nov/2016:13:22:59 -0400] category=SYNC severity=WARNING msgID=org.opends.messages.replication.208 msg=Directory server DS(8764) was unable to connect to any replication servers for domain "dc=example,dc=com"
[11/Nov/2016:13:23:04 -0400] category=SYNC severity=WARNING msgID=org.opends.messages.replication.208 msg=Directory server DS(8764) was unable to connect to any replication servers for domain "cn=schema"
[11/Nov/2016:13:23:26 -0400] category=SYNC severity=ERROR msgID=org.opends.messages.replication.178 msg=Directory server 8764 was attempting to connect to replication server 1245 but has disconnected in handshake phase
[11/Nov/2016:13:43:25 -0400] MODIFY RES conn=4 op=114 msgID=115 result=53 message="The Replication is configured for suffix dc=openam,dc=example,dc=org but was not able to connect to any Replication Server" etime=1
Note

A quick trace of the log messages as demonstrated in OPENDJ-1135 (DS sometimes fail to connect to RS after server restart) points to a problem with timed out connections because the (RS) is unreachable. The logs show the RS listener is accepting local connections after they have already timed out on the client side (DS). 

Embedded DS/OpenDJ

The following error is shown in the embedded DS/OpenDJ logs:

ERROR: Directory server DS(8764) encountered an unexpected error while connecting to replication server host1.example.com:8080 for domain "dc=example,dc=com": SocketException: Broken pipe (SocketOutputStream.java:-2 SocketOutputStream.java:113 SocketOutputStream.java:159 OutputRecord.java:377 OutputRecord.java:363 SSLSocketImpl.java:849 SSLSocketImpl.java:820 SSLSocketImpl.java:691 Handshaker.java:1011 ClientHandshaker.java:1187 ClientHandshaker.java:1099 ClientHandshaker.java:345 Handshaker.java:913 Handshaker.java:849 SSLSocketImpl.java:1035 SSLSocketImpl.java:1344 SSLSocketImpl.java:1371 SSLSocketImpl.java:1355 ReplSessionSecurity.java:196 ReplicationBroker.java:1080 ReplicationBroker.java:792 ...)
EmbeddedDJ:10/02/2016 12:26:48:618 AM CDT: Thread[Replication server RS(1245) connection listener on port 50889,5,Directory Server Thread Group]: TransactionId[ea06d007-df28-492b-b74c-7395e70e49bc-6038342]
ERROR: Directory server 8764 was attempting to connect to replication server 1245 but has disconnected in handshake phase

The following error is shown in the AM/OpenAM Configuration debug log:

amSetupServlet:11/16/2016 10:18:52:187 AM CDT: Thread[localhost-startStop-2,5,main]: TransactionId[af908ced-ff94-4ae6-a375-dc06e33eb0d9-5856958]
ERROR: EmbeddedOpenDS:shutdown hook failed
java.lang.NullPointerException
   at org.opends.server.core.DirectoryServer.shutDown(DirectoryServer.java:6170)
   at com.sun.identity.setup.EmbeddedOpenDS.shutdownServer(EmbeddedOpenDS.java:513)
   at com.sun.identity.setup.EmbeddedOpenDS$1.shutdown(EmbeddedOpenDS.java:490)
   at com.sun.identity.common.ShutdownManager.shutdown(ShutdownManager.java:211)
   at com.sun.identity.common.ShutdownServletContextListener.contextDestroyed(ShutdownServletContextListener.java:51)
   at org.apache.catalina.core.StandardContext.listenerStop(StandardContext.java:5014)
   at org.apache.catalina.core.StandardContext.stopInternal(StandardContext.java:5659)
   at org.apache.catalina.util.LifecycleBase.stop(LifecycleBase.java:232)
   at org.apache.catalina.core.ContainerBase$StopChild.call(ContainerBase.java:1575)
   at org.apache.catalina.core.ContainerBase$StopChild.call(ContainerBase.java:1564)
   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
   at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:745)

You will also see generic "Unwilling to Perform" errors such as the following in the AM/OpenAM Configuration log: 

amSMSEmbeddedLdap:11/16/2016 12:09:52:289 PM CDT: Thread[http-nio-0.0.0.0-9443-exec-16,5,main]: TransactionId[daaeffd3-d4a3-4324-8f1d-865e43f34773-168]
ERROR: SMSEmbeddedLdapObject.modify: Error modifying entry ou=AgentUsers,ou=default,ou=OrganizationConfig,ou=1.0,ou=sunidentityrepositoryservice,ou=services,o=agentusers,ou=services,dc=openam,dc=example,dc=com by Principal: id=amadmin,ou=user,dc=openam,dc=example,dc=com, error code = Unwilling to Perform 

AM/OpenAM

If you are using DS/OpenDJ with AM/OpenAM, you will encounter a variety of issues if AM/OpenAM cannot reach the DS/OpenDJ configuration store and/or user store. These issues include failures when making configuration changes, users not able to authenticate and installations failing with the following error shown in the Install log:

AMSetupServlet.processRequest: errororg.forgerock.opendj.ldap.ConnectionException: Server Connection Closed: Heartbeat failed

Recent Changes

Shut down the DS/OpenDJ replication server (RS) or the RS is down for another reason.

Restarted the RS instance.

Rebooted the VM in which DS/OpenDJ is running.

Causes

In a replicated system, the directory server (DS) must be able to connect to the RS. If the RS cannot be contacted (for example, it is down), any writes to the DS/OpenDJ server (whether they originate from DS/OpenDJ or AM/OpenAM) will fail. Restarting a server can also prevent the DS connecting to its own RS during the handshake phase. 

Solution

This issue can be resolved by upgrading to DS 5.5 and later, or OpenDJ 3.5.2; you can download this from BackStage.

Workaround

As a workaround, you can restart the RS. If you are using the embedded DS/OpenDJ in AM/OpenAM, you should restart the web application container in which AM/OpenAM runs to do this.

If the other DS/OpenDJ instance is offline for a period longer than your purge delay, you will need to initialize it from a running server once it's back online to take account of any updates that occurred while it was down, for example:

$ ./dsreplication initialize --adminUID admin --adminPassword password --baseDN dc=example,dc=com --hostSource ds1.example.com --portSource 4444 --hostDestination ds2.example.com --portDestination 4444 --trustAll --no-prompt 

See Also

How do I repair replication configuration in DS/OpenDJ (All versions) when dsreplication has failed?

How do I use the Replica Remover tool in OpenDJ 2.6.x and 3.x to remove replication when the --disableAll command has failed?

Generation IDs do not match error after restoring a DS/OpenDJ (All versions) replica

How do I use cn=monitor entry in DS/OpenDJ (All versions) for monitoring?

FAQ: Monitoring DS/OpenDJ

Related Training

N/A

Related Issue Tracker IDs

OPENDJ-1135 (DS sometimes fail to connect to RS after server restart)


OpenDJ 3.x Java upgrade causes certificate exceptions with contol-panel/dsreplication/status commands

The purpose of this article is to provide assistance if you encounter certificate and "Unable to connect to the server" errors when running dsreplication commands after upgrading to Java® 1.7.0_191, 1.8.0_181 or later. A similar "An error occurred connecting to the server" is shown when running control-panel.

Symptoms

You will encounter connecting to server errors when using dsreplication or control-panel commands:

  • dsreplication commands: running any dsreplication commands where the hostname is not a FQDN, for example:
    $ ./dsreplication status --hostname host1 --port 4444 --adminUID admin --adminPassword password --trustAll --no-prompt
    
    Gives a response similar to one of the following:
    Unable to connect to the server at host1 on port 4444. Check this port is an administration port
    
    Error reading data from server host1:4444. There is an error with the certificate presented by the server.
    Details: simple bind failed: host1:4444
    
    Whereas running the same command using a FQDN as the hostname succeeds.
  • control-panel command: login will fail with the following error when running the control-panel command:
    An error occurred connecting to the server. Details:
    javax.naming.CommunicationException: 0.0.0.0:4444 [Root exception is javax.net.ssl.SSLHandshakeException:
    java.security.cert.CertificateException: No subject alternative names present]
    

SSL debug log

An error similar to the following is shown in the SSL debug log when this happens:

LDAP Request Handler 0 for connection handler Administration Connector 192.0.2.0 port 4444, WRITE: TLSv1.2 Handshake, length = 947

LDAP Request Handler 0 for connection handler Administration Connector 192.0.2.0 port 4444, READ: TLSv1.2 Alert, length = 2

LDAP Request Handler 0 for connection handler Administration Connector 192.0.2.0 port 4444, RECV TLSv1.2 ALERT:  fatal, certificate_unknown

LDAP Request Handler 0 for connection handler Administration Connector 192.0.2.0 port 4444, fatal: engine already closed.  Rethrowing javax.net.ssl.SSLException: Received fatal alert: certificate_unknown

LDAP Request Handler 0 for connection handler Administration Connector 192.0.2.0 port 4444, fatal: engine already closed.  Rethrowing javax.net.ssl.SSLException: Received fatal alert: certificate_unknown
Note

You can generate SSL debug logs as described in FAQ: SSL certificate management in DS/OpenDJ (Q. How do I debug a SSL handshake error?)

Recent Changes

Upgraded Java to version 1.7.0_191, 1.8.0_181 or later (including Oracle® JDK and OpenJDK).

Causes

Java 1.7.0_191 and 1.8.0_181 introduced changes to improve LDAP support by enabling endpoint identification algorithms by default for LDAPS connections.

For further information see:

 Java SE Development Kit 8, Update 181 (JDK 8u181) 

 Java SE Development Kit 7, Update 191 (JDK 7u191) 

Solution

This issue can be resolved using one of the following options:

  • Always use a FQDN for the hostname. This is a requirement for replication as stated in the Release Notes: Release Notes › FQDNs For Replication.
  • Set the new system property (com.sun.jndi.ldap.object.disableEndpointIdentification) to disable endpoint identification if appropriate for your environment. 
  • Downgrade your version of Java.

Setting the new system property

You can set this system property in OpenDJ as follows:

  1. Add the new system property to dsreplication.java-args in the java.properties file, for example:
    dsreplication.java-args=... -Dcom.sun.jndi.ldap.object.disableEndpointIdentification=true
  2. Apply this change by running the bin/dsjavaproperties command:
    $ ./dsjavaproperties

See Also

How do I change DS/OpenDJ (All versions) to use a different JDK version?

How do I ensure DS/OpenDJ (All versions) uses the Java settings from java.properties file when DS/OpenDJ is started?

Related Training

N/A

Related Issue Tracker IDs

OPENDJ-5336 (Dsreplication and control-panel connection fails with JVM 1.8.0_181)


Invalid Credentials response in OpenDJ 3.5 when dsreplication commands fail

The purpose of this article is to provide assistance if dsreplication commands fail in OpenDJ 3.5 with an "Invalid Credentials" error. This can happen if the global admin account password is different to the Directory Manager password (The password provided by the user did not match any password(s) stored in the user's entry) or the rootDN is not cn=Directory Manager (Unable to bind to the Directory Server because no such user exists in the server).

Symptoms

You will see the following errors depending on your use case:

  • The admin account password is different to the Directory Manager password; the following error is shown in the access log:
    [21/Oct/2016:15:11:29 +0100] BIND REQ conn=11 op=0 msgID=1 version=3 type=SIMPLE dn="cn=Directory Manager"
    [21/Oct/2016:15:11:29 +0100] BIND RES conn=11 op=0 msgID=1 result=49 authFailureReason="The password provided by the user did not match any password(s) stored in the user's entry" authDN="cn=Directory Manager" etime=1
    
    The following error is shown in response to a dsreplication command:
    The provided credentials are not valid in server opendj.example.com:4444.
    Details: [LDAP: error code 49 - Invalid Credentials]
    
    
  • The rootDN is not cn=Directory Manager; the following is shown in the access log:
    [21/Oct/2016:15:11:29 -0100] BIND REQ conn=9 op=0 msgID=1 version=3 type=SIMPLE dn="cn=Directory Manager"
    [21/Oct/2016:15:11:29 -0100] BIND RES conn=9 op=0 msgID=1 result=49 authFailureReason="Unable to bind to the Directory Server because no such user exists in the server" authDN="cn=Directory Manager" etime=0
    

Recent Changes

Upgraded to, or installed OpenDJ 3.5.

Changed the admin account password.

Installed an OpenDJ instance using a rootDN other than "cn=Directory Manager".

Causes

The dsreplication command attempts to bind with "cn=Directory Manager" regardless of whether it should be binding with the admin account or a different rootDN. Since the password and/or user do not match, the dsreplication command fails.

Solution

This issue can be resolved by upgrading to OpenDJ 3.5.1 or later; you can download this from BackStage.

If the issue is caused by the global admin and Directory Manager passwords being different, you can update them so they match as a workaround.

See Also

How do I change the admin account password used for replication in DS/OpenDJ (All versions)?

Replication in DS/OpenDJ

Related Training

N/A

Related Issue Tracker IDs

OPENDJ-3231 (dsreplication status uses wrong bind DN)


Replication server fails to start after starting an OpenDJ 3 instance

The purpose of this article is to provide assistance if the replication server fails to start after starting an OpenDJ instance. The following error is shown when this happens: "The replication server failed to start because the database /path/to/opendj/changelogDb could not be read".

Symptoms

The following error is shown in the Errors log when the replication server fails to start:

[22/Oct/2016:16:29:52 +1000] category=SYNC severity=ERROR msgID=org.opends.messages.replication.274 msg=The following log '/path/to/opendj/changelogDb/2.dom/10566.server' must be released but it is not referenced."
[22/Oct/2016:16:29:52 +1000] category=SYNC severity=ERROR msgID=org.opends.messages.replication.11 msg=The replication server failed to start because the database /path/to/opendj/changelogDb could not be read : Could not get or create replica DB for baseDN 'dc=example,dc=com', serverId '10566', generationId '72390'

You may also experience issues with OpenAM when this happens since OpenDJ will not be able to save any configuration changes you make. You will see generic "Unwilling to Perform" errors such as the following in the OpenAM Configuration log:

amSMSEmbeddedLdap:22/10/2016 16:31:56:289 PM GMT: Thread[http-nio-0.0.0.0-9443-exec-16,5,main]: TransactionId[daaeffd3-d4a3-4324-8f1d-865e43f34773-168]
ERROR: SMSEmbeddedLdapObject.modify: Error modifying entry ou=AgentUsers,ou=default,ou=OrganizationConfig,ou=1.0,ou=sunidentityrepositoryservice,ou=services,o=agentusers,ou=services,dc=openam,dc=example,dc=com by Principal: id=amadmin,ou=user,dc=openam,dc=example,dc=com, error code = Unwilling to Perform 

If this happens, you will need to check the OpenDJ logs to verify it is the same issue. 

Recent Changes

Restarted OpenDJ.

Causes

When the changelogDb file size is a multiple of 256 (the internal block size), the server fails to compute the newest record after being restarted.

Solution

This issue can be resolved by upgrading to OpenDJ 3.5 or later; you can download this from BackStage.

Workaround

You can follow this process as a workaround until your next restart:

  1. Stop the OpenDJ instance.
  2. Move the changelogDB directory to a temporary location to allow a new one to be re-created.
  3. Restart the OpenDJ instance and wait for the changelogDB directory to be rebuilt.

Once the new changelogDB directory has been rebuilt and you have verified replication is working as expected, you can remove the old changelogDB directory from the temporary location.

See Also

N/A

Related Training

N/A

Related Issue Tracker IDs

OPENAM-8347 (Embedded OpenDJ 'errors' and 'replication' logs are not written)

OPENDJ-2969 (changelogDb could not be read on OpenDJ instance startup)


Out Of Memory Error when installing OpenDJ 3, or using import-ldif, rebuild-index or dsreplication commands

The purpose of this article is to provide assistance if you encounter an "Exception in thread "main" java.lang.OutOfMemoryError at sun.misc.Unsafe.allocateMemory(Native Method)" error. This error can occur when you use the setup command to install or upgrade, or you use the import-ldif or rebuild-index commands. Similarly, you might see an "OutOfMemoryError (Unsafe.java" error when using the dsreplication command.

Symptoms

You will see one of the following errors in your errors log depending on which command you are using:

  • setup (this includes an import-ldif command that's used during the setup process):
    [12/09/2016:08:39:51 +0100] category=QUICKSETUP seq=73 severity=INFO msg=import-ldif out log: [12/09/2016:08:39:51 +0100] category=PLUGGABLE seq=11 severity=INFO msg=Import LDIF environment close took 0 seconds
    [12/09/2016:08:39:51 +0100] category=QUICKSETUP seq=74 severity=INFO msg=import-ldif out log: [12/09/2016:08:39:51 +0100] category=PLUGGABLE seq=12 severity=INFO msg=Flushing data to disk
    [12/09/2016:08:39:51 +0100] category=QUICKSETUP seq=75 severity=WARNING msg=import-ldif error log: Exception in thread "main" java.lang.OutOfMemoryError
    [12/09/2016:08:39:51 +0100] category=QUICKSETUP seq=76 severity=WARNING msg=import-ldif error log:  at sun.misc.Unsafe.allocateMemory(Native Method)
    [12/09/2016:08:39:51 +0100] category=QUICKSETUP seq=77 severity=WARNING msg=import-ldif error log:  at org.opends.server.backends.pluggable.OnDiskMergeImporter$BufferPool$OffHeapBuffer.<init>(OnDiskMergeImporter.java:2858)
    [12/09/2016:08:39:51 +0100] category=QUICKSETUP seq=78 severity=WARNING msg=import-ldif error log:  at org.opends.server.backends.pluggable.OnDiskMergeImporter$BufferPool.<init>(OnDiskMergeImporter.java:2780)
    [12/09/2016:08:39:51 +0100] category=QUICKSETUP seq=79 severity=WARNING msg=import-ldif error log:  at org.opends.server.backends.pluggable.OnDiskMergeImporter$StrategyImpl.importLDIF(OnDiskMergeImporter.java:203)
    [12/09/2016:08:39:51 +0100] category=QUICKSETUP seq=80 severity=WARNING msg=import-ldif error log:  at org.opends.server.backends.pluggable.BackendImpl.importLDIF(BackendImpl.java:689)
    ...
    
  • import-ldif:
    Exception in thread "main" java.lang.OutOfMemoryError 
       at sun.misc.Unsafe.allocateMemory(Native Method) 
       at org.opends.server.backends.pluggable.OnDiskMergeImporter$BufferPool$OffHeapBuffer.<init>(OnDiskMergeImporter.java:2858) 
       at org.opends.server.backends.pluggable.OnDiskMergeImporter$BufferPool.<init>(OnDiskMergeImporter.java:2780) 
       at org.opends.server.backends.pluggable.OnDiskMergeImporter$StrategyImpl.importLDIF(OnDiskMergeImporter.java:203) 
       at org.opends.server.backends.pluggable.BackendImpl.importLDIF(BackendImpl.java:689) 
       at org.opends.server.tools.ImportLDIF.processLocal(ImportLDIF.java:1092) 
       at org.opends.server.tools.tasks.TaskTool.process(TaskTool.java:362) 
       at org.opends.server.tools.ImportLDIF.process(ImportLDIF.java:292) 
       at org.opends.server.tools.ImportLDIF.mainImportLDIF(ImportLDIF.java:147) 
       at org.opends.server.tools.ImportLDIF.main(ImportLDIF.java:110)
    ...
    
  • rebuild-index:
    Exception in thread "main" java.lang.OutOfMemoryError 
       at sun.misc.Unsafe.allocateMemory(Native Method) 
       at org.opends.server.backends.pluggable.OnDiskMergeImporter$BufferPool$OffHeapBuffer.<init>(OnDiskMergeImporter.java:2858) 
       at org.opends.server.backends.pluggable.OnDiskMergeImporter$BufferPool.<init>(OnDiskMergeImporter.java:2780) 
       at org.opends.server.backends.pluggable.OnDiskMergeImporter$StrategyImpl.rebuildIndex(OnDiskMergeImporter.java:312) 
       at org.opends.server.backends.pluggable.OnDiskMergeImporter$StrategyImpl.rebuildIndex(OnDiskMergeImporter.java:275) 
       at org.opends.server.backends.pluggable.BackendImpl.rebuildBackend(BackendImpl.java:806) 
       at org.opends.server.tools.RebuildIndex.rebuildIndex(RebuildIndex.java:559) 
       at org.opends.server.tools.RebuildIndex.processLocal(RebuildIndex.java:321) 
       at org.opends.server.tools.tasks.TaskTool.process(TaskTool.java:362) 
       at org.opends.server.tools.RebuildIndex.process(RebuildIndex.java:228) 
       at org.opends.server.tools.RebuildIndex.mainRebuildIndex(RebuildIndex.java:138) 
       at org.opends.server.tools.RebuildIndex.main(RebuildIndex.java:110) 
    ...
    
  • dsreplication:
    [17/Sep/2016:08:39:51 +0100] category=org.opends.server.api.DirectoryThread severity=ERROR msgID=org.opends.messages.core.140 msg=An uncaught exception during processing for thread Replica DS(18687) listener for domain "dc=forgerock,dc=com" has caused it to terminate abnormally. The stack trace for that exception is: OutOfMemoryError (Unsafe.java:-2 OnDiskMergeImporter.java:2858 OnDiskMergeImporter.java:2780 OnDiskMergeImporter.java:203 BackendImpl.java:689 LDAPReplicationDomain.java:3565 ReplicationDomain.java:2275 ReplicationDomain.java:778 ReplicationDomain.java:106 ReplicationDomain.java:2965 Thread.java:744)
    [17/Sep/2016:08:39:51 +0100] category=CORE severity=NOTICE msgID=org.opends.messages.core.139 msg=The Directory Server has sent an alert notification generated by class org.opends.server.api.DirectoryThread (alert type org.opends.server.UncaughtException, alert ID org.opends.messages.core-140): An uncaught exception during processing for thread Replica DS(18687) listener for domain "dc=forgerock,dc=com" has caused it to terminate abnormally. The stack trace for that exception is: OutOfMemoryError (Unsafe.java:-2 OnDiskMergeImporter.java:2858 OnDiskMergeImporter.java:2780 OnDiskMergeImporter.java:203 BackendImpl.java:689 LDAPReplicationDomain.java:3565 ReplicationDomain.java:2275 ReplicationDomain.java:778 ReplicationDomain.java:106 ReplicationDomain.java:2965 Thread.java:744)
    ...
    

Recent Changes

Upgraded to, or installed OpenDJ 3.0.

Causes

Changes were made in OpenDJ 3.0.0 tools that introduced a new off-heap memory management mechanism, which mean that certain commands now request memory outside of the Java® heap using the sun.misc.Unsafe.allocateMemory() method. 

The number of allocations depend on the number of threads and the number of attribute indexes. OpenDJ calculates how many threads it uses based on the number of CPUs available to the system. It is likely this error occurs because the number of allocations being done when you are using the default number of threads exceeds the amount of memory a 32-bit JVM can allocate.

Solution

This issue can be resolved by upgrading to OpenDJ 3.5 or later which has improved/fixed memory management for imports; you can download this from BackStage.

Workaround

You can workaround this issue by switching to a 64-bit JVM as follows to allow more memory to be allocated:

  1. Add the -d64 option to the start-ds.java-args property in the java.properties file (located in the /path/to/opendj/config directory), for example:
    start-ds.java-args=-server -Xms2g -Xmx2g -d64
    
  2. Run the bin/dsjavaproperties command to apply the changes you have made to the java.properties file:
    $ ./dsjavaproperties
  3. Restart the OpenDJ server.
Note

If your heap size is less than 32GB, you should also specify CompressedOops as detailed in How do I tune DS/OpenDJ (All versions) process sizes: JVM heap and database cache?

If your JVM does not support 64-bit, you can try one of the following approaches instead:

  • Limit the number of CPUs being used by these commands as this will reduce the amount of memory required. This is a server level change, but for import-ldif you can limit the number of threads being used by adding the --threadCount option to your command. For example, add the following to use 8 threads:
    --threadCount 8
  • Increase the JVM heap size. See How do I tune DS/OpenDJ (All versions) process sizes: JVM heap and database cache? for further information.
  • Run ./setup without creating the baseDN entry (that is, exclude the --addBaseEntry or -a option) and then run an import-ldif command afterwards to create it.

See Also

OpenDJ Release Notes › OpenDJ Fixes, Limitations, and Known Issues › Key Fixes in 3.5.0

How do I tune DS/OpenDJ (All versions) process sizes: JVM heap and database cache?

How do I ensure DS/OpenDJ (All versions) uses the Java settings from java.properties file when DS/OpenDJ is started?

Best practice for JVM Tuning

OpenDJ Administration Guide › Managing Directory Data › Importing and Exporting Data

How do I collect JVM data for troubleshooting DS/OpenDJ (All versions)?

Related Training

N/A

Related Issue Tracker IDs

OPENDJ-3212 (When we are upgrading Opendj from 2.4.4 to 3.0.0 version, java.lang.OutOfMemoryError occured)

OPENDJ-2721 (JE is using all the available heap memory during import.)


Missing Errors and Replication logs for embedded OpenDJ in OpenAM 13.0

The purpose of this article is to provide assistance if you notice the Errors and Replication logs for embedded OpenDJ are missing in OpenAM 13.0 or are empty.

Symptoms

The Errors and Replication logs for embedded OpenDJ are no longer available in the $HOME/[openam_instance]/opends/logs directory or are empty. The Access log is still available and populated.

Recent Changes

Upgraded to, or installed OpenAM 13.0.

Causes

Changes were introduced in OpenDJ 3.0 to make use of SL4J logging (this is the embedded OpenDJ version shipped with OpenAM) and OpenAM was changed to accommodate this. As a result of these changes, content from the Errors and Replication logs for embedded OpenDJ is now included in the OpenAM EmbeddedDJ debug log (located in the $HOME/[openam_instance]/openam/debug directory) when message level is enabled.

OpenDJ messages such as the following default to error level unless OpenAM is set to message level; this means they are not displayed in the EmbeddedDJ log since this maps to message level in the OpenAM logging system:

[10/Nov/2016:09:03:51 +1200] category=UTIL severity=NOTICE msgID=org.opends.messages.runtime.21 msg=Installation Directory:  /Users/jdoe/work/forgerock/opendj
[10/Nov/2016:09:03:51 +1200] category=UTIL severity=NOTICE msgID=org.opends.messages.runtime.23 msg=Instance Directory:      /Users/jdoe/work/forgerock/opendj
[10/Nov/2016:09:03:51 +1200] category=UTIL severity=NOTICE msgID=org.opends.messages.runtime.17 msg=JVM Information: 1.7.0_80-b15 by Oracle Corporation, 64-bit architecture, 3817865216 bytes heap size
[10/Nov/2016:09:03:51 +1200] category=UTIL severity=NOTICE msgID=org.opends.messages.runtime.18 msg=JVM Host: 192.168.42.152, running Mac OS X 10.11.3 x86_64, 17179869184 bytes physical memory size, number of processors available 8

Solution

This issue can be resolved by upgrading to OpenAM 13.5 or later; you can download this from BackStage.

Note

The EmbeddedDJ debug log has been renamed OpenDJ-SDK in OpenAM 13.5.

Workaround

Alternatively, you can increase the debug level to message for the JVM as described in How do I enable message level debugging for install and upgrade issues with AM/OpenAM (All versions) ? to include the content from the old Errors and Replication logs in the OpenAM EmbeddedDJ debug log.

Warning

Increasing the debug level to message will generate large debug logs, so you are advised to look at log rotation to stop disks filling up: How do I rotate AM/OpenAM (All versions) debug logs?In OpenAM 13.0, there is a known issue that causes the EmbeddedDJ debug log to grow excessively: OPENAM-8696 (OpenAM 13 EmbeddedDJ log spam ). This is fixed in OpenAM 13.5.

See Also

How do I enable message level debugging for install and upgrade issues with AM/OpenAM (All versions) ?

How do I rotate AM/OpenAM (All versions) debug logs?

How do I clear debug logs in AM/OpenAM (All versions)?

Troubleshooting AM/OpenAM and Policy Agents

Related Training

N/A

Related Issue Tracker IDs

OPENAM-8696 (OpenAM 13 EmbeddedDJ log spam )

OPENAM-8347 (Embedded OpenDJ 'errors' and 'replication' logs are not written)

OPENAM-8070 (OpenDJ logs are redirected to the OpenAM debug logs when enabling the message level)


Copyright and TrademarksCopyright © 2018 ForgeRock, all rights reserved.

This content has been optimized for printing.

Loading...