DS 7.1.7

Indexes

About Indexes

A basic, standard directory feature is the ability to respond quickly to searches.

An LDAP search specifies the information that directly affects how long the directory might take to respond:

  • The base DN for the search.

    The more specific the base DN, the less information to check during the search. For example, a request with base DN dc=example,dc=com potentially involves checking many more entries than a request with base DN uid=bjensen,ou=people,dc=example,dc=com.

  • The scope of the search.

    A subtree or one-level scope targets many entries, whereas a base search is limited to one entry.

  • The search filter to match.

    A search filter asserts that for an entry to match, it has an attribute that corresponds to some value. For example, (cn=Babs Jensen) asserts that cn must have a value that equals Babs Jensen.

    A directory server would waste resources checking all entries for a match. Instead, directory servers maintain indexes to expedite checking for a match.

LDAP directory servers disallow searches that cannot be handled expediently using indexes. Maintaining appropriate indexes is a key aspect of directory administration.

The role of an index is to answer the question, "Which entries have an attribute with this corresponding value?" Each index is therefore specific to an attribute. Each index is also specific to the comparison implied in the search filter. For example, a directory server maintains distinct indexes for exact (equality) matching and for substring matching. The types of indexes are explained in Index Types. Furthermore, indexes are configured in specific directory backends.

An index is implemented as a tree of key-value pairs. The key is a form of the value to match, such as babs jensen. The value is a list of IDs for entries that match the key. The figure that follows shows an equality (case ignore exact match) index with five keys from a total of four entries. If the data set were large, there could be more than one entry ID per key:

An index is implemented as a tree of key-value pairs.

This is how DS directory servers use indexes. When the search filter is (cn=Babs Jensen), the directory server retrieves the IDs for entries with a CN matching Babs Jensen from the equality index of the CN attribute. (For a complex filter, it might optimize the search by changing the order in which it uses the indexes.) A successful result is zero or more entry IDs. These are the candidate result entries.

For each candidate, the DS directory server retrieves the entry by ID from a special system index called id2entry. As its name suggests, this index returns an entry for an entry ID. If there is a match, and the client application has the right to access to the data, the directory server returns the search result. It continues this process until no candidates are left.

If there are no indexes that correspond to a search request, the server must check for a match against every entry in the scope of the search. Evaluating every entry for a match is referred to as an unindexed search. An unindexed search is an expensive operation, particularly for large directories. A server refuses unindexed searches unless the user has specific permission to make such requests. The permission to perform an unindexed search is granted with the unindexed-search privilege. This privilege is reserved for the directory superuser by default. It should not be granted lightly.

If the number of entries is smaller than the default resource limits, you can still perform what appear to be unindexed searches, meaning searches with filters for which no index appears to exist. That is because the dn2id index returns all user data entries without hitting a resource limit that would make the search unindexed.

Use cases that may call for unindexed searches include the following:

  • An application must periodically retrieve a very large amount of directory data all at once through an LDAP search.

    For example, an application performs an LDAP search to retrieve everything in the directory once a week as part of a batch job that runs during off hours.

    Make sure the application has no resource limits. For details, see Resource Limits.

  • A directory data administrator occasionally browses directory data through a graphical UI without initially knowing what they are looking for or how to narrow the search.

    For this case, DS directory servers can sort unindexed search results as long as they are paged.

    This capability has the following limitations:

    • The simple paged results control must specify a page size that is less than or equal to the index-entry-limit (default: 4000).

    • For each page, the server reads the entire backend database, retaining page size number of sorted entries.

    Alternatively, DS directory servers can use an appropriately configured VLV index to sort results for an unindexed search. For details, see VLV for Paged Server-Side Sort.

What To Index

DS directory server search performance depends on indexes. The default settings are fine for evaluating DS software, and they work well with sample data. The default settings do not necessarily fit your directory data, and the searches your applications perform.

Necessary Indexes

Index maintenance has its costs. Every time an indexed attribute is updated, the server must update each affected index to reflect the change. This is wasteful if the index is not used. Indexes, especially substring indexes, can occupy more memory and disk space than the corresponding data.

Aim to maintain only indexes that speed up appropriate searches, and that allow the server to operate properly. The former indexes depend on how directory users search, and require thought and investigation. The latter includes non-configurable internal indexes, that should not change.

Begin by reviewing the attributes of your directory data. Which attributes would you expect to see in a search filter? If an attribute is going to show up frequently in reasonable search filters, then index it.

Compare your guesses with what you see actually happening in the directory. One approach is to review the access log for search results with additional items like unindexed. The following example shows the relevant fields in an access log message:

{
  "request": {
    "protocol": "LDAP",
    "operation": "SEARCH"
  },
  "response": {
    "detail": "You do not have sufficient privileges to perform an unindexed search",
    "additionalItems": {
      "unindexed": null
    }
  }
}

Review the messages in the access log, as they also specify the search filter and scope. Understand the search that led to each unindexed search. If the filter is appropriate and frequently used, add an index to facilitate the search. You can either consume the access logs to determine how often a search filter is used, or monitor what is happening in the directory with the index analysis feature.

DS servers provide the index analysis feature to collect information about filters in search requests. You can activate the index analysis mechanism using the dsconfig set-backend-prop command:

$ dsconfig \
 set-backend-prop \
 --hostname localhost \
 --port 4444 \
 --bindDN uid=admin \
 --bindPassword password \
 --backend-name dsEvaluation \
 --set index-filter-analyzer-enabled:true \
 --no-prompt \
 --usePkcs12TrustStore /path/to/opendj/config/keystore \
 --trustStorePassword:file /path/to/opendj/config/keystore.pin

The command causes the server to analyze filters used, and to keep the results in memory. You can read the results as monitoring information:

$ ldapsearch \
 --hostname localhost \
 --port 1636 \
 --useSsl \
 --usePkcs12TrustStore /path/to/opendj/config/keystore \
 --trustStorePassword:file /path/to/opendj/config/keystore.pin \
 --bindDN uid=admin \
 --bindPassword password \
 --baseDN ds-cfg-backend-id=dsEvaluation,cn=Backends,cn=monitor \
 --searchScope base \
 "(&)" \
 ds-mon-backend-filter-use-start-time ds-mon-backend-filter-use-indexed ds-mon-backend-filter-use-unindexed ds-mon-backend-filter-use

dn: ds-cfg-backend-id=dsEvaluation,cn=backends,cn=monitor
ds-mon-backend-filter-use-start-time: <timestamp>
ds-mon-backend-filter-use-indexed: 2
ds-mon-backend-filter-use-unindexed: 3
ds-mon-backend-filter-use: {"search-filter":"(employeenumber=86182)","nb-hits":1,"latest-failure-reason":"caseIgnoreMatch index type is disabled for the employeeNumber attribute"}

The ds-mon-backend-filter-use values include the following fields:

search-filter

The LDAP search filter.

nb-hits

The number of times the filter was used.

latest-failure-reason

A message describing why the server could not use any index for this filter.

The output can include filters for internal use, such as (aci=*). In the example above, you see a filter used by a client application.

In the example, a search filter that led to an unindexed search, (employeenumber=86182), had no matches because, "caseIgnoreMatch index type is disabled for the employeeNumber attribute". Some client application has tried to find users by employee number, but no index exists for that purpose. If this appears regularly as a frequent search, add an employee number index.

To avoid impacting server performance, turn off index analysis after you collect the information you need. Turn off index analysis with the dsconfig set-backend-prop command:

$ dsconfig \
 set-backend-prop \
 --hostname localhost \
 --port 4444 \
 --bindDN uid=admin \
 --bindPassword password \
 --backend-name dsEvaluation \
 --set index-filter-analyzer-enabled:false \
 --no-prompt \
 --usePkcs12TrustStore /path/to/opendj/config/keystore \
 --trustStorePassword:file /path/to/opendj/config/keystore.pin

Directory users might complain to you that their searches are refused because they are unindexed. Ask for the result code, additional information, and search filter. DS directory servers respond to LDAP client applications that attempt unindexed searches with a result code of 50 and additional information about the unindexed search. The following example attempts, anonymously, to get the entries for all users whose email address ends in .com:

$ ldapsearch \
 --hostname localhost \
 --port 1636 \
 --useSsl \
 --usePkcs12TrustStore /path/to/opendj/config/keystore \
 --trustStorePassword:file /path/to/opendj/config/keystore.pin \
 --baseDN ou=people,dc=example,dc=com \
 "(&(mail=*.com)(objectclass=person))"

# The LDAP search request failed: 50 (Insufficient Access Rights)
# Additional Information:  You do not have sufficient privileges to perform an unindexed search

Before you change settings to permit a search, understand why the user wants to perform an unindexed search.

Perhaps they are unintentionally requesting an unindexed search. If so, you can help them find a less expensive search, by using an approach that limits the number of candidate result entries. For example, if a GUI application lets a user browse a group of entries, the application could use a browsing index to retrieve a block of entries for each screen, rather than retrieving all the entries at once.

Perhaps they do have a legitimate reason to get the full list of all entries in one operation, such as regularly rebuilding some database that depends on the directory. If so, assign their application’s account the unindexed-search privilege.

In addition to responding to client search requests, a server performs internal searches. Internal searches let the server retrieve data needed for a request, and maintain internal state information. In some cases internal searches can become unindexed. When this happens, the server logs a warning similar to the following:

The server is performing an unindexed internal search request
with base DN '%s', scope '%s', and filter '%s'. Unindexed internal
searches are usually unexpected and could impact performance.
Please verify that that backend's indexes are configured correctly
for these search parameters.

When you see a message like this in the server log, take these actions:

  • Figure out which indexes are missing, and add them.

    For details, see Debug Search Indexes, and Configure Indexes.

  • Check the integrity of the indexes.

    For details, see Verify Indexes.

  • If the relevant indexes exist, and you have verified that they are sound, the index entry limit might be too low.

    This can happen, for example, in directory servers with more than 4000 groups in a single backend. For details, see Index Entry Limits.

  • If you have made the changes described in the steps above, and problem persists, contact technical support.

Sometimes it is not obvious by inspection how a directory server processes a given search request. The directory superuser can gain insight with the debugsearchindex attribute.

The default global access control prevents users from reading the debugsearchindex attribute. To allow an administrator to read the attribute, add a global ACI such as the following:

$ dsconfig \
 set-access-control-handler-prop \
 --hostname localhost \
 --port 4444 \
 --bindDN uid=admin \
 --bindPassword password \
 --add global-aci:"(targetattr=\"debugsearchindex\")(version 3.0; acl \"Debug search indexes\"; \
  allow (read,search,compare) userdn=\"ldap:///uid=user.0,ou=people,dc=example,dc=com\";)" \
 --usePkcs12TrustStore /path/to/opendj/config/keystore \
 --trustStorePassword:file /path/to/opendj/config/keystore.pin \
 --no-prompt

The format of debugsearchindex values has interface stability: Internal.

The values are intended to be read by human beings, not scripts. If you do write scripts that interpret debugsearchindex values, be aware that they are not stable. Be prepared to adapt your scripts for every upgrade or patch.

The debugsearchindex attribute value indicates how the server would process the search. The server use its indexes to prepare a set of candidate entries. It iterates through the set to compare candidates with the search filter, returning entries that match. The following example demonstrates this feature for a subtree search with a complex filter:

$ ldapsearch \
 --hostname localhost \
 --port 1636 \
 --useSsl \
 --usePkcs12TrustStore /path/to/opendj/config/keystore \
 --trustStorePassword:file /path/to/opendj/config/keystore.pin \
 --bindDN uid=user.0,ou=people,dc=example,dc=com \
 --bindPassword password \
 --baseDN dc=example,dc=com \
 "(&(objectclass=person)(givenName=aa*))" \
 debugsearchindex | sed -n -e "s/^debugsearchindex: //p"

{
  "filter": {
    "intersection": [
      {
        "index": "givenName.caseIgnoreSubstringsMatch:6",
        "range": "[aa,ab[",
        "candidates": 171,
        "retained": 171
      },
      {
        "index": "givenName.caseIgnoreMatch",
        "range": "[aa,ab[",
        "candidates": 50,
        "retained": 50
      },
      {
        "filter": "(objectclass=person)",
        "union": [
          {
            "index": "objectClass.objectIdentifierMatch",
            "exact": "person",
            "candidates": "[LIMIT-EXCEEDED]"
          }
        ],
        "candidates": "[LIMIT-EXCEEDED]",
        "retained": 50
      }
    ],
    "candidates": 50
  },
  "scope": {
    "type": "sub",
    "candidates": "[NOT-INDEXED]",
    "retained": 50
  },
  "final": 50
}

The filter in the example matches person entries whose given name starts with aa. The search scope is not explicitly specified, so the scope defaults to the subtree including the base DN.

Notice that the debugsearchindex value has the following top-level fields:

  • (Optional) "vlv" describes how the server uses VLV indexes.

    The VLV field is not applicable for this example, and so is not present.

  • "filter" describes how the server uses the search filter to narrow the set of candidates.

  • "scope" describes how the server uses the search scope.

  • "final" indicates the final number of candidates in the set.

In the output, notice that the server uses the equality and substring indexes to find candidate entries whose given name starts with aa. If the filter indicated given names containingaa, as in givenName=aa, the server would rely only on the substring index.

Notice that the output for the (objectclass=person) portion of the filter shows "candidates": "[LIMIT-EXCEEDED]". In this case, there are so many entries matching the value specified that the index is not useful for narrowing the set of candidates. The scope is also not useful for narrowing the set of candidates. Ultimately, however, the givenName indexes help the server to narrow the set of candidates. The overall search is indexed and the result is 50 matching entries.

The following example shows a subtree search for accounts with initials starting either with aa or with zz:

$ ldapsearch \
 --hostname localhost \
 --port 1636 \
 --useSsl \
 --usePkcs12TrustStore /path/to/opendj/config/keystore \
 --trustStorePassword:file /path/to/opendj/config/keystore.pin \
 --baseDN dc=example,dc=com \
 --bindDN uid=user.0,ou=people,dc=example,dc=com \
 --bindPassword password \
 "(|(initials=aa*)(initials=zz*))" \
 debugsearchindex | sed -n -e "s/^debugsearchindex: //p"

{
  "filter": {
    "union": [
      {
        "filter": "(initials=aa*)",
        "index": "initials.presence",
        "diagnostic": "not indexed",
        "candidates": "[NOT-INDEXED]"
      }
    ],
    "candidates": "[NOT-INDEXED]"
  },
  "scope": {
    "type": "sub",
    "candidates": "[NOT-INDEXED]",
    "retained": "[NOT-INDEXED]"
  },
  "final": "[NOT-INDEXED]"
}

As shown in the output, the search is not indexed. To fix this, index the initials attribute:

$ dsconfig \
 create-backend-index \
 --hostname localhost \
 --port 4444 \
 --bindDN uid=admin \
 --bindPassword password \
 --backend-name dsEvaluation \
 --index-name initials \
 --set index-type:equality \
 --usePkcs12TrustStore /path/to/opendj/config/keystore \
 --trustStorePassword:file /path/to/opendj/config/keystore.pin \
 --no-prompt

$ rebuild-index \
 --hostname localhost \
 --port 4444 \
 --bindDn uid=admin \
 --bindPassword password \
 --baseDn dc=example,dc=com \
 --index initials \
 --usePkcs12TrustStore /path/to/opendj/config/keystore \
 --trustStorePassword:file /path/to/opendj/config/keystore.pin

After configuring and building the new index, try the same search again:

$ ldapsearch \
 --hostname localhost \
 --port 1636 \
 --useSsl \
 --usePkcs12TrustStore /path/to/opendj/config/keystore \
 --trustStorePassword:file /path/to/opendj/config/keystore.pin \
 --baseDN dc=example,dc=com \
 --bindDN uid=user.0,ou=people,dc=example,dc=com \
 --bindPassword password \
 "(|(initials=aa*)(initials=zz*))" \
 debugsearchindex | sed -n -e "s/^debugsearchindex: //p"

{
  "filter": {
    "union": [
      {
        "filter": "(initials=aa*)",
        "intersection": [
          {
            "index": "initials.caseIgnoreSubstringsMatch:6",
            "range": "[aa,ab[",
            "diagnostic": "not indexed",
            "tryPresenceIndex": {
              "index": "initials.presence",
              "diagnostic": "not indexed"
            },
            "candidates": "[NOT-INDEXED]",
            "retained": "[NOT-INDEXED]"
          },
          {
            "index": "initials.caseIgnoreMatch",
            "range": "[aa,ab[",
            "candidates": 378,
            "retained": 378
          }
        ],
        "candidates": 378
      },
      {
        "filter": "(initials=zz*)",
        "intersection": [
          {
            "index": "initials.caseIgnoreSubstringsMatch:6",
            "range": "[zz,z{[",
            "diagnostic": "not indexed",
            "tryPresenceIndex": {
              "index": "initials.presence",
              "diagnostic": "not indexed"
            },
            "candidates": "[NOT-INDEXED]",
            "retained": "[NOT-INDEXED]"
          },
          {
            "index": "initials.caseIgnoreMatch",
            "range": "[zz,z{[",
            "candidates": 26,
            "retained": 26
          }
        ],
        "candidates": 26
      }
    ],
    "candidates": 404
  },
  "scope": {
    "type": "sub",
    "candidates": "[NOT-INDEXED]",
    "retained": 404
  },
  "final": 404
}

Notice that the server can narrow the list of candidates using the equality index you created. The server would require a substring index instead of an equality index if the filter were not matching initial strings.

If an index already exists, but you suspect it is not working properly, see Verify Indexes.

Index Types

DS directory servers support multiple index types, each corresponding to a different type of search.

View what is indexed by using the backendstat list-indexes command. For details about a particular index, you can use the backendstat dump-index command.

Presence Index

A presence index matches an attribute that is present on the entry, regardless of the value. By default, the aci attribute is indexed for presence:

$ ldapsearch \
 --hostname localhost \
 --port 1636 \
 --useSsl \
 --usePkcs12TrustStore /path/to/opendj/config/keystore \
 --trustStorePassword:file /path/to/opendj/config/keystore.pin \
 --bindDN uid=admin \
 --bindPassword password \
 --baseDN dc=example,dc=com \
 "(aci=*)" \
 aci

A presence index takes up less space than other indexes. In a presence index, there is just one key with a list of IDs.

The following command examines the ACI presence index for a server configured with the evaluation profile:

$ stop-ds

$ backendstat \
 dump-index \
 --backendId dsEvaluation \
 --baseDn dc=example,dc=com \
 --indexName aci.presence

Key (len 1): PRESENCE
Value (len 3): [COUNT:2] 1 9

Total Records: 1
Total / Average Key Size: 1 bytes / 1 bytes
Total / Average Data Size: 3 bytes / 3 bytes

In this case, entries with ACI attributes have IDs 1 and 9.

Equality Index

An equality index matches values that correspond exactly (generally ignoring case) to those in search filters. An equality index requires clients to match values without wildcards or misspellings:

$ ldapsearch \
 --hostname localhost \
 --port 1636 \
 --useSsl \
 --usePkcs12TrustStore /path/to/opendj/config/keystore \
 --trustStorePassword:file /path/to/opendj/config/keystore.pin \
 --baseDN dc=example,dc=com \
 "(uid=bjensen)" \
 mail

dn: uid=bjensen,ou=People,dc=example,dc=com
mail: bjensen@example.com

An equality index has one list of entry IDs for each attribute value. Depending on the backend implementation, the keys in a case-insensitive index might not be strings. For example, a key of 6A656E73656E could represent jensen.

The following command examines the SN equality index for a server configured with the evaluation profile:

$ stop-ds

$ backendstat \
 dump-index \
 --backendID dsEvaluation \
 --baseDN dc=example,dc=com \
 --indexName sn.caseIgnoreMatch | grep -A 1 "jensen$"

Key (len 6): jensen
Value (len 26): [COUNT:17] 18 31 32 66 79 94 133 134 150 5996 19415 32834 46253 59672 73091 86510 99929

In this case, there are 17 entries that have an SN of Jensen.

Unless the keys are encrypted, the server can reuse an equality index for ordering and initial substring searches.

Approximate Index

An approximate index matches values that "sound like" those provided in the filter. An approximate index on sn lets client applications find people even when they misspell surnames:

$ ldapsearch \
 --hostname localhost \
 --port 1636 \
 --useSsl \
 --usePkcs12TrustStore /path/to/opendj/config/keystore \
 --trustStorePassword:file /path/to/opendj/config/keystore.pin \
 --baseDN dc=example,dc=com \
 "(&(sn~=Jansen)(cn=Babs*))" \
 cn

dn: uid=bjensen,ou=People,dc=example,dc=com
cn: Barbara Jensen
cn: Babs Jensen

An approximate index squashes attribute values into a normalized form.

The following command examines an SN approximate index added to a server configured with the evaluation profile:

$ stop-ds

$ backendstat \
 dump-index \
 --backendID dsEvaluation \
 --baseDN dc=example,dc=com \
 --indexName sn.ds-mr-double-metaphone-approx | grep -A 1 "JNSN$"

Key (len 4): JNSN
Value (len 83): [COUNT:74] 18 31 32 59 66 79 94 133 134 150 5928 5939 5940 5941 5996 5997 6033 6034 19347 19358 19359 19360 19415 19416 19452 19453 32766 32777 32778 32779 32834 32835 32871 32872 46185 46196 46197 46198 46253 46254 46290 46291 59604 59615 59616 59617 59672 59673 59709 59710 73023 73034 73035 73036 73091 73092 73128 73129 86442 86453 86454 86455 86510 86511 86547 86548 99861 99872 99873 99874 99929 99930 99966 99967

In this case, there are 74 entries that have an SN that sounds like Jensen.

Substring Index

A substring index matches values that are specified with wildcards in the filter. Substring indexes can be expensive to maintain, especially for large attribute values:

$ ldapsearch \
 --hostname localhost \
 --port 1636 \
 --useSsl \
 --usePkcs12TrustStore /path/to/opendj/config/keystore \
 --trustStorePassword:file /path/to/opendj/config/keystore.pin \
 --baseDN dc=example,dc=com \
 "(cn=Barb*)" \
 cn

dn: uid=bfrancis,ou=People,dc=example,dc=com
cn: Barbara Francis

dn: uid=bhal2,ou=People,dc=example,dc=com
cn: Barbara Hall

dn: uid=bjablons,ou=People,dc=example,dc=com
cn: Barbara Jablonski

dn: uid=bjensen,ou=People,dc=example,dc=com
cn: Barbara Jensen
cn: Babs Jensen

dn: uid=bmaddox,ou=People,dc=example,dc=com
cn: Barbara Maddox

In a substring index, there are enough keys to match any substring in the attribute values. Each key is associated with a list of IDs. The default maximum size of a substring key is 6 bytes.

The following command examines an SN substring index for a server configured with the evaluation profile:

$ stop-ds

$ backendstat \
dump-index \
 --backendID dsEvaluation \
 --baseDN dc=example,dc=com \
 --indexName sn.caseIgnoreSubstringsMatch:6

...
Key (len 1): e
Value (len 25): [COUNT:22] ...
...
Key (len 2): en
Value (len 15): [COUNT:12] ...
...
Key (len 3): ens
Value (len 3): [COUNT:1] 147
Key (len 5): ensen
Value (len 10): [COUNT:9] 18 31 32 66 79 94 133 134 150
...
Key (len 6): jensen
Value (len 10): [COUNT:9] 18 31 32 66 79 94 133 134 150
...
Key (len 1): n
Value (len 35): [COUNT:32] ...
...
Key (len 2): ns
Value (len 3): [COUNT:1] 147
Key (len 4): nsen
Value (len 10): [COUNT:9] 18 31 32 66 79 94 133 134 150
...
Key (len 1): s
Value (len 13): [COUNT:12] 12 26 47 64 95 98 108 131 135 147 149 154
...
Key (len 2): se
Value (len 7): [COUNT:6] 52 58 75 117 123 148
Key (len 3): sen
Value (len 10): [COUNT:9] 18 31 32 66 79 94 133 134 150
...

In this case, the SN value Jensen shares substrings with many other entries. The size of the lists and number of keys make a substring index much more expensive to maintain than other indexes. This is particularly true for longer attribute values.

Ordering Index

An ordering index is used to match values for a filter that specifies a range. For example, the ds-sync-hist attribute used by replication has an ordering index by default. Searches on that attribute often seek entries with changes more recent than the last time a search was performed.

The following example shows a search that specifies a range on the SN attribute value:

$ ldapsearch \
 --hostname localhost \
 --port 1636 \
 --useSsl \
 --usePkcs12TrustStore /path/to/opendj/config/keystore \
 --trustStorePassword:file /path/to/opendj/config/keystore.pin \
 --baseDN dc=example,dc=com \
 "(sn>=zyw)" \
 sn

dn: uid=user.13401,ou=People,dc=example,dc=com
sn: Zywiel

dn: uid=user.26820,ou=People,dc=example,dc=com
sn: Zywiel

dn: uid=user.40239,ou=People,dc=example,dc=com
sn: Zywiel

dn: uid=user.53658,ou=People,dc=example,dc=com
sn: Zywiel

dn: uid=user.67077,ou=People,dc=example,dc=com
sn: Zywiel

dn: uid=user.80496,ou=People,dc=example,dc=com
sn: Zywiel

dn: uid=user.93915,ou=People,dc=example,dc=com
sn: Zywiel

In this case, the server only requires an ordering index if it cannot reuse the (ordered) equality index instead. For example, if the equality index is encrypted, an ordering index must be maintained separately.

Virtual List View (Browsing) Index

A virtual list view (VLV) or browsing index is designed to help applications that list results. For example, a GUI application might let users browse through a list of users. VLVs help the server respond to clients that request server-side sorting of the search results.

VLV indexes correspond to particular searches. Configure your VLV indexes using the command line.

Extensible Matching Rule Index

In some cases, you need an index for a matching rule other than those described above.

For example, a generalized time-based matching index lets applications find entries with a time-based attribute later or earlier than a specified time.

Configure Indexes

You modify index configurations by using the dsconfig command. Configuration changes take effect after you rebuild the index with the new configuration, using the rebuild-index command. The dsconfig --help-database command lists subcommands for creating, reading, updating, and deleting index configuration.

Indexes are per directory backend rather than per base DN. To maintain separate indexes for different base DNs on the same server, put the entries in different backends.

Standard Indexes

New Index

The following example creates a new equality index for the description attribute:

$ dsconfig \
 create-backend-index \
 --hostname localhost \
 --port 4444 \
 --bindDN uid=admin \
 --bindPassword password \
 --backend-name dsEvaluation \
 --index-name description \
 --set index-type:equality \
 --usePkcs12TrustStore /path/to/opendj/config/keystore \
 --trustStorePassword:file /path/to/opendj/config/keystore.pin \
 --no-prompt

Approximate Index

The following example adds an approximate index for the sn (surname) attribute:

$ dsconfig \
 set-backend-index-prop \
 --hostname localhost \
 --port 4444 \
 --bindDN uid=admin \
 --bindPassword password \
 --backend-name dsEvaluation \
 --index-name sn \
 --add index-type:approximate \
 --usePkcs12TrustStore /path/to/opendj/config/keystore \
 --trustStorePassword:file /path/to/opendj/config/keystore.pin \
 --no-prompt

Approximate indexes depend on the Double Metaphone matching rule.

Extensible Match Index

DS servers support matching rules defined in LDAP RFCs. They also define DS-specific extensible matching rules.

The following are DS-specific extensible matching rules:

Name: ds-mr-double-metaphone-approx

Double Metaphone Approximate Match described at http://aspell.net/metaphone/. The DS implementation always produces a single value rather than one or possibly two values.

Configure approximate indexes as described in Approximate Index.

For an example using this matching rule, see Approximate Match.

Name: ds-mr-user-password-exact

User password exact matching rule used to compare encoded bytes of two hashed password values for exact equality.

Name: ds-mr-user-password-equality

User password matching rule implemented as the user password exact matching rule.

Name: partialDateAndTimeMatchingRule

Partial date and time matching rule for matching parts of dates in time-based searches.

For an example using this matching rule, see Active Accounts.

Name: relativeTimeOrderingMatch.gt

Greater-than relative time matching rule for time-based searches.

For an example using this matching rule, see Active Accounts.

Name: relativeTimeOrderingMatch.lt

Less-than relative time matching rule for time-based searches.

For an example using this matching rule, see Active Accounts.

The following example configures an extensible matching rule index for "later than" and "earlier than" generalized time matching on a lastLoginTime attribute:

$ dsconfig \
 create-backend-index \
 --hostname localhost \
 --port 4444 \
 --bindDN uid=admin \
 --bindPassword password \
 --backend-name dsEvaluation \
 --set index-type:extensible \
 --set index-extensible-matching-rule:1.3.6.1.4.1.26027.1.4.5 \
 --set index-extensible-matching-rule:1.3.6.1.4.1.26027.1.4.6 \
 --index-name lastLoginTime \
 --usePkcs12TrustStore /path/to/opendj/config/keystore \
 --trustStorePassword:file /path/to/opendj/config/keystore.pin \
 --no-prompt

Indexes for JSON

DS servers support attribute values that have JSON syntax. The following schema excerpt defines a json attribute with case-insensitive matching:

attributeTypes: ( json-attribute-oid NAME 'json'
  SYNTAX 1.3.6.1.4.1.36733.2.1.3.1 EQUALITY caseIgnoreJsonQueryMatch
  SINGLE-VALUE X-ORIGIN 'DS Documentation Examples' )

When you index a JSON attribute defined in this way, the default directory server behavior is to maintain index keys for each JSON field. Large or numerous JSON objects can result in large indexes, which is wasteful. If you know which fields are used in search filters, you can choose to index only those fields.

As described in Schema and JSON, for some JSON objects only a certain field or fields matter when comparing for equality. In these special cases, the server can ignore other fields when checking equality during updates, and you would not maintain indexes for other fields.

How you index a JSON attribute depends on the matching rule in the attribute’s schema definition, and on the JSON fields you expect to be used as search keys for the attribute.

Index JSON Attributes

The examples that follow demonstrate these steps:

  1. Using the schema definition and the information in the following table, configure a custom schema provider for the attribute’s matching rule, if necessary.

    Matching Rule in Schema Definition Fields in Search Filter Custom Schema Provider Required?

    caseExactJsonQueryMatch

    caseIgnoreJsonQueryMatch

    Any JSON field

    No

    caseExactJsonQueryMatch

    caseIgnoreJsonQueryMatch

    Specific JSON field or fields

    caseExactJsonIdMatch

    caseIgnoreJsonIdMatch

    "_id" field only

    Custom JSON equality matching rule

    Specific field(s)

    A custom schema provider applies to all attributes using this matching rule.

  2. Add the schema definition for the JSON attribute.

  3. Configure the index for the JSON attribute.

  4. Add the JSON attribute values in the directory data.

JSON Query Matching Rule Index

This example illustrates the steps in Index JSON Attributes.

If you installed a directory server with the ds-evaluation profile, the custom index configuration is already present.

The following command configures a custom, case-insensitive JSON query matching rule. This only maintains keys for the access_token and refresh_token fields:

$ dsconfig \
 create-schema-provider \
 --hostname localhost \
 --port 4444 \
 --bindDN uid=admin \
 --bindPassword password \
 --provider-name "Custom JSON Query Matching Rule" \
 --type json-query-equality-matching-rule \
 --set enabled:true \
 --set case-sensitive-strings:false \
 --set ignore-white-space:true \
 --set matching-rule-name:caseIgnoreJsonQueryMatch \
 --set matching-rule-oid:1.3.6.1.4.1.36733.2.1.4.1 \
 --set indexed-field:access_token \
 --set indexed-field:refresh_token \
 --usePkcs12TrustStore /path/to/opendj/config/keystore \
 --trustStorePassword:file /path/to/opendj/config/keystore.pin \
 --no-prompt

The following commands add schemas for a json attribute that uses the matching rule:

$ ldapmodify \
 --hostname localhost \
 --port 1636 \
 --useSsl \
 --usePkcs12TrustStore /path/to/opendj/config/keystore \
 --trustStorePassword:file /path/to/opendj/config/keystore.pin \
 --bindDN uid=admin \
 --bindPassword password << EOF
dn: cn=schema
changetype: modify
add: attributeTypes
attributeTypes: ( json-attribute-oid NAME 'json'
  SYNTAX 1.3.6.1.4.1.36733.2.1.3.1 EQUALITY caseIgnoreJsonQueryMatch
  SINGLE-VALUE X-ORIGIN 'DS Documentation Examples' )
-
add: objectClasses
objectClasses: ( json-object-class-oid NAME 'jsonObject' SUP top
  AUXILIARY MAY ( json ) X-ORIGIN 'DS Documentation Examples' )
EOF

The following command configures an index using the custom matching rule implementation:

$ dsconfig \
 create-backend-index \
 --hostname localhost \
 --port 4444 \
 --bindDN uid=admin \
 --bindPassword password \
 --backend-name dsEvaluation \
 --index-name json \
 --set index-type:equality \
 --usePkcs12TrustStore /path/to/opendj/config/keystore \
 --trustStorePassword:file /path/to/opendj/config/keystore.pin \
 --no-prompt

For an example of how a client application could use this index, see JSON Query Filters.

JSON Equality Matching Rule Index

This example illustrates the steps in Index JSON Attributes.

If you installed a directory server with the ds-evaluation profile, the custom index configuration is already present.

The following command configures a custom, case-insensitive JSON equality matching rule, caseIgnoreJsonTokenIdMatch. This indexes the id rather than the _id field:

$ dsconfig \
 create-schema-provider \
 --hostname localhost \
 --port 4444 \
 --bindDN uid=admin \
 --bindPassword password \
 --provider-name "Custom JSON Token ID Matching Rule" \
 --type json-equality-matching-rule \
 --set enabled:true \
 --set case-sensitive-strings:false \
 --set ignore-white-space:true \
 --set matching-rule-name:caseIgnoreJsonTokenIDMatch \
 --set matching-rule-oid:1.3.6.1.4.1.36733.2.1.4.4.1 \
 --set json-keys:id \
 --usePkcs12TrustStore /path/to/opendj/config/keystore \
 --trustStorePassword:file /path/to/opendj/config/keystore.pin \
 --no-prompt

Notice that this example defines a matching rule with OID 1.3.6.1.4.1.36733.2.1.4.4.1. In production deployments, use a numeric OID allocated for your own organization.

The following commands add schemas for a jsonToken attribute, where the unique identifier is in the "id" field of the JSON object:

$ ldapmodify \
 --hostname localhost \
 --port 1636 \
 --useSsl \
 --usePkcs12TrustStore /path/to/opendj/config/keystore \
 --trustStorePassword:file /path/to/opendj/config/keystore.pin \
 --bindDN uid=admin \
 --bindPassword password << EOF
dn: cn=schema
changetype: modify
add: attributeTypes
attributeTypes: ( jsonToken-attribute-oid NAME 'jsonToken'
  SYNTAX 1.3.6.1.4.1.36733.2.1.3.1 EQUALITY caseIgnoreJsonTokenIDMatch
  SINGLE-VALUE X-ORIGIN 'DS Documentation Examples' )
-
add: objectClasses
objectClasses: ( json-token-object-class-oid NAME 'JsonTokenObject' SUP top
  AUXILIARY MAY ( jsonToken ) X-ORIGIN 'DS Documentation Examples' )
EOF

The following command configures an index using the custom matching rule implementation:

$ dsconfig \
 create-backend-index \
 --hostname localhost \
 --port 4444 \
 --bindDN uid=admin \
 --bindPassword password \
 --backend-name dsEvaluation \
 --index-name jsonToken \
 --set index-type:equality \
 --usePkcs12TrustStore /path/to/opendj/config/keystore \
 --trustStorePassword:file /path/to/opendj/config/keystore.pin \
 --no-prompt

For an example of how a client application could use this index, see JSON Assertions.

Virtual List View Index

The following example shows how to create a VLV index. This example applies where GUI users browse user accounts, sorting on surname then given name:

$ dsconfig \
 create-backend-vlv-index \
 --hostname localhost \
 --port 4444 \
 --bindDn uid=admin \
 --bindPassword password \
 --backend-name dsEvaluation \
 --index-name people-by-last-name \
 --set base-dn:ou=People,dc=example,dc=com \
 --set filter:"(|(givenName=*)(sn=*))" \
 --set scope:single-level \
 --set sort-order:"+sn +givenName" \
 --usePkcs12TrustStore /path/to/opendj/config/keystore \
 --trustStorePassword:file /path/to/opendj/config/keystore.pin \
 --no-prompt

When referring to a VLV index after creation, you must add vlv. as a prefix. In other words, if you named the VLV index people-by-last-name, refer to it as vlv.people-by-last-name when rebuilding indexes, changing index properties such as the index entry limit, or verifying indexes.

VLV for Paged Server-Side Sort

A special VLV index lets the server sort results for a search that is technically unindexed. For example, users page through an entire directory database in a GUI. The user does not filter the data before seeing what is available.

Because the search is technically unindexed, the user performing the search must have the unindexed-search privilege.

The VLV index must have the following characteristics:

  • Its filter must be "always true," (&).

  • Its scope must cover the search scope of the requests.

  • Its base DN must match or be a parent of the base DN of the search requests.

  • Its sort order must match the sort keys of the requests in the order they occur in the requests, starting with the first sort key used in the request. The VLV index sort order can include additional keys not present in a request.

    In other words, if the sort order of the VLV index is +l +sn +cn, then it works with requests having the following sort orders:

    • +l +sn +cn

    • +l +sn

    • +l

The following example commands demonstrate creating and using a VLV index to sort paged results by locality, surname, and then full name. The l attribute is not indexed by default. This example makes use of the rebuild-index command described below. The directory superuser is not subject to resource limits on the LDAP search operation:

$ dsconfig \
 create-backend-vlv-index \
 --hostname localhost \
 --port 4444 \
 --bindDn uid=admin \
 --bindPassword password \
 --backend-name dsEvaluation \
 --index-name by-name \
 --set base-dn:ou=People,dc=example,dc=com \
 --set filter:"(&)" \
 --set scope:subordinate-subtree \
 --set sort-order:"+l +sn +cn" \
 --usePkcs12TrustStore /path/to/opendj/config/keystore \
 --trustStorePassword:file /path/to/opendj/config/keystore.pin \
 --no-prompt

$ rebuild-index \
 --hostname localhost \
 --port 4444 \
 --bindDn uid=admin \
 --bindPassword password \
 --baseDn dc=example,dc=com \
 --index vlv.by-name \
 --usePkcs12TrustStore /path/to/opendj/config/keystore \
 --trustStorePassword:file /path/to/opendj/config/keystore.pin

$ ldapsearch \
 --hostname localhost \
 --port 1636 \
 --useSsl \
 --usePkcs12TrustStore /path/to/opendj/config/keystore \
 --trustStorePassword:file /path/to/opendj/config/keystore.pin \
 --bindDn uid=admin \
 --bindPassword password \
 --baseDn dc=example,dc=com \
 --sortOrder +l,+sn,+cn \
 --simplePageSize 5 \
 "(&)" \
 cn l sn

dn: uid=user.93953,ou=People,dc=example,dc=com
cn: Access Abedi
l: Abilene
sn: Abedi

dn: uid=user.40283,ou=People,dc=example,dc=com
cn: Achal Abernathy
l: Abilene
sn: Abernathy

dn: uid=user.67240,ou=People,dc=example,dc=com
cn: Alaine Alburger
l: Abilene
sn: Alburger

dn: uid=user.26994,ou=People,dc=example,dc=com
cn: Alastair Alexson
l: Abilene
sn: Alexson

dn: uid=user.53853,ou=People,dc=example,dc=com
cn: Alev Allen
l: Abilene
sn: Allen

Press RETURN to continue ^C

Rebuild Indexes

When you first import directory data, the directory server builds the indexes as part of the import process. DS servers maintain indexes automatically, updating them as directory data changes.

Only rebuild an index manually when it is necessary to do so. Rebuilding valid indexes wastes server resources, and is disruptive for client applications.

When you rebuild an index while the server is online, the index appears as degraded and unavailable while the server rebuilds it.

A search request that relies on an index in this state may temporarily fail as an unindexed search.

However, you must manually intervene when you:

  • Create a new index for a new directory attribute.

  • Create a new index for existing directory attribute.

  • Change the server configuration in a way that affects the index, for example, by changing Index Entry Limits.

  • Verify an existing index, and find that it has errors or is not in a valid state.

Automate Index Rebuilds

To automate the process of rebuilding indexes, use the --rebuildDegraded option. This rebuilds only degraded indexes, and does not affect valid indexes:

$ rebuild-index \
 --hostname localhost \
 --port 4444 \
 --bindDN uid=admin \
 --bindPassword password \
 --baseDN dc=example,dc=com \
 --rebuildDegraded \
 --usePkcs12TrustStore /path/to/opendj/config/keystore \
 --trustStorePassword:file /path/to/opendj/config/keystore.pin

Clear a New Index for a New Attribute

When you add a new attribute, as described in Update LDAP Schema, and create an index for the new attribute, the new index appears as degraded or invalid. The attribute has not yet been used, and so the index is sure to be empty, not degraded.

In this case, you can safely use the rebuild-index --clearDegradedState command. The server can complete this operation quickly, because there is no need to scan the entire directory backend to rebuild a new, unused index.

In this example, an index has just been created for newUnusedAttribute. If the newly indexed attribute has already been used, rebuild the index instead of clearing the degraded state.

Before using the rebuild-index command, test the index status to make sure is has not been used:

  1. DS servers must be stopped before you use the backendstat command:

    $ stop-ds
  2. Check the status of the index(es).

    The third column of the output is the Valid column, which is false before the rebuild:

    $ backendstat \
     show-index-status \
     --backendID dsEvaluation \
     --baseDN dc=example,dc=com \
     | grep newUnusedAttribute
    
    newUnusedAttribute.presence                    ... false ...
    newUnusedAttribute.caseIgnoreMatch             ... false ...
    newUnusedAttribute.caseIgnoreSubstringsMatch:6 ... false ...
  3. Update the index information to fix the value of the unused index:

    $ rebuild-index \
     --offline \
     --baseDN dc=example,dc=com \
     --clearDegradedState \
     --index newUnusedAttribute

    Alternatively, you can first start the server, and perform this operation while the server is online.

  4. (Optional) With the server offline, check that the Index Valid column for the index status is now set to true:

    $ backendstat \
     show-index-status \
     --backendID dsEvaluation \
     --baseDN dc=example,dc=com \
     | grep newUnusedAttribute
    
    newUnusedAttribute.presence                    ... true ...
    newUnusedAttribute.caseIgnoreMatch             ... true ...
    newUnusedAttribute.caseIgnoreSubstringsMatch:6 ... true ...
  5. Start the server if you have not done so already:

    $ start-ds

Rebuild an Index

When you make a change that affects an index configuration, manually rebuild the index.

Individual indexes appear as degraded and are unavailable while the server rebuilds them. A search request that relies on an index in this state may temporarily fail as an unindexed search.

The following example rebuilds a degraded cn index immediately with the server online.

While the server is rebuilding the cn index, search requests that would normally succeed may fail:

$ rebuild-index \
 --hostname localhost \
 --port 4444 \
 --bindDN uid=admin \
 --bindPassword password \
 --baseDN dc=example,dc=com \
 --index cn \
 --usePkcs12TrustStore /path/to/opendj/config/keystore \
 --trustStorePassword:file /path/to/opendj/config/keystore.pin

Avoid Rebuilding All Indexes at Once

Rebuilding multiple non-degraded indexes at once is disruptive, and not recommended unless some change has affected all indexes.

If you use the --rebuildAll option, first take the backend offline, stop the server, or, at minimum, make sure that no applications connect to the server while it is rebuilding indexes.

Index Entry Limits

An index is a tree of key-value pairs. The key is what the search is trying to match. The value is a list of entry IDs.

As the number of entries in the directory grows, the list of entry IDs for some keys can become very large. For example, every entry in the directory has objectClass: top. If the directory maintains a substring index for mail, the number of entries ending in .com could be huge.

A directory server therefore defines an index entry limit. When the number of entry IDs for a key exceeds the limit, the server stops maintaining a list of IDs for that key. The limit effectively makes a search using that key unindexed. Searches using other keys in the same index are not affected.

The following figure shows a fragment from a substring index for the mail attribute. The number of email addresses ending in .com has exceeded the index entry limit. For the other substring keys, the entry ID lists are still maintained. To save space, the entry IDs are not shown in the figure.

Illustration of the index entry limit reached for the substring .COM

Ideally, the limit is set at the point where it becomes more expensive to maintain the entry ID list for a key, and to perform an indexed search than to perform an unindexed search. In practice, the limit is a trade off, with a default index entry limit value of 4000.

Check Index Entry Limits

The following steps show how to get information about indexes where the index entry limit is exceeded for some keys. In this case, the directory server holds 100,000 user entries. The settings for this directory server are reasonable.

  1. Stop DS servers before you use the backendstat command:

    $ stop-ds
  2. Non-zero values in the Over Entry Limit column of the output table indicate the number of keys for which the limit has been reached. The keys that are over the limit are then listed below the table:

    $ backendstat show-index-status --backendID dsEvaluation --baseDN dc=example,dc=com
    
    Index Name                                       ... Valid  ... Over Entry Limit ...
    ...
    mail.caseIgnoreIA5SubstringsMatch:6              ... true   ... 34 ...
    ...
    givenName.caseIgnoreSubstringsMatch:6            ... true   ... 9  ...
    telephoneNumber.telephoneNumberSubstringsMatch:6 ... true   ... 10 ...
    ...
    sn.caseIgnoreSubstringsMatch:6                   ... true   ... 14 ...
    ...
    cn.caseIgnoreSubstringsMatch:6                   ... true   ... 14 ...
    objectClass.objectIdentifierMatch                ... true   ... 4  ...
    ...
    
    Index: /dc=com,dc=example/mail.caseIgnoreIA5SubstringsMatch:6
    Over index-entry-limit keys: [.com] [0@mail] ...
    
    Index: /dc=com,dc=example/givenName.caseIgnoreSubstringsMatch:6
    Over index-entry-limit keys: [a] [e] [i] [ie] [l] [n] [na] [ne] [y]
    
    Index: /dc=com,dc=example/telephoneNumber.telephoneNumberSubstringsMatch:6
    Over index-entry-limit keys: [0] [1] [2] [3] [4] [5] [6] [7] [8] [9]
    
    Index: /dc=com,dc=example/cn.caseIgnoreSubstringsMatch:6
    Over index-entry-limit keys: [a] [an] [e] [er] [i] [k] [l] [n] [o] [on] [r] [s] [t] [y]
    
    Index: /dc=com,dc=example/objectClass.objectIdentifierMatch
    Over index-entry-limit keys: [inetorgperson] [organizationalperson] [person] [top]
    
    Index: /dc=com,dc=example/sn.caseIgnoreSubstringsMatch:6
    Over index-entry-limit keys: [a] [an] [e] [er] [i] [k] [l] [n] [o] [on] [r] [s] [t] [y]

    For example, every user entry has the object classes listed, and every user entry has an email address ending in .com, so those values are not specific enough to be used in search filters.

  3. Start the server:

    $ start-ds

Index Entry Limit Changes

In rare cases, the index entry limit might be too low for a certain key. This could manifest itself as a frequent, useful search becoming unindexed with no reasonable way to narrow the search.

You can change the index entry limit on a per-index basis. Do not do this in production unless you can explain and show why the benefits outweigh the costs.

Changing the index entry limit significantly can result in serious performance degradation. Be prepared to test performance thoroughly before you roll out an index entry limit change in production.

Consider a directory with more than 4000 groups in a backend. When the backend is brought online, a directory server searches for the groups with a search filter of (|(objectClass=groupOfNames)(objectClass=groupOfEntries)(objectClass=groupOfUniqueNames)), which is an unindexed search due to the default index entry limit setting. The following example raises the index entry limit for the objectClass index to 10000. It rebuilds the index for the configuration change to take effect. The steps are the same for any other index:

$ dsconfig \
 set-backend-index-prop \
 --hostname localhost \
 --port 4444 \
 --bindDN uid=admin \
 --bindPassword password \
 --backend-name dsEvaluation \
 --index-name objectClass \
 --set index-entry-limit:10000 \
 --no-prompt \
 --usePkcs12TrustStore /path/to/opendj/config/keystore \
 --trustStorePassword:file /path/to/opendj/config/keystore.pin

$ stop-ds

$ rebuild-index \
 --baseDN dc=example,dc=com \
 --index objectClass \
 --offline

$ start-ds

It is also possible, but not recommended, to configure the global index-entry-limit for a backend. This changes the default for all indexes in the backend. Use the dsconfig set-backend-prop command:

# Not recommended
$ dsconfig \
 set-backend-prop \
 --hostname localhost \
 --port 4444 \
 --bindDN uid=admin \
 --bindPassword password \
 --backend-name dsEvaluation \
 --set index-entry-limit:10000 \
 --usePkcs12TrustStore /path/to/opendj/config/keystore \
 --trustStorePassword:file /path/to/opendj/config/keystore.pin \
 --no-prompt

Verify Indexes

You can verify that indexes correspond to current directory data, and do not contain errors. Use the verify-index command.

The following example verifies the cn index offline:

$ stop-ds

$ verify-index \
 --baseDN dc=example,dc=com \
 --index cn \
 --clean \
 --countErrors

The output indicates whether any errors are found in the index.

Copyright © 2010-2023 ForgeRock, all rights reserved.