Indexes
About Indexes
A basic, standard directory feature is the ability to respond quickly to searches.
An LDAP search specifies the information that directly affects how long the directory might take to respond:
The base DN for the search.
The more specific the base DN, the less information to check during the search. For example, a request with base DN
dc=example,dc=com
potentially involves checking many more entries than a request with base DNuid=bjensen,ou=people,dc=example,dc=com
.The scope of the search.
A subtree or one-level scope targets many entries, whereas a base search is limited to one entry.
The search filter to match.
A search filter asserts that for an entry to match, it has an attribute that corresponds to some value. For example,
(cn=Babs Jensen)
asserts thatcn
must have a value that equalsBabs Jensen
.A directory server would waste resources checking all entries for a match. Instead, directory servers maintain indexes to expedite checking for a match.
LDAP directory servers disallow searches that cannot be handled expediently using indexes. Maintaining appropriate indexes is a key aspect of directory administration.
The role of an index is to answer the question, "Which entries have an attribute with this corresponding value?" Each index is therefore specific to an attribute. Each index is also specific to the comparison implied in the search filter. For example, a directory server maintains distinct indexes for exact (equality) matching and for substring matching. The types of indexes are explained in "Index Types". Furthermore, indexes are configured in specific directory backends.
An index is implemented as a tree of key-value pairs. The key is a form of the value to match, such as babs jensen
. The value is a list of IDs for entries that match the key. The figure that follows shows an equality (case ignore exact match) index with five keys from a total of four entries. If the data set were large, there could be more than one entry ID per key:
This is how DS directory servers use indexes. When the search filter is (cn=Babs Jensen)
, the directory server retrieves the IDs for entries with a CN matching Babs Jensen
from the equality index of the CN attribute. (For a complex filter, it might optimize the search by changing the order in which it uses the indexes.) A successful result is zero or more entry IDs. These are the candidate result entries.
For each candidate, the DS directory server retrieves the entry by ID from a special system index called id2entry
. As its name suggests, this index returns an entry for an entry ID. If there is a match, and the client application has the right to access to the data, the directory server returns the search result. It continues this process until no candidates are left.
If there are no indexes that correspond to a search request, the server must check for a match against every entry in the scope of the search. Evaluating every entry for a match is referred to as an unindexed search. An unindexed search is an expensive operation, particularly for large directories. A server refuses unindexed searches unless the user has specific permission to make such requests. The permission to perform an unindexed search is granted with the unindexed-search
privilege. This privilege is reserved for the directory superuser by default. It should not be granted lightly.
If the number of entries is smaller than the default resource limits, you can still perform what appear to be unindexed searches, meaning searches with filters for which no index appears to exist. That is because the dn2id
index returns all user data entries without hitting a resource limit that would make the search unindexed.
Use cases that may call for unindexed searches include the following:
An application must periodically retrieve a very large amount of directory data all at once through an LDAP search.
For example, an application performs an LDAP search to retrieve everything in the directory once a week as part of a batch job that runs during off hours.
Make sure the application has no resource limits. For details, see Resource Limits.
A directory data administrator occasionally browses directory data through a graphical UI without initially knowing what they are looking for or how to narrow the search.
For this case, DS directory servers can sort unindexed search results as long as they are paged.
This capability has the following limitations:
The simple paged results control must specify a page size that is less than or equal to the
index-entry-limit
(default: 4000).For each page, the server reads the entire backend database, retaining page size number of sorted entries.
Alternatively, DS directory servers can use an appropriately configured VLV index to sort results for an unindexed search. For details, see "VLV for Paged Server-Side Sort".
What To Index
DS directory server search performance depends on indexes. The default settings are fine for evaluating DS software, and they work well with sample data. The default settings do not necessarily fit your directory data and the searches your applications perform.
Necessary Indexes
Index maintenance has its costs. Every time an indexed attribute is updated, the server must update each affected index to reflect the change. This is wasteful if the index is not used. Indexes, especially substring indexes, can occupy more memory and disk space than the corresponding data.
Aim to maintain only indexes that speed up appropriate searches, and that allow the server to operate properly. The former indexes depend on how directory users search, and require thought and investigation. The latter includes non-configurable internal indexes, that should not change.
Begin by reviewing the attributes of your directory data. Which attributes would you expect to see in a search filter? If an attribute is going to show up frequently in reasonable search filters, then index it.
Compare your guesses with what you see actually happening in the directory. One approach is to review the access log for search results with additional items like unindexed
. The following example shows the relevant fields in an access log message:
{
"request": {
"protocol": "LDAP",
"operation": "SEARCH"
},
"response": {
"detail": "You do not have sufficient privileges to perform an unindexed search",
"additionalItems": {
"unindexed": null
}
}
}
Review the messages in the access log, as they also specify the search filter and scope. Understand the search that led to each unindexed search. If the filter is appropriate and frequently used, add an index to facilitate the search. You can either consume the access logs to determine how often a search filter is used, or monitor what is happening in the directory with the index analysis feature.
DS servers provide the index analysis feature to collect information about filters in search requests. You can activate the index analysis mechanism using the dsconfig set-backend-prop command:
$ dsconfig \
set-backend-prop \
--hostname localhost \
--port 4444 \
--bindDN uid=admin \
--bindPassword password \
--backend-name dsEvaluation \
--set index-filter-analyzer-enabled:true \
--no-prompt \
--usePkcs12TrustStore /path/to/opendj/config/keystore \
--trustStorePasswordFile /path/to/opendj/config/keystore.pin
The command causes the server to analyze filters used, and to keep the results in memory. You can read the results as monitoring information:
$ldapsearch \ --hostname localhost \ --port 1636 \ --useSsl \ --usePkcs12TrustStore /path/to/opendj/config/keystore \ --trustStorePasswordFile /path/to/opendj/config/keystore.pin \ --bindDN uid=admin \ --bindPassword password \ --baseDN ds-cfg-backend-id=dsEvaluation,cn=Backends,cn=monitor \ --searchScope base \ "(&)" \ ds-mon-backend-filter-use-start-time ds-mon-backend-filter-use-indexed ds-mon-backend-filter-use-unindexed ds-mon-backend-filter-use
dn: ds-cfg-backend-id=dsEvaluation,cn=backends,cn=monitor ds-mon-backend-filter-use-start-time: <timestamp> ds-mon-backend-filter-use-indexed: 2 ds-mon-backend-filter-use-unindexed: 3 ds-mon-backend-filter-use: {"search-filter":"(employeenumber=86182)","nb-hits":1,"latest-failure-reason":"caseIgnoreMatch index type is disabled for the employeeNumber attribute"}
The ds-mon-backend-filter-use
values include the following fields:
search-filter
The LDAP search filter
nb-hits
The number of times the filter was used
latest-failure-reason
Message describing why the server could not use any index for this filter
The output can include filters for internal use, such as (aci=*)
. In the example above, you see a filter used by a client application.
In the example, a search filter that led to an unindexed search, (employeenumber=86182)
, had no matches because, "caseIgnoreMatch index type is disabled for the employeeNumber attribute". Some client application has tried to find users by employee number, but no index exists for that purpose. If this appears regularly as a frequent search, add an employee number index.
To avoid impacting server performance, turn off index analysis after you collect the information you need. Turn off index analysis with the dsconfig set-backend-prop command:
$ dsconfig \
set-backend-prop \
--hostname localhost \
--port 4444 \
--bindDN uid=admin \
--bindPassword password \
--backend-name dsEvaluation \
--set index-filter-analyzer-enabled:false \
--no-prompt \
--usePkcs12TrustStore /path/to/opendj/config/keystore \
--trustStorePasswordFile /path/to/opendj/config/keystore.pin
Directory users might complain to you that their searches are refused because they are unindexed. Ask for the result code, additional information, and search filter. DS directory servers respond to LDAP client applications that attempt unindexed searches with a result code of 50 and additional information about the unindexed search. The following example attempts, anonymously, to get the entries for all users whose email address ends in .com
:
$ldapsearch \ --hostname localhost \ --port 1636 \ --useSsl \ --usePkcs12TrustStore /path/to/opendj/config/keystore \ --trustStorePasswordFile /path/to/opendj/config/keystore.pin \ --baseDN ou=people,dc=example,dc=com \ "(&(mail=*.com)(objectclass=person))"
# The LDAP search request failed: 50 (Insufficient Access Rights) # Additional Information: You do not have sufficient privileges to perform an unindexed search
Before you change settings to permit a search, understand why the user wants to perform an unindexed search.
Perhaps they are unintentionally requesting an unindexed search. If so, you can help them find a less expensive search, by using an approach that limits the number of candidate result entries. For example, if a GUI application lets a user browse a group of entries, the application could use a browsing index to retrieve a block of entries for each screen, rather than retrieving all the entries at once.
Perhaps they do have a legitimate reason to get the full list of all entries in one operation, such as regularly rebuilding some database that depends on the directory. If so, assign their application's account the unindexed-search
privilege.
In addition to responding to client search requests, a server performs internal searches. Internal searches let the server retrieve data needed for a request, and maintain internal state information. In some cases internal searches can become unindexed. When this happens, the server logs a warning similar to the following:
The server is performing an unindexed internal search request with base DN '%s', scope '%s', and filter '%s'. Unindexed internal searches are usually unexpected and could impact performance. Please verify that that backend's indexes are configured correctly for these search parameters.
When you see a message like this in the server log, take these actions:
Figure out which indexes are missing, and add them.
For details, see "Debug Search Indexes", and "Configure Indexes".
Check the integrity of the indexes.
For details, see "Verify Indexes".
If the relevant indexes exist and you have verified that they are sound, the index entry limit might be too low.
This can happen, for example, in directory servers with more than 4000 groups in a single backend. For details, see "Index Entry Limits".
If you have made the changes described in the steps above, and problem persists, contact technical support.
Debug Search Indexes
Sometimes it is not obvious by inspection how a directory server processes a given search request. The directory superuser can gain insight with the debugsearchindex
attribute.
The default global access control prevents users from reading the debugsearchindex
attribute. To allow an administrator to read the attribute, add a global ACI such as the following:
$ dsconfig \
set-access-control-handler-prop \
--hostname localhost \
--port 4444 \
--bindDN uid=admin \
--bindPassword password \
--add global-aci:"(targetattr=\"debugsearchindex\")(version 3.0; acl \"Debug search indexes\"; \
allow (read,search,compare) userdn=\"ldap:///uid=user.0,ou=people,dc=example,dc=com\";)" \
--usePkcs12TrustStore /path/to/opendj/config/keystore \
--trustStorePasswordFile /path/to/opendj/config/keystore.pin \
--no-prompt
Note
The format of debugsearchindex
values has interface stability: Internal.
The values are intended to be read by human beings, not scripts. If you do write scripts that interpret debugsearchindex
values, be aware that they are not stable. Be prepared to adapt your scripts for every upgrade or patch.
The debugsearchindex
attribute value indicates how the server would process the search. The server use its indexes to prepare a set of candidate entries. It iterates through the set to compare candidates with the search filter, returning entries that match. The following example demonstrates this feature for a subtree search with a complex filter:
$ldapsearch \ --hostname localhost \ --port 1636 \ --useSsl \ --usePkcs12TrustStore /path/to/opendj/config/keystore \ --trustStorePasswordFile /path/to/opendj/config/keystore.pin \ --bindDN uid=user.0,ou=people,dc=example,dc=com \ --bindPassword password \ --baseDN dc=example,dc=com \ "(&(objectclass=person)(givenName=aa*))" \ debugsearchindex | sed -n -e "s/^debugsearchindex: //p"
{ "filter": { "intersection": [ { "index": "givenName.caseIgnoreSubstringsMatch:6", "range": "[aa,ab[", "candidates": 171, "retained": 171 }, { "index": "givenName.caseIgnoreMatch", "range": "[aa,ab[", "candidates": 50, "retained": 50 }, { "filter": "(objectclass=person)", "union": [ { "index": "objectClass.objectIdentifierMatch", "exact": "person", "candidates": "[LIMIT-EXCEEDED]" } ], "candidates": "[LIMIT-EXCEEDED]", "retained": 50 } ], "candidates": 50 }, "scope": { "type": "sub", "candidates": "[NOT-INDEXED]", "retained": 50 }, "final": 50 }
The filter in the example matches person entries whose given name starts with aa
. The search scope is not explicitly specified, so the scope defaults to the subtree including the base DN.
Notice that the debugsearchindex
value has the following top-level fields:
(Optional)
"vlv"
describes how the server uses VLV indexes.The VLV field is not applicable for this example, and so is not present.
"filter"
describes how the server uses the search filter to narrow the set of candidates."scope"
describes how the server uses the search scope."final"
indicates the final number of candidates in the set.
In the output, notice that the server uses the equality and substring indexes to find candidate entries whose given name starts with aa
. If the filter indicated given names containing aa
, as in givenName=*aa*
, the server would rely only on the substring index.
Notice that the output for the (objectclass=person)
portion of the filter shows "candidates": "[LIMIT-EXCEEDED]"
. In this case, there are so many entries matching the value specified that the index is not useful for narrowing the set of candidates. The scope is also not useful for narrowing the set of candidates. Ultimately, however, the givenName
indexes help the server to narrow the set of candidates. The overall search is indexed and the result is 50 matching entries.
The following example shows a subtree search for accounts with initials starting either with aa
or with zz
:
$ldapsearch \ --hostname localhost \ --port 1636 \ --useSsl \ --usePkcs12TrustStore /path/to/opendj/config/keystore \ --trustStorePasswordFile /path/to/opendj/config/keystore.pin \ --baseDN dc=example,dc=com \ --bindDN uid=user.0,ou=people,dc=example,dc=com \ --bindPassword password \ "(|(initials=aa*)(initials=zz*))" \ debugsearchindex | sed -n -e "s/^debugsearchindex: //p"
{ "filter": { "union": [ { "filter": "(initials=aa*)", "index": "initials.presence", "diagnostic": "not indexed", "candidates": "[NOT-INDEXED]" } ], "candidates": "[NOT-INDEXED]" }, "scope": { "type": "sub", "candidates": "[NOT-INDEXED]", "retained": "[NOT-INDEXED]" }, "final": "[NOT-INDEXED]" }
As shown in the output, the search is not indexed. To fix this, index the initials
attribute:
$dsconfig \ create-backend-index \ --hostname localhost \ --port 4444 \ --bindDN uid=admin \ --bindPassword password \ --backend-name dsEvaluation \ --index-name initials \ --set index-type:equality \ --usePkcs12TrustStore /path/to/opendj/config/keystore \ --trustStorePasswordFile /path/to/opendj/config/keystore.pin \ --no-prompt
$rebuild-index \ --hostname localhost \ --port 4444 \ --bindDn uid=admin \ --bindPassword password \ --baseDn dc=example,dc=com \ --index initials \ --usePkcs12TrustStore /path/to/opendj/config/keystore \ --trustStorePasswordFile /path/to/opendj/config/keystore.pin
After configuring and building the new index, try the same search again:
$ldapsearch \ --hostname localhost \ --port 1636 \ --useSsl \ --usePkcs12TrustStore /path/to/opendj/config/keystore \ --trustStorePasswordFile /path/to/opendj/config/keystore.pin \ --baseDN dc=example,dc=com \ --bindDN uid=user.0,ou=people,dc=example,dc=com \ --bindPassword password \ "(|(initials=aa*)(initials=zz*))" \ debugsearchindex | sed -n -e "s/^debugsearchindex: //p"
{ "filter": { "union": [ { "filter": "(initials=aa*)", "intersection": [ { "index": "initials.caseIgnoreSubstringsMatch:6", "range": "[aa,ab[", "diagnostic": "not indexed", "tryPresenceIndex": { "index": "initials.presence", "diagnostic": "not indexed" }, "candidates": "[NOT-INDEXED]", "retained": "[NOT-INDEXED]" }, { "index": "initials.caseIgnoreMatch", "range": "[aa,ab[", "candidates": 378, "retained": 378 } ], "candidates": 378 }, { "filter": "(initials=zz*)", "intersection": [ { "index": "initials.caseIgnoreSubstringsMatch:6", "range": "[zz,z{[", "diagnostic": "not indexed", "tryPresenceIndex": { "index": "initials.presence", "diagnostic": "not indexed" }, "candidates": "[NOT-INDEXED]", "retained": "[NOT-INDEXED]" }, { "index": "initials.caseIgnoreMatch", "range": "[zz,z{[", "candidates": 26, "retained": 26 } ], "candidates": 26 } ], "candidates": 404 }, "scope": { "type": "sub", "candidates": "[NOT-INDEXED]", "retained": 404 }, "final": 404 }
Notice that the server can narrow the list of candidates using the equality index you created. The server would require a substring index instead of an equality index if the filter were not matching initial strings.
If an index already exists, but you suspect it is not working properly, see "Verify Indexes".
Index Types
DS directory servers support multiple index types, each corresponding to a different type of search.
View what is indexed by using the backendstat list-indexes command. For details about a particular index, you can use the backendstat dump-index command.
Presence Index
A presence index matches an attribute that is present on the entry, regardless of the value. By default, the aci
attribute is indexed for presence:
$ ldapsearch \
--hostname localhost \
--port 1636 \
--useSsl \
--usePkcs12TrustStore /path/to/opendj/config/keystore \
--trustStorePasswordFile /path/to/opendj/config/keystore.pin \
--bindDN uid=admin \
--bindPassword password \
--baseDN dc=example,dc=com \
"(aci=*)" \
aci
A presence index takes up less space than other indexes. In a presence index, there is just one key with a list of IDs.
The following command examines the ACI presence index for a server configured with the evaluation profile:
$stop-ds
$backendstat \ dump-index \ --backendId dsEvaluation \ --baseDn dc=example,dc=com \ --indexName aci.presence
Key (len 1): PRESENCE Value (len 3): [COUNT:2] 1 9 Total Records: 1 Total / Average Key Size: 1 bytes / 1 bytes Total / Average Data Size: 3 bytes / 3 bytes
In this case, entries with ACI attributes have IDs 1
and 9
.
Equality Index
An equality index matches values that correspond exactly (generally ignoring case) to those in search filters. An equality index requires clients to match values without wildcards or misspellings:
$ldapsearch \ --hostname localhost \ --port 1636 \ --useSsl \ --usePkcs12TrustStore /path/to/opendj/config/keystore \ --trustStorePasswordFile /path/to/opendj/config/keystore.pin \ --baseDN dc=example,dc=com \ "(uid=bjensen)" \ mail
dn: uid=bjensen,ou=People,dc=example,dc=com mail: bjensen@example.com
An equality index has one list of entry IDs for each attribute value. Depending on the backend implementation, the keys in a case-insensitive index might not be strings. For example, a key of 6A656E73656E
could represent jensen
.
The following command examines the SN equality index for a server configured with the evaluation profile:
$stop-ds
$backendstat \ dump-index \ --backendID dsEvaluation \ --baseDN dc=example,dc=com \ --indexName sn.caseIgnoreMatch | grep -A 1 "jensen$"
Key (len 6): jensen Value (len 26): [COUNT:17] 18 31 32 66 79 94 133 134 150 5996 19415 32834 46253 59672 73091 86510 99929
In this case, there are 17 entries that have an SN of Jensen.
Unless the keys are encrypted, the server can reuse an equality index for ordering and initial substring searches.
Approximate Index
An approximate index matches values that "sound like" those provided in the filter. An approximate index on sn
lets client applications find people even when they misspell surnames, as in the following example:
$ldapsearch \ --hostname localhost \ --port 1636 \ --useSsl \ --usePkcs12TrustStore /path/to/opendj/config/keystore \ --trustStorePasswordFile /path/to/opendj/config/keystore.pin \ --baseDN dc=example,dc=com \ "(&(sn~=Jansen)(cn=Babs*))" \ cn
dn: uid=bjensen,ou=People,dc=example,dc=com cn: Barbara Jensen cn: Babs Jensen
An approximate index squashes attribute values into a normalized form.
The following command examines an SN approximate index added to a server configured with the evaluation profile:
$stop-ds
$backendstat \ dump-index \ --backendID dsEvaluation \ --baseDN dc=example,dc=com \ --indexName sn.ds-mr-double-metaphone-approx | grep -A 1 "JNSN$"
Key (len 4): JNSN Value (len 83): [COUNT:74] 18 31 32 59 66 79 94 133 134 150 5928 5939 5940 5941 5996 5997 6033 6034 19347 19358 19359 19360 19415 19416 19452 19453 32766 32777 32778 32779 32834 32835 32871 32872 46185 46196 46197 46198 46253 46254 46290 46291 59604 59615 59616 59617 59672 59673 59709 59710 73023 73034 73035 73036 73091 73092 73128 73129 86442 86453 86454 86455 86510 86511 86547 86548 99861 99872 99873 99874 99929 99930 99966 99967
In this case, there are 74 entries that have an SN that sounds like Jensen.
Substring Index
A substring index matches values that are specified with wildcards in the filter. Substring indexes can be expensive to maintain, especially for large attribute values:
$ldapsearch \ --hostname localhost \ --port 1636 \ --useSsl \ --usePkcs12TrustStore /path/to/opendj/config/keystore \ --trustStorePasswordFile /path/to/opendj/config/keystore.pin \ --baseDN dc=example,dc=com \ "(cn=Barb*)" \ cn
dn: uid=bfrancis,ou=People,dc=example,dc=com cn: Barbara Francis dn: uid=bhal2,ou=People,dc=example,dc=com cn: Barbara Hall dn: uid=bjablons,ou=People,dc=example,dc=com cn: Barbara Jablonski dn: uid=bjensen,ou=People,dc=example,dc=com cn: Barbara Jensen cn: Babs Jensen dn: uid=bmaddox,ou=People,dc=example,dc=com cn: Barbara Maddox
In a substring index, there are enough keys to match any substring in the attribute values. Each key is associated with a list of IDs. The default maximum size of a substring key is 6 bytes.
The following command examines an SN substring index for a server configured with the evaluation profile:
$stop-ds
$backendstat \ dump-index \ --backendID dsEvaluation \ --baseDN dc=example,dc=com \ --indexName sn.caseIgnoreSubstringsMatch:6
... Key (len 1): e Value (len 25): [COUNT:22] ... ... Key (len 2): en Value (len 15): [COUNT:12] ... ... Key (len 3): ens Value (len 3): [COUNT:1] 147 Key (len 5): ensen Value (len 10): [COUNT:9] 18 31 32 66 79 94 133 134 150 ... Key (len 6): jensen Value (len 10): [COUNT:9] 18 31 32 66 79 94 133 134 150 ... Key (len 1): n Value (len 35): [COUNT:32] ... ... Key (len 2): ns Value (len 3): [COUNT:1] 147 Key (len 4): nsen Value (len 10): [COUNT:9] 18 31 32 66 79 94 133 134 150 ... Key (len 1): s Value (len 13): [COUNT:12] 12 26 47 64 95 98 108 131 135 147 149 154 ... Key (len 2): se Value (len 7): [COUNT:6] 52 58 75 117 123 148 Key (len 3): sen Value (len 10): [COUNT:9] 18 31 32 66 79 94 133 134 150 ...
In this case, the SN value Jensen shares substrings with many other entries. The size of the lists and number of keys make a substring index much more expensive to maintain than other indexes. This is particularly true for longer attribute values.
Ordering Index
An ordering index is used to match values for a filter that specifies a range. For example, the ds-sync-hist
attribute used by replication has an ordering index by default. Searches on that attribute often seek entries with changes more recent than the last time a search was performed.
The following example shows a search that specifies a range on the SN attribute value:
$ldapsearch \ --hostname localhost \ --port 1636 \ --useSsl \ --usePkcs12TrustStore /path/to/opendj/config/keystore \ --trustStorePasswordFile /path/to/opendj/config/keystore.pin \ --baseDN dc=example,dc=com \ "(sn>=zyw)" \ sn
dn: uid=user.13401,ou=People,dc=example,dc=com sn: Zywiel dn: uid=user.26820,ou=People,dc=example,dc=com sn: Zywiel dn: uid=user.40239,ou=People,dc=example,dc=com sn: Zywiel dn: uid=user.53658,ou=People,dc=example,dc=com sn: Zywiel dn: uid=user.67077,ou=People,dc=example,dc=com sn: Zywiel dn: uid=user.80496,ou=People,dc=example,dc=com sn: Zywiel dn: uid=user.93915,ou=People,dc=example,dc=com sn: Zywiel
In this case, the server only requires an ordering index if it cannot reuse the (ordered) equality index instead. For example, if the equality index is encrypted, an ordering index must be maintained separately.
Virtual List View (Browsing) Index
A virtual list view (VLV) or browsing index is designed to help applications that list results. For example, a GUI application might let users browse through a list of users. VLVs help the server respond to clients that request server-side sorting of the search results.
VLV indexes correspond to particular searches. Configure your VLV indexes using the command-line.
Extensible Matching Rule Index
In some cases you need an index for a matching rule other than those described above.
For example, a generalized time-based matching index lets applications find entries with a time-based attribute later or earlier than a specified time.
Configure Indexes
You modify index configurations by using the dsconfig command. Configuration changes take effect after you rebuild the index with the new configuration, using the rebuild-index command. The dsconfig --help-database command lists subcommands for creating, reading, updating, and deleting index configuration.
Tip
Indexes are per directory backend rather than per base DN. To maintain separate indexes for different base DNs on the same server, put the entries in different backends.
Standard Indexes
The following example creates a new equality index for the description
attribute:
$ dsconfig \
create-backend-index \
--hostname localhost \
--port 4444 \
--bindDN uid=admin \
--bindPassword password \
--backend-name dsEvaluation \
--index-name description \
--set index-type:equality \
--usePkcs12TrustStore /path/to/opendj/config/keystore \
--trustStorePasswordFile /path/to/opendj/config/keystore.pin \
--no-prompt
The following example add an approximate index for the sn
(surname) attribute:
$ dsconfig \
set-backend-index-prop \
--hostname localhost \
--port 4444 \
--bindDN uid=admin \
--bindPassword password \
--backend-name dsEvaluation \
--index-name sn \
--add index-type:approximate \
--usePkcs12TrustStore /path/to/opendj/config/keystore \
--trustStorePasswordFile /path/to/opendj/config/keystore.pin \
--no-prompt
Approximate indexes depend on the Double Metaphone matching rule.
DS servers support matching rules defined in LDAP RFCs. They also define DS-specific extensible matching rules.
The following are DS-specific extensible matching rules:
- Name:
ds-mr-double-metaphone-approx
, OID:1.3.6.1.4.1.26027.1.4.1
Double Metaphone Approximate Match described at http://aspell.net/metaphone/. The DS implementation always produces a single value rather than one or possibly two values.
Configure approximate indexes as described in "Approximate Index".
For an example using this matching rule, see "Approximate Match".
- Name:
ds-mr-user-password-exact
, OID:1.3.6.1.4.1.26027.1.4.2
User password exact matching rule used to compare encoded bytes of two hashed password values for exact equality.
- Name:
ds-mr-user-password-equality
, OID:1.3.6.1.4.1.26027.1.4.3
User password matching rule implemented as the user password exact matching rule.
- Name:
partialDateAndTimeMatchingRule
, OID:1.3.6.1.4.1.26027.1.4.7
Partial date and time matching rule for matching parts of dates in time-based searches.
For an example using this matching rule, see "Active Accounts".
- Name:
relativeTimeOrderingMatch.gt
, OID:1.3.6.1.4.1.26027.1.4.5
Greater-than relative time matching rule for time-based searches.
For an example using this matching rule, see "Active Accounts".
- Name:
relativeTimeOrderingMatch.lt
, OID:1.3.6.1.4.1.26027.1.4.6
Less-than relative time matching rule for time-based searches.
For an example using this matching rule, see "Active Accounts".
The following example configures an extensible matching rule index for "later than" and "earlier than" generalized time matching on a lastLoginTime
attribute:
$ dsconfig \
create-backend-index \
--hostname localhost \
--port 4444 \
--bindDN uid=admin \
--bindPassword password \
--backend-name dsEvaluation \
--set index-type:extensible \
--set index-extensible-matching-rule:1.3.6.1.4.1.26027.1.4.5 \
--set index-extensible-matching-rule:1.3.6.1.4.1.26027.1.4.6 \
--index-name lastLoginTime \
--usePkcs12TrustStore /path/to/opendj/config/keystore \
--trustStorePasswordFile /path/to/opendj/config/keystore.pin \
--no-prompt
Indexes for JSON
DS servers support attribute values that have JSON syntax. The following schema excerpt defines a json
attribute with case-insensitive matching:
attributeTypes: ( json-attribute-oid NAME 'json' SYNTAX 1.3.6.1.4.1.36733.2.1.3.1 EQUALITY caseIgnoreJsonQueryMatch SINGLE-VALUE X-ORIGIN 'DS Documentation Examples' )
When you index a JSON attribute defined in this way, the default directory server behavior is to maintain index keys for each JSON field. Large or numerous JSON objects can result in large indexes, which is wasteful. If you know which fields are used in search filters, you can choose to index only those fields.
As described in "Schema and JSON", for some JSON objects only a certain field or fields matter when comparing for equality. In these special cases, the server can ignore other fields when checking equality during updates, and you would not maintain indexes for other fields.
How you index a JSON attribute depends on the matching rule in the attribute's schema definition, and on the JSON fields you expect to be used as search keys for the attribute.
The examples that follow demonstrate these steps:
Using the schema definition and the information in the following table, configure a custom schema provider for the attribute's matching rule, if necessary.
Matching Rule in Schema Definition Fields in Search Filter Custom Schema Provider Required? caseExactJsonQueryMatch
caseIgnoreJsonQueryMatch
Any JSON field
No
caseExactJsonQueryMatch
caseIgnoreJsonQueryMatch
Specific JSON field or fields
Yes
caseExactJsonIdMatch
caseIgnoreJsonIdMatch
"_id" field only
No
Custom JSON equality matching rule
Specific field(s)
Yes
A custom schema provider applies to all attributes using this matching rule.
Add the schema definition for the JSON attribute.
Configure the index for the JSON attribute.
Add the JSON attribute values in the directory data.
This example illustrates the steps in "Index JSON Attributes".
Note
If you installed a directory server with the ds-evaluation
profile, the custom index configuration is already present.
The following command configures a custom, case-insensitive JSON query matching rule. This only maintains keys for the access_token
and refresh_token
fields:
$ dsconfig \
create-schema-provider \
--hostname localhost \
--port 4444 \
--bindDN uid=admin \
--bindPassword password \
--provider-name "Custom JSON Query Matching Rule" \
--type json-query-equality-matching-rule \
--set enabled:true \
--set case-sensitive-strings:false \
--set ignore-white-space:true \
--set matching-rule-name:caseIgnoreJsonQueryMatch \
--set matching-rule-oid:1.3.6.1.4.1.36733.2.1.4.1 \
--set indexed-field:access_token \
--set indexed-field:refresh_token \
--usePkcs12TrustStore /path/to/opendj/config/keystore \
--trustStorePasswordFile /path/to/opendj/config/keystore.pin \
--no-prompt
The following commands add schemas for a json
attribute that uses the matching rule:
$ ldapmodify \
--hostname localhost \
--port 1636 \
--useSsl \
--usePkcs12TrustStore /path/to/opendj/config/keystore \
--trustStorePasswordFile /path/to/opendj/config/keystore.pin \
--bindDN uid=admin \
--bindPassword password << EOF
dn: cn=schema
changetype: modify
add: attributeTypes
attributeTypes: ( json-attribute-oid NAME 'json'
SYNTAX 1.3.6.1.4.1.36733.2.1.3.1 EQUALITY caseIgnoreJsonQueryMatch
SINGLE-VALUE X-ORIGIN 'DS Documentation Examples' )
-
add: objectClasses
objectClasses: ( json-object-class-oid NAME 'jsonObject' SUP top
AUXILIARY MAY ( json ) X-ORIGIN 'DS Documentation Examples' )
EOF
The following command configures an index using the custom matching rule implementation:
$ dsconfig \
create-backend-index \
--hostname localhost \
--port 4444 \
--bindDN uid=admin \
--bindPassword password \
--backend-name dsEvaluation \
--index-name json \
--set index-type:equality \
--usePkcs12TrustStore /path/to/opendj/config/keystore \
--trustStorePasswordFile /path/to/opendj/config/keystore.pin \
--no-prompt
For an example of how a client application could use this index, see "JSON Query Filters".
This example illustrates the steps in "Index JSON Attributes".
Note
If you installed a directory server with the ds-evaluation
profile, the custom index configuration is already present.
The following command configures a custom, case-insensitive JSON equality matching rule, caseIgnoreJsonTokenIdMatch
. This indexes the id
rather than the _id
field:
$ dsconfig \
create-schema-provider \
--hostname localhost \
--port 4444 \
--bindDN uid=admin \
--bindPassword password \
--provider-name "Custom JSON Token ID Matching Rule" \
--type json-equality-matching-rule \
--set enabled:true \
--set case-sensitive-strings:false \
--set ignore-white-space:true \
--set matching-rule-name:caseIgnoreJsonTokenIDMatch \
--set matching-rule-oid:1.3.6.1.4.1.36733.2.1.4.4.1 \
--set json-keys:id \
--usePkcs12TrustStore /path/to/opendj/config/keystore \
--trustStorePasswordFile /path/to/opendj/config/keystore.pin \
--no-prompt
Notice that this example defines a matching rule with OID 1.3.6.1.4.1.36733.2.1.4.4.1
. In production deployments, use a numeric OID allocated for your own organization.
The following commands add schemas for a jsonToken
attribute, where the unique identifier is in the "id" field of the JSON object:
$ ldapmodify \
--hostname localhost \
--port 1636 \
--useSsl \
--usePkcs12TrustStore /path/to/opendj/config/keystore \
--trustStorePasswordFile /path/to/opendj/config/keystore.pin \
--bindDN uid=admin \
--bindPassword password << EOF
dn: cn=schema
changetype: modify
add: attributeTypes
attributeTypes: ( jsonToken-attribute-oid NAME 'jsonToken'
SYNTAX 1.3.6.1.4.1.36733.2.1.3.1 EQUALITY caseIgnoreJsonTokenIDMatch
SINGLE-VALUE X-ORIGIN 'DS Documentation Examples' )
-
add: objectClasses
objectClasses: ( json-token-object-class-oid NAME 'JsonTokenObject' SUP top
AUXILIARY MAY ( jsonToken ) X-ORIGIN 'DS Documentation Examples' )
EOF
The following command configures an index using the custom matching rule implementation:
$ dsconfig \
create-backend-index \
--hostname localhost \
--port 4444 \
--bindDN uid=admin \
--bindPassword password \
--backend-name dsEvaluation \
--index-name jsonToken \
--set index-type:equality \
--usePkcs12TrustStore /path/to/opendj/config/keystore \
--trustStorePasswordFile /path/to/opendj/config/keystore.pin \
--no-prompt
For an example of how a client application could use this index, see "JSON Assertions".
Virtual List View Index
The following example shows how to create a VLV index. This example applies where GUI users browse user accounts, sorting on surname then given name:
$ dsconfig \
create-backend-vlv-index \
--hostname localhost \
--port 4444 \
--bindDn uid=admin \
--bindPassword password \
--backend-name dsEvaluation \
--index-name people-by-last-name \
--set base-dn:ou=People,dc=example,dc=com \
--set filter:"(|(givenName=*)(sn=*))" \
--set scope:single-level \
--set sort-order:"+sn +givenName" \
--usePkcs12TrustStore /path/to/opendj/config/keystore \
--trustStorePasswordFile /path/to/opendj/config/keystore.pin \
--no-prompt
Note
When referring to a VLV index after creation, you must add vlv.
as a prefix. In other words, if you named the VLV index people-by-last-name
, refer to it as vlv.people-by-last-name
when rebuilding indexes, changing index properties such as the index entry limit, or verifying indexes.
A special VLV index lets the server sort results for a search that is technically unindexed. For example, users page through an entire directory database in a GUI. The user does not filter the data before seeing what is available.
Because the search is technically unindexed, the user performing the search must have the unindexed-search
privilege.
The VLV index must have the following characteristics:
Its filter must be "always true,"
(&)
.Its scope must cover the search scope of the requests.
Its base DN must match or be a parent of the base DN of the search requests.
Its sort order must match the sort keys of the requests in the order they occur in the requests, starting with the first sort key used in the request. The VLV index sort order can include additional keys not present in a request.
In other words, if the sort order of the VLV index is
+l +sn +cn
, then it works with requests having the following sort orders:+l +sn +cn
+l +sn
+l
The following example commands demonstrate creating and using a VLV index to sort paged results by locality, surname, and then full name. The l
attribute is not indexed by default. This example makes use of the rebuild-index command described below. The directory superuser is not subject to resource limits on the LDAP search operation:
$dsconfig \ create-backend-vlv-index \ --hostname localhost \ --port 4444 \ --bindDn uid=admin \ --bindPassword password \ --backend-name dsEvaluation \ --index-name by-name \ --set base-dn:ou=People,dc=example,dc=com \ --set filter:"(&)" \ --set scope:subordinate-subtree \ --set sort-order:"+l +sn +cn" \ --usePkcs12TrustStore /path/to/opendj/config/keystore \ --trustStorePasswordFile /path/to/opendj/config/keystore.pin \ --no-prompt
$rebuild-index \ --hostname localhost \ --port 4444 \ --bindDn uid=admin \ --bindPassword password \ --baseDn dc=example,dc=com \ --index vlv.by-name \ --usePkcs12TrustStore /path/to/opendj/config/keystore \ --trustStorePasswordFile /path/to/opendj/config/keystore.pin
$ldapsearch \ --hostname localhost \ --port 1636 \ --useSsl \ --usePkcs12TrustStore /path/to/opendj/config/keystore \ --trustStorePasswordFile /path/to/opendj/config/keystore.pin \ --bindDn uid=admin \ --bindPassword password \ --baseDn dc=example,dc=com \ --sortOrder +l,+sn,+cn \ --simplePageSize 5 \ "(&)" \ cn l sn
dn: uid=user.93953,ou=People,dc=example,dc=com cn: Access Abedi l: Abilene sn: Abedi dn: uid=user.40283,ou=People,dc=example,dc=com cn: Achal Abernathy l: Abilene sn: Abernathy dn: uid=user.67240,ou=People,dc=example,dc=com cn: Alaine Alburger l: Abilene sn: Alburger dn: uid=user.26994,ou=People,dc=example,dc=com cn: Alastair Alexson l: Abilene sn: Alexson dn: uid=user.53853,ou=People,dc=example,dc=com cn: Alev Allen l: Abilene sn: Allen Press RETURN to continue ^C
Rebuild Indexes
After you change an index configuration, or when you find that an index is corrupt, rebuild the index. When you rebuild indexes, you specify the base DN of the data to index, and either the list of indexes to rebuild or the --rebuildAll
option.
Important
When you rebuild an index with the server online, the server takes the index's backend database offline while it rebuilds the index.
The data that the backend database serves is unavailable to client operations during this time.
The following example rebuilds the cn
index immediately with the server online:
$ rebuild-index \
--hostname localhost \
--port 4444 \
--bindDN uid=admin \
--bindPassword password \
--baseDN dc=example,dc=com \
--index cn \
--usePkcs12TrustStore /path/to/opendj/config/keystore \
--trustStorePasswordFile /path/to/opendj/config/keystore.pin
The following example rebuilds degraded indexes immediately with the server online:
$ rebuild-index \
--hostname localhost \
--port 4444 \
--bindDN uid=admin \
--bindPassword password \
--baseDN dc=example,dc=com \
--rebuildDegraded \
--usePkcs12TrustStore /path/to/opendj/config/keystore \
--trustStorePasswordFile /path/to/opendj/config/keystore.pin
When you add a new attribute as described in "Update LDAP Schema", and then create indexes for the new attribute, the new indexes appear as degraded. The attribute has not yet been used, and so is sure to be empty, rather than degraded.
In this special case, you can safely use the rebuild-index --clearDegradedState command. The server avoids scanning the entire directory backend before rebuilding the new, unused index. In this example, an index has just been created for newUnusedAttribute
. If the newly indexed attribute has already been used, rebuild the index instead of clearing the degraded state.
Before using the rebuild-index command, test the index status to make sure is has not been used.
DS servers must be stopped before you use the backendstat command:
$ stop-ds
The third column of the output is the Valid
column, which is false
before the rebuild:
$backendstat \ show-index-status \ --backendID dsEvaluation \ --baseDN dc=example,dc=com \ | grep newUnusedAttribute
newUnusedAttribute.presence ... false ... newUnusedAttribute.caseIgnoreMatch ... false ... newUnusedAttribute.caseIgnoreSubstringsMatch:6 ... false ...
Update the index information to fix the value of the unused index:
$ rebuild-index \
--offline \
--baseDN dc=example,dc=com \
--clearDegradedState \
--index newUnusedAttribute
Check that the Index Valid
column for the index status is now set to true
:
$backendstat \ show-index-status \ --backendID dsEvaluation \ --baseDN dc=example,dc=com \ | grep newUnusedAttribute
newUnusedAttribute.presence ... true ... newUnusedAttribute.caseIgnoreMatch ... true ... newUnusedAttribute.caseIgnoreSubstringsMatch:6 ... true ...
Start the server:
$ start-ds
Index Entry Limits
An index is implemented as a tree of key-value pairs. The key is what the search is trying to match. The value is a list of entry IDs.
As the number of entries in the directory grows, the list of entry IDs for some keys can become very large. For example, every entry in the directory has objectClass: top
. If the directory maintains a substring index for mail
, the number of entries ending in .com
could be huge.
A directory server therefore defines an index entry limit. When the number of entry IDs for a key exceeds the limit, the server stops maintaining a list of IDs for that key. The limit effectively makes a search using that key unindexed. Searches using other keys in the same index are not affected.
The following figure shows a fragment from a substring index for the mail
attribute. The number of email addresses ending in .com
has exceeded the index entry limit. For the other substring keys, the entry ID lists are still maintained. To save space, the entry IDs are not shown in the figure.
Ideally, the limit is set at the point where it becomes more expensive to maintain the entry ID list for a key and to perform an indexed search than to perform an unindexed search. In practice, the limit is a trade off, with a default index entry limit value of 4000.
The following steps show how to get information about indexes where the index entry limit is exceeded for some keys. In this case, the directory server holds 100,000 user entries. The settings for this directory server are reasonable.
Use the backendstat show-index-status command.
Stop DS servers before you use the backendstat command:
$
stop-ds
Non-zero values in the Over Entry Limit column of the output table indicate the number of keys for which the limit has been reached. The keys that are over the limit are then listed below the table:
$
backendstat show-index-status --backendID dsEvaluation --baseDN dc=example,dc=com
Index Name ... Valid ... Over Entry Limit ... ... mail.caseIgnoreIA5SubstringsMatch:6 ... true ... 34 ... ... givenName.caseIgnoreSubstringsMatch:6 ... true ... 9 ... telephoneNumber.telephoneNumberSubstringsMatch:6 ... true ... 10 ... ... sn.caseIgnoreSubstringsMatch:6 ... true ... 14 ... ... cn.caseIgnoreSubstringsMatch:6 ... true ... 14 ... objectClass.objectIdentifierMatch ... true ... 4 ... ... Index: /dc=com,dc=example/mail.caseIgnoreIA5SubstringsMatch:6 Over index-entry-limit keys: [.com] [0@mail] ... Index: /dc=com,dc=example/givenName.caseIgnoreSubstringsMatch:6 Over index-entry-limit keys: [a] [e] [i] [ie] [l] [n] [na] [ne] [y] Index: /dc=com,dc=example/telephoneNumber.telephoneNumberSubstringsMatch:6 Over index-entry-limit keys: [0] [1] [2] [3] [4] [5] [6] [7] [8] [9] Index: /dc=com,dc=example/cn.caseIgnoreSubstringsMatch:6 Over index-entry-limit keys: [a] [an] [e] [er] [i] [k] [l] [n] [o] [on] [r] [s] [t] [y] Index: /dc=com,dc=example/objectClass.objectIdentifierMatch Over index-entry-limit keys: [inetorgperson] [organizationalperson] [person] [top] Index: /dc=com,dc=example/sn.caseIgnoreSubstringsMatch:6 Over index-entry-limit keys: [a] [an] [e] [er] [i] [k] [l] [n] [o] [on] [r] [s] [t] [y]
For example, every user entry has the object classes listed, and every user entry has an email address ending in
.com
, so those values are not specific enough to be used in search filters.Start the server:
$
start-ds
In rare cases, the index entry limit might be too low for a certain key. This could manifest itself as a frequent, useful search becoming unindexed with no reasonable way to narrow the search.
You can change the index entry limit on a per-index basis. Do not do this in production unless you can explain and show why the benefits outweigh the costs.
Important
Changing the index entry limit significantly can result in serious performance degradation. Be prepared to test performance thoroughly before you roll out an index entry limit change in production.
Consider a directory with more than 4000 groups in a backend. When the backend is brought online, a directory server searches for the groups with a search filter of (|(objectClass=groupOfNames)(objectClass=groupOfEntries)(objectClass=groupOfUniqueNames))
, which is an unindexed search due to the default index entry limit setting. The following example raises the index entry limit for the objectClass
index to 10000
. It rebuilds the index for the configuration change to take effect. The steps are the same for any other index:
$dsconfig \ set-backend-index-prop \ --hostname localhost \ --port 4444 \ --bindDN uid=admin \ --bindPassword password \ --backend-name dsEvaluation \ --index-name objectClass \ --set index-entry-limit:10000 \ --no-prompt \ --usePkcs12TrustStore /path/to/opendj/config/keystore \ --trustStorePasswordFile /path/to/opendj/config/keystore.pin
$stop-ds
$rebuild-index \ --baseDN dc=example,dc=com \ --index objectClass \ --offline
$start-ds
It is also possible, but not recommended, to configure the global index-entry-limit
for a backend. This changes the default for all indexes in the backend. Use the dsconfig set-backend-prop command as shown in the following example:
# Not recommended
$ dsconfig \
set-backend-prop \
--hostname localhost \
--port 4444 \
--bindDN uid=admin \
--bindPassword password \
--backend-name dsEvaluation \
--set index-entry-limit:10000 \
--usePkcs12TrustStore /path/to/opendj/config/keystore \
--trustStorePasswordFile /path/to/opendj/config/keystore.pin \
--no-prompt
Verify Indexes
You can verify that indexes correspond to current directory data, and do not contain errors. Use the verify-index command.
The following example verifies the cn
index offline:
$stop-ds
$verify-index \ --baseDN dc=example,dc=com \ --index cn \ --clean \ --countErrors
The output indicates whether any errors are found in the index.