12
votes

I'd like to create a php script that runs as a daily cron. What I'd like to do is enumerate through all users within an Active Directory, extract certain fields from each entry, and use this information to update fields within a MySQL database.

Basically what I want to to do is sync up certain user information between Active Directory and a MySQL table.

The problem I have is that the sizelimit on the Active Directory server is often set at 1000 entries per search result. I had hoped that the php function "ldap_next_entry" would get around this by only fetching one entry at a time, but before you can call "ldap_next_entry", you first have to call "ldap_search", which can trigger the SizeLimit exceeded error.

Is there any way besides removing the sizelimit from the server? Can I somehow get "pages" of results?

BTW - I am currently not using any 3rd party libraries or code. Just PHPs ldap methods. Although, I am certainly open to using a library if that will help.

5

5 Answers

16
votes

I've been struck by the same problem while developing Zend_Ldap for the Zend Framework. I'll try to explain what the real problem is, but to make it short: until PHP 5.4, it wasn't possible to use paged results from an Active Directory with an unpatched PHP (ext/ldap) version due to limitations in exactly this extension.

Let's try to unravel the whole thing... Microsoft Active Directory uses a so called server control to accomplish server-side result paging. This control ist described in RFC 2696 "LDAP Control Extension for Simple Paged Results Manipulation" .

ext/php offers an access to LDAP control extensions via its ldap_set_option() and the LDAP_OPT_SERVER_CONTROLS and LDAP_OPT_CLIENT_CONTROLS option respectively. To set the paged control you do need the control-oid, which is 1.2.840.113556.1.4.319, and we need to know how to encode the control-value (this is described in the RFC). The value is an octet string wrapping the BER-encoded version of the following SEQUENCE (copied from the RFC):

realSearchControlValue ::= SEQUENCE {
        size            INTEGER (0..maxInt),
                                -- requested page size from client
                                -- result set size estimate from server
        cookie          OCTET STRING
}

So we can set the appropriate server control prior to executing the LDAP query:

$pageSize    = 100;
$pageControl = array(
    'oid'        => '1.2.840.113556.1.4.319', // the control-oid
    'iscritical' => true, // the operation should fail if the server is not able to support this control
    'value'      => sprintf ("%c%c%c%c%c%c%c", 48, 5, 2, 1, $pageSize, 4, 0) // the required BER-encoded control-value
);

This allows us to send a paged query to the LDAP/AD server. But how do we know if there are more pages to follow and how do we specify with which control-value we have to send our next query?

This is where we're getting stuck... The server responds with a result set that includes the required paging information but PHP lacks a method to retrieve exactly this information from the result set. PHP provides a wrapper for the LDAP API function ldap_parse_result() but the required last parameter serverctrlsp is not exposed to the PHP function, so there is no way to retrieve the required information. A bug report has been filed for this issue but there has been no response since 2005. If the ldap_parse_result() function provided the required parameter, using paged results would work like

$l = ldap_connect('somehost.mydomain.com');
$pageSize    = 100;
$pageControl = array(
    'oid'        => '1.2.840.113556.1.4.319',
    'iscritical' => true,
    'value'      => sprintf ("%c%c%c%c%c%c%c", 48, 5, 2, 1, $pageSize, 4, 0)

);
$controls = array($pageControl);

ldap_set_option($l, LDAP_OPT_PROTOCOL_VERSION, 3);
ldap_bind($l, 'CN=bind-user,OU=my-users,DC=mydomain,DC=com', 'bind-user-password');

$continue = true;
while ($continue) {
    ldap_set_option($l, LDAP_OPT_SERVER_CONTROLS, $controls);
    $sr = ldap_search($l, 'OU=some-ou,DC=mydomain,DC=com', 'cn=*', array('sAMAccountName'), null, null, null, null);
    ldap_parse_result ($l, $sr, $errcode, $matcheddn, $errmsg, $referrals, $serverctrls); // (*)
    if (isset($serverctrls)) {
        foreach ($serverctrls as $i) {
            if ($i["oid"] == '1.2.840.113556.1.4.319') {
                    $i["value"]{8}   = chr($pageSize);
                    $i["iscritical"] = true;
                    $controls        = array($i);
                    break;
            }
        }
    }

    $info = ldap_get_entries($l, $sr);
    if ($info["count"] < $pageSize) {
        $continue = false;
    }

    for ($entry = ldap_first_entry($l, $sr); $entry != false; $entry = ldap_next_entry($l, $entry)) {
        $dn = ldap_get_dn($l, $entry);
    }
}

As you see there is a single line of code (*) that renders the whole thing useless. On my way though the sparse information on this subject I found a patch against the PHP 4.3.10 ext/ldap by Iñaki Arenaza but neither did I try it nor do I know if the patch can be applied on a PHP5 ext/ldap. The patch extends ldap_parse_result() to expose the 7th parameter to PHP:

--- ldap.c 2004-06-01 23:05:33.000000000 +0200
+++ /usr/src/php4/php4-4.3.10/ext/ldap/ldap.c 2005-09-03 17:02:03.000000000 +0200
@@ -74,7 +74,7 @@
 ZEND_DECLARE_MODULE_GLOBALS(ldap)

 static unsigned char third_argument_force_ref[] = { 3, BYREF_NONE, BYREF_NONE, BYREF_FORCE };
-static unsigned char arg3to6of6_force_ref[] = { 6, BYREF_NONE, BYREF_NONE, BYREF_FORCE, BYREF_FORCE, BYREF_FORCE, BYREF_FORCE };
+static unsigned char arg3to7of7_force_ref[] = { 7, BYREF_NONE, BYREF_NONE, BYREF_FORCE, BYREF_FORCE, BYREF_FORCE, BYREF_FORCE, BYREF_FORCE };

 static int le_link, le_result, le_result_entry, le_ber_entry;

@@ -124,7 +124,7 @@
 #if ( LDAP_API_VERSION > 2000 ) || HAVE_NSLDAP
  PHP_FE(ldap_get_option,   third_argument_force_ref)
  PHP_FE(ldap_set_option,        NULL)
- PHP_FE(ldap_parse_result,   arg3to6of6_force_ref)
+ PHP_FE(ldap_parse_result,   arg3to7of7_force_ref)
  PHP_FE(ldap_first_reference,      NULL)
  PHP_FE(ldap_next_reference,       NULL)
 #ifdef HAVE_LDAP_PARSE_REFERENCE
@@ -1775,14 +1775,15 @@
    Extract information from result */
 PHP_FUNCTION(ldap_parse_result) 
 {
- pval **link, **result, **errcode, **matcheddn, **errmsg, **referrals;
+ pval **link, **result, **errcode, **matcheddn, **errmsg, **referrals, **serverctrls;
  ldap_linkdata *ld;
  LDAPMessage *ldap_result;
+ LDAPControl **lserverctrls, **ctrlp, *ctrl;
  char **lreferrals, **refp;
  char *lmatcheddn, *lerrmsg;
  int rc, lerrcode, myargcount = ZEND_NUM_ARGS();

- if (myargcount  6 || zend_get_parameters_ex(myargcount, &link, &result, &errcode, &matcheddn, &errmsg, &referrals) == FAILURE) {
+ if (myargcount  7 || zend_get_parameters_ex(myargcount, &link, &result, &errcode, &matcheddn, &errmsg, &referrals, &serverctrls) == FAILURE) {
   WRONG_PARAM_COUNT;
  }

@@ -1793,7 +1794,7 @@
     myargcount > 3 ? &lmatcheddn : NULL,
     myargcount > 4 ? &lerrmsg : NULL,
     myargcount > 5 ? &lreferrals : NULL,
-    NULL /* &serverctrls */,
+    myargcount > 6 ? &lserverctrls : NULL,
     0 );
  if (rc != LDAP_SUCCESS ) {
   php_error(E_WARNING, "%s(): Unable to parse result: %s", get_active_function_name(TSRMLS_C), ldap_err2string(rc));
@@ -1805,6 +1806,29 @@

  /* Reverse -> fall through */
  switch(myargcount) {
+  case 7 :
+   zval_dtor(*serverctrls);
+
+   if (lserverctrls != NULL) {
+    array_init(*serverctrls);
+    ctrlp = lserverctrls;
+
+    while (*ctrlp != NULL) {
+     zval *ctrl_array;
+
+     ctrl = *ctrlp;
+     MAKE_STD_ZVAL(ctrl_array);
+     array_init(ctrl_array);
+
+     add_assoc_string(ctrl_array, "oid", ctrl->ldctl_oid,1);
+     add_assoc_bool(ctrl_array, "iscritical", ctrl->ldctl_iscritical);
+     add_assoc_stringl(ctrl_array, "value", ctrl->ldctl_value.bv_val,
+           ctrl->ldctl_value.bv_len,1);
+     add_next_index_zval (*serverctrls, ctrl_array);
+     ctrlp++;
+    }
+    ldap_controls_free (lserverctrls);
+   }
   case 6 :
    zval_dtor(*referrals);
    if (array_init(*referrals) == FAILURE) {

Actually the only option left would be to change the Active Directory configuration and raise the maximum result limit. The relevant option is called MaxPageSize and can be altered by using ntdsutil.exe - please see "How to view and set LDAP policy in Active Directory by using Ntdsutil.exe".

EDIT (reference to COM):

Or you can go the other way round and use the COM-approach via ADODB as suggested in the link provided by eykanal.

6
votes

Support for paged results was added in PHP 5.4.

See ldap_control_paged_result for more details.

3
votes

This isn't a full answer, but this guy was able to do it. I don't understand what he did, though.

By the way, a partial answer is that you CAN get "pages" of results. From the documentation:

resource ldap_search ( resource $link_identifier , string $base_dn ,
     string $filter [, array $attributes [, int $attrsonly [, int $sizelimit [, 
     int $timelimit [, int $deref ]]]]] )
...

sizelimit Enables you to limit the count of entries fetched. Setting this to 0 means no limit.

Note: This parameter can NOT override server-side preset sizelimit. You can set it lower though. Some directory server hosts will be configured to return no more than a preset number of entries. If this occurs, the server will indicate that it has only returned a partial results set. This also occurs if you use this parameter to limit the count of fetched entries.

I don't know how to specify that you want to search STARTING from a certain position, though. I.e., after you get your first 1000, I don't know how to specify that now you need the next 1000. Hopefully someone else can help you there :)

0
votes

Here's an alternative (which works pre PHP 5.4). If you have 10,000 records you need to get but your AD server only returns 5,000 per page:

$ldapSearch = ldap_search($ldapResource, $basedn, $filter, array('member;range=0-4999')); 
$ldapResults = ldap_get_entries($dn, $ldapSearch);
$members = $ldapResults[0]['member;range=0-4999'];

$ldapSearch = ldap_search($ldapResource, $basedn, $filter, array('member;range=5000-10000')); 
$ldapResults = ldap_get_entries($dn, $ldapSearch);
$members = array_merge($members, $ldapResults[0]['member;range=5000-*']);
0
votes

I was able to get around the size limitation using ldap_control_paged_result

ldap_control_paged_result is used to Enable LDAP pagination by sending the pagination control. The below function worked perfectly in my case.

function retrieves_users($conn)
    {
        $dn        = 'ou=,dc=,dc=';
        $filter    = "(&(objectClass=user)(objectCategory=person)(sn=*))";
        $justthese = array();

        // enable pagination with a page size of 100.
        $pageSize = 100;

        $cookie = '';

        do {
            ldap_control_paged_result($conn, $pageSize, true, $cookie);

            $result  = ldap_search($conn, $dn, $filter, $justthese);
            $entries = ldap_get_entries($conn, $result);

            if(!empty($entries)){
                for ($i = 0; $i < $entries["count"]; $i++) {
                    $data['usersLdap'][] = array(
                            'name' => $entries[$i]["cn"][0],
                            'username' => $entries[$i]["userprincipalname"][0]
                    );
                }
            }
            ldap_control_paged_result_response($conn, $result, $cookie);

        } while($cookie !== null && $cookie != '');

        return $data;
    }