Searching LDAP Servers with Net::LDAP

By: Mark Wilcox

The Lightweight Directory Access Protocol (LDAP) is the Internet standard for directory services. Perl provides one of the best tools you can use to build LDAP-based applications. In this article we're going to learn how to use Perl to search LDAP servers.

Why Do We Need LDAP?

A directory service is specialized database that has been optimized for read access. The data that is contained in a directory service is usually the type of data that is accessed more often than it's updated (things like password databases, DNS tables, etc). A directory is usually organized in a hierarchical fashion as opposed to a relational format. Finally a directory service generally is designed so that it can be easily distributed across different servers.

Directory services provide us with the ability to provide all sorts of services such as a network email address book, ability to provide single sign-on or manage networked devices like printers.

There several different directory service protocols, such as NIS/NIS+, Novell's NDS and X.500. LDAP is an open-standard (X.500 is also an open standard and is the forebearer to LDAP, but it wasn't widely implemented because of its roots in the OSI protocol) and has gained popularity because all of the major proprietary directory service vendors (Sun, Novell, Microsoft) have all released versions of their directory service products that can 'speak' LDAP.

The most important thing LDAP provides for us is the ability to centralize the management of data that has been decentralized across the enterprise. This data has been decentralized because it was too hard to get access to it if it was managed centrally (e.g. in a mainframe). The reason why we want to move this data back to the 'center' is that in a networked environment, much of the information we need to accurately manage our connected world is stored in various data stores that aren't easily accessed via a standard management interface.

Take user databases for example.

At the University of North Texas we created the concept of an 'enterprise userid' that could form the basis of single-signon. We came up with this specification about 8 years ago and in Academic Computing Services we have implemented the system. However, we have never had a real good idea of who should have access and when their access should be turned off (e.g. they graduated, left for a different job, etc). The best we could do was attempt to 'read the tea leaves' that we got in the form of a flat-file from our ID Card Office. In reality this information was actively maintained in a particular database in the mainframe, but we didn't have a way to get access to the data in the mainframe from our Solaris boxes (CORBA was out of the question because our mainframe database doesn't support a CORBA interface, it's a long story ;).

Another problem was that other systems on campus, such as our Novell group wanted to be able to use our EUID database so that students could have the same userid on our UNIX systems and Novell systems.

Finally, in ironic twist, the Administrative Computing Services group (aka 'the mainframe programmers') wanted access to our student email addresses so that they could send them official messages like financial aid statements and grades.

Thus we decided to build an LDAP server since it would allow us to give access to this type of data via a standard, secure, open, Internet-based protocol.

Now the mainframe will provide us with a better feed (via a CORBA like product called EntireBroker) so that we have a better idea of who should & shouldn't have access to computing services. We are also able to provide extra services such as automatically creating 'bulk mail' services for our administration and faculty. For example select members of the administration can send messages to students based on a particular query (e.g. all freshmen biology majors) and faculty can send messages to students in their courses. We use the LDAP server to answer these queries and to control access to who can send these messages.

Onto Perl!

Ok, you're now wondering what does this have to do with Perl? Well a couple of things. One is that Perl is the de-facto standard for CGI programming and one of the most popular things to do with LDAP is to provide a Web interface to its information (generally in the form of an enterprise email/phone search). Second, LDAP is often populated with data from legacy information and Perl is one of the best tools you can use to move data from one source to another (in particular if you have to do any type of data normalization).

The rest of this article we are going to cover how to search an LDAP server using the Net::LDAP module.

First A Brief Intro to LDAP

Before we dive directly into LDAP programming, there are a few other things you need to know.

First the term 'LDAP' can mean one of four things. One is a data model. The data model defines what an LDAP entry looks like. Second, there is the naming model which defines how LDAP entries are named. Third is the security model which defines how to protect your data in an LDAP server. Finally there is an access model. The access model defines how LDAP clients and servers talk to each other.

The data stored in the LDAP server (as specified by the data model) are stored as entries. Entries have a collection of attributes which are represented as name-value pairs. Each attribute can have one or more values associated with it. The values can be text or binary. There is a special attribute called the Distinguished Name (DN) which must be unique within its particular directory server. A DN is made up of attributes found in the entry. By reading a DN you can determine where that entry fits into the directory server, just like you can figure out where a file fits on a hard-drive by looking at its path-name. One key difference between a DN and a path-name is that a DN goes from leaf to root while a path-name goes from root to leaf.

Here is an example of a DN:

uid=mewilcox,ou=people,o=airius.com

Check the resources section at the end of this article to learn where you can find more about LDAP.

Why Net::LDAP

Now we'll actually start to do some programming. In our examples here we are going to be using Graham Barr's Net::LDAP module. There are two other LDAP modules, Net::LDAPapi (no longer actively maintained) and Netscape's PerLDAP which are Perl wrappers around the Netscape C LDAP API.

The Net::LDAP module is written in pure Perl, no C compiler required which makes this module incredibly portable. All you need is a copy of Perl 5.004 or later and you're ready to go. On the Net::LDAP mailing list we've heard from people who have used the module on everything from MacPerl, to Windows, to all flavors of Unix and I'm pretty sure someone has even had a copy running on a mainframe somewhere all without having to recompile or port any code!

Assuming that you have the Perl lib-net bundle (required to do any network programming in Perl & I'm pretty sure a standard module set these days), then Net::LDAP only requires 2 modules, Convert::BER (soon to replaced by Convert::ASN1, I'll explain more about this later) and Net::LDAP. Optionally you can also install Digest::MD5 (used for SASL-CRAM MD5 authentication) and the URI module (used to parse LDAP URLs). All of these modules are available on CPAN.

After you have downloaded the modules, they are a pretty straightforward install. If you are going to use the optional modules, you should install those modules first. Then you must install Convert::BER followed by Net::LDAP.

All of the modules follow a similar install procedure:

  1. unpack the module
  2. cd into their unpacked directory (e.g. perl-ldap-0.14)
  3. type perl Makefile.PL
  4. type make
  5. type make test
  6. type make install
Now you are ready to start hacking away.

Examples

In this article we are going to confine our examples to just searching an LDAP server. The steps to do this are:

  1. load the Net::LDAP module
  2. connect to the LDAP server
  3. bind (authenticate) to the LDAP server
  4. define what attributes we wish to have returned from the server
  5. define a search base
  6. define a search scope
  7. define a search filter
  8. perform the LDAP search
  9. display the results
To connect to an LDAP server all you have to do is:

my $ldap = new Net::LDAP( 'hostname');

The next step is you must authenticate your connection to the LDAP server via an LDAP bind. The LDAP bind operation associates a client connection to a particular entry in the LDAP server. While LDAP can support a variety of authentication mechanisms including digital certificates and Kerberos (via the SASL mechanism), we are only going to concern ourselves with simple authentication which requires the Distinguished Name (DN) of an entry and a password. A DN is the unique name each LDAP entry has. Under LDAP you can authenticate to the server anonymously, this is very common for when you are using LDAP to publish a public directory service and you don't want to issue accounts to the millions of potential users on the Internet. Generally, anonymous connections only have a very limited access to the server.

Here is how you authenticate as a particular user:

my $mesg = $ldap->bind('uid=mewilcox,ou=people,o=airius.com', password => 'password');

Here is how to authenticate anonymously:

my $mesg = $ldap->bind();

Nearly every function in Net::LDAP will return an Net::LDAP::Message object which I've put in my example here in the $mesg variable. You can use the Net::LDAP::Message::code function to check the LDAP return code. This will tell you if you successfully authenticated or not. Here is the code I use:

die ("failed to bind with ",$mesg->code(),"\n") if $mesg->code();

The code I just showed you works because under LDAP all non-successful operations will have an LDAP result code of 1 or greater. Since under Perl false is zero and any non-zero values is true, the die function will only be called if the LDAP operation did not succeed.

Our next step is to build our search function. A search requires three elements, a search base, a search scope and a filter.

The search base is the DN of the entry you want to be begin to your search at.

The search scope defines how much of the LDAP server you want to search at one time. There are three scopes. The most common scope is sub, which means start the search at the search base and search all entries below the base, but include the base entry in your search. There is also the base scope which starts at the search base and searches all entries below the base, but does not include the base entry. Finally there is a scope of one, which only searches the entry specified by the search base.

In a search filter you specify the attributes and values you want to match to declare a successful search. Whether a search is case sensitive or not depends upon what attribute you are searching. When you configure an LDAP server you tell it which attributes are case-sensitive and which ones are not.

A search filter can be very simple like this one:

(sn = Wilcox)

Which says find all entries that have a sn attribute that contains a value of Wilcox.

You can also do wildcard searches like this:

(sn = Wil*)

Which will find all entries that have an sn attribute that contains a value that starts with Wil.

You can also do boolean searches. AND is represented by the &. OR is represented by the | and NOT is represented by the !. You can combine AND and OR with different filters, but NOT can only apply to one filter.

For example here's how you can say 'give me all entries who have a last name of Wilcox and work in Accounting (where they work is specified in the ou attribute)'.

(&(sn = Wilcox) (ou=Accounting))

Or we can say 'give me all people who work in Accounting or Engineering'.

(|(ou=Accounting)(ou=Engineering))

Or we can say 'give me all entries except for those who work in Accounting'.
(!(ou=Accounting))

Or we can combine some of these like this.
(&(sn=Wilcox)(|(ou=Accounting)(ou=Engineering)))

Here is an example search in Net::LDAP:

$mesg = $ldap->search(
                                        base => 'ou=people,o=airius.com',
                                        scope => 'sub',
                                        filter => '(objectclass=*)'
                                       );

The searchbase is ou=people,o=airius.com, the scope is sub and the filter is (objectclass=*). This last filter says return all entries that have a value in their objectclass attribute. The objectclass attribute is the only attribute guranteed to be in each entry in the LDAP server (the DN is a special attribute is not really part of the entry). An objectclass attribute is used by the LDAP server to determine what attributes a particular LDAP entry is required to have and which ones it is simply allowed to have.

Again we can check for the success of the operation by using the returned Net::LDAP::Message object.

die ("search failed with ",$mesg->code(),"\n") if $mesg->code();

The entries returned from a search are also contained in our Net::LDAP::Message object.

We can get them one at a time using either Net::LDAP::Message::shift_entry or Net::LDAP::Message::pop_entry. Or we can get all of the entries as once using Net::LDAP::Message::entries. An LDAP entry is stored in a Net::LDAP::Entry object.

I generally dump my entries using code that looks something like this:

while (my $entry = $mesg->shift_entry())
{
   $entry->dump();
}

The easiest way to display the results of a search is using the Net::LDAP::Entry::dump method which simply dumps the contents of an entry to STDOUT in LDIF format. The LDAP Data Interchange Format is the standard way to display LDAP data in a human readable format. There is a newer XML standard, DSML, but it hasn't been widely implemented anywhere (yet).

If you use simplesearch.pl or bindedsimplesearch.pl to search your local LDAP server, you will see results that look something like this:

dn:uid=scarter, ou=People, o=airius.com
cn: Sam Carter
sn: Carter
givenname: Sam
objectclass: top
                       person
                       organizationalPerson
                       inetOrgPerson
ou: Accounting
       People
l: Sunnyvale
uid: scarter
mail: scarter@airius.com
telephonenumber: +1 408 555 4798
facsimiletelephonenumber: +1 408 555 9751
roomnumber: 4612
------------------------------------------------------------------------

One thing you'll likely notice when you run these two example scripts is that the bindedsimplesearch.pl results will display more attributes than the simplesearch.pl. This is because LDAP uses ACLs to control access to the attributes stored in its entries.  By properly configuring ACLs you can easily provide different 'views' to entries stored in the LDAP server.

While these scripts are handy for testing out LDAP servers, they don't work real well when you want to either change how your results are displayed or when you only want to retrieve a select number of attributes from an entry.

One way to limit the attributes you are seeing is by telling the LDAP server to only retrieve particular attributes. This requires two steps.

First you must specify the attributes via an anonymous array like this:

my $attrs = ['cn','mail'];

Second you must pass these attributes to the LDAP server in the search method like this:

$mesg = $ldap->search(
                                         base => 'ou=people,o=airius.com',
                                         scope => 'sub',
                                          filter => '(objectclass=*)',
                                          attrs => $attrs,
                                       );

Now lets say that instead of dumping the attributes out to STDOUT, you would like to dump the results out to a text file. While it is true you could simply pipe the output from the earlier scripts into a text file, there is a cleaner solution in Net::LDAP with the Net::LDAP::LDIF class. The Net::LDAP::LDIF class lets you read and write LDIF files. What's great about this is that LDIF allows us to specify LDAP update commands in the LDIF itself and by using Net::LDAP::LDIF we can take other LDIF files and with relative ease, modify our LDAP server without having to write a lot of code.

However, for now we just simply want to write our search results out to a file. First you must directly import Net::LDAP::LDIF with use Net::LDAP::LDIF because Net::LDAP doesn't automatically load the module into your program.

Next you must obtain an instance of the Net::LDAP::LDIF class like this:

my $ldif = new Net::LDAP::LDIF ('example.ldif','w') || die ("failed to open example.ldif. $!\n");

This will open a new file named example.ldif that is ready to be written to.

Third instead of simply dumping the results to the screen with Net::LDAP::Entry::dump, you write them to the LDIF file like this:

$ldif->write($entry);

Finally when you are done with the LDIF file you can close it by:

$ldif->done();

But what about if you don't want plain old LDIF. What if you wanted to be ahead of the curve and actually write your results out in DSML so that you could other things with it (e.g. apply style sheets to it for displaying in a browser or feed another datasource that only takes XML tagged data).

Here's how you could do that.

First we must open a file to print our XML data out to.
open(DSML,">example.xml") || die("failed to open example.xml.$!\n");

Second we print out the first lines of our XML file.

print DSML "<?xml version=\"1.0\"?>\n";
print DSML "<directory-entries>\n";

Next we perform a search on the server and retrieve each entry one by one. Instead of dumping the entry to STDOUT or into a LDIF object, we are going to retrieve each of the entry's attributes and values so that we can transform them into DSML entities.

Our first attribute we must get is the DN of the entry and print it out as DSML like this:
my $dn = $entry->dn();
print DSML "<entry dn=\"$dn\">\n";

Next we get an array of the attributes stored in the entry.
  #get a list of attributes
  my @attributes = $entry->attributes;
 

Our next trick is to step through each attribute, get its values from the entry and then print out the resulting DSML.

  for my $attribute (@attributes)
  {
     #in DSML objectclasses are written differently than other attributes
   if ( $attribute eq 'objectclass')
  {
    print DSML "<objectclass>\n";

    my $values = $entry->get($attribute);

    for my $v (@{$values})
    {
      print DSML "<oc-value>$v</oc-value>\n";
    }
    print DSML "</objectclass>\n";
  }
  else
  {
   #<dsml:attr name="sn"><dsml:value>Rabbit</dsml:value></dsml:attr>
    my $values = $entry->get($attribute);

    print DSML "<attr name=\"$attribute\">\n";
    for my $v (@{$values})
    {
      print DSML "<value>$v</value>\n";
    }
    print DSML "</attr>\n";
   }
 }

Finally we must close off all of our DSML entry tags.
  print DSML "</entry>\n";
}
print DSML "</directory-entries>\n";

The final steps are to close the DSML filehandle and the LDAP connection (via Net::LDAP::unbind, something you should do in all of your LDAP programs, though Perl will take care of it for you if you forget).

The resulting program, dsmlsearch.pl will print out an entry that looks something like this:

<entry dn="uid=scarter, ou=People, o=airius.com">
<attr name="cn">
<value>Sam Carter</value>
</attr>
<attr name="mail">
<value>scarter@airius.com</value>
</attr>
<objectclass>
<oc-value>top</oc-value>
<oc-value>person</oc-value>
<oc-value>organizationalPerson</oc-value>
<oc-value>inetOrgPerson</oc-value>
</objectclass>
</entry>

Our final example will show you how to use the callback parameter of Net::LDAP. A callback is simply a subroutine that is called everytime a Net::LDAP::Message object is created during a search. The reason you want to go through the trouble of doing such a thing is that it can make your program appear to run faster, which is important when you're using Net::LDAP to build a Web application. The reason why it appears to run faster is that you can start to work on the results as soon as they appear instead of having to wait for all of the results in a non-callback approach.

To make it easier to compare the code, I've rewritten the simplesearch.pl script to use a callback in a script called callbacksearch.pl.

Here's the callback routine.

sub callback {
 my ($mesg,$entry) = @_;

 $entry->dump() if ref $entry eq 'Net::LDAP::Entry';
}

Now here's the modified search method.

$mesg = $ldap->search(
                       base => 'ou=people,o=airius.com',
                      scope => 'sub',
                      filter => '(objectclass=*)',
                      callback => \&callback
                     );

Convert::ASN1

The most common complaint about Net::LDAP is that it's 'too slow'. Of course this is all a matter of perception. For all of the applications that I have written, Net::LDAP's performance has been more than adequate.  And any of it's inefficiencies have been either covered up by network latency on the client's connection or have been minor compared to the headache of compiling and installing a C based Perl module (especially if you must compile the C API from scratch). Another great thing about Net::LDAP is that people helping to contribute development represent users of all of the popular LDAP servers out there to help make sure that Net::LDAP stays vendor nuetral.

Net::LDAP will always be a bit slower than it's C wrapper counterparts because it is pure Perl. Occasionally users have asked Graham to completely rewrite the API in C so that it will speed it up. But as Graham has pointed out, this isn't necessary. The bulk of the work during any LDAP operation is in the encoding and decoding of the ASN.1 op-codes that LDAP uses to conduct its operations (occasionally LDAP does show its X.500/OSI roots). These methods have been carried out in the past by a module named Convert::BER (BER stands for Basic Encoding Rules, which is the encoding set LDAP uses for ASN.1). Graham has recently written a new module called Convert::ASN1. This module (still in pure Perl!) has already doubled the performance of Net::LDAP (at least in the development releases Graham and I have been testing out). He might even have a release on CPAN of Net::LDAP that uses Convert::ASN1 by the time you are reading this.

Graham also has written Convert::ASN1 in a way that will make it easier to put in a XS module so that those of us who can/want a C based back-end can have one, once such a module is written.

Conclusion

This ends our brief introduction to LDAP programming with Net::LDAP. I hope you have learned something and enjoyed the ride. If you did please let me know. If you didn't please let me know. If you just didn't care, well at least no trees were wasted in the process. Of course we have just barely scratched the surface of LDAP application programming. If you want to learn more, you can check out the resources I have provided at the end of this article, chat with me on one of the many LDAP mailing list/newsgroups that I hang out in or even meet me in person at the O'Reilly conference (where I'll be giving a tutorial on this module).

Resources

Net::LDAP Homepage
LDAP Central
Implementing LDAP

About Mark

Mark Wilcox (mark@mjwilcox.com) is the Web Administrator & LDAP Guru for the University of North Texas in Denton, the 4th largest university in the state. He's a regular author and speaker on LDAP as well a frequent contributor on LDAP mailing lists and newsgroups. When he's not busy spreading the good news of LDAP, he's spending time with his wife or dreaming of the day he can buy his 64 inch TV.