On generating data, I use the faker library which is partly inspired by the PHP version.
I recently released an internal project that uses Faker to generate LDAP data for testing called Eris. It's not been heavily tested but works for my needs. Eris is like a chaos monkey for Samba 4 or MS AD that creates, deletes, and moves accounts at random.
That said, as I'm focused on Microsoft directories, I recommend using Samba 4. It uses the standard Microsoft Active Directory schema out of the box.
The easiest set up is Debian (preferably 8) with the distribution packages. This is a pretty good guide for Ubuntu.
If you're an Amazon AWS user, you can stand up an instance of their Simple AD service (which is Samba 4) in no time.
The AWS option will help you get a feel for long distance LDAP request/response latency while a local VM (Debian, Ubuntu, ...) will be easier to reset/rebuild if you mess up the database, configuration, etc.
Note: The AWS option is a service and not shell accessible. As such it will have no external connectivity.