113
votes

Is there a straightforward way to find all the modules that are part of a python package? I've found this old discussion, which is not really conclusive, but I'd love to have a definite answer before I roll out my own solution based on os.listdir().

5
@S.Lott: There are more general solutions available, python packages are not always in directories in the filesystem, but can also be inside zips.u0b34a0f6ae
why reinvent the wheel? If python acquires hypermodules in Python 4, pkgutil and updated with that, my code will still work. I like to use abstractions that are available. Use the obvious method provided, it is tested and known to work. Reimplementing that.. now you have to find and work around every corner case yourself.u0b34a0f6ae
@S.Lott: So everytime the application starts, it will unzip its own egg if installed inside one just to check this? Please submit a patch against my project to reinvent the wheel in this function: git.gnome.org/cgit/kupfer/tree/kupfer/plugins.py#n17. Please consider both eggs and normal directories, do not exceed 20 lines.u0b34a0f6ae
@S.Lott: Why you don't understand that it is relevant is something you can't understand. Discovering this programmatically is about that the application takes interest in the content of a package, not the user.u0b34a0f6ae
Of course I mean programmatically! Otherwise I wouldn't have mentioned "rolling out my own solution with os.listdir()"static_rtti

5 Answers

151
votes

Yes, you want something based on pkgutil or similar -- this way you can treat all packages alike regardless if they are in eggs or zips or so (where os.listdir won't help).

import pkgutil

# this is the package we are inspecting -- for example 'email' from stdlib
import email

package = email
for importer, modname, ispkg in pkgutil.iter_modules(package.__path__):
    print "Found submodule %s (is a package: %s)" % (modname, ispkg)

How to import them too? You can just use __import__ as normal:

import pkgutil

# this is the package we are inspecting -- for example 'email' from stdlib
import email

package = email
prefix = package.__name__ + "."
for importer, modname, ispkg in pkgutil.iter_modules(package.__path__, prefix):
    print "Found submodule %s (is a package: %s)" % (modname, ispkg)
    module = __import__(modname, fromlist="dummy")
    print "Imported", module
51
votes

The right tool for this job is pkgutil.walk_packages.

To list all the modules on your system:

import pkgutil
for importer, modname, ispkg in pkgutil.walk_packages(path=None, onerror=lambda x: None):
    print(modname)

Be aware that walk_packages imports all subpackages, but not submodules.

If you wish to list all submodules of a certain package then you can use something like this:

import pkgutil
import scipy
package=scipy
for importer, modname, ispkg in pkgutil.walk_packages(path=package.__path__,
                                                      prefix=package.__name__+'.',
                                                      onerror=lambda x: None):
    print(modname)

iter_modules only lists the modules which are one-level deep. walk_packages gets all the submodules. In the case of scipy, for example, walk_packages returns

scipy.stats.stats

while iter_modules only returns

scipy.stats

The documentation on pkgutil (http://docs.python.org/library/pkgutil.html) does not list all the interesting functions defined in /usr/lib/python2.6/pkgutil.py.

Perhaps this means the functions are not part of the "public" interface and are subject to change.

However, at least as of Python 2.6 (and perhaps earlier versions?) pkgutil comes with a walk_packages method which recursively walks through all the modules available.

2
votes

This works for me:

import types

for key, obj in nltk.__dict__.iteritems():
    if type(obj) is types.ModuleType: 
        print key
0
votes

I was looking for a way to reload all submodules that I'm editing live in my package. It is a combination of the answers/comments above, so I've decided to post it here as an answer rather than a comment.

package=yourPackageName
import importlib
import pkgutil
for importer, modname, ispkg in pkgutil.walk_packages(path=package.__path__, prefix=package.__name__+'.', onerror=lambda x: None):
    try:
        modulesource = importlib.import_module(modname)
        reload(modulesource)
        print("reloaded: {}".format(modname))
    except Exception as e:
        print('Could not load {} {}'.format(modname, e))
-4
votes

Here's one way, off the top of my head:

>>> import os
>>> filter(lambda i: type(i) == type(os), [getattr(os, j) for j in dir(os)])
[<module 'UserDict' from '/usr/lib/python2.5/UserDict.pyc'>, <module 'copy_reg' from '/usr/lib/python2.5/copy_reg.pyc'>, <module 'errno' (built-in)>, <module 'posixpath' from '/usr/lib/python2.5/posixpath.pyc'>, <module 'sys' (built-in)>]

It could certainly be cleaned up and improved.

EDIT: Here's a slightly nicer version:

>>> [m[1] for m in filter(lambda a: type(a[1]) == type(os), os.__dict__.items())]
[<module 'copy_reg' from '/usr/lib/python2.5/copy_reg.pyc'>, <module 'UserDict' from '/usr/lib/python2.5/UserDict.pyc'>, <module 'posixpath' from '/usr/lib/python2.5/posixpath.pyc'>, <module 'errno' (built-in)>, <module 'sys' (built-in)>]
>>> [m[0] for m in filter(lambda a: type(a[1]) == type(os), os.__dict__.items())]
['_copy_reg', 'UserDict', 'path', 'errno', 'sys']

NOTE: This will also find modules that might not necessarily be located in a subdirectory of the package, if they're pulled in in its __init__.py file, so it depends on what you mean by "part of" a package.