18
votes

The question is pretty much in the title: in terms of OS-level implementation, how are shared objects and dlls different?

The reason I ask this is because I recently read this page on extending Python, which states:

Unix and Windows use completely different paradigms for run-time loading of code. Before you try to build a module that can be dynamically loaded, be aware of how your system works.

In Unix, a shared object (.so) file contains code to be used by the program, and also the names of functions and data that it expects to find in the program. When the file is joined to the program, all references to those functions and data in the file’s code are changed to point to the actual locations in the program where the functions and data are placed in memory. This is basically a link operation.

In Windows, a dynamic-link library (.dll) file has no dangling references. Instead, an access to functions or data goes through a lookup table. So the DLL code does not have to be fixed up at runtime to refer to the program’s memory; instead, the code already uses the DLL’s lookup table, and the lookup table is modified at runtime to point to the functions and data.

Could anyone elaborate on that? Specifically I'm not sure I understand the description of shared objects containing references to what they expect to find. Similarly, a DLL sounds like pretty much the same mechanism to me.

Is this a complete explanation of what is going on? Are there better ones? Is there in fact any difference?

I am aware of how to link to a DLL or shared object and a couple of mechanisms (.def listings, dllexport/dllimport) for writing DLLs so I'm explicitly not looking for a how to on those areas; I'm more intrigued as to what is going on in the background.

(Edit: another obvious point - I'm aware they work on different platforms, use different file types (ELF vs PE), are ABI-incompatible etc...)

1
DLL is closed, all its symbols are resolved (the linker knows where to find them).n. 1.8e9-where's-my-share m.
That sounds very odd to me. A DLL has dangling references, of course it does. They need to be resolved by the loader.David Heffernan
Sorry, sent the comment by mistake and could not edit it. Of course a DLL can have dangling references, but DLL knows where (to which specific other DLL) each one points. Whereas with .so, a dangling reference is resolved by the first symbol with that name, wherever it's found. If it happens to be in the executable itself and not in any other .so, that's OK too.n. 1.8e9-where's-my-share m.

1 Answers

17
votes

A Dll is pretty much the same mechanism as used by .so or .dylib (MacOS) files, so it is very hard to explain exactly what the differences are.

The core difference is in what is visible by default from each type of file. .so files export the language (gcc) level linkage - which means that (by default) all C & c++ symbols that are "extern" are available for linking when .so's are pulled in. It also means that, as resolving .so files is essentially a link step, the loader doesn't care which .so file a symbol comes from. It just searches the specified .so files in some order following the usual link step rules that .a files adhere to.

Dll files on the other hand are an Operating system feature, completely separate to the link step of the language. MSVC uses .lib files for linking both static, and dynamic libraries (each dll file generates a paired .lib file that is used for linking) so the resulting program is fully "linked" (from a language centric point of view) once its built.

During the link stage however, symbols were resolved in the lib's that represents the Dlls, allowing the linker to build the import table in the PE file containing an explicit list of dlls and the entry points referenced in each dll. At load time, Windows does not have to perform a "link" to resolving symbols from shared libraries: That step was already done - the windows loader just loads up the dll's and hooks up the functions directly.