2
votes

Something I still haven't quite absorbed. Nearly all of my development has involved generating and compiling my code together statically, without the use of linking dynamic libraries (dll or .so).

I understand in this case how the compiler and linker can resolve the symbols as it is stepping through the code.

However, when linking in a dynamic library for example, and the dynamic library has been compiled with a different compiler than the main code, will the symbol names not be different?

For example, I define a structure called RequiredData in my main code as shown below. Imagine that the FuncFromDynamicLib() function is part of a separate dynamic library and it takes this same structure RequiredData as an argument. It means that somewhere in the dynamic library, this structure must be defined again? How can this be resolved at the time of dynamic linking that these structures are the same. What if the data members are different? Thanks in advance.

#include <iostream>

using namespace std;

struct RequiredData
{
    string name;
    int value1;
    int value2;
};

int FuncFromDynamicLib(RequiredData);


int main()
{
    RequiredData data;
    data.name = "test";
    data.value1 = 120;
    data.value2 = 200;

    FuncFromDynamicLib(data);
    return 0;
}


//------------Imagine that this func is Part of dynamic library in another file--------------------
int FuncFromDynamicLib(RequiredData rd)
{
    cout<<rd.name<<endl;
    cout<<rd.value1<<endl;
    cout<<rd.value2<<endl;
}
//------------------------------------------------------
1
If the data members are different, your program may crash. You can prevent this by sharing the definitions of your types amongst all pieces of your program (.so's and the main binary). If you change the definition for any reason, you also have to remember to compile both...Charles
The process used to uniquely identify symbol names in C/C++ is called "name mangling". The Wikipedia page has some useful info en.wikipedia.org/wiki/Name_manglingDominic Dos Santos
@DominicDosSantos name mangling is consistent across compilations with the same compiler, though.Charles
Would the library still link at runtime if the structure names were different. Furthermore, if the data members in the structures were not the same size, would it still link anyways and leave it up to the programmer to know what he is doing?Engineer999

1 Answers

5
votes

You are completely right that you cannot in general mix your tools. The process of "linking" is not standardized or specified, and so in general you have to use the same toolchain for the entire process, or indeed one part won't know how the other part called the symbols in the object code.

The problem is in fact worse: it's not just symbol names that must match, but also calling conventions: how to propagate function arguments, return values, exceptions, and how to handle RTTI and dynamic casts.

All these factors are summarized under the term "application binary interface", ABI for short. And the problem you describe can be summarized as "there is no standard ABI".

Several factors make the situation less dire:

  • For C code, every major platform as a de facto standard ABI which is followed by most tools. Therefore, C code is in practice highly portable and interoperable.

  • For C++ code, the ABI is much more complex than for C. However, for x86 and x86-64, two major platforms, the Itanium ABI is used on a wide variety of platforms, which creates at least a somewhat interoperable environment. In particular, that ABI's rules for name mangling (i.e. how to represent namespace-qualified names and function overload sets as flat strings) are well-known and have widespread tooling support.

For C++, there are further peculiar concerns. Shipping compiled code to customers is one thing, but another issue is that much of C++ library code lives in headers and gets compiled each time. So the problem is not just that you may be using different compilers, but also different library implementations. Internal details of class layout become relevant, because you can't just access vendor A's vector as if it were vendor B's vector. GCC has dubbed this "library ABI".