0
votes

I had wanted to find an example to express my understanding of binary compatibility, but blown it. I want to change the layout of members of class in the DLL by add members to class at the beginning or in the middle, and expect that the variable cannot be accessed correctly or accessing the variable will generate crash.However, everything goes well. I find, no matter how I add member variable to any position of class,there are no crash and not breaking binary compatibility. My code as following:

//untitled1_global.h
#include <QtCore/qglobal.h>

#if defined(UNTITLED1_LIBRARY)
#  define UNTITLED1_EXPORT Q_DECL_EXPORT
#else
#  define UNTITLED1_EXPORT Q_DECL_IMPORT
#endif
//base.h
class UNTITLED1_EXPORT Base
{
public:
    Base();

    double getA();
    double getB();

private:
    int arr[100]; //Add later to update the DLL
    double a;
    double b;
};
//derived.h
#include "dpbase.h"
class UNTITLED1_EXPORT Derived :  public Base
{
public:
    Derived();
    void setC(double d);
    double getC();

private:
    char arrCh[100]; //Add later to update the DLL
    double c;
};

Below is the client code,base.hderived.h included aren't same as in the DLL, one is annotated and one not. Implementation and declaration are separate in the DLL.I tried to access the variable directly and access the variable by funcation(such as annotated at the beginning of main.cpp).

//main.cpp
#include "dpbase.h"
#include "dpbase2.h"
#include <QDebug>

#include <QApplication>

int main(int argc, char *argv[])
{
    QApplication a(argc, argv);

    Base base;
    qDebug() << base.getA();
    qDebug() << base.getB();

    Derived base2;
    base2.setC(50);
    qDebug() << base2.getC();

    return a.exec();
}

Among them, class Base,Derived is exported from dll. No matter how I add member variable to whether Base or Derived anywhere,there are no crash and not breaking binary compatibility.

I am using qt.There is a same question here, but no help for me.

Furthermore, I delete all member var of class in the DLL, I still use nonexistent variable in the client by linking the DLL,assign value, get it...It seems that there is enough space reserved in the dynamic library to be redefined by the client, even if no member variable is defined.So strange!

My question is, why changing the layout of members of class in the DLL, will not break binary compatibility?And deleting all member var of class in the DLL but why the caller can still use members in the .h file?

1
The language lawyers will tell you that you shouldn't do this. Reality - adding member vars to the end of a class declaration usually works, provided you are passing the class around via pointer. The calling code expecting the old implementation gets the same expected layout. However, the moment the calling code copies the class instance, all bets are off.** And there's plenty of other corner cases with multiple/nested inheritance where this can break. Basically, if it works, it works. But when you add another variable tomorrow, it might break. End of story.selbie
What you really want to do is either use the COM approach or something similar - a factory function exported out of the DLL that returns back a pointer to an interface declared with only pure virtual methods. (e.g. declare an interface called IDPBase that only has pure virtual methods, and export a function that returns instances of this class).selbie
@selbie I know add member var will break binary compatibility, and do a test for verification.I add member var to any position of a class , not just the end.What confuse me was that no crash and program run well.Crawl.W
If you've added a variable to the middle of the class, it might still work, but you've added a lot more risk. If the calling code is inheriting from your class and adding its own members, it's going to depend on the layout of the members to be the same. Just calling methods that linked from your DLL, less of an issue.selbie
I don't know what you want me to tell you. You're doing risky stuff by adding variables into the middle of a class and still hoping to achieve binary compatibility. It might work provider the caller is only accessing your class via pointer, only calling methods that aren't inlined in the header, and not copying/inheriting. But if you want guaranteed binary compatibility between changes, you really should be exporting classes out of a DLL via factory functions and interfaces with only pure-virtual methods.selbie

1 Answers

1
votes

First of all you have to understand how members are accessed.

Let's take

class UNTITLED1_EXPORT Base
{
public:
    Base();

    double getA();
    double getB();

private:
    double a;
    double b;
};

When you access a it is like doing *(this + 0). When you access b it is like doing *(this + 8) (assuming double is 8 bytes).

Then when you change you class like so:

class UNTITLED1_EXPORT Base
{
public:
    Base();

    double getA();
    double getB();

private:
    int arr[100];
    double a;
    double b;
};

When you access a it will do *(this + 400) and when you access b it will do *(this + 408).

Now this may or may not be an issue. If you access a and b only through getA() and getB() and they are defined in a .cpp in your DLL, then you will update the definitions of the getters at the same time you update your class.

You could create some weird behaviors by makign the definition of the getters inline:

class UNTITLED1_EXPORT Base
{
public:
    Base();

    double getA() { return a; }
    double getB();

private:
    double a;
    double b;
};

In this case your .exe might have its own copy of getA() and getB(). Meaning that aven after you update your DLL and add the int arr[400] your .exe will still try to access *(this + 0) which is now occupied by arr.

This is undefined behavior, but it will not make your program crash though, as you are accessing allocated memory.

If you do the opposite:

  1. Compile your exe with the arr
  2. Remove arr
  3. Build the DLL
  4. Run the .exe

Then you are more likely to have a crash. Because the exe will try to access this + 400 while Base is only 16 bytes. But it is still not guaranteed to crash, for multiple reasons. For instance this + 400 might be valid. But more importantly it depends on where you allocated the memory for Base from. If you do new Base in your .exe then it will allocate 416 bytes even after your changed the DLL. But, if you do new Base in your DLL, it will allocate only 16 bytes.


Here is an exemple.

Here is the header for the .exe:

class Base
{
public:
    Base();
    double getA() const { return a; }
    double getB() const { return b; }
    static Base *create();
    void print();
private:
    double a;
    double b;
};

The header for the DLL

class Base { public: Base(); double getA() const { return a; } double getB() const; static Base *create(); void print(); private: int arr[400] double a; double b; };

The code of the DLL:

Base::Base()
{
    for (int i = 0 ; i < 100 ; ++i) {
        arr[i] = i;
    }
    a = 3.14;
    b = 1.42;
}

double Base::getB() const { return b; }

Base *Base::create()
{
    return new Base();
}

void Base::print()
{
    std::cout << a << std::endl;
    std::cout << b << std::endl;
}

And the code in my exe is:

Base *b = Base::create();

std::cout << b->getA() << std::endl;
std::cout << b->getB() << std::endl;
b->print();

Normally I should expect to have the output:

3.14
1.42
3.14
1.42

But in practice I have:

2.122e-314
1.42
3.14
1.42

The reason is that for the first line (2.122e-314) the exe is looking for a at address this + 0, but since this memory is now occupied by arr it is as we would have done: std::cout << *((double *)arr). The values for b are not affected because b is always read from the DLL as getB() is not inlined.

Now if I change Base *b = Base::create(); by Base *b = new B(); the program crashes because as the new is done from the .exe it will only allocate 16 bytes while Base() will access all its member variables across 416 bytes.

Note

For the purpose of this answer I did pointer arithmetic supposing the increment is always 1. In reality it is not.

So when I write this + 8 it is to be understood in C++ as reinterpret_cast<double *>(reinterpret_cast<char *>(this) + 8).