ECS serialization

votes

I am currently developing an ECS system for a game engine, and stumbled across a problem with serialization. The data structure I use for component storage in my ECS implementation, is a pool of preconstructed components, which are recycled. Therefore adding an entity is as trivial as assigning values to the preconstructed components. The system is designed to utilize component indetifiers instead of types, and this is where the problem lies. When serializing and deserializing the components are treated as BaseComponent pointers.

Component struct hierachy

struct BaseComponent {}; // struct used as base to store components

template<typename T>
struct Component : public BaseComponent {
    static const uint_fast32_t ID; // identifier used by the rest of the system
}

struct SomeComponent : public Component<SomeComponent> {
    // component specific data
}

It is the component specific data that I want to serialize, deserialize and assign the data to the appropriate fields in SomeComponent.

I have a simple solution, however it is quite shady when it comes to clean code. The solution I found is to dump the components memory directly into a file and read it into memory via a char buffer. This, however, does not allow pointers, and is, in my opinion, quite disgusting. The other solution I have found is to use, if it is possible, a variadic function which constructs a temporary with aggregate initialization using variadic expansion. However this method does not solve the serialization and deserialization, only the assignment.

My question is therefore: Is there a good way of serializing and deserializing polymorphic types, generically, when no type information is known, and if not, is there a better way of doing it?

c++serializationdeserializationgame-engine

2 Answers

votes

I would consider CRTP pattern in this case. Base class defines serialize/deserialize functions which call an actual implementation in a delivered class. The wikipedia article has some great examples.

votes

First things first: I think you are trying to avoid it, but one way or another, you will need to define a serialize/deserialize function for every kind of Component you have.

That being clear, you have three options.

Not using BaseComponent pointers.
Change BaseComponent to have virtual serialize/deserialize member functions (and a virtual destructor, to keep things safe). Then use CRTP, as suggested by @licensed-slacker , in the Component struct. Like this:

struct BaseComponent {
    virtual std::string serialize() = 0;
};
template <typename T>
struct Component: public BaseComponent {
    static const uint_fast32_t ID;
    virtual std::string serialize() {
        return static_cast<T*>(this)->serialize();
    }
};

struct SomeComponent: public Component<SomeComponent> {
    std::string serialize() {
        return "{a: b, c: d}";
    }
};

struct OtherComponent: public Component<OtherComponent> {
    std::string serialize() {
        return "{d: c, b: a}";
    }
};

void myfunc() {
    OtherComponent o1, o2;
    SomeComponent s1, s2;

    std::vector<BaseComponent*> components;
    components.push_back(&o1);
    components.push_back(&s1);
    components.push_back(&o2);
    components.push_back(&s2);

    for (auto* comp : components) {
        std::cout << comp->serialize() << "\n";
    }
}

Change BaseComponent to have an enum, or id that you can use to identify the derived component type in a big switch, that way you can recover the types.

enum class ComponentKind {
    SomeComponent,
    OtherComponent
};
struct BaseComponent {
    ComponentKind kind;
};
template <typename T>
struct Component: public BaseComponent {
    static const uint_fast32_t ID;
};

struct SomeComponent: public Component<SomeComponent> {
    SomeComponent() {
        kind = ComponentKind::SomeComponent;
    }
    std::string serialize() {
        return "{a: b, c: d}";
    }
};

struct OtherComponent: public Component<OtherComponent> {
    OtherComponent() {
        kind = ComponentKind::OtherComponent;
    }
    std::string serialize() {
        return "{d: c, b: a}";
    }
};

std::string serialize(BaseComponent* comp) {
    switch (comp->kind) {
        case (ComponentKind::SomeComponent):
            return static_cast<SomeComponent*>(comp)->serialize();
        case (ComponentKind::OtherComponent):
            return static_cast<OtherComponent*>(comp)->serialize();
        default:
            return "";
    }
}

void myfunc() {
    OtherComponent o1, o2;
    SomeComponent s1, s2;

    std::vector<BaseComponent*> components;
    components.push_back(&o1);
    components.push_back(&s1);
    components.push_back(&o2);
    components.push_back(&s2);

    for (auto* comp : components) {
        std::cout << serialize(comp) << "\n";
    }
}enum class ComponentKind {
    SomeComponent,
    OtherComponent
};
struct BaseComponent {
    ComponentKind kind;
};
template <typename T>
struct Component: public BaseComponent {
    static const uint_fast32_t ID;
};

struct SomeComponent: public Component<SomeComponent> {
    SomeComponent() {
        kind = ComponentKind::SomeComponent;
    }
    std::string serialize() {
        return "{a: b, c: d}";
    }
};

struct OtherComponent: public Component<OtherComponent> {
    OtherComponent() {
        kind = ComponentKind::OtherComponent;
    }
    std::string serialize() {
        return "{d: c, b: a}";
    }
};

std::string serialize(BaseComponent* comp) {
    switch (comp->kind) {
        case (ComponentKind::SomeComponent):
            return static_cast<SomeComponent*>(comp)->serialize();
        case (ComponentKind::OtherComponent):
            return static_cast<OtherComponent*>(comp)->serialize();
        default:
            return "";
    }
}

void myfunc() {
    OtherComponent o1, o2;
    SomeComponent s1, s2;

    std::vector<BaseComponent*> components;
    components.push_back(&o1);
    components.push_back(&s1);
    components.push_back(&o2);
    components.push_back(&s2);

    for (auto* comp : components) {
        std::cout << serialize(comp) << "\n";
    }
}

The option number 2 creates a vtable on BaseComponent (dynamic polymorphism performance kicks in), and forces you to use virtual destructors in order to be safe (check Scott Meyers words on that matter).

The option number 3, does not need a vtable but throws type safety away, and forces you to be very careful static_casting all over the place.

I think people would prefer option 3 over 2, but honestly, I would consider first option 1.