Almost every C++ resource I've seen that discusses this kind of thing tells me that I should prefer polymorphic approaches to using RTTI (run-time type identification). In general, I take this kind of advice seriously, and will try and understand the rationale -- after all, C++ is a mighty beast and hard to understand in its full depth. However, for this particular question, I'm drawing a blank and would like to see what kind of advice the internet can offer. First, let me summarize what I've learned so far, by listing the common reasons that are quoted why RTTI is "considered harmful":
Some compilers don't use it / RTTI is not always enabled
I really don't buy this argument. It's like saying I shouldn't use C++14 features, because there are compilers out there that don't support it. And yet, no one would discourage me from using C++14 features. The majority of projects will have influence over the compiler they're using, and how it's configured. Even quoting the gcc manpage:
-fno-rtti
Disable generation of information about every class with virtual functions for use by the C++ run-time type identification features (dynamic_cast and typeid). If you don't use those parts of the language, you can save some space by using this flag. Note that exception handling uses the same information, but G++ generates it as needed. The dynamic_cast operator can still be used for casts that do not require run-time type information, i.e. casts to "void *" or to unambiguous base classes.
What this tells me is that if I'm not using RTTI, I can disable it. That's like saying, if you're not using Boost, you don't have to link to it. I don't have to plan for the case where someone is compiling with -fno-rtti
. Plus, the compiler will fail loud and clear in this case.
It costs extra memory / Can be slow
Whenever I'm tempted to use RTTI, that means I need to access some kind of type information or trait of my class. If I implement a solution that does not use RTTI, this usually means I will have to add some fields to my classes to store this information, so the memory argument is kind of void (I'll give an example of this further down).
A dynamic_cast can be slow, indeed. There's usually ways to avoid having to use it speed-critical situations, though. And I don't quite see the alternative. This SO answer suggests using an enum, defined in the base class, to store the type. That only works if you know all your derived classes a-priori. That's quite a big "if"!
From that answer, it seems also that the cost of RTTI is not clear, either. Different people measure different stuff.
Elegant polymorphic designs will make RTTI unnecessary
This is the kind of advice I take seriously. In this case, I simply can't come up with good non-RTTI solutions that cover my RTTI use case. Let me provide an example:
Say I'm writing a library to handle graphs of some kind of objects. I want to allow users to generate their own types when using my library (so the enum method is not available). I have a base class for my node:
class node_base
{
public:
node_base();
virtual ~node_base();
std::vector< std::shared_ptr<node_base> > get_adjacent_nodes();
};
Now, my nodes can be of different types. How about these:
class red_node : virtual public node_base
{
public:
red_node();
virtual ~red_node();
void get_redness();
};
class yellow_node : virtual public node_base
{
public:
yellow_node();
virtual ~yellow_node();
void set_yellowness(int);
};
Hell, why not even one of these:
class orange_node : public red_node, public yellow_node
{
public:
orange_node();
virtual ~orange_node();
void poke();
void poke_adjacent_oranges();
};
The last function is interesting. Here's a way to write it:
void orange_node::poke_adjacent_oranges()
{
auto adj_nodes = get_adjacent_nodes();
foreach(auto node, adj_nodes) {
// In this case, typeid() and static_cast might be faster
std::shared_ptr<orange_node> o_node = dynamic_cast<orange_node>(node);
if (o_node) {
o_node->poke();
}
}
}
This all seems clear and clean. I don't have to define attributes or methods where I don't need them, the base node class can stay lean and mean. Without RTTI, where do I start? Maybe I can add a node_type attribute to the base class:
class node_base
{
public:
node_base();
virtual ~node_base();
std::vector< std::shared_ptr<node_base> > get_adjacent_nodes();
private:
std::string my_type;
};
Is std::string a good idea for a type? Maybe not, but what else can I use? Make up a number and hope no one else is using it yet? Also, in the case of my orange_node, what if I want to use the methods from red_node and yellow_node? Would I have to store multiple types per node? That seems complicated.
Conclusion
This examples doesn't seem overly complex or unusual (I'm working on something similar in my day job, where the nodes represent actual hardware that gets controlled through the software, and which do very different thing depending on what they are). Yet I wouldn't know a clean way of doing this with templates or other methods. Please note that I'm trying to understand the problem, not defend my example. My reading of pages such as the SO answer I linked above and this page on Wikibooks seem to suggest I'm misusing RTTI, but I would like to learn why.
So, back to my original question: Why is 'pure polymorphism' preferable over using RTTI?
node_base
is part of a library and users will make their own node types. Then they can't modifynode_base
to allow another solution, so maybe RTTI becomes their best option then. On the other hand, there are other ways to design such a library so that new node types can fit in much more elegantly without needing to use RTTI (and other ways to design the new node types, too). – Matthew Walton