30
votes

tl;dr

What strategies exist to overcome parameter type invariance for specializations, in a language (PHP) without support for generics?

Note: I wish I could say my understanding of type theory/safety/variance/etc., was more complete; I'm no CS major.


Situation

You've got an abstract class, Consumer, that you'd like to extend. Consumer declares an abstract method consume(Argument $argument) which needs a definition. Shouldn't be a problem.


Problem

Your specialized Consumer, called SpecializedConsumer has no logical business working with every type of Argument. Instead, it should accept a SpecializedArgument (and subclasses thereof). Our method signature changes to consume(SpecializedArgument $argument).

abstract class Argument { }

class SpecializedArgument extends Argument { }

abstract class Consumer { 
    abstract public function consume(Argument $argument);
}

class SpecializedConsumer extends Consumer {
    public function consume(SpecializedArgument $argument) {
        // i dun goofed.
    }
}

We're breaking Liskov substitution principle, and causing type safety problems. Poop.


Question

Ok, so this isn't going to work. However, given this situation, what patterns or strategies exist to overcome the type safety problem, and the violation of LSP, yet still maintain the type relationship of SpecializedConsumer to Consumer?

I suppose it's perfectly acceptable that an answer can be distilled down to "ya dun goofed, back to the drawing board".


Considerations, Details, & Errata

  • Alright, an immediate solution presents itself as "don't define the consume() method in Consumer". Ok, that makes sense, because method declaration is only as good as the signature. Semantically though the absence of consume(), even with a unknown parameter list, hurts my brain a bit. Perhaps there is a better way.

  • From what I'm reading, few languages support parameter type covariance; PHP is one of them, and is the implementation language here. Further complicating things, I've seen creative "solutions" involving generics; another feature not supported in PHP.

  • From Wiki's Variance (computer science) - Need for covariant argument types?:

    This creates problems in some situations, where argument types should be covariant to model real-life requirements. Suppose you have a class representing a person. A person can see the doctor, so this class might have a method virtual void Person::see(Doctor d). Now suppose you want to make a subclass of the Person class, Child. That is, a Child is a Person. One might then like to make a subclass of Doctor, Pediatrician. If children only visit pediatricians, we would like to enforce that in the type system. However, a naive implementation fails: because a Child is a Person, Child::see(d) must take any Doctor, not just a Pediatrician.

    The article goes on to say:

    In this case, the visitor pattern could be used to enforce this relationship. Another way to solve the problems, in C++, is using generic programming.

    Again, generics can be used creatively to solve the problem. I'm exploring the visitor pattern, as I have a half-baked implementation of it anyway, however most implementations as described in articles leverage method overloading, yet another unsupported feature in PHP.


<too-much-information>

Implementation

Due to recent discussion, I'll expand on the specific implementation details I've neglected to include (as in, I'll probably include way too much).

For brevity, I've excluded method bodies for those which are (should be) abundantly clear in their purpose. I've tried to keep this brief, but I tend to get wordy. I didn't want to dump a wall of code, so explanations follow/precede code blocks. If you have edit privileges, and want to clean this up, please do. Also, code blocks aren't copy-pasta from a project. If something doesn't make sense, it might not; yell at me for clarification.

With respect to the original question, hereafter the Rule class is the Consumer and the Adapter class is the Argument.

The tree-related classes are comprised as follows:

abstract class Rule {
    abstract public function evaluate(Adapter $adapter);
    abstract public function getAdapter(Wrapper $wrapper);
}

abstract class Node {
    protected $rules = [];
    protected $command;
    public function __construct(array $rules, $command) {
        $this->addEachRule($rules);
    }
    public function addRule(Rule $rule) { }
    public function addEachRule(array $rules) { }
    public function setCommand(Command $command) { }
    public function evaluateEachRule(Wrapper $wrapper) {
        // see below
    }
    abstract public function evaluate(Wrapper $wrapper);
}

class InnerNode extends Node {
    protected $nodes = [];
    public function __construct(array $rules, $command, array $nodes) {
        parent::__construct($rules, $command);
        $this->addEachNode($nodes);
    }
    public function addNode(Node $node) { }
    public function addEachNode(array $nodes) { }
    public function evaluateEachNode(Wrapper $wrapper) {
        // see below
    }
    public function evaluate(Wrapper $wrapper) {
        // see below
    }
}

class OuterNode extends Node {
    public function evaluate(Wrapper $wrapper) {
        // see below
    }
}

So each InnerNode contains Rule and Node objects, and each OuterNode only Rule objects. Node::evaluate() evaluates each Rule (Node::evaluateEachRule()) to a boolean true. If each Rule passes, the Node has passed and it's Command is added to the Wrapper, and will descend to children for evaluation (OuterNode::evaluateEachNode()), or simply return true, for InnerNode and OuterNode objects respectively.

As for Wrapper; the Wrapper object proxies a Request object, and has a collection of Adapter objects. The Request object is a representation of the HTTP request. The Adapter object is a specialized interface (and maintains specific state) for specific use with specific Rule objects. (this is where the LSP problems come in)

The Command object is an action (a neatly packaged callback, really) which is added to the Wrapper object, once all is said and done, the array of Command objects will be fired in sequence, passing the Request (among other things) in.

class Request { 
    // all teh codez for HTTP stuffs
}

class Wrapper {
    protected $request;
    protected $commands = [];
    protected $adapters = [];
    public function __construct(Request $request) {
        $this->request = $request;
    }
    public function addCommand(Command $command) { }
    public function getEachCommand() { }
    public function adapt(Rule $rule) {
        $type = get_class($rule);
        return isset($this->adapters[$type]) 
            ? $this->adapters[$type]
            : $this->adapters[$type] = $rule->getAdapter($this);
    }
    public function commit(){
        foreach($this->adapters as $adapter) {
            $adapter->commit($this->request);
        }
    }
}

abstract class Adapter {
    protected $wrapper;
    public function __construct(Wrapper $wrapper) {
        $this->wrapper = $wrapper;
    }
    abstract public function commit(Request $request);
}

So a given user-land Rule accepts the expected user-land Adapter. If the Adapter needs information about the request, it's routed through Wrapper, in order to preserve the integrity of the original Request.

As the Wrapper aggregates Adapter objects, it will pass existing instances to subsequent Rule objects, so that the state of an Adapter is preserved from one Rule to the next. Once an entire tree has passed, Wrapper::commit() is called, and each of the aggregated Adapter objects will apply it's state as necessary against the original Request.

We are then left with an array of Command objects, and a modified Request.


What the hell is the point?

Well, I didn't want to recreate the prototypical "routing table" common in many PHP frameworks/applications, so instead I went with a "routing tree". By allowing arbitrary rules, you can quickly create and append an AuthRule (for example) to a Node, and no longer is that whole branch accessible without passing the AuthRule. In theory (in my head) it's like a magical unicorn, preventing code duplication, and enforcing zone/module organization. In practice, I'm confused and scared.

Why I left this wall of nonsense?

Well, this is the implementation for which I need to fix the LSP problem. Each Rule corresponds to an Adapter, and that ain't good. I want to preserve the relationship between each Rule, as to ensure type safety when constructing the tree, etc., however I can't declare the key method (evaluate()) in the abstract Rule, as the signature changes for subtypes.

On another note, I'm working on sorting out the Adapter creation/management scheme; whether it is the responsibility of the Rule to create it, etc.

</too-much-information>

2
I don't have time to go in-depth at the moment, but this is IMHO an important reason why it's better to prefer composition over inheritance. Your problem arises because you're trying to extend an abstract in the first place. My initial suggestion is to use interfaces instead of abstract classes, typehint the interfaces in your method signatures and compose functionality as necessary via the constructors.rdlowrey
@Dan i find myself asking this question you voiced years ago. do you recall which path you took? have your thoughts changed since then?jules

2 Answers

13
votes

To properly answer this question, we must really take a step back and look at the problem you're trying to solve in a more general manner (and your question was already pretty general).

The Real Problem

The real problem is that you're trying to use inheritance to solve a problem of business logic. That's never going to work because of LSP violations and -more importantly- tight coupling your business logic to the application's structure.

So inheritance is out as a method to solve this problem (for the above, and the reasons you stated in the question). Fortunately, there are a number of compositional patterns that we can use.

Now, considering how generic your question is, it's going to be very hard to identify a solid solution to your problem. So let's go over a few patterns and see how they can solve this problem.

Strategy

The Strategy Pattern is the first that came to my mind when I first read the question. Basically, it separates the implementation details from the execution details. It allows for a number of different "strategies" to exist, and the caller would determine which to load for the particular problem.

The downside here is that the caller must know about the strategies in order to pick the correct one. But it also allows for a cleaner distinction between the different strategies, so it's a decent choice...

Command

The Command Pattern would also decouple the implementation just like Strategy would. The main difference is that in Strategy, the caller is the one that chooses the consumer. In Command, it's someone else (a factory or dispatcher perhaps)...

Each "Specialized Consumer" would implement only the logic for a specific type of problem. Then someone else would make the appropriate choice.

Chain Of Responsibility

The next pattern that may be applicable is the Chain of Responsibility Pattern. This is similar to the strategy pattern discussed above, except that instead of the consumer deciding which is called, each one of the strategies is called in sequence until one handles the request. So, in your example, you would take the more generic argument, but check if it's the specific one. If it is, handle the request. Otherwise, let the next one give it a try...

Bridge

A Bridge Pattern may be appropriate here as well. This is in some sense similar to the Strategy pattern, but it's different in that a bridge implementation would pick the strategy at construction time, instead of at run time. So then you would build a different "consumer" for each implementation, with the details composed inside as dependencies.

Visitor Pattern

You mentioned the Visitor Pattern in your question, so I'd figure I'd mention it here. I'm not really sure it's appropriate in this context, because a visitor is really similar to a strategy pattern that's designed to traverse a structure. If you don't have a data structure to traverse, then the visitor pattern will be distilled to look fairly similar to a strategy pattern. I say fairly, because the direction of control is different, but the end relationship is pretty much the same.

Other Patterns

In the end, it really depends on the concrete problem that you're trying to solve. If you're trying to handle HTTP requests, where each "Consumer" handles a different request type (XML vs HTML vs JSON etc), the best choice will likely be very different than if you're trying to handle finding the geometric area of a polygon. Sure, you could use the same pattern for both, but they are not really the same problem.

With that said, the problem could also be solved with a Mediator Pattern (in the case where multiple "Consumers" need a chance to process data), a State Pattern (in the case where the "Consumer" will depend on past consumed data) or even an Adapter Pattern (in the case where you're abstracting a different sub-system in the specialized consumer)...

In short, it's a difficult problem to answer, because there are so many solutions that it's hard to say which is correct...

5
votes

The only one known to me is DIY strategy: accept simple Argument in function definition and immediately check if it is specialized enough:

class SpecializedConsumer extends Consumer {
    public function consume(Argument $argument) {
        if(!($argument instanceof SpecializedArgument)) {
            throw new InvalidArgumentException('Argument was not specialized.');
        }
        // move on
    }
}