26
votes

My Flask-Restful application has a number of "objects". In the first version of the app these are simple data structures with no behaviour, implemented as Dicts, or lists of Dicts.

The attributes of these "objects" can change. I use a generator function to track the changes, and then alert web-clients via server-sent-events (SSEs). This works by maintaining an "old" copy of the object to be tracked, and comparing it to the latest state.

In the next version of the app I populate the "objects" from a SQLite DB using SQLAlchemy. The objects are now implemented as SQLAlchemy declarative classes, or lists of such classes.

To compare "old" and "new" instances based on equality of attributes only I had to add an __eq__ override to my SQLAlchemy Objects. i.e. the instances are considered equal / unchanged when the attributes have the same values. (I have posted example code at the bottom of this question).

Technically this works, but raises some architectural alarm bells: Am I sailing in the wrong direction?

a) If I add __eq__ and __ne__ overrides to SQAlchemy objects, could this cause SQLAlchemy a problem when I later want to re-persist the objects back to the database?

b) How far into my application should the SQLAlchemy objects reach: is there a "pythonic best practice"? i.e. Is it ok / normal to extend SQLAlchemy objects with business logic / behaviours unconnected with DB persistence (such as tracking changes); or should they be used only as simple DTOs between the database and server, with business logic in other objects?

Note: it is clear to me that the data presented to the clients via the REST apis and the SSEs should be abstracted from the implementation details in the web-server and DB, so that is not part of this question.

sqlalchemy id equality vs reference equality https://codereview.stackexchange.com/questions/93511/data-transfer-objects-vs-entities-in-java-rest-server-application http://www.mehdi-khalili.com/orm-anti-patterns-part-4-persistence-domain-model/

class EqualityMixin(object):
# extended from the concept in :
# https://stackguides.com/questions/390250/elegant-ways-to-support-equivalence-equality-in-python-classes

    def __eq__(self, other):
        classes_match = isinstance(other, self.__class__)
        a, b = deepcopy(self.__dict__), deepcopy(other.__dict__)
        #compare based on equality our attributes, ignoring SQLAlchemy internal stuff
        a.pop('_sa_instance_state', None)
        b.pop('_sa_instance_state', None)
        attrs_match = (a == b)
        return classes_match and attrs_match

    def __ne__(self, other):
        return not self.__eq__(other)
1
yes, you are sailing in the wrong direction on this. a deep comparison is going to be very slow and error prone. Do the work and write explicit __eq__() routines for each object that wishes to include this functionality, comparing the actual attributes you care about individually.zzzeek
In the above code you might want to check symmetrically: classes_match = isinstance(other, self.__class__) and isinstance(self, other.__class__)Daniel Böckenhoff

1 Answers

5
votes

I'll drill down into what's happening behind the Base class, to show that the __eq__ and __ne__ overrides are fine. When you instantiate your Base class by calling declarative_base(), it's using a metaclass behind the scenes to set it up (It might be worth reading this explanation this metaclass explanation to better understand why it is involved). It does some configurable setup, like adding a custom constructor to your Base class and setting how it will map from the object to a table.

declarative_base()is then going to return a new Base class instance of a DeclarativeMeta metaclass. The whole reason metaclasses are involved here are so that at the time when you create a class that extends your Base, it will map it to a table. If you trace down this path a little way, you will see how it maps the Columns you declare on your object to a table.

self.cls.__mapper__ = mp_ = mapper_cls(
        self.cls, # cls is your model
        self.local_table,
        **self.mapper_args # the columns you have defined
    )

Although the actual Mapper which is doing this looks like it gets very complicated and low level, at this stage it's operating with primary keys and columns rather than actual object instances. This doesn't confirm that it's never used, however, so I looked through the usages of == and != on the source and didn't see any causes for concern.

As for your second question I can only really offer my own opinion - I've googled around this subject many times in the past and haven't found much in the way of 'gold standard' SQL Alchemy usage. I've used SQL Alchemy for a couple of projects so far and it feels like your use of the objects can extend about as far as you can still sensibly abstract away the session life-cycle. To me it seems like enough of the Alchemy "magic" is abstracted away from the models themselves that when sessions are handled well, they are sufficiently far removed from the data layer that it doesn't feel like business logic in the classes would get in the way.