Fun with Operator Overloading
The other day I was asked to take a look at code I had written years ago that basically allowed the user to build up logical formulas while always keeping them in CNF (conjunctive normal form – that is, a logical formula where the outermost connective is always a conjunction).
This was done by building a class hierarchy with an abstract class Expr on the top level,
two classes ExprAnd and ExprOr that represent conjunction and disjunction respectively and
both contain lists of Expr. Various other classes where used to represent various atoms of
the formulas, the particular code was about solar physics so the atoms happen to be things like
Wavelength(x, y) for selecting observations of a particular wavelength.
The base class Expr defines __and__ and __or__, that is the methods that need to be
implemented to overload the & and | operations that just return ExprAnd and ExprOr
of the object and the left hand operand.
ExprAnd and ExprOr also implement the operation corresponding to their node type
by just returning a new ExprAnd or ExprOr with the left hand operand added to the
list, respectively. They also implement __rand__ and __ror__, so their special implementation
is also applied if they are used at the right hand side of the operation – or that was the plan.
ExprAnd commutes | operations into the conjuction to keep conjunction as the top level
connective.
The problem that I asked to help with was that foo & (bar & baz) did not seem to evaluate into
ExprAnd(foo, bar, baz), but rather ExprAnd(foo, ExprAnd(bar, baz)), that is, the automatic
transformation to CNF was not working.
Python actually implements logic that when you have an operation foo & bar where the type of
bar is a subclass of the type of foo and implements its own version of __rand__, the
expression will be evaluating using that. So when you read __rand__ (or any of the operators)
is used of the left hand side does not implement the operator, that is wrong. For instance,
In [18]: class Foo(object):
    ...:     def __add__(self, x):
    ...:         print "Foo ", self, x
    ...:         
In [19]: class Qux(Foo):
    ...:     def __radd__(self, x):
    ...:         print "Qux ", self, x
    ...:         
In [20]: Foo() + Qux()
Qux  <__main__.Qux object at 0x112f9f750> <__main__.Foo object at 0x112f7e510>This behaviour was what I was aiming to exploit by making ExprAnd (for instance) implement
both __and__ and __rand__ and make Python use the most specific implementation. Sadly, it
does not work this way. If you now introduce a third class that inherits from Foo and does not
implement either of the methods itself, the base class’ will be used, for instance
In [21]: class Bar(Foo):
    ...:     pass
    ...: 
In [22]: Bar() + Qux()
Foo  <__main__.Bar object at 0x112f80890> <__main__.Qux object at 0x112f93a90>The semantics of Python operator overloading will only prefer the __rand__ method of the
right hand side if it is an object of a subclass of the type of the right hand side object.
Note that this does not take into account in which class the implementation of the left hand side
was actually done, which explains the behaviour above.
If you look at the class layout described above, this was exactly the case that I was hitting. Disclaimer: What follows below is not how it was actually fixed in the code.
Going Crazy
I realized that what I had intended to implement back then was something like commutative multimethod. Python’s dynamicity, for better or worse, gives you the tools to do most of such things, so I decided to give it a go.
The gist of the idea is to try to find the most specific implementation of the operation, considering both operands. The operation of the left hand side operand is more specific if it overrides the definition of the operation on the right hand side (and the other way round).
This can conveniently be expressed in terms of Python: The operation of the left hand side operand is more specific if the operation defined by the right hand side is in the MRO (method resolution order – the list of classes, in order, that get consulted when looking up a method in an object).
To implement this, first we define a functions to get all functions. It is important to know
that the function is different to the unbound and bound method object and does not contain
information about which class it is contained in; the function object can be obtained from
an unbound method object by using im_func.
import inspect
def fn_mro(cls, fn):
    mro = inspect.getmro(cls)
    return [getattr(x, fn).im_func for x in mro if hasattr(x, fn)]This can then be used to implement the logic as described above as a decorator that wraps a method. Note that we have to compare whether the left and right side operator is actually the same to prevent an infinite loop.
def most_specific(fn):
    def func(self, other):
        name = fn.__name__
        self_fn = getattr(self, name).im_func
        other_fn = getattr(other, name).im_func
        if self_fn == other_fn:
            return fn(self, other)
        if self_fn not in fn_mro(other.__class__, name):
            # We did not inherit this from something in other's MRO.
            if other_fn not in fn_mro(self.__class__, name):
                # They didn't either. PANIC!
                raise TypeError
            return fn(self, other)
        else:
            return getattr(other, name)(self)
    return funcIf we use the decorator on a slightly modified version of the previous example
(we use __add__ instead of __radd__ because the decorator assumes commutativity
and uses the same operations on both sides).
In [7]: class Foo(object):
    @most_specific
    def __add__(self, x):
        print "Foo ", self, x
   ...:         
In [8]: class Qux(Foo):
    @most_specific
    def __add__(self, x):
        print "Qux ", self, x
   ...:         
In [9]: Foo() + Qux()
Qux  <__main__.Qux object at 0x103103690> <__main__.Foo object at 0x103103f50>
In [10]: class Bar(Foo):
   ....:     pass
   ....: 
In [11]: Bar() + Qux()
Qux  <__main__.Qux object at 0x103103e50> <__main__.Bar object at 0x10310b050>