view content/Coding/021-python-chained-assignment.rst @ 4:7ce6393e6d30

Adding converted blog posts from old blog.
author Brian Neal <bgneal@gmail.com>
date Thu, 30 Jan 2014 21:45:03 -0600
parents
children
line wrap: on
line source
A C & Python chained assignment gotcha
######################################

:date: 2012-12-27 14:45
:tags: Python, C++
:slug: a-c-python-chained-assignment-gotcha
:author: Brian Neal

Late last night I had a marathon debugging session where I discovered I had
been burned by not fully understanding chaining assignment statements in
Python. I was porting some C code to Python that had some chained assignment
expressions. C and C++ programmers are well used to this idiom which has the
following meaning:

.. sourcecode:: c

   a = b = c = d = e;      // C/C++ code

   // The above is equivalent to this:

   a = (b = (c = (d = e)));

This is because in C, assignments are actually expressions that return a value,
and they are right-associative.

I knew that Python supported this syntax, and I had a vague memory that it was
not the same semantically as C, but I was in a hurry. After playing a bit in the
shell I convinced myself this chained assignment was doing what I wanted. My
Python port kept this syntax and I drove on. A huge mistake!

Hours later, of course, I found out the hard way the two are not exactly
equivalent. For one thing, in Python, assignment is a statement, not an
expression. There is no 'return value' from an assignment. The Python syntax
does allow chaining for convenience, but the meaning is subtly different.

.. sourcecode:: python

   a = b = c = d = e    # Python code

   # The above is equivalent to these lines of code:
   a = e
   b = e
   c = e
   d = e

Now usually, I suspect, you can mix the C/C++ meaning with Python and not get
tripped up. But I was porting some tricky red-black tree code, and it made
a huge difference. Here is the C code first, and then the Python.

.. sourcecode:: c

   p = p->link[last] = tree_rotate(q, dir);

   // The above is equivalent to:

   p = (p->link[last] = tree_rotate(q, dir));


The straight (and incorrect) Python port of this code:

.. sourcecode:: python

   p = p.link[last] = tree_rotate(q, d)

   # The above code is equivalent to this:
   temp = tree_rotate(q, d)
   p = temp                                   # Oops
   p.link[last] = temp                        # Oops
   
Do you see the problem? It is glaringly obvious to me now. The C and Python
versions are not equivalent because the Python version is executing the code in
a different order. The flaw comes about because ``p`` is used multiple times in
the chained assignment and is now susceptible to an out-of-order problem.

In the C version, the tree node pointed at by ``p`` has one of its child links
changed first, then ``p`` is advanced to the value of the new child. In the
Python version, the tree node referenced by the name ``p`` is changed first,
and then its child link is altered! This introduced a very subtle bug that cost
me a few hours of bleary-eyed debugging.

Watch out for this when you are porting C to Python or vice versa. I already
avoid this syntax in both languages in my own code, but I do admit it is nice
for conciseness and let it slip in occasionally.