diff content/Coding/021-python-chained-assignment.rst @ 4:7ce6393e6d30

Adding converted blog posts from old blog.
author Brian Neal <bgneal@gmail.com>
date Thu, 30 Jan 2014 21:45:03 -0600
parents
children
line wrap: on
line diff
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/content/Coding/021-python-chained-assignment.rst	Thu Jan 30 21:45:03 2014 -0600
@@ -0,0 +1,83 @@
+A C & Python chained assignment gotcha
+######################################
+
+:date: 2012-12-27 14:45
+:tags: Python, C++
+:slug: a-c-python-chained-assignment-gotcha
+:author: Brian Neal
+
+Late last night I had a marathon debugging session where I discovered I had
+been burned by not fully understanding chaining assignment statements in
+Python. I was porting some C code to Python that had some chained assignment
+expressions. C and C++ programmers are well used to this idiom which has the
+following meaning:
+
+.. sourcecode:: c
+
+   a = b = c = d = e;      // C/C++ code
+
+   // The above is equivalent to this:
+
+   a = (b = (c = (d = e)));
+
+This is because in C, assignments are actually expressions that return a value,
+and they are right-associative.
+
+I knew that Python supported this syntax, and I had a vague memory that it was
+not the same semantically as C, but I was in a hurry. After playing a bit in the
+shell I convinced myself this chained assignment was doing what I wanted. My
+Python port kept this syntax and I drove on. A huge mistake!
+
+Hours later, of course, I found out the hard way the two are not exactly
+equivalent. For one thing, in Python, assignment is a statement, not an
+expression. There is no 'return value' from an assignment. The Python syntax
+does allow chaining for convenience, but the meaning is subtly different.
+
+.. sourcecode:: python
+
+   a = b = c = d = e    # Python code
+
+   # The above is equivalent to these lines of code:
+   a = e
+   b = e
+   c = e
+   d = e
+
+Now usually, I suspect, you can mix the C/C++ meaning with Python and not get
+tripped up. But I was porting some tricky red-black tree code, and it made
+a huge difference. Here is the C code first, and then the Python.
+
+.. sourcecode:: c
+
+   p = p->link[last] = tree_rotate(q, dir);
+
+   // The above is equivalent to:
+
+   p = (p->link[last] = tree_rotate(q, dir));
+
+
+The straight (and incorrect) Python port of this code:
+
+.. sourcecode:: python
+
+   p = p.link[last] = tree_rotate(q, d)
+
+   # The above code is equivalent to this:
+   temp = tree_rotate(q, d)
+   p = temp                                   # Oops
+   p.link[last] = temp                        # Oops
+   
+Do you see the problem? It is glaringly obvious to me now. The C and Python
+versions are not equivalent because the Python version is executing the code in
+a different order. The flaw comes about because ``p`` is used multiple times in
+the chained assignment and is now susceptible to an out-of-order problem.
+
+In the C version, the tree node pointed at by ``p`` has one of its child links
+changed first, then ``p`` is advanced to the value of the new child. In the
+Python version, the tree node referenced by the name ``p`` is changed first,
+and then its child link is altered! This introduced a very subtle bug that cost
+me a few hours of bleary-eyed debugging.
+
+Watch out for this when you are porting C to Python or vice versa. I already
+avoid this syntax in both languages in my own code, but I do admit it is nice
+for conciseness and let it slip in occasionally.