diff content/Coding/005-django-unicode-error-uploads.rst @ 4:7ce6393e6d30

Adding converted blog posts from old blog.
author Brian Neal <bgneal@gmail.com>
date Thu, 30 Jan 2014 21:45:03 -0600
parents
children
line wrap: on
line diff
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/content/Coding/005-django-unicode-error-uploads.rst	Thu Jan 30 21:45:03 2014 -0600
@@ -0,0 +1,53 @@
+Django Uploads and UnicodeEncodeError
+#####################################
+
+:date: 2011-06-04 20:00
+:tags: Django, Python, Linux, Unicode
+:slug: django-uploads-and-unicodeencodeerror
+:author: Brian Neal
+
+Something strange happened that I wish to document in case it helps others.  I
+had to reboot my Ubuntu server while troubleshooting a disk problem. After the
+reboot, I began receiving internal server errors whenever someone tried to view
+a certain forum thread on my Django_ powered website. After some detective work,
+I determined it was because a user that had posted in the thread had an avatar
+image whose filename contained non-ASCII characters. The image file had been
+there for months, and I still cannot explain why it just suddenly started
+happening. 
+
+The traceback I was getting ended with something like this:
+
+.. sourcecode:: python
+
+   File "/django/core/files/storage.py", line 159, in _open
+   return File(open(self.path(name), mode))
+
+   UnicodeEncodeError: 'ascii' codec can't encode characters in position 72-79: ordinal not in range(128)
+
+So it appeared that the ``open()`` call was triggering the error. This led me on
+a twisty Google search which had many dead ends. Eventually I found a suitable
+explanation. Apparently, Linux filesystems don't enforce a particular Unicode
+encoding for filenames. Linux applications must decide how to interpret
+filenames all on their own. The Python OS library (on Linux) uses environment
+variables to determine what locale you are in, and this chooses the encoding for
+filenames.  If these environment variables are not set, Python falls back to
+ASCII (by default), and hence the source of my ``UnicodeEncodeError``.
+
+So how do you tell a Python instance that is running under Apache / ``mod_wsgi``
+about these environment variables? It turns out the answer is in the `Django
+documentation`_, albeit in the ``mod_python`` integration section.
+
+So, to fix the issue, I added the following lines to my ``/etc/apache2/envvars``
+file:
+
+.. sourcecode:: bash
+
+   export LANG='en_US.UTF-8'
+   export LC_ALL='en_US.UTF-8'
+
+Note that you must cold stop and re-start Apache for these changes to take
+effect. I got tripped up at first because I did an ``apache2ctrl
+graceful``, and that was not sufficient to create a new environment.
+
+.. _Django: http://djangoproject.com
+.. _Django documentation: https://docs.djangoproject.com/en/1.3/howto/deployment/modpython/#if-you-get-a-unicodeencodeerror