Mercurial > public > pelican-blog
diff content/Coding/005-django-unicode-error-uploads.rst @ 4:7ce6393e6d30
Adding converted blog posts from old blog.
author | Brian Neal <bgneal@gmail.com> |
---|---|
date | Thu, 30 Jan 2014 21:45:03 -0600 |
parents | |
children |
line wrap: on
line diff
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/content/Coding/005-django-unicode-error-uploads.rst Thu Jan 30 21:45:03 2014 -0600 @@ -0,0 +1,53 @@ +Django Uploads and UnicodeEncodeError +##################################### + +:date: 2011-06-04 20:00 +:tags: Django, Python, Linux, Unicode +:slug: django-uploads-and-unicodeencodeerror +:author: Brian Neal + +Something strange happened that I wish to document in case it helps others. I +had to reboot my Ubuntu server while troubleshooting a disk problem. After the +reboot, I began receiving internal server errors whenever someone tried to view +a certain forum thread on my Django_ powered website. After some detective work, +I determined it was because a user that had posted in the thread had an avatar +image whose filename contained non-ASCII characters. The image file had been +there for months, and I still cannot explain why it just suddenly started +happening. + +The traceback I was getting ended with something like this: + +.. sourcecode:: python + + File "/django/core/files/storage.py", line 159, in _open + return File(open(self.path(name), mode)) + + UnicodeEncodeError: 'ascii' codec can't encode characters in position 72-79: ordinal not in range(128) + +So it appeared that the ``open()`` call was triggering the error. This led me on +a twisty Google search which had many dead ends. Eventually I found a suitable +explanation. Apparently, Linux filesystems don't enforce a particular Unicode +encoding for filenames. Linux applications must decide how to interpret +filenames all on their own. The Python OS library (on Linux) uses environment +variables to determine what locale you are in, and this chooses the encoding for +filenames. If these environment variables are not set, Python falls back to +ASCII (by default), and hence the source of my ``UnicodeEncodeError``. + +So how do you tell a Python instance that is running under Apache / ``mod_wsgi`` +about these environment variables? It turns out the answer is in the `Django +documentation`_, albeit in the ``mod_python`` integration section. + +So, to fix the issue, I added the following lines to my ``/etc/apache2/envvars`` +file: + +.. sourcecode:: bash + + export LANG='en_US.UTF-8' + export LC_ALL='en_US.UTF-8' + +Note that you must cold stop and re-start Apache for these changes to take +effect. I got tripped up at first because I did an ``apache2ctrl +graceful``, and that was not sufficient to create a new environment. + +.. _Django: http://djangoproject.com +.. _Django documentation: https://docs.djangoproject.com/en/1.3/howto/deployment/modpython/#if-you-get-a-unicodeencodeerror