bgneal@4
|
1 Django Uploads and UnicodeEncodeError
|
bgneal@4
|
2 #####################################
|
bgneal@4
|
3
|
bgneal@4
|
4 :date: 2011-06-04 20:00
|
bgneal@4
|
5 :tags: Django, Python, Linux, Unicode
|
bgneal@4
|
6 :slug: django-uploads-and-unicodeencodeerror
|
bgneal@4
|
7 :author: Brian Neal
|
bgneal@4
|
8
|
bgneal@4
|
9 Something strange happened that I wish to document in case it helps others. I
|
bgneal@4
|
10 had to reboot my Ubuntu server while troubleshooting a disk problem. After the
|
bgneal@4
|
11 reboot, I began receiving internal server errors whenever someone tried to view
|
bgneal@4
|
12 a certain forum thread on my Django_ powered website. After some detective work,
|
bgneal@4
|
13 I determined it was because a user that had posted in the thread had an avatar
|
bgneal@4
|
14 image whose filename contained non-ASCII characters. The image file had been
|
bgneal@4
|
15 there for months, and I still cannot explain why it just suddenly started
|
bgneal@4
|
16 happening.
|
bgneal@4
|
17
|
bgneal@4
|
18 The traceback I was getting ended with something like this:
|
bgneal@4
|
19
|
bgneal@4
|
20 .. sourcecode:: python
|
bgneal@4
|
21
|
bgneal@4
|
22 File "/django/core/files/storage.py", line 159, in _open
|
bgneal@4
|
23 return File(open(self.path(name), mode))
|
bgneal@4
|
24
|
bgneal@4
|
25 UnicodeEncodeError: 'ascii' codec can't encode characters in position 72-79: ordinal not in range(128)
|
bgneal@4
|
26
|
bgneal@4
|
27 So it appeared that the ``open()`` call was triggering the error. This led me on
|
bgneal@4
|
28 a twisty Google search which had many dead ends. Eventually I found a suitable
|
bgneal@4
|
29 explanation. Apparently, Linux filesystems don't enforce a particular Unicode
|
bgneal@4
|
30 encoding for filenames. Linux applications must decide how to interpret
|
bgneal@4
|
31 filenames all on their own. The Python OS library (on Linux) uses environment
|
bgneal@4
|
32 variables to determine what locale you are in, and this chooses the encoding for
|
bgneal@4
|
33 filenames. If these environment variables are not set, Python falls back to
|
bgneal@4
|
34 ASCII (by default), and hence the source of my ``UnicodeEncodeError``.
|
bgneal@4
|
35
|
bgneal@4
|
36 So how do you tell a Python instance that is running under Apache / ``mod_wsgi``
|
bgneal@4
|
37 about these environment variables? It turns out the answer is in the `Django
|
bgneal@4
|
38 documentation`_, albeit in the ``mod_python`` integration section.
|
bgneal@4
|
39
|
bgneal@4
|
40 So, to fix the issue, I added the following lines to my ``/etc/apache2/envvars``
|
bgneal@4
|
41 file:
|
bgneal@4
|
42
|
bgneal@4
|
43 .. sourcecode:: bash
|
bgneal@4
|
44
|
bgneal@4
|
45 export LANG='en_US.UTF-8'
|
bgneal@4
|
46 export LC_ALL='en_US.UTF-8'
|
bgneal@4
|
47
|
bgneal@4
|
48 Note that you must cold stop and re-start Apache for these changes to take
|
bgneal@4
|
49 effect. I got tripped up at first because I did an ``apache2ctrl
|
bgneal@4
|
50 graceful``, and that was not sufficient to create a new environment.
|
bgneal@4
|
51
|
bgneal@4
|
52 .. _Django: http://djangoproject.com
|
bgneal@4
|
53 .. _Django documentation: https://docs.djangoproject.com/en/1.3/howto/deployment/modpython/#if-you-get-a-unicodeencodeerror
|