Mercurial > public > pelican-blog
changeset 4:7ce6393e6d30
Adding converted blog posts from old blog.
line wrap: on
line diff
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/content/Coding/000-blog-reboot.rst Thu Jan 30 21:45:03 2014 -0600 @@ -0,0 +1,65 @@ +Blog reboot with Blogofile +########################## + +:date: 2011-04-17 14:10 +:tags: Blogging, Blogofile +:slug: blog-reboot-with-blogofile +:author: Brian Neal + +Welcome to my new blog. I've been meaning to start blogging again for some time, especially since +the new version of SurfGuitar101.com_ went live almost two months ago. But the idea of dealing with +WordPress was putting me off. Don't get me wrong, WordPress really is a nice general purpose +blogging platform, but it didn't really suit me anymore. + +I considered creating a new blog in Django_, but I really want to spend all my time and energy on +improving SurfGuitar101 and not tweaking my blog. I started thinking about doing something +simpler. + +Almost by accident, I discovered Blogofile_ by seeing it mentioned in my Twitter feed. Blogofile is +a static blog generator written in Python. After playing with it for a while, I decided to use it +for a blog reboot. It is simple to use, Pythonic, and very configurable. The advantages for me to go +with a static blog are: + +1. No more dealing with WordPress and plugin updates. To be fair, WordPress is very easy to update + these days. Plugins are still a pain, and are often needed to display source code. +2. I can write my blog posts in Markdown_ or reStructuredText_ using my `favorite editor`_ instead + of some lame Javascript editor. Formatting source code is dead simple now. +3. All of my blog content is under version control. +4. Easier to work offline. +5. Easier to deploy. Very little (if any) server configuration. +6. I can use version control with a post-commit hook to deploy the site. + +Disadvantages: + +1. Not as "dynamic". For my blog, this isn't really a problem. Comments can be handled by a service + like Disqus_. +2. Regenerating the entire site can take time. This is only an issue if you have a huge blog with + years of content. A fresh blog takes a fraction of a second to build, and I don't anticipate + this affecting me for some time, if ever. I suspect Blogofile will be improved to include caching + and smarter rebuilds in the future. + +It should be noted that Blogofile seems to require Python 2.6 or later. My production server is +still running 2.5, and I can't easily change this for a while. This really only means I can't use +Mercurial with a *changegroup* hook to automatically deploy the site. This should only be a temporary +issue; I hope to upgrade the server in the future. + +Blogofile comes with some scripts for importing WordPress blogs. Looking over my old posts, some of +them make me cringe. I think I'll save importing them for a rainy day. + +The bottom line is, this style of blogging suits me as a programmer. I get to use all the same +tools I use to write code: a good text editor, the same markup I use for documentation, and version +control. Deployment is a snap, and I don't have a database or complicated server setup to maintain. +Hopefully this means I will blog more. + +Finally, I'd like to give a shout-out to my friend `Trevor Oke`_ who just switched to a static blog +for many of the same reasons. + + +.. _SurfGuitar101.com: http://surfguitar101.com +.. _Django: http://djangoproject.com +.. _Blogofile: http://blogofile.com +.. _Markdown: http://daringfireball.net/projects/markdown/ +.. _reStructuredText: http://docutils.sourceforge.net/rst.html +.. _favorite editor: http://www.vim.org +.. _Disqus: http://disqus.com/ +.. _Trevor Oke: http://trevoroke.com/2011/04/12/converting-to-jekyll.html
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/content/Coding/001-blogofile-rst.rst Thu Jan 30 21:45:03 2014 -0600 @@ -0,0 +1,74 @@ +Blogofile, reStructuredText, and Pygments +######################################### + +:date: 2011-04-17 19:15 +:tags: Blogofile, Pygments, reStructuredText +:slug: blogofile-restructuredtext-and-pygments +:author: Brian Neal + +Blogofile_ has support out-of-the-box for reStructuredText_ and Pygments_. Blogofile's +``syntax_highlight.py`` filter wants you to mark your code blocks with a token such as +``$$code(lang=python)``. I wanted to use the method I am more familiar with, by configuring +reStructuredText with a `custom directive`_. Luckily this is very easy. Here is how I did it. + +First of all, I checked what version of Pygments I had since I used Ubuntu's package +manager to install it. I then visited `Pygments on BitBucket`_, and switched to the tag that matched +my version. I then drilled into the ``external`` directory. I then saved the ``rst-directive.py`` +file to my blog's local repository under the name ``_rst_directive.py``. I named it with a leading +underscore so that Blogofile would ignore it. If this bothers you, you could also add it to +Blogofile's ``site.file_ignore_patterns`` setting. + +Next, I tweaked the settings in ``_rst_directive.py`` by un-commenting the ``linenos`` variant. + +All we have to do now is to get Blogofile to import this module. This can be accomplished by making +use of the `pre_build() hook`_ in your ``_config.py`` file. This is a convenient place to hang +custom code that will run before your blog is built. I added the following code to my +``_config.py`` module + +.. sourcecode:: python + + def pre_build(): + # Register the Pygments Docutils directive + import _rst_directive + +This allows me to embed code in my ``.rst`` files with the ``sourcecode`` directive. For example, +here is what I typed to create the source code snippet above:: + + .. sourcecode:: python + + def pre_build(): + # Register the Pygments Docutils directive + import _rst_directive + +Of course to get it to look nice, we'll need some CSS. I used this Pygments command to generate +a ``.css`` file for the blog. + +.. sourcecode:: bash + + $ pygmentize -f html -S monokai -a .highlight > pygments.css + +I saved ``pygments.css`` in my ``css`` directory and updated my site template to link it in. +Blogofile will copy this file into my ``_site`` directory when I build the blog. + +Here is what I added to my blog's main ``.css`` file to style the code snippets. The important thing +for me was to add an ``overflow: auto;`` setting. This will ensure that a scrollbar will +appear on long lines instead of the code being truncated. + +.. sourcecode:: css + + .highlight { + width: 96%; + padding: 0.5em 0.5em; + border: 1px solid #00ff00; + margin: 1.0em auto; + overflow: auto; + } + +That's it! + +.. _Blogofile: http://blogofile.com +.. _reStructuredText: http://docutils.sourceforge.net/rst.html +.. _Pygments: http://pygments.org/ +.. _custom directive: http://pygments.org/docs/rstdirective/ +.. _Pygments on BitBucket: https://bitbucket.org/birkenfeld/pygments-main +.. _pre_build() hook: http://blogofile.com/documentation/config_file.html#pre-build
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/content/Coding/002-redis-whos-online.rst Thu Jan 30 21:45:03 2014 -0600 @@ -0,0 +1,269 @@ +A better "Who's Online" with Redis & Python +########################################### + +:date: 2011-04-25 12:00 +:tags: Redis, Python +:slug: a-better-who-s-online-with-redis-python +:author: Brian Neal + +**Updated on December 17, 2011:** I found a better solution. Head on over to +the `new post`_ to check it out. + + +Who's What? +----------- + +My website, like many others, has a "who's online" feature. It displays the +names of authenticated users that have been seen over the course of the last ten +minutes or so. It may seem a minor feature at first, but I find it really does a lot to +"humanize" the site and make it seem more like a community gathering place. + +My first implementation of this feature used the MySQL database to update a +per-user timestamp whenever a request from an authenticated user arrived. +Actually, this seemed excessive to me, so I used a strategy involving an "online" +cookie that has a five minute expiration time. Whenever I see an authenticated +user without the online cookie I update their timestamp and then hand them back +a cookie that will expire in five minutes. In this way I don't have to hit the +database on every single request. + +This approach worked fine but it has some aspects that didn't sit right with me: + +* It seems like overkill to use the database to store temporary, trivial information like + this. It doesn't feel like a good use of a full-featured relational database + management system (RDBMS). +* I am writing to the database during a GET request. Ideally, all GET requests should + be idempotent. Of course if this is strictly followed, it would be + impossible to create a "who's online" feature in the first place. You'd have + to require the user to POST data periodically. However, writing to a RDBMS + during a GET request is something I feel guilty about and try to avoid when I + can. + + +Redis +----- + +Enter Redis_. I discovered Redis recently, and it is pure, white-hot +awesomeness. What is Redis? It's one of those projects that gets slapped with +the "NoSQL" label. And while I'm still trying to figure that buzzword out, Redis makes +sense to me when described as a lightweight data structure server. +Memcached_ can store key-value pairs very fast, where the value is always a string. +Redis goes one step further and stores not only strings, but data +structures like lists, sets, and hashes. For a great overview of what Redis is +and what you can do with it, check out `Simon Willison's Redis tutorial`_. + +Another reason why I like Redis is that it is easy to install and deploy. +It is straight C code without any dependencies. Thus you can build it from +source just about anywhere. Your Linux distro may have a package for it, but it +is just as easy to grab the latest tarball and build it yourself. + +I've really come to appreciate Redis for being such a small and lightweight +tool. At the same time, it is very powerful and effective for filling those +tasks that a traditional RDBMS is not good at. + +For working with Redis in Python, you'll need to grab Andy McCurdy's redis-py_ +client library. It can be installed with a simple + +.. sourcecode:: sh + + $ sudo pip install redis + + +Who's Online with Redis +----------------------- + +Now that we are going to use Redis, how do we implement a "who's online" +feature? The first step is to get familiar with the `Redis API`_. + +One approach to the "who's online" problem is to add a user name to a set +whenever we see a request from that user. That's fine but how do we know when +they have stopped browsing the site? We have to periodically clean out the +set in order to time people out. A cron job, for example, could delete the +set every five minutes. + +A small problem with deleting the set is that people will abruptly disappear +from the site every five minutes. In order to give more gradual behavior we +could utilize two sets, a "current" set and an "old" set. As users are seen, we +add their names to the current set. Every five minutes or so (season to taste), +we simply overwrite the old set with the contents of the current set, then clear +out the current set. At any given time, the set of who's online is the union +of these two sets. + +This approach doesn't give exact results of course, but it is perfectly fine for my site. + +Looking over the Redis API, we see that we'll be making use of the following +commands: + +* SADD_ for adding members to the current set. +* RENAME_ for copying the current set to the old, as well as destroying the + current set all in one step. +* SUNION_ for performing a union on the current and old sets to produce the set + of who's online. + +And that's it! With these three primitives we have everything we need. This is +because of the following useful Redis behaviors: + +* Performing a ``SADD`` against a set that doesn't exist creates the set and is + not an error. +* Performing a ``SUNION`` with sets that don't exist is fine; they are simply + treated as empty sets. + +The one caveat involves the ``RENAME`` command. If the key you wish to rename +does not exist, the Python Redis client treats this as an error and an exception +is thrown. + +Experimenting with algorithms and ideas is quite easy with Redis. You can either +use the Python Redis client in a Python interactive interpreter shell, or you can +use the command-line client that comes with Redis. Either way you can quickly +try out commands and refine your approach. + + +Implementation +-------------- + +My website is powered by Django_, but I am not going to show any Django specific +code here. Instead I'll show just the pure Python parts, and hopefully you can +adapt it to whatever framework, if any, you are using. + +I created a Python module to hold this functionality: +``whos_online.py``. Throughout this module I use a lot of exception handling, +mainly because if the Redis server has crashed (or if I forgot to start it, say +in development) I don't want my website to be unusable. If Redis is unavailable, +I simply log an error and drive on. Note that in my limited experience Redis is +very stable and has not crashed on me once, but it is good to be defensive. + +The first important function used throughout this module is a function to obtain +a connection to the Redis server: + +.. sourcecode:: python + + import logging + import redis + + logger = logging.getLogger(__name__) + + def _get_connection(): + """ + Create and return a Redis connection. Returns None on failure. + """ + try: + conn = redis.Redis(host=HOST, port=PORT, db=DB) + return conn + except redis.RedisError, e: + logger.error(e) + + return None + +The ``HOST``, ``PORT``, and ``DB`` constants can come from a +configuration file or they could be module-level constants. In my case they are set in my +Django ``settings.py`` file. Once we have this connection object, we are free to +use the Redis API exposed via the Python Redis client. + +To update the current set whenever we see a user, I call this function: + +.. sourcecode:: python + + # Redis key names: + USER_CURRENT_KEY = "wo_user_current" + USER_OLD_KEY = "wo_user_old" + + def report_user(username): + """ + Call this function when a user has been seen. The username will be added to + the current set. + """ + conn = _get_connection() + if conn: + try: + conn.sadd(USER_CURRENT_KEY, username) + except redis.RedisError, e: + logger.error(e) + +If you are using Django, a good spot to call this function is from a piece +of `custom middleware`_. I kept my "5 minute cookie" algorithm to avoid doing this on +every request although it is probably unnecessary on my low traffic site. + +Periodically you need to "age out" the sets by destroying the old set, moving +the current set to the old set, and then emptying the current set. + +.. sourcecode:: python + + def tick(): + """ + Call this function to "age out" the old set by renaming the current set + to the old. + """ + conn = _get_connection() + if conn: + # An exception may be raised if the current key doesn't exist; if that + # happens we have to delete the old set because no one is online. + try: + conn.rename(USER_CURRENT_KEY, USER_OLD_KEY) + except redis.ResponseError: + try: + del conn[old] + except redis.RedisError, e: + logger.error(e) + except redis.RedisError, e: + logger.error(e) + +As mentioned previously, if no one is on your site, eventually your current set +will cease to exist as it is renamed and not populated further. If you attempt to +rename a non-existent key, the Python Redis client raises a ``ResponseError`` exception. +If this occurs we just manually delete the old set. In a bit of Pythonic cleverness, +the Python Redis client supports the ``del`` syntax to support this operation. + +The ``tick()`` function can be called periodically by a cron job, for example. If you are using Django, +you could create a `custom management command`_ that calls ``tick()`` and schedule cron +to execute it. Alternatively, you could use something like Celery_ to schedule a +job to do the same. (As an aside, Redis can be used as a back-end for Celery, something that I hope +to explore in the near future). + +Finally, you need a way to obtain the current "who's online" set, which again is +a union of the current and old sets. + +.. sourcecode:: python + + def get_users_online(): + """ + Returns a set of user names which is the union of the current and old + sets. + """ + conn = _get_connection() + if conn: + try: + # Note that keys that do not exist are considered empty sets + return conn.sunion([USER_CURRENT_KEY, USER_OLD_KEY]) + except redis.RedisError, e: + logger.error(e) + + return set() + +In my Django application, I calling this function from a `custom inclusion template tag`_ +. + + +Conclusion +---------- + +I hope this blog post gives you some idea of the usefulness of Redis. I expanded +on this example to also keep track of non-authenticated "guest" users. I simply added +another pair of sets to track IP addresses. + +If you are like me, you are probably already thinking about shifting some functions that you +awkwardly jammed onto a traditional database to Redis and other "NoSQL" +technologies. + +.. _Redis: http://redis.io/ +.. _Memcached: http://memcached.org/ +.. _Simon Willison's Redis tutorial: http://simonwillison.net/static/2010/redis-tutorial/ +.. _redis-py: https://github.com/andymccurdy/redis-py +.. _Django: http://djangoproject.com +.. _Redis API: http://redis.io/commands +.. _SADD: http://redis.io/commands/sadd +.. _RENAME: http://redis.io/commands/rename +.. _SUNION: http://redis.io/commands/sunion +.. _custom middleware: http://docs.djangoproject.com/en/1.3/topics/http/middleware/ +.. _custom management command: http://docs.djangoproject.com/en/1.3/howto/custom-management-commands/ +.. _Celery: http://celeryproject.org/ +.. _custom inclusion template tag: http://docs.djangoproject.com/en/1.3/howto/custom-template-tags/#inclusion-tags +.. _new post: http://deathofagremmie.com/2011/12/17/who-s-online-with-redis-python-a-slight-return/
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/content/Coding/003-nl2br-markdown-ext.rst Thu Jan 30 21:45:03 2014 -0600 @@ -0,0 +1,100 @@ +A newline-to-break Python-Markdown extension +############################################ + +:date: 2011-05-09 22:40 +:tags: Markdown, Python +:slug: a-newline-to-break-python-markdown-extension +:author: Brian Neal + +When I launched a new version of my website, I decided the new forums would use +Markdown_ instead of BBCode_ for the markup. This decision was mainly a personal +one for aesthetic reasons. I felt that Markdown was more natural to write compared +to the clunky square brackets of BBCode. + +My new site is coded in Python_ using the Django_ framework. For a Markdown implementation +I chose `Python-Markdown`_. + +My mainly non-technical users seemed largely ambivalent to the change from +BBCode to Markdown. This was probably because I gave them a nice Javascript editor +(`MarkItUp!`_) which inserted the correct markup for them. + +However, shortly after launch, one particular feature of Markdown really riled up +some users: the default line break behavior. In strict Markdown, to create a new +paragraph, you must insert a blank line between paragraphs. Hard returns (newlines) +are simply ignored, just like they are in HTML. You can, however, force a break by +ending a line with two blank spaces. This isn't very intuitive, unlike the rest of +Markdown. + +Now I agree the default behavior is useful if you are creating an online document, like a blog post. +However, non-technical users really didn't understand this behavior at all in the context +of a forum post. For example, many of my users post radio-show playlists, formatted with +one song per line. When such a playlist was pasted into a forum post, Markdown made it +all one giant run-together paragraph. This did not please my users. Arguably, they should +have used a Markdown list. But it became clear teaching people the new syntax wasn't +going to work, especially when it used to work just fine in BBCode and they had created +their playlists in the same way for several years. + +It turns out I am not alone in my observations (or on the receiving end of user wrath). Other, +much larger sites, like StackOverflow_ and GitHub_, have altered their Markdown parsers +to treat newlines as hard breaks. How can this be done with Python-Markdown? + +It turns out this is really easy. Python-Markdown was designed with user customization +in mind by offering an extension facility. The `extension documentation`_ is good, +and you can find extension writing help on the friendly `mailing list`_. + +Here is a simple extension for Python-Markdown that turns newlines into HTML <br /> tags. + +.. sourcecode:: python + + """ + A python-markdown extension to treat newlines as hard breaks; like + StackOverflow and GitHub flavored Markdown do. + + """ + import markdown + + + BR_RE = r'\n' + + class Nl2BrExtension(markdown.Extension): + + def extendMarkdown(self, md, md_globals): + br_tag = markdown.inlinepatterns.SubstituteTagPattern(BR_RE, 'br') + md.inlinePatterns.add('nl', br_tag, '_end') + + + def makeExtension(configs=None): + return Nl2BrExtension(configs) + +I saved this code in a file called ``mdx_nl2br.py`` and put it on my ``PYTHONPATH``. You can then use +it in a Django template like this: + +.. sourcecode:: django + + {{ value|markdown:"nl2br" }} + +To use the extension in Python code, something like this should do the trick: + +.. sourcecode:: python + + import markdown + md = markdown.Markdown(safe_mode=True, extensions=['nl2br']) + converted_text = md.convert(text) + +**Update (June 21, 2011):** This extension is now being distributed with +Python-Markdown! See `issue 13 on github`_ for the details. Thanks to Waylan +Limberg for the help in creating the extension and for including it with +Python-Markdown. + + +.. _Markdown: http://daringfireball.net/projects/markdown/ +.. _BBCode: http://en.wikipedia.org/wiki/BBCode +.. _Python: http://python.org +.. _Django: http://djangoproject.com +.. _MarkItUp!: http://markitup.jaysalvat.com/home/ +.. _StackOverflow: http://blog.stackoverflow.com/2009/10/markdown-one-year-later/ +.. _GitHub: http://github.github.com/github-flavored-markdown/ +.. _Python-Markdown: http://www.freewisdom.org/projects/python-markdown/ +.. _extension documentation: http://www.freewisdom.org/projects/python-markdown/Writing_Extensions +.. _mailing list: http://lists.sourceforge.net/lists/listinfo/python-markdown-discuss +.. _issue 13 on github: https://github.com/waylan/Python-Markdown/issues/13
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/content/Coding/004-fructose-contrib.rst Thu Jan 30 21:45:03 2014 -0600 @@ -0,0 +1,42 @@ +I contributed to Fructose +######################### + +:date: 2011-05-31 21:40 +:tags: Fructose, C++, Python, UnitTesting +:slug: i-contributed-to-fructose +:author: Brian Neal + +At work we started using CxxTest_ as our unit testing framework. We like it because +it is very light-weight and easy to use. We've gotten a tremendous amount of benefit +from using a unit testing framework, much more than I had ever imagined. We now have +almost 700 tests, and I cannot imagine going back to the days of no unit tests or ad-hoc +testing. It is incredibly reassuring to see all the tests pass after making a significant +change to the code base. There is no doubt in my mind that our software-hardware integration +phases have gone much smoother thanks to our unit tests. + +Sadly it seems CxxTest is no longer actively supported. However this is not of great +concern to us. The code is so small we are fairly confident we could tweak it if necessary. + +I recently discovered Fructose_, a unit testing framework written by Andrew Marlow. It too +has similar goals of being small and simple to use. One thing I noticed that CxxTest had that +Fructose did not was a Python code generator that took care of creating the ``main()`` function +and registering all the tests with the framework. Since C++ has very little introspection +capabilities, C++ unit testing frameworks have historically laid the burden of registering +tests on the programmer. Some use macros to help with this chore, but littering your code +with ugly macros makes tests annoying to write. And if anything, you want your tests to be +easy to write so your colleagues will write lots of tests. CxxTest approached this problem by +providing first a Perl script, then later a Python script, to automate this part of the process. + +I decided it would be interesting to see if I could provide such a script for Fructose. After +a Saturday of hacking, I'm happy to say Andrew has accepted the script and it now ships with +Fructose version 1.1.0. I hope to improve the script to not only run all the tests but to also +print out a summary of the number of tests that passed and failed at the end, much like CxxTest does. +This will require some changes to the C++ code. Also on my wish list is to make the script +extensible, so that others can easily change the output and code generation to suit their needs. + +I've hosted the code for the Python script, which I call ``fructose_gen.py`` on Bitbucket_. +Feedback is greatly appreciated. + +.. _CxxTest: http://cxxtest.tigris.org/ +.. _Fructose: http://fructose.sourceforge.net/ +.. _Bitbucket: https://bitbucket.org/bgneal/fructose_gen/src
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/content/Coding/005-django-unicode-error-uploads.rst Thu Jan 30 21:45:03 2014 -0600 @@ -0,0 +1,53 @@ +Django Uploads and UnicodeEncodeError +##################################### + +:date: 2011-06-04 20:00 +:tags: Django, Python, Linux, Unicode +:slug: django-uploads-and-unicodeencodeerror +:author: Brian Neal + +Something strange happened that I wish to document in case it helps others. I +had to reboot my Ubuntu server while troubleshooting a disk problem. After the +reboot, I began receiving internal server errors whenever someone tried to view +a certain forum thread on my Django_ powered website. After some detective work, +I determined it was because a user that had posted in the thread had an avatar +image whose filename contained non-ASCII characters. The image file had been +there for months, and I still cannot explain why it just suddenly started +happening. + +The traceback I was getting ended with something like this: + +.. sourcecode:: python + + File "/django/core/files/storage.py", line 159, in _open + return File(open(self.path(name), mode)) + + UnicodeEncodeError: 'ascii' codec can't encode characters in position 72-79: ordinal not in range(128) + +So it appeared that the ``open()`` call was triggering the error. This led me on +a twisty Google search which had many dead ends. Eventually I found a suitable +explanation. Apparently, Linux filesystems don't enforce a particular Unicode +encoding for filenames. Linux applications must decide how to interpret +filenames all on their own. The Python OS library (on Linux) uses environment +variables to determine what locale you are in, and this chooses the encoding for +filenames. If these environment variables are not set, Python falls back to +ASCII (by default), and hence the source of my ``UnicodeEncodeError``. + +So how do you tell a Python instance that is running under Apache / ``mod_wsgi`` +about these environment variables? It turns out the answer is in the `Django +documentation`_, albeit in the ``mod_python`` integration section. + +So, to fix the issue, I added the following lines to my ``/etc/apache2/envvars`` +file: + +.. sourcecode:: bash + + export LANG='en_US.UTF-8' + export LC_ALL='en_US.UTF-8' + +Note that you must cold stop and re-start Apache for these changes to take +effect. I got tripped up at first because I did an ``apache2ctrl +graceful``, and that was not sufficient to create a new environment. + +.. _Django: http://djangoproject.com +.. _Django documentation: https://docs.djangoproject.com/en/1.3/howto/deployment/modpython/#if-you-get-a-unicodeencodeerror
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/content/Coding/006-nl2br-in-python-markdown.rst Thu Jan 30 21:45:03 2014 -0600 @@ -0,0 +1,16 @@ +My newline-to-break extension now shipping with Python-Markdown +############################################################### + +:date: 2011-06-21 22:15 +:tags: Markdown, Python +:slug: my-newline-to-break-extension-now-shipping-with-python-markdown +:author: Brian Neal + +Here is a quick update on a `previous post`_ I made about a newline-to-break +extension for `Python-Markdown`_. I'm very happy to report that the extension will +now be `shipping with Python-Markdown`_! Thanks to developer Waylan Limberg for +including it! + +.. _previous post: http://deathofagremmie.com/2011/05/09/a-newline-to-break-python-markdown-extension/ +.. _Python-Markdown: http://www.freewisdom.org/projects/python-markdown/ +.. _shipping with Python-Markdown: https://github.com/waylan/Python-Markdown/issues/13
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/content/Coding/007-subversion-contrib.rst Thu Jan 30 21:45:03 2014 -0600 @@ -0,0 +1,143 @@ +Contributing to open source - a success story and advice for newbies +#################################################################### + +:date: 2011-06-23 21:45 +:tags: Subversion, OpenSource +:slug: contributing-to-open-source-a-success-story-and-advice-for-newbies +:author: Brian Neal + +Recently, my team at work found a `bug in Subversion`_, I submitted a patch, and it +was accepted! This was very exciting for me so I thought I would share this +story in the hopes of inspiring others to contribute to open source projects. +It may not be as hard as you might think! + +The Bug +======= + +We use Subversion_ at work for revision control. My colleague and I were trying +to merge a branch back to trunk when we ran into some strange behavior. We make +use of Subversion properties, which allow you to attach arbitrary metadata to +files and directories. Our project has to deliver our source code and +documentation to the customer in a required directory format (can you guess who +our customer is?). However not all files need to be sent to the customer. To +solve this problem we use a simple "yes/no" delivery property on each file to +control whether it is delivered or not. Before making a delivery, a script is +run that prunes out the files that have the delivery flag set to "no". + +When our merge was completed, many files were marked with having merge conflicts +on the delivery property. Looking through the logs it was discovered that after +we had made our branch, someone had changed the delivery property on some files +to "yes" on the trunk. Someone else had also changed the delivery property +independently to "yes" on the branch. When we attempted to merge the branch back +to trunk, we were getting merge conflicts, even though we were trying to change +the delivery property value to "yes" on both the trunk and branch. Why was this +a conflict? This didn't seem quite right. + +I signed up for the Subversion user's mailing list and made a short post +summarizing our issue. I later learned that it is proper etiquette to attach a +bash script that can demonstrate the problem. Despite this, a Subversion developer +took interest in my post and created a test script in an attempt to reproduce our +issue. At first it looked like he could not reproduce the problem. However another +helpful user pointed out a bug in his script. Once this was fixed, the developer +declared our problem a genuine bug and created a ticket for it in the issue tracker. + +The Patch +========= + +Impressed by all of this, I thanked him for his efforts and tentatively asked if +I could help. The developer told me which file and function he thought the +problem might be in. I downloaded the Subversion source and began looking at the +code. I was fairly impressed with the code quality, so I decided I would try to +create a patch for the bug over the weekend. We really wanted that bug fixed, +and I was genuinely curious to see if I would be able to figure something out. +It would be an interesting challenge and a test of my novice open source skills. + +When the weekend came I began a more thorough examination of the Subversion +website. The Subversion team has done a great job in providing documentation on +their development process. This includes a contributing guide and patch +submittal process. I also discovered they had recently added a makefile that +downloaded the Subversion source code and the source for all of Subversion's +dependencies. The makefile then builds everything with debug turned on. Wow! It +took me a few tries to get this working, but the problems were because I did not +have all the development tools installed on my Ubuntu box. Once this was +sorted, everything went smoothly, and in a matter of minutes I had a Subversion +executable I could run under the gdb debugger. Nice! + +I studied the code for about an hour, peeking and poking at a few things in the +debugger. I used the script the developer wrote to recreate the problem. I +wasn't quite sure what I was doing, as I was brand new to this code base. But +the code was clearly written and commented well. My goal was to get a patch that +was in the 80-100% complete range. I wanted to do enough work that a core +developer would be able to see what I was doing and either commit it outright or +easily fill in the parts that I missed. After a while I thought I had a solution +and generated a patch. I sent it to the Subversion developer's mailing list as +per the contributing guide. + +The Wait +======== + +Next I began probably the worst part for a contributor. I had to wait and see if +I got any feedback. On some open source projects a patch may languish for months. +It all depends on the number of developers and how busy they are. My chances +didn't look good as the developers were in the initial stages of getting a +beta version of 1.7 out the door. It was also not clear to me who "owned" the +issue tracker. On some projects, the issue tracker is wide open to the +community. Was I supposed to update the ticket? I wasn't quite sure, and the +contributing guide was silent on this issue. I eventually concluded I was not; +it looked like only committers were using the tracker. Patches were being +discussed on the mailing list instead of in the tracker. This is a bit different +than some projects I am familiar with. + +I didn't have to wait long. After a few days, the original developer who +confirmed my bug took interest again. He looked at my patch, and thought I had +missed something. He suggested a change and asked for my opinion. I looked at +the code again; it seemed like a good change and I told him I agreed. I also +warned him I was brand new to the code, and to take my opinion with a grain a +salt. After running my change against the tests, he then committed my patch! +One small victory for open source! + +Final Thoughts +============== + +So what went right here? I have to hand it to the Subversion team. They have +been in business a long time, and they have excellent documentation for people +who want to contribute. The makefile they created that sets up a complete +development environment most definitely tipped the scale for me and enabled me +to create my patch. Without that I'm not sure I would have had the time or +patience to get all that unfamiliar source code built. The Subversion team has +really worked hard at lowering the barrier for new contributors. + +My advice to people who want to contribute to open source but aren't quite sure +how to go about doing it: + +- Spend some time reading the documentation. This includes any FAQ's and + contributor guides (if any). +- Monitor the user and developer mailing lists to get a feel for how the + community operates. Each project has different customs and traditions. +- You may also wish to hang out on the project's IRC channel for the same + reason. +- When writing on the mailing lists, be extremely concise and polite. + You don't want to waste anyone's time, and you don't want to + be seen as someone who thinks they are entitled to a fix. Just remember you + are the new guy. You can't just barge in and make demands. +- Ask how you can help. Nothing makes a developer happier when someone asks how + they can help. Remember, most of the people in the community are volunteers. +- Open source can sometimes be "noisy". There will be people who + won't quite understand your issue and may hurriedly suggest an incorrect solution or give + incomplete advice. Study their responses and be polite. You may also wish to resist the temptation + to reply right away. This is especially hard when you are new and you don't + know who the "real" developers are. However you should assume everyone is trying to + help. +- Finally, be patient. Again, most folks are volunteers. They have real jobs, + families and lives. The project may also be preoccupied with other tasks, like + getting a beta out the door. Now may not be a good time for a brand new + feature, or your bug may not be considered a show stopper to the majority of + the community. + +A big thank-you to Stefan Sperling from the Subversion team who shepherded my +bug report and patch through their process. + +I hope this story encourages you to contribute to open source software! + +.. _bug in Subversion: http://subversion.tigris.org/issues/show_bug.cgi?id=3919 +.. _Subversion: http://subversion.apache.org/
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/content/Coding/008-oauth-python-gdata.rst Thu Jan 30 21:45:03 2014 -0600 @@ -0,0 +1,219 @@ +Implementing OAuth using Google's Python Client Library +####################################################### + +:date: 2011-07-04 13:00 +:tags: Python, OAuth, Google, GData +:slug: implementing-oauth-using-google-s-python-client-library +:author: Brian Neal + +My Django_ powered website allows users to submit events for a site calendar +that is built upon Google Calendar. After an admin approves events, I use +Google's `Python Client Library`_ to add, delete, or update events on the Google +calendar associated with my personal Google account. I wrote this application a +few years ago, and it used the ClientLogin_ method for authentication. I +recently decided to upgrade this to the OAuth_ authentication method. The +ClientLogin method isn't very secure and it doesn't play well with Google's +`two-step verification`_. After hearing about a friend who had his GMail account +compromised and all his email deleted I decided it was long past due to get +two-step verification on my account. But first I needed to upgrade my web +application to OAuth. + +In this post I'll boil down the code I used to implement the elaborate OAuth +dance. It really isn't that much code, but the Google documentation is somewhat +confusing and scattered across a bewildering number of documents. I found at +least one error in the documentation that I will point out. Although I am using +Django, I will omit details specific to Django where I can. + +In addition to switching from ClientLogin to OAuth, I also upgraded to version +2.0 of the Google Data API. This had more implications for my calendar-specific +code, and perhaps I can go over that in a future post. + +Getting started and registering with Google +=========================================== + +To understand the basics of OAuth, I suggest you read `OAuth 1.0 for Web +Applications`_. I decided to go for maximum security and use RSA-SHA1 signing on +all my requests to Google. This requires that I verify my domain and then +`register my application`_ with Google, which includes uploading a security +certificate. Google provides documentation that describes how you can `create a +self-signing private key and certificate`_ using OpenSSL. + +Fetching a Request Token and authorizing access +=============================================== + +To perform the first part of the OAuth dance, you must ask Google for a request +token. When you make this request, you state the "scope" of your future work by +listing the Google resources you are going to access. In our case, this is the +calendar resources. You also provide a "consumer key" that Google assigned to +you when you registered your application. This allows Google to retrieve the +security certificate you previously uploaded when you registered. This is very +important because this request is going to be signed with your private key. +Fortunately the Python library takes care of all the signing details, you simply +must provide your private key in PEM format. And finally, you provide a +"callback URL" that Google will send your browser to after you (or your users) +have manually authorized this request. + +Once you have received the request token from Google, you have to squirrel it +away somewhere, then redirect your (or your user's) browser to a Google +authorization page. Once the user has authorized your application, Google sends +the browser to the callback URL to continue the process. Here I show the +distilled code I used that asks for a request token, then sends the user to the +authorization page. + +.. sourcecode:: python + + import gdata.gauth + from gdata.calendar_resource.client import CalendarResourceClient + + USER_AGENT = 'mydomain-myapp-v1' # my made up user agent string + + client = CalendarResourceClient(None, source=USER_AGENT) + + # obtain my private key that I saved previously on the filesystem: + with open(settings.GOOGLE_OAUTH_PRIVATE_KEY_PATH, 'r') as f: + rsa_key = f.read() + + # Ask for a request token: + # scopes - a list of scope strings that the request token is for. See + # http://code.google.com/apis/gdata/faq.html#AuthScopes + # callback_url - URL to send the user after authorizing our app + + scopes = ['https://www.google.com/calendar/feeds/'] + callback_url = 'http://example.com/some/url/to/callback' + + request_token = client.GetOAuthToken( + scopes, + callback_url, + settings.GOOGLE_OAUTH_CONSUMER_KEY, # from the registration process + rsa_private_key=rsa_key) + + # Before redirecting, save the request token somewhere; here I place it in + # the session (this line is Django specific): + request.session[REQ_TOKEN_SESSION_KEY] = request_token + + # Generate the authorization URL. + # Despite the documentation, don't do this: + # auth_url = request_token.generate_authorization_url(domain=None) + # Do this instead if you are not using a Google Apps domain: + auth_url = request_token.generate_authorization_url() + + # Now redirect the user somehow to the auth_url; here is how you might do + # it in Django: + return HttpResponseRedirect(auth_url) + +A couple of notes on the above: + +* You don't have to use ``CalendarResourceClient``, it just made the most sense + for me since I am doing calendar stuff later on. Any class that inherits from + ``gdata.client.GDClient`` will work. You might be able to use that class + directly. Google uses ``gdata.docs.client.DocsClient`` in their examples. +* I chose to store my private key in a file rather than the database. If you do + so, it's probably a good idea to make the file readable only to the user your + webserver runs your application as. +* After getting the request token you must save it somehow. You can save it in + the session, the database, or perhaps a file. Since this is only temporary, I + chose to save it in the session. The code I have here is Django specific. +* When generating the authorization URL, don't pass in ``domain=None`` if you + aren't using a Google Apps domain like the documentation states. This appears + to be an error in the documentation. Just omit it and let it use the default + value of ``"default"`` (see the source code). +* After using the request token to generate the authorization URL, redirect the + browser to it. + +Extracting and upgrading to an Access Token +=========================================== + +The user will then be taken to a Google authorization page. The page will show the +user what parts of their Google account your application is trying to access +using the information you provided in the ``scopes`` parameter. If the user +accepts, Google will then redirect the browser to your callback URL where we can +complete the process. + +The code running at our callback URL must retrieve the request token that we +saved earlier, and combine that with certain ``GET`` parameters Google attached +to our callback URL. This is all done for us by the Python library. We then send +this new token back to Google to upgrade it to an actual access token. If this +succeeds, we can then save this new access token in our database for use in +subsequent Google API operations. The access token is a Python object, so you +can serialize it use the pickle module, or use routines provided by Google +(shown below). + +.. sourcecode:: python + + # Code running at our callback URL: + # Retrieve the request token we saved earlier in our session + saved_token = request.session[REQ_TOKEN_SESSION_KEY] + + # Authorize it by combining it with GET parameters received from Google + request_token = gdata.gauth.AuthorizeRequestToken(saved_token, + request.build_absolute_uri()) + + # Upgrade it to an access token + client = CalendarResourceClient(None, source=USER_AGENT) + access_token = client.GetAccessToken(request_token) + + # Now save access_token somewhere, e.g. a database. So first serialize it: + access_token_str = gdata.gauth.TokenToBlob(access_token) + + # Save to database (details omitted) + +Some notes on the above code: + +* Once called back, our code must retrieve the request token we saved in our + session. The code shown is specific to Django. +* We then combine this saved request token with certain ``GET`` parameters that + Google added to our callback URL. The ``AuthorizeRequestToken`` function takes care of + those details for us. The second argument to that function requires the full URL + including ``GET`` parameters as a string. Here I populate that argument by + using a Django-specific method of retrieving that information. +* Finally, you upgrade your token to an access token by making one last call to + Google. You should now save a serialized version of this access token in your + database for future use. + +Using your shiny new Access Token +================================= + +Once you have saved your access token, you won't have to do this crazy dance +again until the token either expires, or the user revokes your application's +access to the Google account. To use it in a calendar operation, for example, +you simply retrieve it from your database, deserialize it, and then use it to +create a ``CalendarClient``. + +.. sourcecode:: python + + from gdata.calendar.client import CalendarClient + + # retrieve access token from the database: + access_token_str = ... + access_token = gdata.gauth.TokenFromBlob(access_token_str) + + client = CalendarClient(source=USER_AGENT, auth_token=access_token) + + # now use client to make calendar operations... + +Conclusion +========== + +The main reason I wrote this blog post is I wanted to show a concrete example of +using RSA-SHA1 and version 2.0 of the Google API together. All of the +information I have presented is in the Google documentation, but it is spread +across several documents and jumbled up with example code for version 1.0 and +HMAC-SHA1. Do not be afraid to look at the source code for the Python client +library. Despite Google's strange habit of ignoring PEP-8_ and using +LongJavaLikeMethodNames, the code is logical and easy to read. Their library is +built up in layers, and you may have to dip down a few levels to find out what +is going on, but it is fairly straightforward to read if you combine it with +their online documentation. + +I hope someone finds this useful. Your feedback is welcome. + + +.. _Django: http://djangoproject.com +.. _Python Client Library: http://code.google.com/apis/calendar/data/2.0/developers_guide_python.html +.. _ClientLogin: http://code.google.com/apis/calendar/data/2.0/developers_guide_python.html#AuthClientLogin +.. _OAuth: http://code.google.com/apis/gdata/docs/auth/oauth.html +.. _two-step verification: http://googleblog.blogspot.com/2011/02/advanced-sign-in-security-for-your.html +.. _OAuth 1.0 for Web Applications: http://code.google.com/apis/accounts/docs/OAuth.html +.. _register my application: http://code.google.com/apis/accounts/docs/RegistrationForWebAppsAuto.html +.. _create a self-signing private key and certificate: http://code.google.com/apis/gdata/docs/auth/oauth.html#GeneratingKeyCert +.. _PEP-8: http://www.python.org/dev/peps/pep-0008/
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/content/Coding/009-windows-trac-upgrade.rst Thu Jan 30 21:45:03 2014 -0600 @@ -0,0 +1,43 @@ +Upgrading Trac on Windows Gotchas +################################# + +:date: 2011-09-12 22:15 +:tags: Python, Trac, Subversion, Windows +:slug: upgrading-trac-on-windows-gotchas +:author: Brian Neal + +At work, we are outfitted with Windows servers. Despite this obstacle, I managed +to install Trac_ and Subversion_ a few years ago. During a break in the action, +we decided to update Subversion (SVN) and Trac. Since we are on Windows, this +means we have to rely on the `kindness of strangers`_ for Subversion binaries. I +ran into a couple of gotchas I'd like to document here to help anyone else who +runs into these. + +I managed to get Subversion and Trac up and running without any real problems. +However when Trac needed to access SVN to display changesets or a timeline, for +example, I got an error: + +``tracerror: unsupported version control system "svn" no module named _fs`` + +After some googling, I finally found that this issue is `documented on the Trac +wiki`_, but it was kind of hard to find. To fix this problem, you have to rename +the Python SVN binding's DLLs to ``*.pyd``. Specifically, change the +``libsvn/*.dll`` files to ``libsvn/*.pyd``, but don't change the name of +``libsvn_swig_py-1.dll``. I'd really like to hear an explanation of why one +needs to do this. Why doesn't the Python-Windows build process do this for you? + +The second problem I ran into dealt with mod_wsgi_ on Windows. Originally, a few +years ago, I setup Trac to run under mod_python_. mod_python has long been +out of favor, so I decided to cut over to mod_wsgi. On my Linux boxes, I always +run mod_wsgi in daemon mode. When I tried to configure this on Windows, Apache +complained about an unknown directive ``WSGIDaemonProcess``. It turns out that +`this mode is not supported on Windows`_. You'll have to use the embedded mode on +Windows. + +.. _Trac: http://trac.edgewall.org/ +.. _Subversion: http://subversion.apache.org/ +.. _kindness of strangers: http://sourceforge.net/projects/win32svn/ +.. _documented on the Trac wiki: http://trac.edgewall.org/wiki/TracSubversion +.. _mod_wsgi: http://code.google.com/p/modwsgi/ +.. _mod_python: http://www.modpython.org/ +.. _this mode is not supported on Windows: http://code.google.com/p/modwsgi/wiki/ConfigurationDirectives#WSGIDaemonProcess
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/content/Coding/010-redis-whos-online-return.rst Thu Jan 30 21:45:03 2014 -0600 @@ -0,0 +1,100 @@ +Who's Online with Redis & Python, a slight return +################################################# + +:date: 2011-12-17 19:05 +:tags: Redis, Python +:slug: who-s-online-with-redis-python-a-slight-return +:author: Brian Neal + +In a `previous post`_, I blogged about building a "Who's Online" feature using +Redis_ and Python_ with redis-py_. I've been integrating Celery_ into my +website, and I stumbled across this old code. Since I made that post, I +discovered yet another cool feature in Redis: sorted sets. So here is an even +better way of implementing this feature using Redis sorted sets. + +A sorted set in Redis is like a regular set, but each member has a numeric +score. When you add a member to a sorted set, you also specify the score for +that member. You can then retrieve set members if their score falls into a +certain range. You can also easily remove members outside a given score range. + +For a "Who's Online" feature, we need a sorted set to represent the set +of all users online. Whenever we see a user, we insert that user into the set +along with the current time as their score. This is accomplished with the Redis +zadd_ command. If the user is already in the set, zadd_ simply updates +their score with the current time. + +To obtain the curret list of who's online, we use the zrangebyscore_ command to +retrieve the list of users who's score (time) lies between, say, 15 minutes ago, +until now. + +Periodically, we need to remove stale members from the set. This can be +accomplished by using the zremrangebyscore_ command. This command will remove +all members that have a score between minimum and maximum values. In this case, +we can use the beginning of time for the minimum, and 15 minutes ago for the +maximum. + +That's really it in a nutshell. This is much simpler than my previous +solution which used two sets. + +So let's look at some code. The first problem we need to solve is how to +convert a Python ``datetime`` object into a score. This can be accomplished by +converting the ``datetime`` into a POSIX timestamp integer, which is the number +of seconds from the UNIX epoch of January 1, 1970. + +.. sourcecode:: python + + import datetime + import time + + def to_timestamp(dt): + """ + Turn the supplied datetime object into a UNIX timestamp integer. + + """ + return int(time.mktime(dt.timetuple())) + +With that handy function, here are some examples of the operations described +above. + +.. sourcecode:: python + + import redis + + # Redis set keys: + USER_SET_KEY = "whos_online:users" + + # the period over which we collect who's online stats: + MAX_AGE = datetime.timedelta(minutes=15) + + # obtain a connection to redis: + conn = redis.StrictRedis() + + # add/update a user to the who's online set: + + username = "sally" + ts = to_timestamp(datetime.datetime.now()) + conn.zadd(USER_SET_KEY, ts, username) + + # retrieve the list of users who have been active in the last MAX_AGE minutes + + now = datetime.datetime.now() + min = to_timestamp(now - MAX_AGE) + max = to_timestamp(now) + + whos_online = conn.zrangebyscore(USER_SET_KEY, min, max) + + # e.g. whos_online = ['sally', 'harry', 'joe'] + + # periodically remove stale members + + cutoff = to_timestamp(datetime.datetime.now() - MAX_AGE) + conn.zremrangebyscore(USER_SET_KEY, 0, cutoff) + +.. _previous post: http://deathofagremmie.com/2011/04/25/a-better-who-s-online-with-redis-python/ +.. _Redis: http://redis.io/ +.. _Python: http://www.python.org +.. _redis-py: https://github.com/andymccurdy/redis-py +.. _Celery: http://celeryproject.org +.. _zadd: http://redis.io/commands/zadd +.. _zrangebyscore: http://redis.io/commands/zrangebyscore +.. _zremrangebyscore: http://redis.io/commands/zremrangebyscore
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/content/Coding/011-ts3-python-javascript.rst Thu Jan 30 21:45:03 2014 -0600 @@ -0,0 +1,311 @@ +A TeamSpeak 3 viewer with Python & Javascript +############################################# + +:date: 2012-01-20 19:15 +:tags: Python, Javascript, TeamSpeak +:slug: a-teamspeak-3-viewer-with-python-javascript +:author: Brian Neal + +The Problem +=========== + +My gaming clan started using `TeamSpeak 3`_ (TS3) for voice communications, so +it wasn't long before we wanted to see who was on the TS3 server from the clan's +server status page. Long ago, before I met Python, I had built the clan a server +status page in PHP. This consisted of cobbling together various home-made and +3rd party PHP scripts for querying game servers (Call of Duty, Battlefield) and +voice servers (TeamSpeak 2 and Mumble). But TeamSpeak 3 was a new one for us, +and I didn't have anything to query that. My interests in PHP are long behind +me, but we needed to add a TS3 viewer to the PHP page. The gaming clan's web +hosting is pretty vanilla; in other words PHP is the first class citizen. If I +really wanted to host a Python app I probably could have resorted to Fast CGI or +something. But I had no experience in that and no desire to go that way. + +I briefly thought about finding a 3rd party PHP library to query a TS3 server. +The libraries are out there, but they are as you might expect: overly +complicated and/or pretty amateurish (no public source code repository). I even +considered writing my own PHP code to do the job, so I started looking for any +documentation on the TS3 server query protocol. Luckily, there is a `TS3 +query protocol document`_, and it is fairly decent. + +But, I just could not bring myself to write PHP again. On top of this, the +gaming clan's shared hosting blocks non-standard ports. If I did have a PHP +solution, the outgoing query to the TS3 server would have been blocked by the +host's firewall. It is a hassle to contact their technical support and try to +find a person who knows what a port is and get it unblocked (we've had to do +this over and over as each game comes out). Thus it ultimately boiled down to me +wanting to do this in Python. For me, life is too short to write PHP scripts. + +I started thinking about writing a query application in Python using my +dedicated server that I use to host a few Django_ powered websites. At first I +thought I'd generate the server status HTML on my server and display it in an +``<iframe>`` on the gaming clan's server. But then it hit me that all I really +needed to do is have my Django_ application output a representation of the TS3 +server status in JSON_, and then perhaps I could find a slick jQuery_ tree menu +to display the status graphically. I really liked this idea, so here is a post +about the twists and turns I took implementing it. + +The Javascript +============== + +My searching turned up several jQuery tree menu plugins, but in the end I +settled on dynatree_. Dynatree had clear documentation I could understand, it +seems to be actively maintained, and it can generate a menu from JSON. After one +evening of reading the docs, I built a static test HTML page that could display +a tree menu built from JSON. Here the Javascript code I put in the test page's +``<head>`` section: + +.. sourcecode:: javascript + + var ts3_data = [ + {title: "Phantom Aces", isFolder: true, expand: true, + children: [ + {title: "MW3", isFolder: true, expand: true, + children: [ + {title: "Hogan", icon: "client.png"}, + {title: "Fritz!!", icon: "client.png"} + ] + }, + {title: "COD4", isFolder: true, expand: true, + children: [ + {title: "Klink", icon: "client.png"} + ] + }, + {title: "Away", isFolder: true, children: [], expand: true} + ] + } + ]; + + $(function(){ + $("#ts3-tree").dynatree({ + persist: false, + children: ts3_data + }); + }); + +Note that ``client.png`` is a small icon I found that I use in place of +dynatree's default file icon to represent TS3 clients. If I omitted the icon +attribute, the TS3 client would have appeared as a small file icon. Channels +appear as folder icons, and this didn't seem to unreasonable to me. In other +words I had no idea what a channel icon would look like. A folder was fine. + +With dynatree, you don't need a lot of HTML markup, it does all the heavy +lifting. You simply have to give it an empty ``<div>`` tag it can render +into. + +.. sourcecode:: html + + <body> + <div id="ts3-tree"></div> + </body> + </html> + +Here is a screenshot of the static test page in action. + +.. image:: /images/011-tree1.png + +Nice! Thanks dynatree! Now all I need to do is figure out how to dynamically +generate the JSON data and get it into the gaming clan's server status page. + +The Python +========== + +Looking through the `TS3 protocol documentation`_ I was somewhat surprised to +see that TS3 used the Telnet protocol for queries. So from my trusty shell I +telnet'ed into the TS3 server and played with the available commands. I made +notes on what commands I needed to issue to build my status display. + +My experiments worked, and I could see a path forward, but there were still some +kinks to be worked out with the TS3 protocol. The data it sent back was escaped +in a strange way for one thing. I would have to post-process the data in Python +before I could use it. I didn't want to reinvent the wheel, so I did a quick +search for Python libraries for working with TS3. I found a few, but quickly +settled on Andrew William's python-ts3_ library. It was small, easy to +understand, had tests, and a GitHub page. Perfect. + +One of the great things about Python, of course, is the interactive shell. Armed +with the `TS3 protocol documentation`_, python-ts3_, and the Python shell, I was +able to interactively connect to the TS3 server and poke around again. This time +I was sitting above telnet using python-ts3_ and I confirmed it would do the job +for me. + +Another evening was spent coding up a Django view to query the TS3 server using +python-ts3_ and to output the channel status as JSON. + +.. sourcecode:: python + + from django.conf import settings + from django.core.cache import cache + from django.http import HttpResponse, HttpResponseServerError + from django.utils import simplejson + import ts3 + + CACHE_KEY = 'ts3-json' + CACHE_TIMEOUT = 2 * 60 + + def ts3_query(request): + """ + Query the TeamSpeak3 server for status, and output a JSON + representation. + + The JSON we return is targeted towards the jQuery plugin Dynatree + http://code.google.com/p/dynatree/ + + """ + # Do we have the result cached? + result = cache.get(CACHE_KEY) + if result: + return HttpResponse(result, content_type='application/json') + + # Cache miss, go query the remote server + + try: + svr = ts3.TS3Server(settings.TS3_IP, settings.TS3_PORT, + settings.TS3_VID) + except ts3.ConnectionError: + return HttpResponseServerError() + + response = svr.send_command('serverinfo') + if response.response['msg'] != 'ok': + return HttpResponseServerError() + svr_info = response.data[0] + + response = svr.send_command('channellist') + if response.response['msg'] != 'ok': + return HttpResponseServerError() + channel_list = response.data + + response = svr.send_command('clientlist') + if response.response['msg'] != 'ok': + return HttpResponseServerError() + client_list = response.data + + # Start building the channel / client tree. + # We save tree nodes in a dictionary, keyed by their id so we can find + # them later in order to support arbitrary channel hierarchies. + channels = {} + + # Build the root, or channel 0 + channels[0] = { + 'title': svr_info['virtualserver_name'], + 'isFolder': True, + 'expand': True, + 'children': [] + } + + # Add the channels to our tree + + for channel in channel_list: + node = { + 'title': channel['channel_name'], + 'isFolder': True, + 'expand': True, + 'children': [] + } + parent = channels[int(channel['pid'])] + parent['children'].append(node) + channels[int(channel['cid'])] = node + + # Add the clients to the tree + + for client in client_list: + if client['client_type'] == '0': + node = { + 'title': client['client_nickname'], + 'icon': 'client.png' + } + channel = channels[int(client['cid'])] + channel['children'].append(node) + + tree = [channels[0]] + + # convert to JSON + json = simplejson.dumps(tree) + + cache.set(CACHE_KEY, json, CACHE_TIMEOUT) + + return HttpResponse(json, content_type='application/json') + +I have to make three queries to the TS3 server to get all the information I +need. The ``serverinfo`` command is issued to retrieve the TS3 virtual server's +name. The ``channellist`` command retrieves the list of channels. The +``clientlist`` command gets the list of TS3 clients that are currently +connected. For more information on these three commands see the TS3 query +protocol document. + +The only real tricky part of this code was figuring out how to represent an +arbitrary, deeply-nested channel tree in Python. I ended up guessing that +``cid`` meant channel ID and ``pid`` meant parent ID in the TS3 query data. I +squirrel away the channels in a ``channels`` dictionary, keyed by channel ID. +The root channel has an ID of 0. While iterating over the channel list, I can +retrieve the parent channel from the ``channels`` dictionary by ID and append +the new channel to the parent's ``children`` list. Clients are handled the same +way, but have different attributes. By inspecting the ``clientlist`` data in the +Python shell, I noticed that my Telnet client also showed up in that list. +However it had a ``client_type`` of 1, whereas the normal PC clients had a +``client_type`` of 0. + +I decided to cache the results for 2 minutes to reduce hits on the TS3 server, +as it has flood protection. This probably isn't needed given the size of our +gaming clan, but Django makes it easy to do, so why not? + +Putting it all together +======================= + +At this point I knew how to use my Django application to query the TS3 server +and build status in JSON format. I also knew what the Javascript and HTML on the +gaming clan's server status page (written in PHP) had to look like to render +that JSON status. + +The problem was the server status page was on one server, and my Django +application was on another. At first I thought it would be no problem for the +Javascript to do a ``GET`` on my Django server and retrieve the JSON. However I +had some vague memory of the browser security model, and after some googling I +was reminded of the `same origin policy`_. Rats. That wasn't going to work. + +I briefly researched JSONP_, which is the technique that Facebook & Google use +to embed those little "like" and "+1" buttons on your web pages. But in the end +it was just as easy to have the PHP script make the ``GET`` request to my Django +application using a `file_get_contents()`_ call. The PHP can then embed the JSON +directly into the server status page: + +.. sourcecode:: php + + $ts3_source = 'http://example.com/ts3/'; + $ts3_json = file_get_contents($ts3_source); + + require_once 'header.php'; + +And in header.php, some HTML sprinkled with some PHP: + +.. sourcecode:: html + + <script type="text/javascript"> + var ts3_data = <?php echo $ts3_json; ?>; + + $(function(){ + $("#ts3-tree").dynatree({ + persist: false, + children: ts3_data + }); + }); + </script> + +That did the trick. In the end I had to touch a little PHP, but it was +tolerable. That was a very round-about solution to building a TS3 viewer in +Python and Javascript. While I doubt you will have the same strange requirements +that I had (multiple servers), I hope you can see how to combine a few +technologies to make a TS3 viewer in Python. + + +.. _TeamSpeak 3: http://teamspeak.com/?page=teamspeak3 +.. _TS3 query protocol document: http://media.teamspeak.com/ts3_literature/TeamSpeak%203%20Server%20Query%20Manual.pdf +.. _Django: https://www.djangoproject.org +.. _JSON: http://json.org +.. _jQuery: http://jquery.org +.. _dynatree: http://code.google.com/p/dynatree/ +.. _python-ts3: http://pypi.python.org/pypi/python-ts3/0.1 +.. _same origin policy: http://en.wikipedia.org/wiki/Same_origin_policy +.. _JSONP: http://en.wikipedia.org/wiki/JSONP +.. _file_get_contents(): http://php.net/manual/en/function.file-get-contents.php +.. _TS3 protocol documentation: http://media.teamspeak.com/ts3_literature/TeamSpeak%203%20Server%20Query%20Manual.pdf
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/content/Coding/012-windows-trac-upgrade-2.rst Thu Jan 30 21:45:03 2014 -0600 @@ -0,0 +1,69 @@ +Upgrading Trac on Windows Gotcha - Part 2 +######################################### + +:date: 2012-03-21 20:10 +:tags: Python, Trac, Subversion, Windows +:slug: upgrading-trac-on-windows-gotcha-part-2 +:author: Brian Neal + +I have `previously reported`_ on some problems I had when upgrading our Trac_ +install at work. Today I attempted another upgrade to Subversion_ 1.7.4 and Trac +0.12.3 on Windows. I upgraded to Python_ 2.7.2 along the way. I ran into another +problem with the Python bindings to Subversion which took a while to figure out. +I had no problems upgrading Subversion, but Trac could not see our repository. +The symptoms were that the Trac "timeline" and "browse source" features were +missing. + +Following the `Trac troubleshooting advice`_, I opened an interactive Python +session and tried this: + +:: + + E:\Trac_Data>python + Python 2.7.2 (default, Jun 12 2011, 15:08:59) [MSC v.1500 32 bit (Intel)] on win + 32 + Type "help", "copyright", "credits" or "license" for more information. + >>> from svn import client + Traceback (most recent call last): + File "<stdin>", line 1, in <module> + File "C:\Python27\lib\site-packages\svn\client.py", line 26, in <module> + from libsvn.client import * + File "C:\Python27\lib\site-packages\libsvn\client.py", line 25, in <module> + _client = swig_import_helper() + File "C:\Python27\lib\site-packages\libsvn\client.py", line 21, in swig_import + _helper + _mod = imp.load_module('_client', fp, pathname, description) + ImportError: DLL load failed: The operating system cannot run %1 + +After some head scratching and googling I finally found the problem. I had used +the Windows .msi installer, `graciously provided by Alagazam`_, aka David Darj, +to install Subversion. This placed the Subversion binaries and DLL's in +``C:\Program Files\Subversion\bin``. I then unzipped the Python 2.7 bindings to +the ``C:\Python27\Lib\site-packages`` folder. The bindings depend on the DLL's +in the ``Subversion\bin`` folder. But unfortunately for me, there were already +two older versions of the DLL's, ``libeay32.dll`` and ``ssleay32.dll`` on my +path. So when the bindings went looking for those two DLL's, instead of finding +them in ``Subversion\bin``, it found the older versions somewhere else. + +To fix this, you can either rearrange your path, or copy those two DLL's to your +``Python27\Lib\site-packages\libsvn`` folder. In the future, I am going to just +copy all the DLL's from ``Subversion\bin`` to the ``libsvn`` folder. + +I examined the pre-built Subversion packages from Bitnami_ and CollabNet_. They +had packaged all of the Subversion DLL's with the Python bindings together in +the same directory, so this seems reasonable. Later, on the Subversion users' +mailing list, Alagazam gave the nod to this approach. + +A big thank you to Alagazam for the help and for the Windows binaries. And of +course thanks to the Apache_, Subversion_, Trac_, & Python_ teams for making +great tools. + +.. _previously reported: /2011/09/12/upgrading-trac-on-windows-gotchas +.. _Trac: http://trac.edgewall.org/ +.. _Subversion: http://subversion.apache.org/ +.. _Trac troubleshooting advice: http://trac.edgewall.org/wiki/TracSubversion#Checklist +.. _graciously provided by Alagazam: http://sourceforge.net/projects/win32svn/ +.. _Bitnami: http://bitnami.org/ +.. _CollabNet: http://www.collab.net/ +.. _Python: http://www.python.org/ +.. _Apache: http://subversion.apache.org/
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/content/Coding/013-omniorbpy-virtualenv.rst Thu Jan 30 21:45:03 2014 -0600 @@ -0,0 +1,24 @@ +Installing omniORBpy in a virtualenv +#################################### + +:date: 2012-03-27 20:30 +:tags: Python, omniORB, virtualenv +:slug: installing-omniorbpy-in-a-virtualenv +:author: Brian Neal + +I came across an interesting question on Stackoverflow_, which was essentially +`How to install omniORBpy in a virtualenv`_? I've had a lot of experience with +omniORB_, a high quality, open-source CORBA ORB, at work. omniORB is a C++ ORB, +but it also includes omniORBpy, a thin Python_ wrapper around the C++ core. +Despite using omniORBpy extensively, I had never used it in a virtualenv_. + +So I got curious, and gave it a shot. You can read `my answer`_, which the +questioner accepted. I'd love to hear of any better ways of doing it. I worked +it out for Ubuntu, but a Windows solution would be nice to see also. + +.. _Stackoverflow: http://stackoverflow.com/ +.. _How to install omniORBpy in a virtualenv: http://stackoverflow.com/questions/9716611/install-omniorb-python-in-a-virtualenv +.. _omniORB: http://omniorb.sourceforge.net/ +.. _Python: http://www.python.org/ +.. _virtualenv: http://www.virtualenv.org/ +.. _my answer: http://stackoverflow.com/a/9881882/63485
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/content/Coding/014-upgrading-django1.4.rst Thu Jan 30 21:45:03 2014 -0600 @@ -0,0 +1,205 @@ +Upgrading to Django 1.4 +####################### + +:date: 2012-04-15 14:50 +:tags: Django +:slug: upgrading-to-django-1.4 +:author: Brian Neal + +`Django 1.4`_ came out recently, and I took a few hours to upgrade my first site +yesterday. I thought it would be useful for my own reference to write down what +I did. I hope it will be useful to others. I'd love to read what you had to do, +so if you went through this process and blogged about it, please leave a +comment. Please keep in mind these aren't hard and fast steps or a recipe to +follow, as my sites are probably nothing like yours and may use different +features of Django. + +Preparation +----------- + +The first thing I did was to read very carefully the `Django 1.4 release +notes`_. The Django team does a great job of documenting what has changed, so it +is well worth your time to read the release notes. It is also a good idea to at +least skim the `Django Deprecation Timeline`_. After reading these, you should +make a list of the things you want to change, add, or remove. + +Tips +---- + +After deciding what areas you want or need to change in your code, these tips +may be useful to help you implement the changes. + +#. **Run with warnings turned on**. Use this command to run the development + server: ``$ python -Wall manage.py runserver``. Django makes use of `Python's + warning system`_ to flag features that are deprecated. By running Python with + the ``-Wall`` switch, you'll see these warnings in the development server + output. + +#. **Use the debugger to track down warnings**. Not sure where a pesky warning + is coming from? Just open the Django source code in your editor and put a + ``import pdb; pdb.set_trace()`` line right above or below the warning. You + can then use the debugger's ``w`` command to get a stack trace and find out + exactly what code is leading to the warning. In my case I kept getting a few + warnings with no idea where they were coming from. I used this technique to + verify the warnings were coming from third party code and not my own. For + more information on using the debugger (and you really **should** know how to + use this invaluable tool), see the `Pdb documentation`_. + +My upgrade experience +--------------------- + +Here is a list of things that I did during my port. Again, you may not need to +do these, and the next site I upgrade may have a different list. All of these +changes (except for the first) are described in the `Django 1.4 release notes`_. + +#. **Upgrade my Django debug toolbar**. As of this writing, the Django debug + toolbar I got from PyPI was not compatible with Django 1.4. I simply + uninstalled it and grabbed the development version from GitHub with + ``pip install git+https://github.com/django-debug-toolbar/django-debug-toolbar.git``. + +#. **Remove the ADMIN_MEDIA_PREFIX setting**. The admin application in + Django 1.4 now relies on the ``staticfiles`` application (introduced in + Django 1.3) to handle the serving of static assets. + +#. **Remove use of the {% admin_media_prefix %} template tag**. Related to the + above, this tag is now deprecated. I had a custom admin view that used this + template tag, and I simply replaced it with ``{{ STATIC_URL }}/admin``. + +#. **Remove verify_exists on URLFields**. The ``verify_exists`` option to + the ``URLField`` has been removed for performance and security reasons. I had + always set this to ``False``; now I just had to remove it altogether. + +#. **Add the require_debug_false filter to logging settings**. As explained in + the release notes, this change prevents admin error emails from being sent + while in ``DEBUG`` mode. + +#. **django.conf.urls.defaults is deprecated**. I changed my imports in all + ``urls.py`` files to use ``django.conf.urls`` instead of + ``django.conf.urls.defaults`` to access ``include()``, ``patterns()``, and + ``url()``. The Django team had recently moved these functions and updated the + docs and tutorial to stop using the frowned upon ``from + django.conf.urls.defaults import *``. + +#. **Enable the new clickjacking protection**. A nice new feature is some new + middleware that adds the ``X-Frame-Options`` header to all response headers. + This provides clickjacking_ protection in modern browsers. + +#. **Add an admin password reset feature**. By adding a few new lines to your + ``urlconf`` you get a nifty new password reset feature for your admin. + +#. **Update to the new manage.py**. This was the biggest change with the most + impact. The Django team has finally removed a long standing wart with its + ``manage.py`` utility. Previously, ``manage.py`` used to play games with your + ``PYTHONPATH`` which led to confusion when migrating to production. It could + also lead to having your settings imported twice. See the next section in + this blog entry for more on what I did here. + +Reorganizing for the new manage.py +---------------------------------- + +The change with the largest impact for me was reorganizing my directory +structure for the new ``manage.py`` command. Before this change, I had organized +my directory structure like this: + +:: + + mysite/ + media/ + static/ + mysite/ + myapp1/ + __init__.py + models.py + views.py + urls.py + myapp2/ + __init__.py + models.py + views.py + urls.py + settings/ + __init__.py + base.py + local.py + production.py + test.py + apache/ + myproject.wsgi + logs/ + templates/ + manage.py + urls.py + LICENSE + fabfile.py + requirements.txt + +After replacing the contents of my old ``manage.py`` with the new content, I +then reorganized my directory structure to this: + +:: + + mysite/ + media/ + static/ + myapp1/ + __init__.py + models.py + views.py + urls.py + myapp2/ + __init__.py + models.py + views.py + urls.py + myproject/ + settings/ + __init__.py + base.py + local.py + production.py + test.py + apache/ + myproject.wsgi + logs/ + templates/ + urls.py + LICENSE + fabfile.py + manage.py + requirements.txt + +It is a subtle change, but I like it. It now makes it clear that my project is +just an application itself, consisting of the top-level ``urls.py``, settings, +templates and logs. The ``manage.py`` file is now at the top level directory +also, which seems right. + +I had always made my imports as ``from app.models import MyModel`` instead of +``from myproject.app.models``, so I didn't have to update any imports. + +Since I use the "settings as a package" scheme, I did have to update the imports +in my settings files. For example, in my ``local.py`` I had to change ``from +settings.base import *`` to ``myproject.settings.base import *``. + +What I didn't do +---------------- + +Django 1.4's largest new feature is probably its support for timezones. I +decided for this project not to take advantage of that. It would require a lot +of changes, and it isn't really worth it for this small site. I may use it on +the next site I convert to Django 1.4, and I will definitely be using it on new +projects. + +Conclusion +---------- + +The upgrade process went smoother and quicker than I thought thanks to the +excellent release notes and the Django team's use of Python warnings to flag +deprecated features. + + +.. _Django 1.4: https://www.djangoproject.com/weblog/2012/mar/23/14/ +.. _Django 1.4 release notes: https://docs.djangoproject.com/en/1.4/releases/1.4/ +.. _Django Deprecation Timeline: https://docs.djangoproject.com/en/1.4/internals/deprecation/ +.. _Python's warning system: http://docs.python.org/library/warnings.html +.. _Pdb documentation: http://docs.python.org/library/pdb.html +.. _clickjacking: http://en.wikipedia.org/wiki/Clickjacking
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/content/Coding/015-synology-ubuntu.rst Thu Jan 30 21:45:03 2014 -0600 @@ -0,0 +1,65 @@ +Mounting a Synology DiskStation on Ubuntu 12.04 +############################################### + +:date: 2012-05-01 20:45 +:tags: Linux, Ubuntu, Synology +:slug: mounting-a-synology-diskstation-on-ubuntu-12.04 +:author: Brian Neal + +I have a Synology DiskStation that I use for a home NAS. I didn't take good +notes when I got it to work with Ubuntu 10.04, so I had to fumble about when I +upgraded to Ubuntu 12.04 last weekend. So for next time, here is the recipe I +used. + +First, you need to install the **cifs-utils** package:: + + $ sudo apt-get install cifs-utils + +Next, I added the following text to my ``/etc/fstab`` file:: + + //192.168.1.2/shared /mnt/syn cifs noauto,iocharset=utf8,uid=1000,gid=1000,credentials=/home/brian/.cifspwd 0 0 + +Take note of the following: + +* Replace ``//192.168.1.2/`` with the IP address or hostname of your + DiskStation. Likewise, ``/shared`` is just the path on the DiskStation that I + wanted to mount. +* ``/mnt/syn`` is the mount point where the DiskStation will appear on our local + filesystem. +* I didn't want my laptop to mount the DiskStation on bootup, so I used the + ``noauto`` parameter. +* The ``uid`` and ``gid`` should match the user and group IDs of your Ubuntu + user. You can find this by grepping your username in ``/etc/passwd``. +* The ``credentials`` parameter should point to a file you create that contains + the username and password of the DiskStation user you want to impersonate (see + below). + +Your ``.cifspwd`` file should look like the following:: + + username=username + password=password + +Obviously you'll want to use the real username / password pair of a user on your +DiskStation. + +To be paranoid, you should make the file owned by root and readable only by +root. Do this after you get everything working:: + + $ sudo chown root:root .cifspwd + $ sudo chmod 0600 .cifspwd + +I created the mount point for the DiskStation with:: + + $ sudo mkdir -p /mnt/syn + +Then, whenever I want to use the DiskStation I use:: + + $ sudo mount /mnt/syn + +And to unmount it:: + + $ sudo umount /mnt/syn + +You can avoid the ``mount`` and ``umount`` commands if you remove the +``noauto`` parameter from the ``/etc/fstab`` entry. In that case, Ubuntu will +automatically try to mount the DiskStation at startup.
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/content/Coding/016-django-fixed-pages.rst Thu Jan 30 21:45:03 2014 -0600 @@ -0,0 +1,172 @@ +Fixed pages for Django +###################### + +:date: 2012-05-13 13:30 +:tags: Django +:slug: fixed-pages-for-django +:author: Brian Neal + +I have been using Django's `flatpages app`_ for some simple, static pages that +were supposed to be temporary. I slapped a Javascript editor on the admin page +and it has worked very well. However some of the pages have long outlived their +"temporary" status, and I find myself needing to update them. It is then that I +get angry at the Javascript editor, and there is no way to keep any kind of +history on the page without having to fish through old database backups. I +started to think it would be nice to write the content in a nice markup +language, for example reStructuredText_, which I could then commit to version +control. I would just need a way to generate HTML from the source text to +produce the flatpage content. + +Of course I could use the template filters in `django.contrib.markup`_. But +turning markup into HTML at page request time can be more expensive than I like. +Yes, I could cache the page, but I'd like the process to be more explicit. + +In my first attempt at doing this, I wrote a `custom management command`_ that +used a dictionary in my ``settings.py`` file to map reStructuredText files to +flatpage URLs. My management command would open the input file, convert it to +HTML, then find the ``FlatPage`` object associated with the URL. It would then +update the object with the new HTML content and save it. + +This worked okay, but in the end I decided that the pages I wanted to update +were not temporary, quick & dirty pages, which is kind of how I view flatpages. +So I decided to stop leaning on the flatpages app for these pages. + +I then modified the management command to read a given input file, convert it +to an HTML fragment, then save it in my templates directory. Thus, a file stored +in my project directory as ``fixed/about.rst`` would get transformed to +``templates/fixed/about.html``. Here is the source to the command which I saved +as ``make_fixed_page.py``: + +.. sourcecode:: python + + import os.path + import glob + + import docutils.core + from django.core.management.base import LabelCommand, CommandError + from django.conf import settings + + + class Command(LabelCommand): + help = "Generate HTML from restructured text files" + args = "<inputfile1> <inputfile2> ... | all" + + def handle_label(self, filename, **kwargs): + """Process input file(s)""" + + if not hasattr(settings, 'PROJECT_PATH'): + raise CommandError("Please add a PROJECT_PATH setting") + + self.src_dir = os.path.join(settings.PROJECT_PATH, 'fixed') + self.dst_dir = os.path.join(settings.PROJECT_PATH, 'templates', 'fixed') + + if filename == 'all': + files = glob.glob("%s%s*.rst" % (self.src_dir, os.path.sep)) + files = [os.path.basename(f) for f in files] + else: + files = [filename] + + for f in files: + self.process_page(f) + + def process_page(self, filename): + """Processes one fixed page""" + + # retrieve source text + src_path = os.path.join(self.src_dir, filename) + try: + with open(src_path, 'r') as f: + src_text = f.read() + except IOError, ex: + raise CommandError(str(ex)) + + # transform text + content = self.transform_input(src_text) + + # write output + basename = os.path.splitext(os.path.basename(filename))[0] + dst_path = os.path.join(self.dst_dir, '%s.html' % basename) + + try: + with open(dst_path, 'w') as f: + f.write(content.encode('utf-8')) + except IOError, ex: + raise CommandError(str(ex)) + + prefix = os.path.commonprefix([src_path, dst_path]) + self.stdout.write("%s -> %s\n" % (filename, dst_path[len(prefix):])) + + def transform_input(self, src_text): + """Transforms input restructured text to HTML""" + + return docutils.core.publish_parts(src_text, writer_name='html', + settings_overrides={ + 'doctitle_xform': False, + 'initial_header_level': 2, + })['html_body'] + +Next I would need a template that could render these fragments. I remembered +that the Django `include tag`_ could take a variable as an argument. Thus I +could create a single template that could render all of these "fixed" pages. +Here is the template ``templates/fixed/base.html``:: + + {% extends 'base.html' %} + {% block title %}{{ title }}{% endblock %} + {% block content %} + {% include content_template %} + {% endblock %} + +I just need to pass in ``title`` and ``content_template`` context variables. The +latter will control which HTML fragment I include. + +I then turned to the view function which would render this template. I wanted to +make this as generic and easy to do as possible. Since I was abandoning +flatpages, I would need to wire these up in my ``urls.py``. At first I didn't +think I could use Django's new `class-based generic views`_ for this, but after +some fiddling around, I came up with a very nice solution: + +.. sourcecode:: python + + from django.views.generic import TemplateView + + class FixedView(TemplateView): + """ + For displaying our "fixed" views generated with the custom command + make_fixed_page. + + """ + template_name = 'fixed/base.html' + title = '' + content_template = '' + + def get_context_data(self, **kwargs): + context = super(FixedView, self).get_context_data(**kwargs) + context['title'] = self.title + context['content_template'] = self.content_template + return context + +This allowed me to do the following in my ``urls.py`` file: + +.. sourcecode:: python + + urlpatterns = patterns('', + # ... + + url(r'^about/$', + FixedView.as_view(title='About', content_template='fixed/about.html'), + name='about'), + url(r'^colophon/$', + FixedView.as_view(title='Colophon', content_template='fixed/colophon.html'), + name='colophon'), + + # ... + +Now I have a way to efficiently serve reStructuredText files as "fixed pages" +that I can put under source code control. + +.. _flatpages app: https://docs.djangoproject.com/en/1.4/ref/contrib/flatpages/ +.. _reStructuredText: http://docutils.sourceforge.net/rst.html +.. _custom management command: https://docs.djangoproject.com/en/1.4/howto/custom-management-commands/ +.. _include tag: https://docs.djangoproject.com/en/1.4/ref/templates/builtins/#include +.. _class-based generic views: https://docs.djangoproject.com/en/1.4/topics/class-based-views/ +.. _django.contrib.markup: https://docs.djangoproject.com/en/1.4/ref/contrib/markup/
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/content/Coding/017-weighmail.rst Thu Jan 30 21:45:03 2014 -0600 @@ -0,0 +1,69 @@ +Introducing weighmail +##################### + +:date: 2012-05-24 19:30 +:tags: Python, Gmail, IMAPClient, weighmail +:slug: introducing-weighmail +:author: Brian Neal + +Recently my wife approached me and told me that Gmail was warning her that she +was using 95% of her (free) quota. This was a bit surprising, but my wife does a +lot of photography, and sends lots of photos through the mail to various people. +So I began helping her trying to find her large messages. It was then that I +learned that Gmail provides no easy way to do this. You can't sort by size. You +can search for specific attachments, for example .psd or .jpg, and that is what +she ended up doing. + +Surely I thought there must be an easier way. I thought that perhaps Gmail might +have an API like their other products. A bit of searching turned up that the +only API to Gmail is IMAP_. I didn't know anything about IMAP, but I do know +some Python. And sure enough, Python has a library for IMAP called imaplib_. +Glancing through imaplib I got the impression it was a very low-level library +and I began to get a bit discouraged. + +I continued doing some searching and I quickly found IMAPClient_, a high-level +and friendlier library for working with IMAP. This looked like it could work +very well for me! + +I started thinking about writing an application to find big messages in a Gmail +account. The most obvious and natural way to flag large messages would be to +slap a Gmail label on them. But could IMAPClient do this? It didn't look like +it. It turns out that labels are part of a set of `custom Gmail IMAP +extensions`_, and IMAPClient didn't support them. Yet. + +I contacted the author of IMAPClient, `Menno Smits`_, and quickly learned he is +a very friendly and encouraging guy. I decided to volunteer a patch, as this +would give me a chance to learn something about IMAP. He readily agreed and I +dove in. + +The short version of the story is I learned a heck of a lot from reading the +source to IMAPClient, and was able to contribute a patch for Gmail label support +and even some tests! + +After the patch was accepted I went to work on my application, which I have +dubbed weighmail_. I was even able to figure out how to put `weighmail up on +PyPI`_ thanks to Menno's example. + +So if you need a program to categorize your Gmail by message size, I hope +weighmail will meet your needs. Please try it out and feel free to send me +feedback and feature requests on the Bitbucket issue tracker. + +I have used it maybe a half-dozen times on my Gmail account now. My Gmail +account is only about 26% full and I have about 26,300 messages in my "All Mail" +folder. Run times for weighmail have varied from six to 15 minutes when adding 3 +label categories for size. I was (and am) kind of worried that Gmail may lock me +out of my account for accessing it too heavily, but knock on wood it hasn't yet. +Perhaps they rate limit the responses and that is why the run times vary so +much. + +In any event, I hope you find it useful. A big thank you to Menno Smits for his +IMAPClient library, his advice, and working with me on the patch. Hooray for +open source! + +.. _IMAP: http://en.wikipedia.org/wiki/Internet_Message_Access_Protocol +.. _imaplib: http://docs.python.org/library/imaplib.html +.. _IMAPClient: http://imapclient.freshfoo.com/ +.. _custom Gmail IMAP extensions: https://developers.google.com/google-apps/gmail/imap_extensions +.. _weighmail: https://bitbucket.org/bgneal/weighmail/ +.. _weighmail up on PyPI: http://pypi.python.org/pypi/weighmail/0.1.0 +.. _Menno Smits: http://freshfoo.com/blog/
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/content/Coding/018-pyenigma.rst Thu Jan 30 21:45:03 2014 -0600 @@ -0,0 +1,84 @@ +Introducing Py-Enigma +##################### + +:date: 2012-06-06 18:45 +:tags: Python, Py-Enigma, Enigma +:slug: introducing-py-enigma +:author: Brian Neal + +Py-Enigma +--------- + +For some strange reason, I don't even remember why or how, I found myself +browsing the `Wikipedia page for the Enigma Machine +<http://en.wikipedia.org/wiki/Enigma_machine>`_. I quickly became fascinated and +started surfing around some more. I found several Enigma machine simulators, +some of them even online. It suddenly occured to me that it would be a fun +project to try and write my own in Python_. I wanted to challenge myself to see +if I could figure it all without help, so I vowed not to look at anyone else's +source code. + +In order to write an Enigma simulator, you need a really good technical +explanation of how it worked, along with numerous details like how the rotors +were wired. Fortunately I very quickly found Dirk Rijmenants' incredible `Cipher +Machines and Cryptology website`_. In particular, his `Technical Details of the +Enigma Machine`_ page was exactly what I was looking for, a real gold mine of +information. + +I had a long Memorial Day weekend, so I spent many hours in front of the +computer, consuming every word of Dirk's explanations and trying to sort it all +out in my head. And so a very fun marathon hacking session began. In the end I +got it figured out, and you cannot believe how excited I was when I was able to +decode actual Enigma messages from World War II! + +And so `Py-Enigma <https://bitbucket.org/bgneal/enigma>`_ was born! Py-Enigma is +a historically accurate simulation library for war-time Enigma machines, written +in Python 3. I also included a simple command-line application for doing +scripting or quick commad-line experiments. + +Lessons learned +--------------- + +Since I didn't really know what I was doing at first, I wrote little classes for +each of the components and unit tested them. I'm really glad I did this because +not only did it find bugs, it also made me think harder about the problem and +made me realize I didn't understand everything. When you make a mistake in +cryptography, very often the answer you get is gibberish and there is no way to +tell how close you are to the right answer. This was almost the case here, and I +think I stumbled upon an Enigma weakness that the allied codebreakers must have +seen also. I had a bug in my rotor stepping algorithm, and I got the right +answer up until the point where the right rotor stepped the middle rotor, then +the output went all garbage. Once I noticed this, I was able to focus on the +stepping algorithm and find the bug. I'm sure the allied codebreakers must have +experienced the same thing when they were guessing what rotors were being used +for the message they were cracking. + +I also decided to use this little project to really learn Sphinx_. I had dabbled +around in it before when I contributed some documentation patches to Django_. I +think writing the documentation took almost as long as my research and coding, +but with Sphinx at least it was fun! It is a very useful and powerful package. I +also checked out the awesome `readthedocs.org <http://readthedocs.org>`_ and +quickly got my documentation `hosted there +<http://py-enigma.readthedocs.org/>`_. What a fantastic service! Now whenever I +push changes to bitbucket my docs are automatically built on readthedocs! + +This was also my second project that I put up on PyPI. I'm a little bit more +comfortable with the packaging process, but it is still a bit bewildering given +all the choices. + +Conclusion +---------- + +I hope folks who are interested in history, cryptography, World War II, and +Python find Py-Enigma useful and fun! You can get the code from either the +`Py-Enigma Bitbucket page`_ or from PyPI_. And a big thank-you to Dirk +Rijmenants! Without his hard work and detailed explanation, Py-Enigma would +have been considerably more difficult. + +.. _Python: http://www.python.org +.. _Cipher Machines and Cryptology website: http://users.telenet.be/d.rijmenants/index.htm +.. _Technical Details of the Enigma Machine: http://users.telenet.be/d.rijmenants/en/enigmatech.htm +.. _Sphinx: http://sphinx.pocoo.org/ +.. _Django: http://www.djangoproject.com/ +.. _Py-Enigma Bitbucket page: https://bitbucket.org/bgneal/enigma +.. _PyPI: http://pypi.python.org/pypi/py-enigma/
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/content/Coding/019-enigma-challenge.rst Thu Jan 30 21:45:03 2014 -0600 @@ -0,0 +1,223 @@ +Completing the Enigma Challenge +############################### + +:date: 2012-07-22 13:30 +:tags: Enigma, Py-Enigma, Cpp-Enigma, C++ +:slug: completing-the-enigma-challenge +:author: Brian Neal + +Since my last `blog post +<http://deathofagremmie.com/2012/06/06/introducing-py-enigma/>`_, I have spent +almost all my free time working on `Dirk Rijmenants' Enigma Challenge +<http://users.telenet.be/d.rijmenants/en/challenge.htm>`_. I'm very happy to +report that I cracked the last message on the evening of July 10. I thought it +might be interesting to summarize my experiences working on this challenge in a +blog post. + +The first nine +-------------- + +I had a lot of fun building Py-Enigma_, an Enigma machine simulator library +written in Python_ based on the information I found on Dirk's `Technical Details +of the Enigma Machine`_ page. After I had built it and played with it for a +while, I discovered Dirk also had an `Enigma Challenge`_. I decided this would +be a good test for Py-Enigma_, and it may give me some ideas on how to improve +the library for breaking messages. + +The first nine messages were pretty easy to solve using Py-Enigma_. Dirk gives +you most of the key information for each problem, and one can then write a small +program to brute force the missing information. I was able to solve the first +nine messages in a weekend. This was very enjoyable, as it did require some +thinking about how to simulate the problem and then executing it. I also learned +about `Python's itertools <http://docs.python.org/library/itertools.html>`_ +library, which contains some very handy functions for generating permutations +and combinations for brute forcing a solution. + +Dirk did a great job on the messages. They referenced actual events from World +War II and I ended up spending a fair amount of time on Wikipedia reading about +them. Translating the messages from German to English was a bit difficult, but I +relied heavily on `Google Translate <http://translate.google.com>`_. + +Message 10 +---------- + +But then I came to the last message. Here, Dirk doesn't give you any key +information, just ciphertext. Gulp. If you include 2 possible reflectors, the +Enigma machine's key space for message 10 is somewhere around 2 x 10\ +:sup:`23` keys, so it is impossible to brute force every possible key, unless +you can afford to wait around for the heat death of the universe. + +I then did some research, and learned about a technique for brute-forcing only +the rotor and message settings, and then performing a heuristic technique called +"hill-climbing" on the plugboard settings. I am deeply indepted to Frode Weierud +and Geoff Sullivan, who have published high level descriptions of this +technique. See Frode's `The Enigma Collection +<http://cryptocellar.org/Enigma/>`_ page and Geoff's `Crypto Barn +<http://www.hut-six.co.uk/>`_ for more information. + +Python is too slow, or is it? +----------------------------- + +After a week or so of studying and tinkering, I came up with a hill-climbing +algorithm in Python. I then started an attempt to break message 10, but after +running the program for a little over 24 hours, I realized it was too slow. I +did a "back of the envelope" calculation and determined I would need several +hundred years to hill-climb every rotor and message setting. This was a bit +disappointing, but I knew it was a possibility going in. + +I then set out, somewhat reluctantly, to port Py-Enigma to C++. These days I +don't do any C++ at home, only at work. However I got much happier when I +decided this was an opportunity to try out some new C++11 features (the embedded +compiler we use at work has no support for C++11). Okay, things are fun again! I +have to say that C++11 is really quite enjoyable, and I made good use of the new +``auto`` style variable declarations, range-based for-loops, and brace-style +initialization for containers like ``vector`` and ``map``. + +C++ is too slow, or is it? +-------------------------- + +It took a fair amount of time to re-tool my code to C++11. I ensured all my unit +tests originally written in Python could pass in C++, and that I could solve +some of the previous messages. I then re-implemented my hill-climbing +algorithm in C++ and made another attempt at cracking message 10. + +My C++ solution was indeed faster, but only by a factor of 8 or 9. I calculated +I would need about 35 years to hill-climb all rotor and message settings, and I +didn't really want to see if I could out-live my program. My morale was very low +at this point. I wasn't sure if I could ever solve this self-imposed problem. + +Algorithms & Data Structures are key +------------------------------------ + +I had wanted to test myself with this problem and had avoided looking at anyone +else's source code. However, at this juncture, I needed some help. I turned to +Stephan Krah's `M4 Message Breaking Project +<http://www.bytereef.org/m4_project.html>`_. Stephan had started a distributed +computing attack on several 4-rotor naval Enigma machine messages. He had +graciously made his source code available. Perhaps I could get some hints by +looking at his code. + +Indeed, Stephen's code provided the break I needed. I discovered a very cool +trick that Stephen was doing right before he started his hill-climb. He +pre-computed every path through the machine for every letter of the alphabet. +Since hill-climbing involves running the message through the machine roughly +600 to 1000 times, this proved to be a huge time saver. After borrowing this +idea, my hill-climbing times for one fixed rotor & message setting went down +from 250 milliseconds to 2! + +I immediately stopped looking at Stephen's code at this point; I didn't bother +to compare our hill-climbing algorithms. I believe both of us are influenced by +Weierud & Sullivan here. I was now convinced I had the speed-up I needed to make +a serious attempt at message 10 again. + +This showed me how critically important it is to have the right data structure +and algorithm for a problem. However there were two other lessons I needed to +learn before I was successful. + +Utilize all your computing resources +------------------------------------ + +Up to this point my cracking program was linear. It started at the first rotor +and message setting, then hill-climbed and noted the score I got. Then it +proceeded to the next message setting and hill-climbed, noting the score of that +attempt, and so on. There were two problems with this approach. + +I realized if I could do some sort of multi-processing, I could search the key +space faster. I thought about forking off multiple processes or maybe using +threads, but that was over-thinking it. In the end, I added command-line +arguments to the program to tell it where in the key space to start looking. I +could then simply run multiple instances of my program. My development laptop, +which runs Ubuntu, is from 2008, and it has 2 cores in it. My desktop PC (for +gaming), has 4 cores! I next spent a couple nights installing MinGW_ and getting +my application to build in that environment. I could now run 6 instances of my +cracking program. Hopefully this would pay off and shorten the time further. + +(I didn't even attempt to commandeer my wife's and kids' computers. That +wouldn't have gone down so well.) + +How do you know when you are done? +---------------------------------- + +The final problem that I had was determining when my cracking program had in +fact cracked the message. The hill-climbing attempts produced scores for each +key setting. Up to this point, based on running my program on messages 1 through +9, I had noticed that when I got a score divided by the message length of around +10 I had cracked the message. But my tests on messages 1 through 9 were +hill-climbing using the known ring settings. My message cracker was using a +shortcut: I was only searching ring settings on the "fast" rotor to save time. +This would let me get the first part of a message, but if the middle or +left-most rotor stepped at the wrong time the rest of the message would be +unintelligible. + +To verify my message cracking program, I ran it on some of the previous +messages. I was shooting for a score divided by message length of 10. Using this +heuristic, I was in fact able to crack some of the previous messages, but not +others. It took me a while to realize that not searching the middle rotor's ring +setting was causing this. The light bulb came on, and I realized that my +heuristic of 10 is only valid when I search all the ring settings. Instead I +should just keep track of the highest scores. When you get close enough, the +score will be significantly higher than average score. Thus I may not know when +I am done until I see a large spike in the score. I would then have to write +another program to refine the solution to search for the middle and left-most +ring setting. + +Success at last +--------------- + +It took a little over a month of evenings and weekends to get to this point. +I even had a false start where I ran my program on 6 cores for 1.5 days only to +realize I had a bug in my code and I was only searching ring setting 0 on the +fast ring. But once I had that worked out, I started a search on 6 cores on a +Sunday night. I checked the logs off and on Monday, but no significant spike in +the scores were noted. Before leaving for work on Tuesday I checked again, and +noticed that one instance of my program had found a score twice as high as any +other. Could that be it? I really wanted to stay and find out, but I had to go +to work! Needless to say I was a bit distracted at work that day. + +After work, I ran the settings through my Python application and my heart +pounded as I recognized a few German words at the beginning of the message. In +fact I had gotten around 75% of the message, then some garbage, but then the +last few words were recognizable German. I then wrote another application to +search the middle ring setting space. Using the highest score from that program +allowed me to finally see the entire message. I cannot tell you how happy and +relieved I was at the same time! + +Final thoughts +-------------- + +For a while I didn't think I was going to be smart enough or just plain up to +the task of solving message 10. Looking back on it, I see a nice progression of +trying things, experimenting, setbacks, and a few "a-ha" moments that lead to the +breakthrough. I am very grateful to Dirk Rijmenants for creating the challenge, +which was a lot of fun. And I also wish to thank Frode Weierud and Geoff +Sullivan for their hill-climbing algorithm description. Thanks again to Stephan +Krah for the key speedup that made my attempt possible. + +For now I have decided not to publish my cracking program, since it is a tiny +bit specific to Dirk's challenge. I don't want to ruin his Enigma Challenge by +giving anything away. I don't believe my broad descriptions here have revealed +too much. Other people need to experience the challenge for themselves to fully +appreciate the thrill and hard work that go into code breaking. + +I should also note that there are other ways of solving Dirk's challenge. You +could, with a bit of patience, solve messages 1 through 9 with one of the many +Enigma Machine GUI simulators (Dirk has a fine one). I'm not certain how +you could break message 10 using a GUI simulator and trial and error, however. + +It is my understanding that this style of Enigma message breaking was not the +technique used by the allied code-breakers during the war. I believe they used +knowledge of certain key words, or "cribs", in messages received in weather +reports and other regular messages to find the key for the day. I look forward +to reading more about their efforts now. + +Finally, I have published the C++11 port of Py-Enigma_, which I am calling +(rather unimaginatively) Cpp-Enigma_ on bitbucket. I hope Cpp-Enigma_ and +Py-Enigma_ are useful for other people who want to explore the fascinating world +of Enigma codes. + +.. _Py-Enigma: https://bitbucket.org/bgneal/enigma +.. _Python: http://www.python.org +.. _Technical Details of the Enigma Machine: http://users.telenet.be/d.rijmenants/en/enigmatech.htm +.. _Enigma Challenge: http://users.telenet.be/d.rijmenants/en/challenge.htm +.. _Cpp-Enigma: https://bitbucket.org/bgneal/cpp-enigma +.. _MinGW: http://www.mingw.org/
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/content/Coding/020-django-moinmoin.rst Thu Jan 30 21:45:03 2014 -0600 @@ -0,0 +1,443 @@ +Integrating Django and MoinMoin with Redis +########################################## + +:date: 2012-12-02 14:50 +:tags: Django, MoinMoin, Redis +:slug: integrating-django-and-moinmoin-with-redis +:author: Brian Neal + +We want a Wiki! +=============== + +Over at SurfGuitar101.com_, we decided we'd like to have a wiki to capture +community knowledge. I briefly looked at candidate wiki engines with an eye +towards integrating them with Django_, the framework that powers +SurfGuitar101.com_. And of course I was biased towards a wiki solution that was +written in Python_. I had tried a few wikis in the past, including the behemoth +MediaWiki_. MediaWiki is a very powerful piece of software, but it is also +quite complex, and I didn't want to have to maintain a PHP infrastructure to run +it. + +Enter MoinMoin_. This is a mature wiki platform that is actively maintained and +written in Python. It is full featured but did not seem overly complex to me. +It stores its pages in flat files, which seemed appealing for our likely small +wiki needs. It turns out I had been a user of MoinMoin for many years without +really knowing it. The `Python.org wiki`_, `Mercurial wiki`_, and `omniORB +wiki`_ are all powered by MoinMoin_. We'd certainly be in good company. + +Single Sign-On +============== + +The feature that clinched it was MoinMoin's flexible `authentication system`_. +It would be very desirable if my users did not have to sign into Django and +then sign in again to the wiki with possibly a different username. Managing two +different password databases would be a real headache. The ideal solution would +mean signing into Django would log the user into MoinMoin with the same +username automatically. + +MoinMoin supports this with their `external cookie authentication`_ mechanism. +The details are provided in the previous link; basically a Django site needs to +perform the following: + +#. Set an external cookie for MoinMoin to read whenever a user logs into Django. +#. To prevent spoofing, the Django application should create a record that the + cookie was created in some shared storage area accessible to MoinMoin. This + allows MoinMoin to validate that the cookie is legitimate and not a fake. +#. When the user logs out of the Django application, it should delete this + external cookie and the entry in the shared storage area. +#. Periodically the Django application should expire entires in the shared + storage area for clean-up purposes. Otherwise this storage would grow and + grow if users never logged out. Deleting entries older than the external + cookie's age should suffice. + +My Django Implementation +======================== + +There are of course many ways to approach this problem. Here is what I came up +with. I created a Django application called *wiki* to hold this integration +code. There is quite a lot of code here, too much to conveniently show in this +blog post. I will post snippets below, but you can refer to the complete code +in my `Bitbucket repository`_. You can also view online the `wiki application in +bitbucket`_ for convenience. + +Getting notified of when users log into or out of Django is made easy thanks to +Django's `login and logout signals`_. By creating a signal handler I can be +notified when a user logs in or out. The signal handler code looks like this: + +.. sourcecode:: python + + import logging + from django.contrib.auth.signals import user_logged_in, user_logged_out + from wiki.constants import SESSION_SET_MEMBER + + logger = logging.getLogger(__name__) + + def login_callback(sender, request, user, **kwargs): + """Signal callback function for a user logging in. + + Sets a flag for the middleware to create an external cookie. + + """ + logger.info('User login: %s', user.username) + + request.wiki_set_cookie = True + + def logout_callback(sender, request, user, **kwargs): + """Signal callback function for a user logging in. + + Sets a flag for the middleware to delete the external cookie. + + Since the user is about to logout, her session will be wiped out after + this function returns. This forces us to set an attribute on the request + object so that the response middleware can delete the wiki's cookie. + + """ + if user: + logger.info('User logout: %s', user.username) + + # Remember what Redis set member to delete by adding an attribute to the + # request object: + request.wiki_delete_cookie = request.session.get(SESSION_SET_MEMBER) + + + user_logged_in.connect(login_callback, dispatch_uid='wiki.signals.login') + user_logged_out.connect(logout_callback, dispatch_uid='wiki.signals.logout') + +When a user logs in I want to create an external cookie for MoinMoin. But +cookies can only be created on HttpResponse_ objects, and all we have access to +here in the signal handler is the request object. The solution here is to set +an attribute on the request object that a later piece of middleware_ will +process. I at first resisted this approach, thinking it was kind of hacky. +I initially decided to set a flag in the session, but then found out that in +some cases the session is not always available. I then reviewed some of the +Django supplied middleware classes and saw that they also set attributes on the +request object, so this must be an acceptable practice. + +My middleware looks like this. + +.. sourcecode:: python + + class WikiMiddleware(object): + """ + Check for flags on the request object to determine when to set or delete an + external cookie for the wiki application. When creating a cookie, also + set an entry in Redis that the wiki application can validate to prevent + spoofing. + + """ + + def process_response(self, request, response): + + if hasattr(request, 'wiki_set_cookie'): + create_wiki_session(request, response) + elif hasattr(request, 'wiki_delete_cookie'): + destroy_wiki_session(request.wiki_delete_cookie, response) + + return response + +The ``create_wiki_session()`` function creates the cookie for MoinMoin and +stores a hash of the cookie in a shared storage area for MoinMoin to validate. +In our case, Redis_ makes an excellent shared storage area. We create a sorted +set in Redis to store our cookie hashes. The score for each hash is the +timestamp of when the cookie was created. This allows us to easily delete +expired cookies by score periodically. + +.. sourcecode:: python + + def create_wiki_session(request, response): + """Sets up the session for the external wiki application. + + Creates the external cookie for the Wiki. + Updates the Redis set so the Wiki can verify the cookie. + + """ + now = datetime.datetime.utcnow() + value = cookie_value(request.user, now) + response.set_cookie(settings.WIKI_COOKIE_NAME, + value=value, + max_age=settings.WIKI_COOKIE_AGE, + domain=settings.WIKI_COOKIE_DOMAIN) + + # Update a sorted set in Redis with a hash of our cookie and a score + # of the current time as a timestamp. This allows us to delete old + # entries by score periodically. To verify the cookie, the external wiki + # application computes a hash of the cookie value and checks to see if + # it is in our Redis set. + + h = hashlib.sha256() + h.update(value) + name = h.hexdigest() + score = time.mktime(now.utctimetuple()) + conn = get_redis_connection() + + try: + conn.zadd(settings.WIKI_REDIS_SET, score, name) + except redis.RedisError: + logger.error("Error adding wiki cookie key") + + # Store the set member name in the session so we can delete it when the + # user logs out: + request.session[SESSION_SET_MEMBER] = name + +We store the name of the Redis set member in the user's session so we can +delete it from Redis when the user logs out. During logout, this set member is +retrieved from the session in the logout signal handler and stored on the +request object. This is because the session will be destroyed after the logout +signal handler runs and before the middleware can access it. The middleware +can check for the existence of this attribute as its cue to delete the wiki +session. + +.. sourcecode:: python + + def destroy_wiki_session(set_member, response): + """Destroys the session for the external wiki application. + + Delete the external cookie. + Deletes the member from the Redis set as this entry is no longer valid. + + """ + response.delete_cookie(settings.WIKI_COOKIE_NAME, + domain=settings.WIKI_COOKIE_DOMAIN) + + if set_member: + conn = get_redis_connection() + try: + conn.zrem(settings.WIKI_REDIS_SET, set_member) + except redis.RedisError: + logger.error("Error deleting wiki cookie set member") + +As suggested in the MoinMoin external cookie documentation, I create a cookie +whose value consists of the username, email address, and a key separated by +the ``#`` character. The key is just a string of stuff that makes it difficult +for a spoofer to recreate. + +.. sourcecode:: python + + def cookie_value(user, now): + """Creates the value for the external wiki cookie.""" + + # The key part of the cookie is just a string that would make things + # difficult for a spoofer; something that can't be easily made up: + + h = hashlib.sha256() + h.update(user.username + user.email) + h.update(now.isoformat()) + h.update(''.join(random.sample(string.printable, 64))) + h.update(settings.SECRET_KEY) + key = h.hexdigest() + + parts = (user.username, user.email, key) + return '#'.join(parts) + +Finally on the Django side we should periodically delete expired Redis set +members in case users do not log out. Since I am using Celery_ with my Django +application, I created a Celery task that runs periodically to delete old set +members. This function is a bit longer than it probably needs to be, but +I wanted to log how big this set is before and after we cull the expired +entries. + +.. sourcecode:: python + + @task + def expire_cookies(): + """ + Periodically run this task to remove expired cookies from the Redis set + that is shared between this Django application & the MoinMoin wiki for + authentication. + + """ + now = datetime.datetime.utcnow() + cutoff = now - datetime.timedelta(seconds=settings.WIKI_COOKIE_AGE) + min_score = time.mktime(cutoff.utctimetuple()) + + conn = get_redis_connection() + + set_name = settings.WIKI_REDIS_SET + try: + count = conn.zcard(set_name) + except redis.RedisError: + logger.error("Error getting zcard") + return + + try: + removed = conn.zremrangebyscore(set_name, 0.0, min_score) + except redis.RedisError: + logger.error("Error removing by score") + return + + total = count - removed + logger.info("Expire wiki cookies: removed %d, total is now %d", + removed, total) + +MoinMoin Implementation +======================= + +As described in the MoinMoin external cookie documentation, you have to +configure MoinMoin to use your external cookie authentication mechanism. +It is also nice to disable the ability for the MoinMoin user to change their +username and email address since that is being managed by the Django +application. These changes to the MoinMoin ``Config`` class are shown below. + +.. sourcecode:: python + + class Config(multiconfig.DefaultConfig): + + # ... + + # Use ExternalCookie method for integration authentication with Django: + auth = [ExternalCookie(autocreate=True)] + + # remove ability to change username & email, etc. + user_form_disable = ['name', 'aliasname', 'email',] + user_form_remove = ['password', 'password2', 'css_url', 'logout', 'create', + 'account_sendmail', 'jid'] + +Next we create an ``ExternalCookie`` class and associated helper functions to +process the cookie and verify it in Redis. This code is shown in its entirety +below. It is based off the example in the MoinMoin external cookie +documentation, but uses Redis as the shared storage area. + +.. sourcecode:: python + + import hashlib + import Cookie + import logging + + from MoinMoin.auth import BaseAuth + from MoinMoin.user import User + import redis + + COOKIE_NAME = 'YOUR_COOKIE_NAME_HERE' + + # Redis connection and database settings + REDIS_HOST = 'localhost' + REDIS_PORT = 6379 + REDIS_DB = 0 + + # The name of the set in Redis that holds cookie hashes + REDIS_SET = 'wiki_cookie_keys' + + logger = logging.getLogger(__name__) + + + def get_cookie_value(cookie): + """Returns the value of the Django cookie from the cookie. + None is returned if the cookie is invalid or the value cannot be + determined. + + This function works around an issue with different Python versions. + In Python 2.5, if you construct a SimpleCookie with a dict, then + type(cookie[key]) == unicode + whereas in later versions of Python: + type(cookie[key]) == Cookie.Morsel + """ + if cookie: + try: + morsel = cookie[COOKIE_NAME] + except KeyError: + return None + + if isinstance(morsel, unicode): # Python 2.5 + return morsel + elif isinstance(morsel, Cookie.Morsel): # Python 2.6+ + return morsel.value + + return None + + + def get_redis_connection(host=REDIS_HOST, port=REDIS_PORT, db=REDIS_DB): + """ + Create and return a Redis connection using the supplied parameters. + + """ + return redis.StrictRedis(host=host, port=port, db=db) + + + def validate_cookie(value): + """Determines if cookie was created by Django. Returns True on success, + False on failure. + + Looks up the hash of the cookie value in Redis. If present, cookie + is deemed legit. + + """ + h = hashlib.sha256() + h.update(value) + set_member = h.hexdigest() + + conn = get_redis_connection() + success = False + try: + score = conn.zscore(REDIS_SET, set_member) + success = score is not None + except redis.RedisError: + logger.error('Could not check Redis for ExternalCookie auth') + + return success + + + class ExternalCookie(BaseAuth): + name = 'external_cookie' + + def __init__(self, autocreate=False): + self.autocreate = autocreate + BaseAuth.__init__(self) + + def request(self, request, user_obj, **kwargs): + user = None + try_next = True + + try: + cookie = Cookie.SimpleCookie(request.cookies) + except Cookie.CookieError: + cookie = None + + val = get_cookie_value(cookie) + if val: + try: + username, email, _ = val.split('#') + except ValueError: + return user, try_next + + if validate_cookie(val): + user = User(request, name=username, auth_username=username, + auth_method=self.name) + + changed = False + if email != user.email: + user.email = email + changed = True + + if user: + user.create_or_update(changed) + if user and user.valid: + try_next = False + + return user, try_next + +Conclusion +========== + +I've been running this setup for a month now and it is working great. My users +and I are enjoying our shiny new MoinMoin wiki integrated with our Django +powered community website. The single sign-on experience is quite seamless and +eliminates the need for separate accounts. + + +.. _SurfGuitar101.com: http://surfguitar101.com +.. _Django: https://www.djangoproject.com +.. _MediaWiki: http://www.mediawiki.org +.. _MoinMoin: http://moinmo.in/ +.. _Python: http://www.python.org +.. _Python.org wiki: http://wiki.python.org/moin/ +.. _Mercurial wiki: http://mercurial.selenic.com/wiki/ +.. _omniORB wiki: http://www.omniorb-support.com/omniwiki +.. _authentication system: http://moinmo.in/HelpOnAuthentication +.. _external cookie authentication: http://moinmo.in/HelpOnAuthentication/ExternalCookie +.. _login and logout signals: https://docs.djangoproject.com/en/1.4/topics/auth/#login-and-logout-signals +.. _HttpResponse: https://docs.djangoproject.com/en/1.4/ref/request-response/#httpresponse-objects +.. _middleware: https://docs.djangoproject.com/en/1.4/topics/http/middleware/ +.. _Redis: http://redis.io/ +.. _Bitbucket repository: https://bitbucket.org/bgneal/sg101 +.. _wiki application in bitbucket: https://bitbucket.org/bgneal/sg101/src/a5b8f25e1752faf71ed429ec7f22ff6f3b3dc851/wiki?at=default +.. _Celery: http://celeryproject.org/
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/content/Coding/021-python-chained-assignment.rst Thu Jan 30 21:45:03 2014 -0600 @@ -0,0 +1,83 @@ +A C & Python chained assignment gotcha +###################################### + +:date: 2012-12-27 14:45 +:tags: Python, C++ +:slug: a-c-python-chained-assignment-gotcha +:author: Brian Neal + +Late last night I had a marathon debugging session where I discovered I had +been burned by not fully understanding chaining assignment statements in +Python. I was porting some C code to Python that had some chained assignment +expressions. C and C++ programmers are well used to this idiom which has the +following meaning: + +.. sourcecode:: c + + a = b = c = d = e; // C/C++ code + + // The above is equivalent to this: + + a = (b = (c = (d = e))); + +This is because in C, assignments are actually expressions that return a value, +and they are right-associative. + +I knew that Python supported this syntax, and I had a vague memory that it was +not the same semantically as C, but I was in a hurry. After playing a bit in the +shell I convinced myself this chained assignment was doing what I wanted. My +Python port kept this syntax and I drove on. A huge mistake! + +Hours later, of course, I found out the hard way the two are not exactly +equivalent. For one thing, in Python, assignment is a statement, not an +expression. There is no 'return value' from an assignment. The Python syntax +does allow chaining for convenience, but the meaning is subtly different. + +.. sourcecode:: python + + a = b = c = d = e # Python code + + # The above is equivalent to these lines of code: + a = e + b = e + c = e + d = e + +Now usually, I suspect, you can mix the C/C++ meaning with Python and not get +tripped up. But I was porting some tricky red-black tree code, and it made +a huge difference. Here is the C code first, and then the Python. + +.. sourcecode:: c + + p = p->link[last] = tree_rotate(q, dir); + + // The above is equivalent to: + + p = (p->link[last] = tree_rotate(q, dir)); + + +The straight (and incorrect) Python port of this code: + +.. sourcecode:: python + + p = p.link[last] = tree_rotate(q, d) + + # The above code is equivalent to this: + temp = tree_rotate(q, d) + p = temp # Oops + p.link[last] = temp # Oops + +Do you see the problem? It is glaringly obvious to me now. The C and Python +versions are not equivalent because the Python version is executing the code in +a different order. The flaw comes about because ``p`` is used multiple times in +the chained assignment and is now susceptible to an out-of-order problem. + +In the C version, the tree node pointed at by ``p`` has one of its child links +changed first, then ``p`` is advanced to the value of the new child. In the +Python version, the tree node referenced by the name ``p`` is changed first, +and then its child link is altered! This introduced a very subtle bug that cost +me a few hours of bleary-eyed debugging. + +Watch out for this when you are porting C to Python or vice versa. I already +avoid this syntax in both languages in my own code, but I do admit it is nice +for conciseness and let it slip in occasionally.
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/content/Coding/022-python-red-black-tree.rst Thu Jan 30 21:45:03 2014 -0600 @@ -0,0 +1,63 @@ +A Red-Black Tree in Python +########################## + +:date: 2012-12-28 11:10 +:tags: Python, datastructures +:slug: a-red-black-tree-in-python +:author: Brian Neal + +I've been working my way through Allen Downey's `Think Complexity`_ book. I'm +not very far in, but so far it's a great way to brush up on algorithms and +datastructures, and learn some new stuff about complexity science. Plus, all of +the code is written in Python! I've been doing the exercises, and most of them +take at most fifteen or twenty minutes. But at the end of his explanation on +hash tables he casually lobs this one out (3.4, exercise 5): + + A drawback of hashtables is that the elements have to be hashable, which + usually means they have to be immutable. That’s why, in Python, you can use + tuples but not lists as keys in a dictionary. An alternative is to use + a tree-based map. + + Write an implementation of the map interface called TreeMap that uses + a red-black tree to perform add and get in log time. + +I've never researched red-black trees before, but as a C++ programmer I know +they are the datastructure that powers ``std::set`` and ``std::map``. So +I decided to take a look. I quickly realized this was not going to be a simple +exercise, as red-black trees are quite complicated to understand and implement. +They are basically binary search trees that do a lot of work to keep themselves +approximately balanced. + +I spent a few nights reading up on red-black trees. A good explanation can be +found in Wikipedia_. There are even a handful of Python implementations +floating about, of varying quality. But finally I found a detailed explanation +that really clicked with me at `Eternally Confuzzled`_. Julienne Walker derives +a unique algorithm based upon the rules for red-black trees, and the +implementation code is non-recursive and top-down. Most of the other +implementations I found on the web seem to be based on the textbook +`Introduction To Algorithms`_, and often involve the use of parent pointers +and using dummy nodes to represent the nil leaves of the tree. Julienne's +solution avoided these things and seemed a bit less complex. However the best +reason to study the tutorial was the explanation was very coherent and +detailed. The other sources on the web seemed to be fragmented, missing +details, and lacking in explanation. + +So to complete my `Think Complexity`_ exercise, I decided to port Julienne's +red-black tree algorithm from C to Python, and hopefully learn something along +the way. After a couple nights of work, and one `very embarrassing bug`_, I've +completed it. I can't say I quite understand every bit of the algorithm, but +I certainly learned a lot. You can view the `source code at Bitbucket`_, or +clone my `Think Complexity repository`_. + +Many thanks to Julienne Walker for the `great tutorial`_! And I highly recommend +`Think Complexity`_. Check them both out. + + +.. _Think Complexity: http://www.greenteapress.com/compmod/ +.. _Wikipedia: http://en.wikipedia.org/wiki/Red%E2%80%93black_tree +.. _Eternally Confuzzled: http://www.eternallyconfuzzled.com/tuts/datastructures/jsw_tut_rbtree.aspx +.. _great tutorial: http://www.eternallyconfuzzled.com/tuts/datastructures/jsw_tut_rbtree.aspx +.. _Introduction to Algorithms: http://mitpress.mit.edu/books/introduction-algorithms +.. _very embarrassing bug: http://deathofagremmie.com/2012/12/27/a-c-python-chained-assignment-gotcha/ +.. _source code at Bitbucket: https://bitbucket.org/bgneal/think_complexity/src/0326803882adc4a598d890ee4d7d39d93cb64af7/redblacktree.py?at=default +.. _Think Complexity repository: https://bitbucket.org/bgneal/think_complexity
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/content/Coding/023-finding-empty-dirs.rst Thu Jan 30 21:45:03 2014 -0600 @@ -0,0 +1,29 @@ +Finding empty directories in Python +################################### + +:date: 2013-05-26 16:00 +:tags: Python +:slug: finding-empty-directories-in-python +:author: Brian Neal + +Late one night I needed to write some Python code to recursively find empty +directories. Searching online produced some unsatisfactory solutions. Here is +a simple solution that leverages `os.walk`_. + +.. sourcecode:: python + + import os + + empty_dirs = [] + for root, dirs, files in os.walk(starting_path): + if not len(dirs) and not len(files): + empty_dirs.append(root) + +If you then wish to delete these empty directories: + +.. sourcecode:: python + + for path in empty_dirs: + os.removedirs(path) + +.. _os.walk: http://docs.python.org/2/library/os.html#os.walk
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/content/Coding/024-m209.rst Thu Jan 30 21:45:03 2014 -0600 @@ -0,0 +1,82 @@ +Introducing m209 +################ + +:date: 2013-08-01 20:05 +:tags: Python, m209, Enigma +:slug: introducing-m209 +:author: Brian Neal + +I'm very pleased to announce yet another M-209_ simulator written in Python, +creatively called m209_. Last summer I worked on Enigma_ simulators in both +`Python <http://py-enigma.readthedocs.org/en/latest/>`_ and `C++ +<https://bitbucket.org/bgneal/cpp-enigma>`_, and I thought it would be fun to +try another World War II-era crypto device. m209_ is a Python 3 library and +command-line utility for encrypting and decrypting text by simulating the +operation of an actual M-209_ device. + +One fun part about doing something like this is researching the original +device. It seems like there are more resources online about the M-209_ than the +Enigma_. I even found an actual 1940's War Department training film on YouTube +that explains how to operate the M-209_, including the procedure for encrypting +and decrypting messages! I want to thank `Mark J. Blair`_ for his very +informative pages on the M-209_ which were very helpful to me. Check out the +`m209 references section <https://m209.readthedocs.org/en/latest/#references>`_ +for these and other useful links. + +The M-209_ isn't as complex as the Enigma_. That isn't meant to knock it. The +M-209_, while cryptographically not as secure as the Enigma_, is a remarkable +piece of mechanical engineering. It is much more portable and easier to +operate compared to the Enigma_. It has user-friendly features like printing to +paper tape and a letter counter for backing up when mistakes are made. +According to Wikipedia, about 140,000 of these machines were produced. They +even come up on eBay a few times a year, and seem to go for between $1000 +- $2000 USD. Maybe someday I can score an actual unit! + +Coding the actual simulator wasn't all that hard. I spent much more time on +the unit tests, documentation, and creating an application to generate key +lists. Writing the documentation gave me some good practice with Sphinx_, an +awesome Python based documentation tool that uses the `reStructured Text`_ +markup language. + +Writing the key list generator was actually the hardest part. The procedure for +creating key lists is spelled out in a M-209 manual from 1944 (which exists +online as a series of photos). The procedure is kind of loosely specified, and +a lot is left up to the soldier creating the key list. I came up with an +ad-hoc, heuristic-based algorithm that works most of the time. If it got stuck +it simply started over, and retried up to a certain number of attempts. + +While researching the procedure, I noticed what appears to be a typo in the +data tables in the manual that are used when developing a key list. On top of +that I found several sets of initial numbers that I could not generate a key +list from. In other words, using these starting numbers, my algorithm could not +generate M-209 settings that satisfied the exit criteria for the procedure in +the manual. After a while, I just removed those troublesome initial conditions +as possible inputs. It would be interesting to return to this some day and +write a program to search the solution space exhaustively to see if there +really was a solution for these numbers. It could just be that my +trial-and-error algorithm could not find a solution, even after tens of +thousands of attempts. However this doesn't seem likely. I wonder if these +initial settings caused lots of head scratching for the poor officer trying to +create a key list. + +In any event, if you are into this kind of thing, I hope you check out m209_. +Doing a project like this is a lot of fun. I enjoy doing the research, creating +the code, and working on the test suite. I also get some practice with Python +packaging and writing documentation with Sphinx. + +Future enhancements include adding the ability to read Mark Blair's key lists +that he created for his C++ simulator. This would make it easier for our two +simulators to interoperate. + +Links: + +* `m209 documentation <http://m209.readthedocs.org>`_ +* `m209 on PyPi <https://pypi.python.org/pypi/m209>`_ +* `m209 source code repository on Bitbucket <https://bitbucket.org/bgneal/m209/>`_ + +.. _M-209: http://en.wikipedia.org/wiki/M-209 +.. _m209: http://m209.readthedocs.org/ +.. _Enigma: https://en.wikipedia.org/wiki/Enigma_machine +.. _Mark J. Blair: http://www.nf6x.net/groups/m209group/ +.. _Sphinx: http://sphinx-doc.org/index.html +.. _reStructured Text: http://docutils.sf.net/rst.html
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/content/Coding/025-upgrading-django1.5.rst Thu Jan 30 21:45:03 2014 -0600 @@ -0,0 +1,120 @@ +Upgrading to Django 1.5 +####################### + +:date: 2013-08-29 19:30 +:tags: Django +:slug: upgrading-to-django-1.5 +:author: Brian Neal + +Upgrading to Django 1.5 +======================= + +`Django`_ 1.5 came out a while ago, and I finally got around to upgrading two +of my largest sites. I thought I would make a list of what I changed at a high +level for my own reference. Perhaps someone else may find it useful as well. + +In any event, I highly recommend you read the excellent `release notes`_ and +`deprecation timeline`_. I also recommend you run with warnings turned on:: + + $ python -Wall manage.py runserver + +This will help you flag down issues in your code. If you aren't sure where +a warning is coming from, you can turn warnings into exceptions and get +a traceback (see the Python docs on the warnings_ library). Another trick is to +put a pdb_ breakpoint in the Django code before or after the warning, then you +can examine the call stack with the ``w`` command. + +Upgrading Issues +================ + +Here are the issues that I ran into. Of course you may have a very different +experience depending on what features of Django you used and the details of +your site. + +#. **Replace occurrences of Django's simplejson with json**. The Django team + deprecated their version of the json library since they had dropped support + for older versions of Python. + +#. **select_related's depth argument is deprecated**. Instead of using + ``depth`` I had to explictly name which relationships to follow. + +#. **The cleanup management command is now called clearsessions**. I have + a Celery_ task I had to update because of this name change. + +#. **Remove my {% load url from future %} tags**. Long ago I had converted to + the newer ``url`` tag behavior by using this tag. Since this is now the + current behavior I could remove all of these tags. + +#. **The LOGIN_* settings now accept URL names**. These settings can now accept + URL names which allows me to be more DRY (Don't Repeat Yourself). I no + longer have to maintain URL patterns in two places. + +#. **Management commands should use self.stdout, etc. for output**. I'm not + sure if this is required, but it makes it easier to test your management + commands. + +#. **slugify is now available at django.utils.text**. I was doing something + kind of hacky by invoking the slugify template filter in Python code. Now + this slugify code is more naturally available as a free Python function + without the template filter baggage. + +#. **Remove all references to django.contrib.markup**. Django decided not to be + so tightly coupled to various third party markup languages. I can see their + reasoning, as it does add to their maintenance burden, but it is kind of + inconvenient if you already relied on it. Fortunately I had a while ago + stopped relying on the Markdown filter and had used a custom solution. + Seeing this issue in the release notes caused me to go looking through my + code and I found some unused templates and ``INSTALLED_APPS`` setting that + I could remove. On my second site, I had one tiny usage of the Textile + filter that I decided to just get rid of since it was hardly worth carrying + the dependency for. I wrote a quick and dirty management command to convert + the database from storing Textile markup to HTML and I was done. + +#. **localflavor is deprecated in preferece to a third party app**. I was using + some US state fields in a few models. All I had to do was pip_ install + `django-localflavor`_, fix up some imports, update my requirements files, + and I was good. + +#. **Function-based generic views are now gone**. A third party application + I am using made use of the old function-based generic views. These have been + replaced with the newer class-based generic views. Luckily I wasn't actually + using the views in this application, I am only relying on it's models and + utilities. In this case I simply removed the inclusion of the application's + ``urls.py`` and all was well. The application is no longer maintained so + I would have had to port to the class-based views if I had needed them. + +#. **The adminmedia template tags library is now gone**. I wasn't actually + using this anymore, but I had a template that was still loading it. This + cruft was removed. + +What I didn't do +================ + +Django 1.5's largest new feature is probably its support for a `configurable +User model`_. This impacts my largest site, and I wasn't quite sure how to +proceed on this front. This is the only issue I feel like Django threw me under +the bus on. Fortunately, the ``get_profile()`` and ``AUTH_PROFILE_MODULE`` +stuff will not be removed until Django 1.7. In addition, the author of the +South_ library is developing model migration support which will also land in +Django 1.7. Thus there is some time to sort this out, and I'm hoping Django +will have some tutorials or docs on how to migrate away from the old +``AUTH_PROFILE_MODULE`` scheme. + +Conclusion +========== + +The upgrade process went smoother and quicker than I thought thanks to the +excellent release notes and the Django team's use of Python warnings to flag +deprecated features. + + +.. _Django: https://www.djangoproject.com/ +.. _release notes: https://docs.djangoproject.com/en/1.5/releases/1.5/ +.. _deprecation timeline: https://docs.djangoproject.com/en/1.5/internals/deprecation/ +.. _warnings: http://docs.python.org/library/warnings.html +.. _pdb: http://docs.python.org/library/pdb.html +.. _Celery: http://www.celeryproject.org/ +.. _pip: http://www.pip-installer.org/en/latest/ +.. _django-localflavor: https://pypi.python.org/pypi/django-localflavor +.. _configurable user model: https://docs.djangoproject.com/en/1.5/topics/auth/customizing/#auth-custom-user +.. _South: http://south.aeracode.org/
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/content/Coding/026-haystack-safe-tip.rst Thu Jan 30 21:45:03 2014 -0600 @@ -0,0 +1,78 @@ +Haystack Search Quoting Issue +############################# + +:date: 2013-09-21 11:45 +:tags: Django, Haystack +:slug: haystack-search-quoting-issue +:author: Brian Neal + +The case of the missing forum topic +=================================== + +I use the awesome Haystack_ search framework for my Django_ powered website. +I have found Haystack to be a huge win. It is easy to setup, configure, and +customize when you have to. As someone who doesn't know very much about the +world of searching, I'm grateful to have a powerful tool that just works +without me having to get too involved in arcane details. + +One day one of our users noticed that he could not find a forum topic with the +title ``"Hawaiian" sounding chords``. Notice the word *Hawaiian* is in quotes. The +topic would turn up if you searched for *sounding* or *chords*. But no +combination of *Hawaiian*, with or without quotes would uncover this topic. + +I should mention I am using the `Xapian backend`_. I know the backend tries to +remove puncuation and special characters to create uniform searches. But +I could not figure out where this was getting dropped at. After a bit of +searching online, I found a few hints which led to the solution. + +Safety versus correctness +========================= + +As suggested in the documentation, I am using templates to build the document +used for the search engine. My template for forum topics looked like this:: + + {{ object.name }} + {{ object.user.username }} + {{ object.user.get_full_name }} + +A mailing list post from another user suggested the problem. Django by default +escapes text in templates. Thus the forum topic title:: + + "Hawaiian" sounding chords + +was being turned into this by the Django templating engine:: + + "Hawaiian" sounding chords + +Now what Haystack and/or the Xapian backend were doing with +``"Hawaiian"`` I have no idea. I tried searching for this unusual +term but it did not turn up any results. Apparently it is just getting dropped. + +The solution was to modify the template to this:: + + {{ object.name|safe }} + {{ object.user.username|safe }} + {{ object.user.get_full_name|safe }} + +But is it safe? +=============== + +After changing my template and rebuilding the index, the troublesome topic was +then found. Hooray! But have I just opened myself up to a XSS_ attack? Can user +supplied content now show up unescaped in the search results? Well I can't +answer this authoritatively but I did spend a fair amount of time experimenting +with this. I'm using Haystack's ``highlight`` template tag, and my users' input +is done in Markdown_, and I could not inject malicious text into the search +results. You should test this yourself on your site. + +Conclusion +========== + +This turned out to be a simple fix and I hope it helps someone else. I will +make enquiries to see if this should be added to the Haystack documentation. + +.. _Haystack: http://haystacksearch.org/ +.. _Django: https://www.djangoproject.com/ +.. _Xapian backend: https://github.com/notanumber/xapian-haystack +.. _XSS: http://en.wikipedia.org/wiki/Cross-site_scripting +.. _Markdown: http://daringfireball.net/projects/markdown/
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/content/Coding/027-upgrading-django1.6.rst Thu Jan 30 21:45:03 2014 -0600 @@ -0,0 +1,121 @@ +Upgrading to Django 1.6 +####################### + +:date: 2013-12-29 18:00 +:tags: Django +:slug: upgrading-to-django-1.6 +:author: Brian Neal + +Getting started +=============== + +`Django`_ 1.6 came out recently, which was soon followed by 1.6.1, and it looks +like 1.6.2 is on the way. I finally got around to upgrading two of my largest +sites. I thought I would make a list of what I changed at a high level for my +own reference. Perhaps someone else may find it useful as well. + +In any event, I highly recommend you read the excellent `release notes`_ and +`deprecation timeline`_. The changes in 1.6 didn't seem groundbreaking, but +they were numerous. I spent a lot of time reading through the notes and trying +to decide if the issues affected me or not. + +I recommend you run with warnings turned on:: + + $ python -Wall manage.py runserver + +This will help you flag down issues in your code. If you aren't sure where +a warning is coming from, you can turn warnings into exceptions and get +a traceback (see the Python docs on the warnings_ library). Another trick is to +put a pdb_ breakpoint in the Django code before or after the warning, then you +can examine the call stack with the ``w`` command. + +Upgrade Issues +============== + +Here are the issues that I ran into. Of course you may have a very different +experience depending on what features of Django you used and the details of +your site. + +#. The location of the ``XViewMiddleware`` changed. I had to update my + ``MIDDLEWARE_CLASSES`` setting as a result. +#. Various ``get_query_set`` to ``get_queryset`` changes. The Django developers + have ironed out some naming inconsistencies in method names on both model + managers and ``ModelAdmin`` classes. +#. In template / form processing, the ``label_tag`` now includes the + ``label_suffix``. I noticed this when I saw that I had two colons on a form + field's label. +#. One very nice change that I am please to see is that Django now does test + discovery just like the unittest_ module in the standard library. To take + advantage of this I renamed all my test modules from ``view_tests.py`` to + ``test_views.py``, for example. This also let me get rid of ``import`` + statements in various ``__init__.py`` files in test subdirectories. In other + words, you no longer have to have silly lines like + ``from view_tests import *`` in your test packages' ``__init__.py`` files. +#. Django now supports database connection persistence. To take advantage of + this you need to set the CONN_MAX_AGE_ setting to a non-zero value. +#. The ``CACHE_MIDDLEWARE_ANONYMOUS_ONLY`` setting is now deprecated and can be + removed. For various reasons explained in the notes this never really worked + right anyway. +#. Updated to version 1.0 of the django-debug-toolbar_. The version I was using + would not work in Django 1.6. It is so good to see that this project is + being actively maintained again. There are several new panels and neat + features, check it out! +#. You now get a warning if you have a ``ModelForm`` without an ``exclude`` or + ``fields`` meta option. This is rather nice as I have been bit by this in + the past when a form suddenly started showing a newly added field that it + should not have. I added a ``fields`` option to a ``ModelForm`` as a result. + Unfortunately some third party applications I am using have this problem as + well. +#. The ``cycle`` tag has new XSS_ protection. To make use of it now, you have + to add a ``{% load cycle from future %}`` tag into your templates. +#. The ``django.contrib.auth`` password reset function is now using base 64 encoding of the + ``User`` primary key. The details are `here <https://docs.djangoproject.com/en/1.6/releases/1.6/#django-contrib-auth-password-reset-uses-base-64-encoding-of-user-pk>`_. This affected me + because I am using a custom password reset URL, and thus I needed to update + my URL pattern for both the new parameter name and the regular expression + for base 64. I missed this originally and I started getting 404's on my + password reset confirmation URLs. And yes, this is something I should have a + test for! + +What I didn't do +================ + +Many of the warnings that I got came from third party modules that I have not +updated in a long time, including Celery_ and Haystack_. I am going to have to +schedule some time to update to the latest versions of these apps. Hopefully +the warnings will be fixed in the newer versions, but if not I can write +tickets or possibly submit patches / pull requests. This is the price of +progress I suppose. + +I also use a couple of smaller third party applications that seem to be no +longer maintained. These apps are now generating some warnings. I'll have to +fork them and fix these myself. Luckily these projects are on GitHub so this +should not be a problem. + +Finally I am still facing the problem of what to do about the deprecation of +the ``AUTH_PROFILE_MODULE`` and the ``get_profile`` method. This will be +removed in Django 1.7. I've been doing some more reading about this and I'm +less scared about this than I used to. I'll probably just change my profile +model to have a one-to-one relationship with the provided ``User`` model. I'll +have to do some more researching and thinking about this before Django 1.7. + + +Conclusion +========== + +Once again the upgrade process went smoother and quicker than I thought thanks +to the excellent release notes and the Django team's use of Python warnings to +flag deprecated features. + + +.. _Django: https://www.djangoproject.com/ +.. _release notes: https://docs.djangoproject.com/en/1.6/releases/1.6/ +.. _deprecation timeline: https://docs.djangoproject.com/en/1.6/internals/deprecation/ +.. _warnings: http://docs.python.org/library/warnings.html +.. _pdb: http://docs.python.org/library/pdb.html +.. _unittest: http://docs.python.org/2/library/unittest.html +.. _CONN_MAX_AGE: https://docs.djangoproject.com/en/1.6/ref/settings/#conn-max-age +.. _XSS: http://en.wikipedia.org/wiki/Cross-site_scripting +.. _configurable user model: https://docs.djangoproject.com/en/1.5/topics/auth/customizing/#auth-custom-user +.. _django-debug-toolbar: https://pypi.python.org/pypi/django-debug-toolbar +.. _Celery: http://www.celeryproject.org/ +.. _Haystack: http://haystacksearch.org/