comparison content/Coding/026-haystack-safe-tip.rst @ 4:7ce6393e6d30

Adding converted blog posts from old blog.
author Brian Neal <bgneal@gmail.com>
date Thu, 30 Jan 2014 21:45:03 -0600
parents
children 49bebfa6f9d3
comparison
equal deleted inserted replaced
3:c3115da3ff73 4:7ce6393e6d30
1 Haystack Search Quoting Issue
2 #############################
3
4 :date: 2013-09-21 11:45
5 :tags: Django, Haystack
6 :slug: haystack-search-quoting-issue
7 :author: Brian Neal
8
9 The case of the missing forum topic
10 ===================================
11
12 I use the awesome Haystack_ search framework for my Django_ powered website.
13 I have found Haystack to be a huge win. It is easy to setup, configure, and
14 customize when you have to. As someone who doesn't know very much about the
15 world of searching, I'm grateful to have a powerful tool that just works
16 without me having to get too involved in arcane details.
17
18 One day one of our users noticed that he could not find a forum topic with the
19 title ``"Hawaiian" sounding chords``. Notice the word *Hawaiian* is in quotes. The
20 topic would turn up if you searched for *sounding* or *chords*. But no
21 combination of *Hawaiian*, with or without quotes would uncover this topic.
22
23 I should mention I am using the `Xapian backend`_. I know the backend tries to
24 remove puncuation and special characters to create uniform searches. But
25 I could not figure out where this was getting dropped at. After a bit of
26 searching online, I found a few hints which led to the solution.
27
28 Safety versus correctness
29 =========================
30
31 As suggested in the documentation, I am using templates to build the document
32 used for the search engine. My template for forum topics looked like this::
33
34 {{ object.name }}
35 {{ object.user.username }}
36 {{ object.user.get_full_name }}
37
38 A mailing list post from another user suggested the problem. Django by default
39 escapes text in templates. Thus the forum topic title::
40
41 "Hawaiian" sounding chords
42
43 was being turned into this by the Django templating engine::
44
45 &quot;Hawaiian&quot; sounding chords
46
47 Now what Haystack and/or the Xapian backend were doing with
48 ``&quot;Hawaiian&quot;`` I have no idea. I tried searching for this unusual
49 term but it did not turn up any results. Apparently it is just getting dropped.
50
51 The solution was to modify the template to this::
52
53 {{ object.name|safe }}
54 {{ object.user.username|safe }}
55 {{ object.user.get_full_name|safe }}
56
57 But is it safe?
58 ===============
59
60 After changing my template and rebuilding the index, the troublesome topic was
61 then found. Hooray! But have I just opened myself up to a XSS_ attack? Can user
62 supplied content now show up unescaped in the search results? Well I can't
63 answer this authoritatively but I did spend a fair amount of time experimenting
64 with this. I'm using Haystack's ``highlight`` template tag, and my users' input
65 is done in Markdown_, and I could not inject malicious text into the search
66 results. You should test this yourself on your site.
67
68 Conclusion
69 ==========
70
71 This turned out to be a simple fix and I hope it helps someone else. I will
72 make enquiries to see if this should be added to the Haystack documentation.
73
74 .. _Haystack: http://haystacksearch.org/
75 .. _Django: https://www.djangoproject.com/
76 .. _Xapian backend: https://github.com/notanumber/xapian-haystack
77 .. _XSS: http://en.wikipedia.org/wiki/Cross-site_scripting
78 .. _Markdown: http://daringfireball.net/projects/markdown/