Mercurial > public > pelican-blog
comparison content/Coding/026-haystack-safe-tip.rst @ 4:7ce6393e6d30
Adding converted blog posts from old blog.
author | Brian Neal <bgneal@gmail.com> |
---|---|
date | Thu, 30 Jan 2014 21:45:03 -0600 |
parents | |
children | 49bebfa6f9d3 |
comparison
equal
deleted
inserted
replaced
3:c3115da3ff73 | 4:7ce6393e6d30 |
---|---|
1 Haystack Search Quoting Issue | |
2 ############################# | |
3 | |
4 :date: 2013-09-21 11:45 | |
5 :tags: Django, Haystack | |
6 :slug: haystack-search-quoting-issue | |
7 :author: Brian Neal | |
8 | |
9 The case of the missing forum topic | |
10 =================================== | |
11 | |
12 I use the awesome Haystack_ search framework for my Django_ powered website. | |
13 I have found Haystack to be a huge win. It is easy to setup, configure, and | |
14 customize when you have to. As someone who doesn't know very much about the | |
15 world of searching, I'm grateful to have a powerful tool that just works | |
16 without me having to get too involved in arcane details. | |
17 | |
18 One day one of our users noticed that he could not find a forum topic with the | |
19 title ``"Hawaiian" sounding chords``. Notice the word *Hawaiian* is in quotes. The | |
20 topic would turn up if you searched for *sounding* or *chords*. But no | |
21 combination of *Hawaiian*, with or without quotes would uncover this topic. | |
22 | |
23 I should mention I am using the `Xapian backend`_. I know the backend tries to | |
24 remove puncuation and special characters to create uniform searches. But | |
25 I could not figure out where this was getting dropped at. After a bit of | |
26 searching online, I found a few hints which led to the solution. | |
27 | |
28 Safety versus correctness | |
29 ========================= | |
30 | |
31 As suggested in the documentation, I am using templates to build the document | |
32 used for the search engine. My template for forum topics looked like this:: | |
33 | |
34 {{ object.name }} | |
35 {{ object.user.username }} | |
36 {{ object.user.get_full_name }} | |
37 | |
38 A mailing list post from another user suggested the problem. Django by default | |
39 escapes text in templates. Thus the forum topic title:: | |
40 | |
41 "Hawaiian" sounding chords | |
42 | |
43 was being turned into this by the Django templating engine:: | |
44 | |
45 "Hawaiian" sounding chords | |
46 | |
47 Now what Haystack and/or the Xapian backend were doing with | |
48 ``"Hawaiian"`` I have no idea. I tried searching for this unusual | |
49 term but it did not turn up any results. Apparently it is just getting dropped. | |
50 | |
51 The solution was to modify the template to this:: | |
52 | |
53 {{ object.name|safe }} | |
54 {{ object.user.username|safe }} | |
55 {{ object.user.get_full_name|safe }} | |
56 | |
57 But is it safe? | |
58 =============== | |
59 | |
60 After changing my template and rebuilding the index, the troublesome topic was | |
61 then found. Hooray! But have I just opened myself up to a XSS_ attack? Can user | |
62 supplied content now show up unescaped in the search results? Well I can't | |
63 answer this authoritatively but I did spend a fair amount of time experimenting | |
64 with this. I'm using Haystack's ``highlight`` template tag, and my users' input | |
65 is done in Markdown_, and I could not inject malicious text into the search | |
66 results. You should test this yourself on your site. | |
67 | |
68 Conclusion | |
69 ========== | |
70 | |
71 This turned out to be a simple fix and I hope it helps someone else. I will | |
72 make enquiries to see if this should be added to the Haystack documentation. | |
73 | |
74 .. _Haystack: http://haystacksearch.org/ | |
75 .. _Django: https://www.djangoproject.com/ | |
76 .. _Xapian backend: https://github.com/notanumber/xapian-haystack | |
77 .. _XSS: http://en.wikipedia.org/wiki/Cross-site_scripting | |
78 .. _Markdown: http://daringfireball.net/projects/markdown/ |