Mercurial > public > sg101
diff custom_search/fields.py @ 943:cf9918328c64
Haystack tweaks for Django 1.7.7.
I had to upgrade to Haystack 2.3.1 to get it to work with Django
1.7.7. I also had to update the Xapian backend. But I ran into
problems.
On my laptop anyway (Ubuntu 14.0.4), xapian gets mad when search terms
are greater than 245 chars (or something) when indexing. So I created
a custom field that would simply omit terms greater than 64 chars and
used this field everywhere I previously used a CharField.
Secondly, the custom search form was broken now. Something changed in
the Xapian backend and exact searches stopped working. Fortunately the
auto_query (which I was using originally and broke during an upgrade)
started working again. So I cut the search form back over to doing an
auto_query. I kept the form the same (3 fields) because I didn't want
to change the form and I think it's better that way.
author | Brian Neal <bgneal@gmail.com> |
---|---|
date | Wed, 13 May 2015 20:25:07 -0500 |
parents | |
children |
line wrap: on
line diff
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/custom_search/fields.py Wed May 13 20:25:07 2015 -0500 @@ -0,0 +1,29 @@ +"""Custom Haystack SearchFields.""" + +import haystack.fields + + +class MaxTermSizeCharField(haystack.fields.CharField): + """A CharField that discards large terms when preparing the search index. + + Some backends (e.g. Xapian) throw errors when terms are bigger than some + limit. This field omits the terms over a limit when preparing the data for + the search index. + + The keyword argument max_term_size sets the maximum size of a whitespace + delimited word/term. Terms over this size are not indexed. The default value + is 64. + """ + DEFAULT_MAX_TERM_SIZE = 64 + + def __init__(self, *args, **kwargs): + self.max_term_size = kwargs.pop('max_term_size', self.DEFAULT_MAX_TERM_SIZE) + super(MaxTermSizeCharField, self).__init__(*args, **kwargs) + + def prepare(self, obj): + text = super(MaxTermSizeCharField, self).prepare(obj) + if text is None or self.max_term_size is None: + return text + + terms = (term for term in text.split() if len(term) <= self.max_term_size) + return u' '.join(terms)