view custom_search/fields.py @ 943:cf9918328c64

Haystack tweaks for Django 1.7.7. I had to upgrade to Haystack 2.3.1 to get it to work with Django 1.7.7. I also had to update the Xapian backend. But I ran into problems. On my laptop anyway (Ubuntu 14.0.4), xapian gets mad when search terms are greater than 245 chars (or something) when indexing. So I created a custom field that would simply omit terms greater than 64 chars and used this field everywhere I previously used a CharField. Secondly, the custom search form was broken now. Something changed in the Xapian backend and exact searches stopped working. Fortunately the auto_query (which I was using originally and broke during an upgrade) started working again. So I cut the search form back over to doing an auto_query. I kept the form the same (3 fields) because I didn't want to change the form and I think it's better that way.
author Brian Neal <bgneal@gmail.com>
date Wed, 13 May 2015 20:25:07 -0500
parents
children
line wrap: on
line source
"""Custom Haystack SearchFields."""

import haystack.fields


class MaxTermSizeCharField(haystack.fields.CharField):
    """A CharField that discards large terms when preparing the search index.

    Some backends (e.g. Xapian) throw errors when terms are bigger than some
    limit. This field omits the terms over a limit when preparing the data for
    the search index.

    The keyword argument max_term_size sets the maximum size of a whitespace
    delimited word/term. Terms over this size are not indexed. The default value
    is 64.
    """
    DEFAULT_MAX_TERM_SIZE = 64

    def __init__(self, *args, **kwargs):
        self.max_term_size = kwargs.pop('max_term_size', self.DEFAULT_MAX_TERM_SIZE)
        super(MaxTermSizeCharField, self).__init__(*args, **kwargs)

    def prepare(self, obj):
        text = super(MaxTermSizeCharField, self).prepare(obj)
        if text is None or self.max_term_size is None:
            return text

        terms = (term for term in text.split() if len(term) <= self.max_term_size)
        return u' '.join(terms)