Django-SphinxQL implements Sphinx search for Django, thanks for checking it out.
Django is a Web framework for building websites with relational databases; Sphinx is a search engine designed for relational databases. Django-SphinxQL defines an ORM for using Sphinx in Django. As corollary, it allows you to implement full text search with Sphinx in your Django website.
Specifically, this API allows you to:
Django-SphinxQL requires:
Our build matrix in Travis has 8 builds:
For more details, you can check the directory tests
and .travis.yml
.
To run the tests locally, use:
PYTHONPATH=..:$PYTHONPATH django-admin.py test --settings=tests.settings_test tests
The next sections present a minimal setup to use this package. The full documentation is available here.
Django-SphinxQL has no requirements besides Django and Sphinx. To install Sphinx, use:
export VERSION=2.2.10
wget http://sphinxsearch.com/files/sphinx-$VERSION-release.tar.gz
tar -xf sphinx-$VERSION-release.tar.gz
cd sphinx-$VERSION-release
./configure --prefix=$HOME --with-pgsql
make
make install
To install Django-SphinxQL, use:
pip install git+https://github.com/jorgecarleitao/django-sphinxql.git
Django-SphinxQL requires a directory to store its database and be registered as installed app (it doesn't contain Django models):
add sphinxql
to the INSTALLED_APPS
;
add INDEXES
to settings:
INDEXES = {
'path': os.path.join(BASE_DIR, '_index'), # also do `mkdir _index`.
'sphinx_path': BASE_DIR
}
path
is where Sphinx database is going to be createdsphinx_path
is the directory that will contain Sphinx-specific files.Assume you have a model Document
with a summary
, a text
and a
number
that you want to index. To index it, create a file indexes.py
in
your app with:
from sphinxql import indexes, fields
from myapp import models
class DocumentIndex(indexes.Index):
my_summary = fields.Text(model_attr='summary')
my_text = fields.Text(model_attr='text')
my_number = fields.Integer(model_attr='number')
class Meta:
model = models.Document
model_attr
can be either a string with lookups or an F expression.
E.g. type_name = fields.Text(model_attr='type__name')
will index the name of
the ForeignKey type
of your model, while
type_name = fields.Text(model_attr=Concat('type__name', Value(' '),
'my_text',
output_field=CharField()))
indexes the concatenation of two fields (see also Django documentation). In principle the index fields accept any Django expression Django annotate accepts.
To index your indexes, run:
python manage.py index_sphinx
At this moment you may notice that some files will be created in
settings.INDEXES['path']
: Sphinx database is populated.
Then, start Sphinx daemon (only has to be started once):
python manage.py start_sphinx
(for the sake of reversibility, to stop Sphinx use python manage.py stop_sphinx
)
Django-SphinxQL defines a subclass of Django QuerySet
's, that interfaces with
all Sphinx-related operations. SearchQuerySet
only adds functionality: if you
don't use Sphinx-related, it is a QuerySet
.
Sphinx has a dedicated syntax for text search that Django-SphinxQL also accepts:
>>> q = SearchQuerySet(DocumentIndex).search('@my_text toys for babies')
This particular query returns Documents
restricted to the ones where
"toys for babies" match in field my_text
, ordered by the most relevant match.
Once you perform it, it does the following:
DocumentIndex
instances;Document
instances;Document
instances with the respective DocumentIndex
search_result
)Document
instances.Step 2. is done using .filter(pk__in=[...])
. The results are ordered by relevance
because there was no specific call of order_by
: if you set any ordering
in Django Query, it uses Django ordering (i.e. it overrides the default ordering
but not an explicit ordering). See docs for detailed information.
You should check if Django-Haystack suits your needs.
Django-SphinxQL is useful when you can index your data on a time-scale different from "real time". It should be much faster in indexing, and it should have lower memory requirements.