🔎 Powerful, configurable, and extensible text search for your content
MIT License
Powerful, configurable, and extensible text search for your content.
Create a copy of your content optimised for full-text search
Store this in Elasticsearch, and automatically keep it up to date
Generate a query with out-of-the-box support for:
Contentful's search is good, but not optimised for text content. You might want to consider Elasticsearch over the built-in search when:
npm install --save contentful-text-search
const ContentfulTextSearch = require('contentful-text-search')
const search = new ContentfulTextSearch({ space: 'space_id', token: 'access_token' })
search.indexer.fullReindex()
// later
search.query('searchTerm', 'en-US')
Initialise the module using the new
operator, passing in the mandatory values for:
Optionally, also pass in:
http://localhost:9200
cdn.contentful.com
redis://localhost:6379
elastic
info
const ContentfulTextSearch = require('contentful-text-search')
const search = new ContentfulTextSearch({
space: 'string',
token: 'string',
elasticHost: 'optionalString',
contentfulHost: 'optionalString',
redisHost: 'optionalString',
elasticUser: 'optionalString',
elasticPassword: 'optionalString',
elasticLogLevel: 'optionalString'
})
Although most of the indexing functions return a promise, you should be aware that Elasticsearch is 'near real-time', so you might have to deal with a short delay before an indexed document is available in search results.
Delete and recreate an index for each locale in the space, and index the content into these indices. You need to call this the first time you use the module, but after that only when your content model changes.
search.indexer.fullReindex() // returns a promise
Clear the indices and reindex all the content from Contentful. You can use this to update the indices if they are out of date, assuming the content model hasn't changed.
search.indexer.reindexContent() // returns a promise
Deletes all indices related to this space. Could be used to clean up your Elasticsearch cluster after deleting a Contentful space.
search.indexer.deleteAllIndices() // returns a promise
Queries the Elasticsearch index and get back search results as JSON ordered by relevance, with highlights showing where your search term appeared in the result. Both parameters are mandatory.
search.query('searchTerm', 'localeCode') // returns a promise containing the results and highlights
Uses the contentful-webhook-listener package to listen for webhooks. You can start the server like this:
const server = search.update.createServer()
server.listen(3000)
The createServer
method returns an extended instance of Node's http.server, so you have all those methods available, and could for example pass in an Express instance as the requestListener
. The server object is also always available at search.update.server
.
You can use basic auth with the webhooks like this:
const server = search.update.createServer({ auth: username:password })
You should set up webhooks in Contentful pointing to the URL of this server, with all events, or at least all events for Entries.
When developing locally, or behind a proxy, this package uses contentful-webhook-tunnel instead, which automatically sets up the webhook in Contentful, and creates a tunnel through ngrok. See the documentation of that package for more details on how to enable this.
See the debug module. Use the package name (contentful-text-search
) as the string in the environment variable.
Use the Contentful Sync API to keep a local copy of our content in Redis - because we need all/most of our content for indexing, Redis should be faster than the Content Delivery API.
Here we remap Contentful fields (e.g. dereferencing, de-localising, and stripping out extraneous info), and reformat some data, for example converting markdown to plain text.
At this step the transformed data is passed through our analysis chain.
The content for each locale from Contentful is uploaded to a separate index.
english
analyser for English content or german
analyser for German content.Send a string and get back search results as JSON ordered by relevance
best_fields
multi-match on all fieldsAlso get back highlighted text snippets with your search results, showing where your query appears in results.
We keep the index up to date via Contentful webhooks. You could also perform a complete reindex of the content every so often using a scheduler like node-cron
Exclude content types and fields from being indexed
Specify the transformation and analyser for each field
Exclude fields from query
Boost fields in query
All contributions welcome! Please feel free to open an issue/PR 😄