kerko

A web application component that provides a faceted search interface for bibliographies managed with Zotero.

GPL-3.0 License

Downloads
704
Stars
310
Committers
2

Kerko

Kerko is a web application component that provides a user-friendly search and browsing interface for sharing a bibliography managed with the Zotero reference manager.

The combination of Kerko and Zotero gives you the best of both worlds: a rich but easy to use web interface for end-users of the bibliography, and a well-established and powerful bibliographic reference management tool for individuals or teams working on the bibliography's content.

Demo site

A KerkoApp-based demo site is available for you to try. You may also view the Zotero library that contains the source data for the demo site.

Powered by Kerko

Here are some sites that are powered by Kerko:

Features

The main features provided by Kerko are:

  • Faceted search interface: allows exploration of the bibliography both in
    search mode and in browsing mode, potentially suiting different user needs,
    behaviors, and abilities. For example, users with a prior idea of the topic or
    expected results may enter keywords or a more complex query in a search field,
    while those who wish to become familiar with the content of the bibliography
    or discover new topics may choose to navigate along the proposed facets, to
    narrow or broaden their search. Since both modes are integrated into a single
    interface, it is possible to combine them.
  • Keyword search features:
    • Boolean operators:
      • AND: matches items that contain all specified terms. This is the
        default relation between terms when no operator is specified, e.g.,
        a b is the same as a AND b.
      • OR: matches items that contain any of the specified terms, e.g.,
        a OR b.
      • NOT: excludes items that match the term, e.g., NOT a.
      • Boolean operators must be specified in uppercase and may be translated
        in other languages.
    • Logical grouping (with parentheses), e.g., (a OR b) AND c.
    • Sequence of words (with double quotes), e.g., "a b c". The default
      difference between word positions is 1, meaning that an item will match if
      it contains the words next to each other, but a different maximum distance
      may be selected (with the tilde character), e.g. "web search"~2 allows
      up to 1 word between web and search, meaning it could match web site search as well as web search.
    • Term boosting (with the caret), e.g., faceted^2 search browsing^0.5
      specifies that faceted is twice as important as search when computing
      the relevance score of results, while browsing is half as important.
      Boosting may be applied to a logical grouping, e.g., (a b)^3 c.
    • Keyword search is case-insensitive, accents are folded, and punctuation is
      ignored. To further improve recall (albeit at the cost of precision),
      stemming is also performed on terms from most text fields, e.g., title,
      abstract, notes. Stemming relieves the user from having to specify all
      variants of a word when searching, e.g., terms such as search,
      searches, and searching all return the same results. The Snowball
      algorithm is used for that purpose.
    • Full-text search: the text content of PDF attachments can be searched.
    • Scope of search: users may choose to search everywhere, in
      author/contributor names, in titles, in publication years, in all fields
      (i.e., in metadata and notes), or in documents (i.e., in the text content
      of attachments). Applications may provide additional choices.
  • Faceted browsing: allows filtering by topic (Zotero tag), by resource type
    (Zotero item type), by publication year, or by resource language. Moreover,
    you may define additional facets modeled on collections and subcollections; in
    such case, any collection can be represented as a facet, and each
    subcollection as a value within that facet. By taking advantage of Zotero's
    ability to assign any given item to multiple collections, a faceted
    classification scheme can be designed, including hierarchical subdivisions
    within facets.
  • Relevance scoring: provided by the Whoosh library and based on the
    BM25F algorithm, which determines how important a term is to a document in
    the context of the whole collection of documents, while taking into account
    its relation to document structure (in this regard most fields are neutral,
    but the score is boosted when a term appears in specific fields, e.g., DOI,
    ISBN, ISSN, title, author/contributor). Any keyword search asks the question
    "how well does this document match this query clause?", which requires
    calculating a relevance score for each document. Filtering with facets, on the
    other hand, has no effect on the score because it asks "does this document
    match this query clause?", which leads to a yes or no answer.
  • Sort options: by relevance score (only applicable to keyword search), by
    publication date, by author, by title.
  • Citation styles: any from the Zotero Style Repository, or
    custom stylesheet defined in the Citation Style Language (stylesheet
    must be accessible by URL).
  • Language support: the default language of the user interface is English,
    but some translations are provided. Additional
    translations may be created using gettext-compatible tools. Also to consider:
    locales supported by the Zotero Data Schema (which provides
    the names of fields, item types and author types displayed by Kerko);
    languages supported by Whoosh (which provides the search capabilities), i.e.,
    ar, da, nl, en, fi, fr, de, hu, it, no, pt, ro, ru, es, sv, tr.
  • Semantic markup: pages generated by Kerko embed HTML markup that can be
    detected by web crawlers (helping the indexing of your records by search
    engines) or by web browsers (allowing users of reference management tools to
    easily import metadata in their library). Supported schemes are:
    • OpenURL COinS, in search results pages and individual
      bibliographic record pages. COinS is recognized by many reference
      management tools
      , including the Zotero
      Connector
      browser extension.
    • Highwire Press tags, in the individual bibliographic record pages of book,
      conference paper, journal article, report or thesis items. These tags are
      recommended for indexing by [Google Scholar][HighwirePress_Google], and
      are recognized by many other databases and reference management tools,
      including the Zotero Connector browser extension.
  • Web feeds: users of news aggregators or feed readers may get updates when
    new bibliographic records are added. They may subscribe to the main feed, or
    to one or more custom feeds.
    • The main feed lists the most recently added bibliographic records.
    • Any search page has a related custom feed that lists the most recently
      added bibliographic records that match the search criteria. Thus, a user
      can obtain a custom feed for a particular area of interest simply by
      entering keywords to search and/or selecting filters.
    • Feeds are provided in the Atom syndication format.
    • Basic metadata is provided directly in the feeds, using both Atom and
      unqualified Dublin Core elements.
    • An age limit may be configured to exclude older items from the feeds. This
      may be useful to bibliographies that are frequently updated and mostly
      meant to promote recent literature (all resources still remain visible to
      the search interface regardless of their age).
  • Sitemap: an XML Sitemap is automatically generated, and you
    may use it to help search engines discover your bibliographic records.
  • Exporting: users may export individual records as well as complete
    bibliographies corresponding to search results. By default, download links are
    provided for the RIS and BibTeX formats, but applications may be configured to
    export any format supported by the Zotero API.
  • Printing: stylesheets are provided for printing individual bibliographic
    records as well as lists of search results. When printing search results, all
    results get printed (not just the current page of results).
  • Notes and attachments: notes, attached files, and attached links to URIs
    are synchronized from zotero.org and made available to users of the
    bibliography. Regular expressions may be used to include or exclude such child
    items from the bibliography, based on their tags.
  • DOI, ISBN and ISSN resolver: items that have such identifier in your
    library can be referenced by appending their identifier to your Kerko site's
    base URL.
  • Relations: bibliographic record pages show links to related items, if any.
    You may define such relations using Zotero's Related field. Moreover, Kerko
    adds the Cites and Cited by relation types, which can be managed in Zotero
    through notes. Custom applications can add more types of relations if desired.
  • Pages: basic informational pages can be defined using content from Zotero
    standalone notes.
  • Badges: custom applications can have icons conditionally displayed next to
    items.
  • Responsive design: the simple default implementation works on large
    monitors as well as on small screens. It is based on Bootstrap.
  • Google Analytics integration: just provide a Google Analytics stream ID to
    have Kerko automatically include the tracking code into its pages.
  • Integration: as a Flask blueprint, Kerko can be
    integrated into any Flask application. For a standalone application, however,
    you may simply install KerkoApp.
  • Customizable front-end: applications may partly or fully replace the
    default templates, scripts and stylesheets with their own.
  • Command line interface (CLI): Kerko provides commands for synchronizing or
    deleting its data.

KerkoApp is a standalone application built around Kerko. It inherits all of Kerko's features and it provides a few additions of its own:

  • Configuration files: allow separation of configuration from code and
    enable the Twelve-factor App methodology. Environment
    variables and TOML configuration files are supported. Secrets,
    server-specific parameters, and general parameters can be configured in
    separate files.
  • Page templates for common HTTP errors.
  • Syslog logging handler (for Unix environments).

Learn more

Please refer to the documentation for more details.