fast_rss

Fast Elixir RSS feed parser, a NIF wrapper around the Rust RSS crate

APACHE-2.0 License

Downloads
53.4K
Stars
110
Committers
7

Intro

Parse RSS feeds very quickly

  • This is rust NIF built using rustler
  • Uses the RSS rust crate to do the actual RSS parsing

Speed

Currently this is already much faster than most of the pure elixir/erlang packages out there. In benchmarks there are speed improvements anywhere between 6.12x - 50.09x over the next fastest package (feeder_ex) that was tested.

Compared to the slowest elixir options tested (feed_raptor, elixir_feed_parser), FastRSS was sometimes 259.91x faster and used 5,412,308.17x less memory (0.00156 MB vs 8423.70 MB).

See full benchmarks below:

Compatibility

FastRSS requires a minimum combination of Elixir 1.6.0 and Erlang/OTP 20.0, and is tested with a maximum combination of Elixir 1.11.1 and Erlang/OTP 22.0.

Installation

This package is available on hex.

It can be installed by adding fast_rss to your list of dependencies in mix.exs:

def deps do
  [
    {:fast_rss, "~> 0.5.0"}
  ]
end

You also need the rust compiler installed: https://www.rust-lang.org/tools/install

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

Usage

There is only two functions, one for parsing rss parse_rss/1 and one for parsing atom feeds parse_atom/1 they takes a string and outputs an {:ok, map()} with string keys.

iex(1)>  {:ok, map_of_rss} = FastRSS.parse_rss("...rss_feed_string...")
iex(2)> Map.keys(map_of_rss)
["categories", "cloud", "copyright", "description", "docs", "dublin_core_ext",
 "extensions", "generator", "image", "items", "itunes_ext", "language",
 "last_build_date", "link", "managing_editor", "namespaces", "pub_date",
 "rating", "skip_days", "skip_hours", "syndication_ext", "text_input", "title",
 "ttl", "webmaster"]

The docs can be found at https://hexdocs.pm/fast_rss.

Supported Feeds

Reading from the following RSS versions is supported:

  • RSS 0.90
  • RSS 0.91
  • RSS 0.92
  • RSS 1.0
  • RSS 2.0
  • iTunes
  • Dublin Core
  • Atom

Benchmark

HTML: https://avencera.github.io/fast_rss/

Benchmark run from 2020-02-22 05:23:47.524699Z UTC

System

Benchmark suite executing on the following system:

Configuration

Benchmark suite executing with the following configuration:

Statistics

Input: anxiety

Run Time

Input: ben

Run Time

Input: daily

Run Time

Input: dave

Run Time

Input: sleepy

Run Time

Input: stuff

Run Time

Deploying

Deploying rust NIFs can be a little bit annoying as you have to install the rust compiler. We try to alleviate this with rustler_precopmiled, which will create precompiled assets for a number of targets (see release.yml for the full list), but does not cover all environments. If you are having trouble deploying this package make an issue and I will try and help you out.

I will then add it to the FAQ below.

Q. How do I deploy using an Alpine Dockerfile?

A. I recommend using a multi-stage Dockerfile, and doing the following

  1. On the stages where you build all your deps, and build your release make sure to install build-base and libgcc:

    # This step installs all the build tools we'll need
    RUN apk update && \
        apk upgrade --no-cache && \
        apk add --no-cache \
        git \
        curl \
        build-base \
        libgcc  && \
        mix local.rebar --force && \
        mix local.hex --force
    
  2. Install the rust compiler and allow dynamic linking to the C library by setting the rust flag

    # install rustup
    RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y
    ENV RUSTUP_HOME=/root/.rustup \
        RUSTFLAGS="-C target-feature=-crt-static" \
        CARGO_HOME=/root/.cargo  \
        PATH="/root/.cargo/bin:$PATH"
    
  3. On the stage where you actually run your elixir release install libgcc:

    ################################################################################
    ## STEP 4 - FINAL
    FROM alpine:3.11
    
    ENV MIX_ENV=prod
    
    RUN apk update && \
        apk add --no-cache \
        bash \
        libgcc \
        openssl-dev
    
    COPY --from=release-builder /opt/built /app
    WORKDIR /app
    CMD ["/app/my_app/bin/my_app", "start"]
    

License

FastRSS is released under the Apache License 2.0 - see the LICENSE file.