Elixir SortedSet backed by a Rust-based NIF
MIT License
SortedSet is a fast and efficient data structure that provides certain guarantees and functionality. The core data structure and algorithms are implemented in a Native Implemented Function in the Rust Programming Language, using the Rustler crate.
Add SortedSet to your dependencies and then install with mix do deps.get, deps.compile
def deps do
[
{:sorted_set_nif, "~> 1.0.0"}
]
end
Internally the Elixir terms stored in the SortedSet are converted to Rust equivalents and stored in a Vector of Vectors. The structure is similar to a skip-list, almost every operation on the SortedSet will perform a linear scan through the buckets to find the bucket that owns the term, then a binary search is done within the bucket to complete the operation.
Why not just a Vector of Terms? This approach was explored but when the Vector needs to grow beyond it's capacity, copying Terms over to the new larger Vector proved to be a performance bottle neck. Using a Vector of Vectors, the Bucket pointers can be quickly copied when additional capacity is required.
This strategy provides a reasonable trade off between performance and implementation complexity.
When using a SortedSet, the caller can tune bucket sizes to their use case. A default bucket
size of 500 was chosen as it provides good performance for most use cases. See new/2
for
details on how to provide custom tuning details.
There is some special functionality that SortedSet provides beyond sorted and uniqueness guarantees.
reference
, pid
, port
,function
, and float
. Attempting to store any of these types (or an allowed composite{:error, :unsupported_type}
Documentation is hosted on hexdocs.
For a local copy of the documentation, the mix.exs
file is already set up for generating
documentation, simply run the following commands to generate the documentation from source.
$ mix deps.get
$ mix docs
There are two test suites available in this library, an ExUnit test suite that tests the
correctness of the implementation from a black box point of view. These tests can be run by
running mix test
in the root of the library.
The rust code also contains tests, these can be run by running cargo test
in the
native/sorted_set_nif
directory.
Before running any benchmarks it's important to remember that during development the NIF will be built unoptimized. Make sure to rebuild an optimized version of the NIF before running the benchmarks.
There are benchmarks available in the bench
folder, these are written with
Benchee and can be run with the following command.
$ OPTIMIZE_NIF=true mix run bench/{benchmark}.exs
Adding the OPTIMIZE_NIF=true
will force the benchmark to run against the fully optimized NIF.
SortedSet lives in the Discord
namespace to prevent symbol collision, it can be used directly
defmodule ExampleModule do
def get_example_sorted_set() do
Discord.SortedSet.new()
|> Discord.SortedSet.add(1)
|> Discord.SortedSet.add(:atom),
|> Discord.SortedSet.add("hi there!")
end
end
You can always add an alias
to make this code less verbose
defmodule ExampleModule do
alias Discord.SortedSet
def get_example_sorted_set() do
SortedSet.new()
|> SortedSet.add(1)
|> SortedSet.add(:atom),
|> SortedSet.add("hi there!")
end
end
Full API Documentation is available, there is also a full test suite with examples of how the library can be used.