Real-time Vamp plugin SDK for C++20
MIT License
Vamp is an C/C++ plugin API for audio analysis / feature extraction plugins: https://www.vamp-plugins.org
This SDK for plugins and hosts targets performance-critical applications by:
constexpr
evaluation for compile time errors instead of runtime errorsThe SDK aims to be well tested, cross-platform and use modern C++. The plugin SDK is available as a single-header library (download as asset from latest release page).
Compiler support: GCC >= 10
, Clang >= 11
, MSVC >= 19.30
Note: Python bindings for the hostsdk are available via PyPI. Please check out the documentation.
Following benchmarks compare the performance/overhead of the plugin SDKs based on a simple RMS plugin. The performance is measured as throughput (number of processed samples per second).
Results with an i7-9850H CPU (12 cores):
Throughput vs. block size | Multithreading |
---|---|
Results with an ARMv7 CPU: Throughput vs block size, Multithreading
The official SDK offers a convenient C++ plugin interface. But there are some drawbacks for real-time processing:
Huge amount of memory allocations due to the use of C++ containers like vectors and lists passed by value.
Let's have a look at the process
method of the Vamp::Plugin
class which does the main work:
FeatureSet process(const float *const *inputBuffers, RealTime timestamp)
FeatureSet
is returned by value and is a std::map<int, FeatureList>
.
FeatureList
is a std::vector<Feature>
and Feature
is struct
containing the actual feature values as a std::vector<float>
.
So in total, those are three nested containers, which are all heap allocated.
The C++ API is a wrapper of the C API:
On the plugin side, the PluginAdapter
class converts the C++ containers to C level (code).
Therefore the C++ containers are temporary objects and will be deallocated shortly after creation.
On the host side, the PluginHostAdapter
converts again from the C to the C++ representation (code).
The rt-vamp-plugin-sdk
aims to to keep the overhead minimal but still provide an easy and safe to use API:
static constexpr
variables to generate the C plugin descriptor at compile time.std::span
) to prevent heap allocations during processing.TimeDomainBuffer
(std::span<const float>
) or a FrequencyDomainBuffer
(std::span<const std::complex<float>>
).std::variant<TimeDomainBuffer, FrequencyDomainBuffer>
. A wrong input buffer type will result in an exception. The sized spans enable easy iteration over the input buffer data.Following features of the Vamp API Vamp::Plugin
are restricted within the rt-vamp-plugin-sdk
:
OutputDescriptor::hasFixedBinCount == true
for every output.
The number of values is constant for each feature during processing.
This has the advantage, that memory for the feature vector can be preallocated.
OutputDescriptor::SampleType == OneSamplePerStep
for every output.
The plugin will generate one feature set for each input block.
Following parameters are therefore negitable:
OutputDescriptor::sampleRate
OutputDescriptor::hasDuration
Feature::hasTimestamp
& Feature::timestamp
Feature::hasDuration
& Feature::duration
Only one input channel allowed: getMinChannelCount() == 1
More examples can be found here: https://github.com/lukasberbuer/rt-vamp-plugin-sdk/tree/master/examples.
class ZeroCrossing : public rtvamp::pluginsdk::Plugin<1 /* one output */> {
public:
using Plugin::Plugin; // inherit constructor
static constexpr Meta meta{
.identifier = "zerocrossing",
.name = "Zero crossings",
.description = "Detect and count zero crossings",
.maker = "LB",
.copyright = "MIT",
.pluginVersion = 1,
.inputDomain = InputDomain::Time,
};
OutputList getOutputDescriptors() const override {
return {
OutputDescriptor{
.identifier = "counts",
.name = "Zero crossing counts",
.description = "The number of zero crossing points per processing block",
.unit = "",
.binCount = 1,
},
};
}
bool initialise(uint32_t stepSize, uint32_t blockSize) override {
initialiseFeatureSet(); // automatically resizes feature set to number of outputs and bins
return true;
}
void reset() override {
previousSample_ = 0.0f;
}
const FeatureSet& process(InputBuffer buffer, uint64_t nsec) override {
const auto signal = std::get<TimeDomainBuffer>(buffer);
size_t crossings = 0;
bool wasPositive = (previousSample_ >= 0.0f);
for (const auto& sample : signal) {
const bool isPositive = (sample >= 0.0f);
crossings += int(isPositive != wasPositive);
wasPositive = isPositive;
}
previousSample_ = signal.back();
auto& result = getFeatureSet();
result[0][0] = crossings; // first and only output, first and only bin
return result; // return span/view of the results
}
private:
float previousSample_ = 0.0f;
};
RTVAMP_ENTRY_POINT(ZeroCrossing)
// list all plugins keys (library:plugin)
for (auto&& key : rtvamp::hostsdk::listPlugins()) {
std::cout << key.get() << std::endl;
}
auto plugin = rtvamp::hostsdk::loadPlugin("minimal-plugin:zerocrossing", 48000 /* samplerate */);
plugin->initialise(4096 /* step size */, 4096 /* block size */);
std::vector<float> buffer(4096);
// fill buffer with data from audio file, sound card, ...
auto features = plugin->process(buffer, 0 /* timestamp nanoseconds */);
std::cout << "Zero crossings: " << features[0][0] << std::endl;