⚡ Edgen: Local, private GenAI server alternative to OpenAI. No GPU required. Run AI models locally: LLMs (Llama2, Mistral, Mixtral...), Speech-to-text (whisper) and many others.
APACHE-2.0 License
⚡Edgen lets you use GenAI in your app, completely locally on your user's devices, for free and with data-privacy. It's a drop-in replacement for OpenAI (it uses the a compatible API), supports various functions like text generation, speech-to-text and works on Windows, Linux, and MacOS.
Check in the documentation
Data Private: On-device inference means users' data never leave their devices.
Scalable: More and more users? No need to increment cloud computing infrastructure. Just let your users use their own hardware.
Reliable: No internet, no downtime, no rate limits, no API keys.
Free: It runs locally on hardware the user already owns.
Ready to start your own GenAI application? Checkout our guides!
⚡Edgen usage:
Usage: edgen [<command>] [<args>]
Toplevel CLI commands and options. Subcommands are optional. If no command is provided "serve" will be invoked with default options.
Options:
--help display usage information
Commands:
serve Starts the edgen server. This is the default command when no
command is provided.
config Configuration-related subcommands.
version Prints the edgen version to stdout.
oasgen Generates the Edgen OpenAPI specification.
edgen serve
usage:
Usage: edgen serve [-b <uri...>] [-g]
Starts the edgen server. This is the default command when no command is provided.
Options:
-b, --uri if present, one or more URIs/hosts to bind the server to.
`unix://` (on Linux), `http://`, and `ws://` are supported.
For use in scripts, it is recommended to explicitly add this
option to make your scripts future-proof.
-g, --nogui if present, edgen will not start the GUI; the default
behavior is to start the GUI.
--help display usage information
⚡Edgen also supports compilation and execution on a GPU, when building from source, through Vulkan, CUDA and Metal. The following cargo features enable the GPU:
llama_vulkan
- execute LLM models using Vulkan. Requires a Vulkan SDK to be installed.llama_cuda
- execute LLM models using CUDA. Requires a CUDA Toolkit to be installed.llama_metal
- execute LLM models using Metal.whisper_cuda
- execute Whisper models using CUDA. Requires a CUDA Toolkit to be installed.Note that, at the moment, llama_vulkan
, llama_cuda
and llama_metal
cannot be enabled at the same time.
Example usage (building from source, you need to first install the prerequisites):
cargo run --features llama_vulkan --release -- serve
If you don't know where to start, check Edgen's roadmap! Before you start working on something, see if there's an existing issue/pull-request. Pop into Discord to check with the team or see if someone's already tackling it.
llama.cpp
,whisper.cpp
, and ggml
for being