Bot releases are visible (Hide)
Introducing a Server Mode enabling easy-to-use REST API for the inner GPT model. Let's go Production :)
Published by gotzmann over 1 year ago
Nothing special, more stable inference and more sane default parameters
Published by gotzmann over 1 year ago
Inference performance was boosted for CPUs supporting vector math.
Please use:
--neon flag for Apple Silicon (M1-M3 processors) and ARM servers
--avx for Intel and AMD CPUs which supports AVX2 instruction set
Published by gotzmann over 1 year ago
This version supports bigger / multipart LLaMA models (tested with 7B / 13B) converted into latest GGMJ binary format with custom Python script (see README).
Published by gotzmann over 1 year ago
The very first public release of LLaMA.go