RayLLM - LLMs on Ray
APACHE-2.0 License
Bot releases are hidden (Show)
Thanks for contributions from:
@avnishn
@csivanich
@sihanwang41
@Yard1
@tterrysun
Published by avnishn 12 months ago
The following changes are introduced:
Thanks for contributions from:
@avnishn
@csivanich
@shrekris-anyscale
@sihanwang41
@richardliaw
@Yard1
Published by Yard1 about 1 year ago
Full Changelog: https://github.com/ray-project/ray-llm/compare/v0.3.0...v0.3.1
Published by Yard1 about 1 year ago
Please note that API stability is not expected until 1.0 release. This update introduces breaking changes.
This release introduces a new vLLM backend and removes the dependency on TGI. This is because TGI is not Apache 2.0 licensed anymore, and the new license is too restrictive for most organizations to run in production. On the other hand, vllm is Apache 2.0 licensed and is a better foundation to build on top of. There are some breaking changes to model configuration YAMLs related to the new vLLM backend.
Refer to the updated ray-llm/models/README.md
file for details on the updated configuration file format.
Documentation
API & SDK
Backend
In order to use RayLLM, ensure you are using the official Docker image anyscale/aviary:latest
.
Published by Yard1 about 1 year ago
Documentation
API & SDK
openai
Python package)
/v1/completions
suffix
, n
, logprobs
, echo
, best_of
, logit_bias
, user
top_k
, typical_p
, watermark
, seed
/v1/chat/completions
n
, logprobs
, echo
, logit_bias
, user
top_k
, typical_p
, watermark
, seed
/v1/models
/v1/models/<MODEL>
frequency_penalty
and presence_penalty
parametersaviary run
is now blocking by default and will clarify that rerunning aviary run
will remove existing models/frontend
route to avoid conflicts with backendopenai
package is now a dependency for AviaryBackend
Predictor
to Engine
Engine
combines the functionality of initializers, predictors and pipelines.Predictor
and Pipeline
frequency_penalty
and presence_penalty
parametersHUGGING_FACE_HUB_TOKEN
env var is propagated throughout all Aviary Backend processes to allow access to gated models such as llama-2This update introduces breaking changes to model configuration YAMLs and the Aviary SDK. Refer to the migration guide below for more details.
In order to use Aviary backend, ensure you are using the official Docker image anyscale/aviary:latest
. Using the backend without Docker is not a supported usecase. anyscale/aviary:latest-tgi
image has been superseded by anyscale/aviary:latest
.
In the most recent version of Aviary we introduce breaking changes in the model YAMLs. This guide will help you migrate your existing model YAMLs to the new format.
model_config.initialization
to be under model_config
model_config.initialization
.Then remove the following sections/fields and everything that is under them:
- model_config.initializer
- model_config.pipeline
- model_config.batching
Rename model_config
to engine_config
.
In v0.2, we introduce Engine
, the Aviary abstraction for interacting with a model. In short, Engine
combines the functionality of initializers
, pipelines
, and predictors
.
Pipeline and initializer parameters are no longer configurable.
In v0.2 we remove the option to specify static batching and instead do continuous batching by default for performance improvement.
Add the Scheduler
and Policy
configs.
The scheduler is a component of the engine that determines which requests to run inference on. The policy is a component of the scheduler that determines the scheduling strategy. These components previously existed in Aviary, however they weren't explicitly configurable.
Previously the following parameters were specified under model_config.generation
:
max_batch_total_tokens
max_total_tokens
max_waiting_tokens
max_input_length
max_batch_prefill_tokens
rename max_waiting_tokens
to max_iterations_curr_batch
place these parameters under engine_config.scheduler.policy
for example:
engine_config:
scheduler:
policy:
max_iterations_curr_batch: 100
max_batch_total_tokens: 100000
max_total_tokens: 100000
max_input_length: 100
max_batch_prefill_tokens: 100000
Published by Yard1 about 1 year ago
Full Changelog: https://github.com/ray-project/aviary/compare/v0.1.1...v0.1.2
Published by Yard1 over 1 year ago
Note: This update requires changes to model config YAMLs
Full Changelog: https://github.com/ray-project/aviary/compare/v0.1.0...v0.1.1
Published by Yard1 over 1 year ago
Note: This update breaks existing APIs and requires changes to model config YAMLs
Full Changelog: https://github.com/ray-project/aviary/compare/v0.0.3...v0.1.0
Published by Yard1 over 1 year ago
Full Changelog: https://github.com/ray-project/aviary/compare/v0.0.2...v0.0.3
Published by Yard1 over 1 year ago
Full Changelog: https://github.com/ray-project/aviary/compare/v0.0.1...v0.0.2
Published by Yard1 over 1 year ago
Initial public release.