dstack is an open-source orchestration engine for cost-effectively running AI workloads in the cloud as well as on-premises. Discord: https://discord.gg/u8SmfwPpMd
MPL-2.0 License
Bot releases are visible (Hide)
Published by peterschmidt85 7 months ago
This is a preview of the upcoming 0.17.0
release update, which brings several new features and multiple bug fixes.
instance_type
via CLI and profiles by @r4victor in https://github.com/dstackai/dstack/pull/1023
shm_size
property in resources doesn't take effect by @peterschmidt85 in https://github.com/dstackai/dstack/pull/1007
vastai
doesn't show any offers since 0.16.0
by @iRohith in https://github.com/dstackai/dstack/pull/959
main
by @peterschmidt85 in https://github.com/dstackai/dstack/pull/992
Full changelog: https://github.com/dstackai/dstack/compare/0.16.5...0.17.0rc2
Published by peterschmidt85 7 months ago
Full changelog: https://github.com/dstackai/dstack/compare/0.16.4...0.16.5
Published by peterschmidt85 7 months ago
The 0.16.4
update introduces the cudo
backend, which allows running workloads with CUDO Compute, a cloud GPU marketplace.
To configure the cudo
backend, you simply need to specify your CUDO Compute project ID and API key:
projects:
- name: main
backends:
- type: cudo
project_id: my-cudo-project
creds:
type: api_key
api_key: 7487240a466624b48de22865589
Once it's done, you can restart the dstack server
and use the dstack
CLI or API to run workloads.
[!NOTE]
Limitations
- The
dstack gateway
feature is not yet compatible withcudo
, but it is expected to be supported in version0.17.0
,
planned for release within a week.- The
cudo
backend cannot yet be used with dstack Sky, but it will also be enabled within a week.
Full changelog: https://github.com/dstackai/dstack/compare/0.16.3...0.16.4
Published by peterschmidt85 7 months ago
shm_size
property in resources
doesn't take effect #1006~/.dstack/server/config.yml
#991Full changelog: https://github.com/dstackai/dstack/compare/0.16.2...0.16.3
Published by peterschmidt85 8 months ago
dstack pool
dstack pool add
to 72h
#964dstack pool add
to on-demand
#962lambda
, azure
, and tensordock
#923dstack pool add
#918dstack run
does not respect pool-related profiles.yml
parameters #949vastai
backend doesn't show any offers since 0.16.0
#958~/.ssh/config
#937~/.ssh/config
#933PATH
is overridden when logging via SSH #930Too many authentication failures
#927We've also updated our guide on how to add new backends. It's now available here.
Full Changelog: https://github.com/dstackai/dstack/compare/0.16.0...0.16.1
Published by peterschmidt85 8 months ago
The 0.16.0
release is the next major update, which, in addition to many bug fixes, introduces pools, a major new feature that enables a more efficient way to manage instance lifecycles and reuse instances across runs.
Previously, when running a dev environment, task, or service, dstack
provisioned an instance in a configured
backend, and upon completion of the run, deleted the instance.
Now, when using the dstack run
command, it tries to reuse an instance from a pool. If no ready instance meets the
requirements, dstack
automatically provisions a new one and adds it to the pool.
Once the workload finishes, the instance is marked as ready (to run other workloads).
If the instance remains idle for the configured duration, dstack
tears it down.
dstack pool
The dstack pool
command allows for managing instances within pools.
To manually add an instance to a pool, use dstack pool add
:
dstack pool add --gpu 80GB --idle-duration 1d
The dstack pool add
command allows specifying resource requirements, along with the spot policy, idle duration, max
price, retry policy, and other policies.
If no idle duration is configured, by default, dstack
sets it to 72h
.
To override it, use the --idle-duration DURATION
argument.
To learn more about pools, refer to the official documentation. To learn more about 0.16.0
, refer to the changelog.
dstack pool show
by @TheBits in https://github.com/dstackai/dstack/pull/898
Full changelog: https://github.com/dstackai/dstack/compare/0.15.1...0.16.0
Published by peterschmidt85 8 months ago
--gpus==all
instead of --runtime=nvidia
) #910Full changelog: https://github.com/dstackai/dstack/compare/0.15.1...0.15.2rc2
Published by peterschmidt85 8 months ago
In addition to a few bug fixes, the latest update brings initial integration with Kubernetes (experimental) and adds the possibility to configure a custom VPC for AWS. Read below for more details.
With the latest update, it's now possible to configure a Kubernetes backend. In this case, if you run a workload, dstack will provision infrastructure within your Kubernetes cluster. This may work with both self-managed and managed clusters.
If you're using dstack with AWS, it's now possible to configure a vpc_name
via ~/.dstack/server/config.yml
.
** Learn more about the new features in detail on the changelog page.**
get_latest_runner_build
by @Egor-S in https://github.com/dstackai/dstack/pull/871
Full Changelog: https://github.com/dstackai/dstack/compare/0.15.0...0.15.1
Published by peterschmidt85 8 months ago
0.15.0
:It is now possible to configure resources in the YAML configuration file:
type: dev-environment
python: 3.11
ide: vscode
# (Optional) Configure `gpu`, `memory`, `disk`, etc
resources:
gpu: 24GB
Supported properties include: gpu
, cpu
, memory
, disk
, and shm_size
.
If you specify memory size, you can either specify an explicit size (e.g. 24GB
) or a
range (e.g. 24GB..
, or 24GB..80GB
, or ..80GB
).
The gpu
property allows specifying not only memory size but also GPU names
and their quantity. Examples: A100
(one A100), A10G,A100
(either A10G or A100),
A100:80GB
(one A100 of 80GB), A100:2
(two A100), 24GB..40GB:2
(two GPUs between 24GB and 40GB), etc.
Service endpoints now require the Authentication
header with "Bearer <dstack token>"
. This also includes the OpenAI-compatible endpoints.
from openai import OpenAI
client = OpenAI(
base_url="https://gateway.example.com",
api_key="<dstack token>"
)
completion = client.chat.completions.create(
model="mistralai/Mistral-7B-Instruct-v0.1",
messages=[
{"role": "user", "content": "Compose a poem that explains the concept of recursion in programming."}
]
)
print(completion.choices[0].message)
Authentication can be disabled by setting auth
to false
in the service configuration file.
Model mapping (required to enable OpenAI interact) now supports format: openai
.
For example, if you run vLLM using the OpenAI mode, it's possible to configure model mapping for it.
type: service
python: "3.11"
env:
- MODEL=NousResearch/Llama-2-7b-chat-hf
commands:
- pip install vllm
- python -m vllm.entrypoints.openai.api_server --model $MODEL --port 8000
port: 8000
resources:
gpu: 24GB
model:
format: openai
type: chat
name: NousResearch/Llama-2-7b-chat-hf
In case you have any questions, experience bugs, or need help,
drop us a message on our Discord server or submit it as a
GitHub issue.
Full Changelog: https://github.com/dstackai/dstack/compare/0.14.0...0.15.0
Published by peterschmidt85 9 months ago
With the upcoming dstack 0.14.0
, we are extending the service configuration in dstack to enable you to optionally map your custom LLM to an OpenAI-compatible endpoint.
To try the preview of this new upcoming feature, make sure to install 0.14.0rc1
and restart your server.
pip install "dstack[all]==0.14.0rc1"
Note: In order to use the new feature, it's important to delete your existing gateway (if any) using dstack gateway delete
and then create it again with dstack gateway create
.
To learn more about how the new mapping works, read our blog post on it.
Full Changelog: https://github.com/dstackai/dstack/compare/0.13.1...0.14.0rc1
Published by peterschmidt85 9 months ago
dstack 0.13.1
is a minor update that introduces a couple of important fixes.
If you submit a task or a service via the Python API, you can now specify the repo
with the Client.runs.submit
method.
This argument accepts an instance of dstack.api.LocalRepo
(which allows you to mount additional files to the run from a local folder), dstack.api.RemoteRepo
(which allows you to mount additional files to the run from a remote Git repo), or dstack.api.VirtualRepo
(which allows you to mount additional files to the run programmatically).
Here's an example:
repo=RemoteRepo.from_url(
repo_url="https://github.com/dstackai/dstack-examples",
repo_branch="main"
)
client.repos.init(repo)
run = client.runs.submit(
configuration=...,
repo=repo,
)
This allows you to access the additional files in your run from the mounted repo.
More examples are now available in the API documentation.
Note that the Python API is just one possible way to manage runs. Another one is the CLI. When using the CLI, it automatically mounts the repo in the current folder.
Among other improvements, the update addresses the issue that previously prevented the ability to pass custom arguments to the run using ${{ run.args }}
in the YAML configuration.
Here's an example:
type: task
python: "3.11" # (Optional) If not specified, your local version is used
commands:
- pip install -r requirements.txt
- python train.py ${{ run.args }}
``
Now, you can pass custom arguments to the run via `dstack run`:
```shell
dstack run . -f train.dstack.yml --gpu A100 --train_batch_size=1 --num_train_epochs=100
In this case --train_batch_size=1 --num_train_epochs=100
will be passed to python train.py
.
Last but not least, we've extended our contribution guide with a new wiki page that guides you through the steps of adding a custom backend. This can be helpful if you decide to extend dstack with support for a custom backend (cloud provider).
Feel free to check out this new wiki page and share your feedback. As always, if you need help with adding custom backend support, you can always ask for assistance from our team.
To try out the update, run the following command:
pip install "dstack[all]==0.13.1"
After that, make sure to restart the server.
As always, you're very welcome to join our Discord server with any questions and feedback!
Stay tuned for more news to be announced soon regarding the next major release.
Published by peterschmidt85 10 months ago
The dstack
0.13.0 update introduces several new features and includes a new guide on deploying Mixtral 8x7B.
Previously, dstack
set the disk size to 100GB
regardless of the cloud provider. Now, to accommodate larger language
models and datasets, dstack
enables setting a custom disk size using --disk
in dstack run
or via the disk
property in .dstack/profiles.yml
.
With dstack
, whether you're using dev environments, tasks, or services, you can opt for a custom Docker image (for
self-installed dependencies) or stick with the default Docker image (dstack
pre-installs CUDA drivers, Conda, Python,
etc.).
We've upgraded the default Docker image's CUDA drivers to 12.1 (for better compatibility with modern libraries).
Lastly, and most importantly, we've added a guide on deploying Mixtral 8x7B as a service. This guide allows you to effortlessly deploy a Mixtral endpoint on any cloud platform of your preference.
That's all! Feel free to try out the update and the new guide, and share your feedback with us.
For updates or assistance, join our Discord.
--
For more details about the update, read our official blog.
Published by peterschmidt85 10 months ago
This is a release candidate for the upcoming dstack
0.13.0 update.
The major changes in this release include:
nvcc
). If you'd like to install the compiler, use conda install cuda
, it will automatically install nvcc
among other utilities. This can be important if you're using a training or serving framework that requires building a custom CUDA kernel.100GB
. Use --disk
with dstack run
(e.g. --disk 200GB
), or specify disk
under resources
in .dstack/profiles.yml
.All of these changes can be already tried with dstack
0.13.0rc1. Here's how to install the release candidate:
pip install "dstack[all]==0.13.0rc1"
And don't forget to restart the server!
As always, for questions and assistance, visit our Discord server.
Published by peterschmidt85 11 months ago
This update focuses on bug fixes and stability improvements:
dstack.FineTuningTask
failed because of a missing fileGo ahead, and update the CLI:
pip install "dstack[all]" -U
And don't forget to restart the server!
As always, for questions and assistance, visit our Discord server.
Published by peterschmidt85 11 months ago
With dstack 0.12.3
, you can now use dstack
with Vast.ai, a marketplace providing GPUs from independent hosts at notably lower prices.
Configuring Vast.ai is very easy. Log into your Vast AI account, click Account in the sidebar, and copy your
API Key.
Then, go ahead and configure the backend via ~/.dstack/server/config.yml
:
projects:
- name: main
backends:
- type: vastai
creds:
type: api_key
api_key: d75789f22f1908e0527c78a283b523dd73051c8c7d05456516fc91e9d4efd8c5
Now you can restart the server and proceed to using the CLI or API for running development environments, tasks, and services.
$ dstack run --gpu 24GB --backend vastai --max-price 0.4
# REGION INSTANCE RESOURCES PRICE
1 pl-greaterpoland 6244171 16xCPU, 32GB, 1xRTX3090 (24GB) $0.18478
2 ee-harjumaa 6648481 16xCPU, 64GB, 1xA5000 (24GB) $0.29583
3 pl-greaterpoland 6244172 32xCPU, 64GB, 2XRTX3090 (24GB) $0.36678
Continue? [y/n]:
Questions and requests for help are very much welcome in our Discord server.
Published by peterschmidt85 11 months ago
The upcoming dstack 0.12.3 release introduces two major new features:
Try these features with 0.12.3rc2
, the release candidate.
The fine-grained API enables deploying a text generation model with a single call:
from dstack.api import Client, GPU, CompletionService, Resources
client = Client.from_config()
# Pass a model and quantization params
service = CompletionService(
model_name="TheBloke/CodeLlama-34B-GPTQ",
quantize="gptq"
)
# Deploy the model as a public endpoint
run = client.runs.submit(
run_name = "CodeLlama-34B-GPTQ", # If not set, assigned randomly
configuration=service,
resources=Resources(gpu=GPU(memory="24GB"))
)
Once deployed, the model's endpoint is accessible at https://<run-name>.<domain-name>
, supporting features like streaming, continuous batching, and tensor parallelism.
Integration with LangChain is on the way.
Another major change is the integration with Vast.ai, a cost-effective GPU marketplace provider. If cost is an important factor, Vast.ai may be a perfect choice, especially for dev environments and tasks.
dstack run . --gpu 24GB --backend vastai
Configuration .dstack.yml
Project main
User admin
Min resources 1xGPU (24GB)
Max price -
Max duration 6h
Spot policy on-demand
Retry policy no
# BACKEND REGION INSTANCE RESOURCES SPOT PRICE
1 vastai vn-hochiminh 7326430 12xCPU, 128GB, 1xRTX4090 (24GB) no $0.41983
2 vastai es-seville 7058202 12xCPU, 64GB, 1xRTX4090 (24GB) no $0.50944
3 vastai tw-newtaipei 7371375 128xCPU, 193GB, 4xRTX4090 (24GB) no $1.95472
...
Continue? [y/n]:
To use Vast.ai, simply configure the corresponding backend in ~/.dstack/server/config.yml
. See example.
Something doesn't work, or do you have a question? Write to us via Discord.
backend_data
to compute.terminate_instance
by @Egor-S in https://github.com/dstackai/dstack/pull/775
Full Changelog: https://github.com/dstackai/dstack/compare/0.12.2...0.12.3rc2
Published by peterschmidt85 12 months ago
With dstack 0.12.2
, you can now access TensorDock's cloud GPUs, leveraging their highly competitive pricing.
Configuring your TensorDock account with dstack
is very easy. Simply generate an authorization key in your TensorDock
API settings and set it up in ~/.dstack/server/config.yml
:
projects:
- name: main
backends:
- type: tensordock
creds:
type: api_key
api_key: 248e621d-9317-7494-dc1557fa5825b-98b
api_token: FyBI3YbnFEYXdth2xqYRnQI7hiusssBC
Now you can restart the server and proceed to using the CLI or API for running development environments, tasks, and services.
dstack run . -f .dstack.yml --gpu 40GB
Min resources 1xGPU (40GB)
Max price -
Max duration 6h
Retry policy no
# REGION INSTANCE RESOURCES SPOT PRICE
1 unitedstates ef483076 10xCPU, 80GB, 1xA6000 (48GB) no $0.6235
2 canada 0ca177e7 10xCPU, 80GB, 1xA6000 (48GB) no $0.6435
3 canada 45d0cabd 10xCPU, 80GB, 1xA6000 (48GB) no $0.6435
...
Continue? [y/n]:
Questions and requests for help are very much welcome in our Discord server.
Published by peterschmidt85 12 months ago
Exciting news! In the upcoming v0.12.2
release, we are adding support for TensorDock, which enables the use of cloud GPUs at a very low cost.
To give it a try, install the 0.12.2rc1
preview build and configure the tensordock backend following the updated docs.
Published by peterschmidt85 12 months ago
Version 0.12.1 introduces the API for fine-tuning LLMs with just a single line of code.
This API takes the name of the model and dataset from the Hugging Face hub, along with training parameters and your Hugging Face API key. It fine-tunes the model using the SFT method. Once the fine-tuning is complete, the model is pushed to the Hugging Face hub.
Learn more about the new API at dstack.ai/docs/guides/fine-tuning/.
Published by peterschmidt85 12 months ago
Version 0.12.1, planned for later this week, introduces the fine-grained API for fine-tuning LLMs with just a single line of code.
This API takes the name of the model and dataset from the Hugging Face hub, along with training parameters and your Hugging Face API key. It fine-tunes the model using the SFT method. Once the fine-tuning is complete, the model is pushed to the Hugging Face hub.
To try out the new API now, feel free to use the 0.12.1rc1
release candidate build.
Learn more about the new API at dstack.ai/docs/guides/fine-tuning/.