CLOUDY automates the execution of experiments on Google Cloud.
GPL-3.0 License
CLOUDY automates the execution of experiments on Google Cloud. It creates VM instances and buckets, installs dependencies, runs Python scripts, and handles resource cleanup.
The workflow of CLOUDY comprises the following steps:
The script launch.sh
prepares a VM instance, according to the options specified in the config.json
file.
The script setup.sh
is executed in the VM to install dependencies and run the Python script indicated.
The output is saved to an existing bucket, or a new one is created as required.
The instance is automatically deleted once its execution has finished.
This project consists of the following scripts:
launch.sh
: creates a VM instance on Google Cloud according to the configuration defined in config.json
. It also downloads and copies your repository to the VM instance.setup.sh
: runs on the VM instance. Installs dependencies, runs your Python script, and saves the results to a Google Cloud bucket, creating it if necessary.clean.sh
: cleans up all VM instances and buckets on Google Cloud.Makefile
: enables the execution of the scripts through simple commands.Prerequisites
First, create a service account on GCP with the required permissions for Compute Engine and Cloud Storage (e.g., storage administrator, compute instances administrator).
Then, install the following dependencies:
Google Cloud SDK
: required to interact with Google Cloud from the command line.
jq
: used to read the JSON configuration file.
Edit config.json
Define your custom configuration in the config.json
file, located in the root directory of the project. For example:
{
"INSTANCE_NAME": "vm",
"BUCKET_NAME": "bucket",
"REPO_URL": "https://github.com/manjavacas/cloudy.git",
"SCRIPT_PATH": "foo/foo.py",
"SCRIPT_ARGS": "cloudy",
"DEPENDENCIES": "numpy pandas",
"SERVICE_ACCOUNT": "[email protected]",
"SETUP_SCRIPT": "setup.sh",
"MACHINE_TYPE": "n2-standard-2",
"ZONE": "europe-southwest1-b",
"IMAGE_FAMILY": "ubuntu-2004-lts",
"IMAGE_PROJECT": "ubuntu-os-cloud",
"BUCKET_ZONE": "eu"
}
The main options to edit are:
INSTANCE_NAME
and BUCKET_NAME
: identifiers for the created instance and bucket.REPO_URL
: the repository to clone. This is where the code you want to execute is located.SCRIPT_PATH
and SCRIPT_ARGS
: path to the Python script you want to execute in the repository, along with its input arguments.DEPENDENCIES
: dependencies required to run the Python script.SERVICE_ACCOUNT
: GCP service account to be used. It must have the necessary permissions.Run CLOUDY
a. Using Makefile
$ make launch
$ make clean
$ make reset
b. Using cloudy.py
Alternatively, you can use the Python script cloud.py
for the same operations:
$ python cloudy.py launch
$ python cloudy.py clean
$ python cloudy.py reset