Code associated with our paper "Learning Visuomotor Policies for Aerial Navigation Using Cross-Modal Representations": https://arxiv.org/abs/1909.06993
MIT License
This repository provides a code base to evaluate and train models from the paper "Learning Visuomotor Policies for Aerial Navigation Using Cross-Modal Representations".
ArXiv pre-print: https://arxiv.org/abs/1909.06993
Paper video: https://youtu.be/VKc3A5HlUU8
This project is licensed under the terms of the MIT license. By using the software, you are agreeing to the terms of the license agreement.
If you use this code in your research, please cite us as follows:
@article{bonatti2020learning,
title={Learning Visuomotor Policies for Aerial Navigation Using Cross-Modal Representations},
author={Bonatti, Rogerio and Madaan, Ratnesh and Vineet, Vibhav and Scherer, Sebastian and Kapoor, Ashish},
journal={arXiv preprint arXiv:1909.06993},
year={2020}
}
Recommended system (tested):
Python packages used by the example provided and their recommended version:
In order for you to train the models and run Airsim you first need to download all image datasets, behavior cloning datasets, network weights and Airsim binaries:
settings.json
file inside ~/Documents/AirSim
in your computerIn order to train the cross-modal representations you can either use the downloaded image dataset from the previous step, or generate the data yourself using Airsim.
cmvae
, and inside file train_cmvae.py
edit variable data_dir
to the correct path of the extracted dataset within your computer. The default value is the directory with 1K images. But for final training you will need more images, such as the 50K or 300K datasetsoutput_dir
to the correct place where you want the models to be savedtrain_cmvae.py
eval_cmvae.py
You may want to generate a custom dataset for training you cross-modal VAE. Here are the steps to do it:
$ cd /yourpath/all_files/airsim_binaries/vae_env
$ ./AirSimExe.sh -windowed
No
datagen/img_generator/main.py
first change the desired number of samples and saved dataset pathmain.py # inside datagen/img_generator
In order to train the behavior cloning networks you can either use the downloaded image-action pairs dataset or generate the data yourself using Airsim.
imitation_learning
, and inside file train_bc.py
edit variables base_path
, data_dir_list
, and output_dir
. By default you will be using downloaded datasets with 0m to 3m of random gate displacement amplitude over a course with 8m of nominal radiustrain_bc.py
You may want to generate a custom dataset for training you behavior cloning policies. Here are the steps to do it:
$ cd /yourpath/all_files/airsim_binaries/recording_env
$ ./AirSimExe.sh -windowed
datagen/action_generator/src/soccer_datagen.py
change the desired meta-parameters (number of gates, track radius, gate displacement noise, etc)soccer_datagen.py
viz_traj
. Otherwise the recorded images will show the motion liner
on your keyboard to start recording images. Velocities will be automatically recorded. Both are saved inside ~/Documents/AirSim
Now you`ll need to process the raw recording so that you can match the time-stamps from velocity commands and images into a cohesive dataset. To do it:
/Documents/AirSim
, copy the contents of both folders (moveOnSpline_vel_cmd.txt
, images
folder and images.txt
file) into a new directory, for example /all_files/il_datasets/bc_test
.datagen/action_generator/src/data_processor.py
, modify variable base_path
to /all_files/il_datasets/bc_test
. Then run:data_processor.py
train_bc.py
following the previous steps. You can combine different datasets with different noise levels to train the same policyNow you can deploy the trained policies in AirSim, following these steps:
$ cd /yourpath/all_files/airsim_binaries/vae_env
$ ./AirSimExe.sh -windowed
imitation_learning/bc_navigation.py
, modify policy_type
and gate_noise
. Then run:bc_navigation.py
The policies trained in AirSim using the cross-modal representations can transferred directly towards real-world applications. Please check out the paper and video to see more results from real-life deployment.
This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.
When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.
This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.