Kubernetes Operator for MPI-based applications (distributed training, HPC, etc.)
APACHE-2.0 License
Thank you to all the contributors (in no particular order): @sheevy @alculquicondor @terrytangyuan @tenzen-y @kuizhiqing @lowang-bh @vsoch @emsixteeen @wang-mask @benash @yeahdongcn @xhejtman @pheianox @lianghao208
Published by alculquicondor over 1 year ago
Special thanks to @tenzen-y for multiple contributions.
Thank you to all the contributors (in no particular order): @mimowo @adilhusain-s @davidLif @ArangoGutierrez @shaowei-su @ggaaooppeenngg @pugangxa @HeGaoYuan @Dimss @alculquicondor @terrytangyuan
Published by terrytangyuan about 3 years ago
runPolicy
(ttlSecondsAfterFinish
, activeDeadlineSeconds
, backoffLimit
)Published by terrytangyuan over 4 years ago
Published by terrytangyuan about 5 years ago
JobStatus
from kubeflow/commonPublished by terrytangyuan over 5 years ago
v1alpha2
MPI Operator.Published by terrytangyuan over 5 years ago
ActiveDeadlineSeconds
in MPIJobSpec
launcherOnMaster
fieldStartTime
and CompletionTime
in job statusPublished by rongou almost 6 years ago
Initial release of the MPI Operator.