kserve
-
Published by yuzisun over 3 years ago
InferenceService V1Beta1
🚢 KFServing 0.5 promotes the core InferenceService from v1alpha2 to v1beta1!
The minimum required versions are Kubernetes 1.16 and Istio 1.3.1/Knative 0.14.3. Conversion webhook is installed to automatically convert v1alpha2 inference service to v1beta1.
🆕 What's new ?
- You can now specify container fields on ML Framework spec such as env variable, liveness/readiness probes etc.
- You can now specify pod template fields on component spec such as NodeAffinity etc.
- Allow specifying timeouts on component spec
- Tensorflow Serving gRPC support.
- Triton Inference server V2 inference REST/gRPC protocol support, see examples
- TorchServe predict integration, see examples
- SKLearn/XGBoost V2 inference REST/gRPC protocol support with MLServer, see SKLearn and XGBoost examples
- PMMLServer support, see examples
- LightGBM support, see examples
- Simplified canary rollout, traffic split at knative revisions level instead of services level, see examples
- Transformer to predictor call is now using AsyncIO by default
⚠️ What's gone ?
- Default/Canary level is removed, canaryTrafficPercent is moved to the component level
-
rollout_canary
and promote_canary
API is deprecated on KFServing SDK
- Parallelism field is renamed to containerConcurrency
-
Custom
keyword is removed and container
field is changed to be an array
⬆️ What actions are needed to take to upgrade?
- Make sure canary traffic is all rolled out before upgrade as v1alpha2 canary spec is deprecated, please use v1beta1 spec for canary rollout feature.
- Although KFServing automatically converts the InferenceService to v1beta1, we recommend rewriting all your spec with V1beta1 API as we plan to drop the support for v1alpha2 in later versions.
Contribution list
- Make KFServer HTTP requests asynchronous #983 by @salanki
- Add support for generic HTTP/HTTPS URI for Storage Initializer #979 by @tduffy000
-
InferenceService v1beta1 API #991 by @yuzisun
- Validation check for InferenceService Name #1079 by @jazzsir
- Set KFServing default worker to 1 #1106 by @yuzliu
- Add support for MLServer in the SKLearn predictor #1155 by @adriangonz
- Add V2 support to XGBoost predictor #1196 by @adriangonz
- Support PMML server #1141 by @AnyISalIn
- Generate SDK for KFServing v1beta1 #1150 by @jinchihe
- Support Kubernetes 1.18 #1128 by @pugangxa
- Integrate TorchServe to v1beta1 spec #1161 by @jagadeeshi2i
- Merge batcher to model agent #1287 by @yuzisun
- Fix torchserve protocol version and update doc #1271 #1277
- Support CloudEvent(Avro/Protobuf) for KFServer #1343 @mtickoobb
Multi Model Serving V1Alpha1
🌈 KFServing 0.5 introduces Multi Model Serving with V1Alpha1 TrainedModel CR, this is currently for experiment only and we are looking for your feedbacks!
Checkout sklearn, triton MMS examples.
- Multi-Model Puller #989 by @ifilonenko
- Add multi model configmap #992 by @wengyao04
- Trained model v1alpha1 api #1009 by @yuzliu
- TrainedModel controller #1013 by @yuzliu
- Harden model puller logic and add tests #1055 by @yuzisun
- Puller streamlining/simplification #1057 by @njhill
-
Integrate MMS inferenceservice controller, configmap controller, model agent #1132 by @yuzliu
- Add load/unload endpoint for SKLearn/XGBoost KFServer #1082 by @wengyao04
- Sync from model config on agent startup #1204 by @yuzisun
- Fix model puller flag for MMS #1281 by @yuzisun
- TrainedModel status url #1319 by @abchoo
- Add MMS support for SKLearn/XGBoost MLServer #1290 @adriangonz
- Support GCS for model agent #1105 @mszacillo
Explanation
-
Add support for AIX360 explanations #1094 by @drewbutlerbb4
- Alibi 0.5.5 #1168 by @cliveseldon
- Adversarial robustness explainer(ART) #1244 by @drewbutlerbb4
- PyTorch Captum explain integration, see example
Documentation
- Docs/custom domain #1036 by adamkgray
- Update ingress gateway access instruction #1008 by @yuzisun
- Document working k8s version #1062 by @riklopfer
- Add triton torchscript example with prediction v2 protocol #1131 by @yuzisun
- Add torchserve custom server with pv storage example #1182 by @jagadeeshi2i
- Add torchserve custom server example #1156 by @jagadeeshi2i
-
Add torchserve custom server bert sample #1185 by @jagadeeshi2i
- Bump up minimal Kube and Istio requirements #1166 by @animeshsingh
- V1beta1 canary rollout examples #1267 by @yuzisun
- Promethus based metrics and monitoring docs #1276 by @sriumcp
Developer Experience
- Migrate controller tests to use BDD testing style #936 by @yuzisun
-
Genericized component logic. #1018 by @ellistarn
- Use github action for kfserving controller tests #1056 by @yuzisun
- Make standalone installation kustomizable #1103 by @jazzsir
-
Move KFServing CI to AWS #1170 by @yuzisun
- Upgrade k8s and kn go library versions #1144 by @ryandawsonuk
- Add e2e test for torchserve #1265 by @jagadeeshi2i
- Add e2e test for SKLearn/XGBoost MMS #1306 by @abchoo
- Upgrade k8s client library to 1.19 #1305 by @ivan-valkov
- Upgrade controller-runtime to 0.7.0 #1341 by @pugangxa