Bot releases are hidden (Show)

kserve - v0.6.0-rc0

Published by yuzisun over 3 years ago

🌈 What's New?

Web app for managing InferenceServices #1328
web-app: Add manifests for launching and exposing the app #1505
web-app: Implement a GitHub action for building the web app #1504
[storage-initializer] add support for aws sts #1451
MMS: Add heathcheck endpoint for InferenceService agent #1041
MMS: Trained Model Validation Webhook + Memory in trained model immutable #1394
MMS: multi-model-serving support for custom container in predictorSpec #1427
MMS: Added annotation to use anonymous credentials for s3 #1538
MMS: Adds condition for Trained Model to check if isvc predictor supports MMS #1522
MMS: Introducing HTTP protocol for MMS downloader
Improve PMMLServer predict performance #1405

🐛 What's Fixed?

Fix duplicated revision when creating the service initially #1467
The ingress virtual service is not reconciled when updating annotations/labels of inference service #1524
Model server response status code not propagated when using logger #1530
MMS service gets 404 during autoscaling #1338
MMS: Added mutex for downloader providers. Fixes #1531
MMS: Prevents /mnt/models/ from being converted into a file #1549
MMS: Watcher should not be started until models downloaded in MMS #1429
Resolve knative service diff to prevent dup revision #1484
Storage initializer download tar.gz or zip from uri with query params fails #1462
Make v1beta1 custom predictors have configurable protocol #1483
Fix logger for error response case #1533
[xgboostserver] Convert list input to numpy array before creating DMatrix #1513

What's Changed?

support knative 0.19+, defaults to knative-local-gateway #1334

Development experience and docs

speed-up alibi-explainer image build #1395
Update logger samples for newer eventing versions #1526
Update pipelines documentation #1498
Add github action for python lint #1485
Add Spark model inference example with export pmml file #1434
Update kubeflow overlay #1424
reorg multi-model serving doc #1412

kserve -

Published by yuzisun over 3 years ago

Features

Support credentials for HTTP storage URIs (#1372)
Trained Model Validation Webhook + Memory in trained model immutable (#1394)
Validate the parent inference service is ready in trained model controller (#1402)
Validation for storage URI in Trained Model webhook (#1407)

Bug Fixes

Use custom local gateway for isvc external service (#1382)
Avoid overwriting arguments specified on container fields (#1400)
Bug Fix for CloudEvent data access (#1396)
Propagate Inferenceservice annotations to top level virtualservice (#1403)
Remove unnecessary "latest" routing tag (#1378)

kserve -

Published by yuzisun over 3 years ago

InferenceService V1Beta1

🚢 KFServing 0.5 promotes the core InferenceService from v1alpha2 to v1beta1!

The minimum required versions are Kubernetes 1.16 and Istio 1.3.1/Knative 0.14.3. Conversion webhook is installed to automatically convert v1alpha2 inference service to v1beta1.

🆕 What's new ?

You can now specify container fields on ML Framework spec such as env variable, liveness/readiness probes etc.
You can now specify pod template fields on component spec such as NodeAffinity etc.
Allow specifying timeouts on component spec
Tensorflow Serving gRPC support.
Triton Inference server V2 inference REST/gRPC protocol support, see examples
TorchServe predict integration, see examples
SKLearn/XGBoost V2 inference REST/gRPC protocol support with MLServer, see SKLearn and XGBoost examples
PMMLServer support, see examples
LightGBM support, see examples
Simplified canary rollout, traffic split at knative revisions level instead of services level, see examples
Transformer to predictor call is now using AsyncIO by default

⚠️ What's gone ?

Default/Canary level is removed, canaryTrafficPercent is moved to the component level
rollout_canary and promote_canary API is deprecated on KFServing SDK
Parallelism field is renamed to containerConcurrency
Custom keyword is removed and container field is changed to be an array

⬆️ What actions are needed to take to upgrade?

Make sure canary traffic is all rolled out before upgrade as v1alpha2 canary spec is deprecated, please use v1beta1 spec for canary rollout feature.
Although KFServing automatically converts the InferenceService to v1beta1, we recommend rewriting all your spec with V1beta1 API as we plan to drop the support for v1alpha2 in later versions.

Contribution list

Make KFServer HTTP requests asynchronous #983 by @salanki 
Add support for generic HTTP/HTTPS URI for Storage Initializer #979 by @tduffy000 
    InferenceService v1beta1 API #991 by @yuzisun 
Validation check for InferenceService Name #1079 by @jazzsir 
Set KFServing default worker to 1 #1106 by @yuzliu 
Add support for MLServer in the SKLearn predictor #1155 by @adriangonz 
Add V2 support to XGBoost predictor #1196 by @adriangonz 
Support PMML server #1141 by @AnyISalIn 
Generate SDK for KFServing v1beta1 #1150 by @jinchihe  
Support Kubernetes 1.18 #1128 by @pugangxa
Integrate TorchServe to v1beta1 spec #1161 by @jagadeeshi2i
Merge batcher to model agent #1287 by @yuzisun
Fix torchserve protocol version and update doc #1271 #1277
Support CloudEvent(Avro/Protobuf) for KFServer #1343 @mtickoobb

Multi Model Serving V1Alpha1

🌈 KFServing 0.5 introduces Multi Model Serving with V1Alpha1 TrainedModel CR, this is currently for experiment only and we are looking for your feedbacks!

Checkout sklearn, triton MMS examples.

Multi-Model Puller #989 by @ifilonenko  
Add multi model configmap #992 by @wengyao04 
Trained model v1alpha1 api #1009 by @yuzliu
TrainedModel controller #1013 by @yuzliu 
Harden model puller logic and add tests #1055 by @yuzisun
Puller streamlining/simplification #1057 by @njhill
 Integrate MMS inferenceservice controller, configmap controller, model agent #1132 by @yuzliu
Add load/unload endpoint for SKLearn/XGBoost KFServer #1082 by @wengyao04 
Sync from model config on agent startup #1204 by @yuzisun
Fix model puller flag for MMS #1281 by @yuzisun
TrainedModel status url #1319 by @abchoo
Add MMS support for SKLearn/XGBoost MLServer #1290 @adriangonz
Support GCS for model agent #1105 @mszacillo

Explanation

 Add support for AIX360 explanations #1094 by @drewbutlerbb4 
Alibi 0.5.5 #1168 by @cliveseldon
Adversarial robustness explainer(ART) #1244 by @drewbutlerbb4
PyTorch Captum explain integration, see example

Documentation

Docs/custom domain #1036 by adamkgray 
Update ingress gateway access instruction #1008 by @yuzisun 
Document working k8s version #1062 by @riklopfer
Add triton torchscript example with prediction v2 protocol #1131 by @yuzisun 
Add torchserve custom server with pv storage example #1182 by @jagadeeshi2i
Add torchserve custom server example #1156 by @jagadeeshi2i
 Add torchserve custom server bert sample #1185 by @jagadeeshi2i   
Bump up minimal Kube and Istio requirements #1166 by @animeshsingh
V1beta1 canary rollout examples #1267 by @yuzisun
Promethus based metrics and monitoring docs #1276 by @sriumcp

Developer Experience

Migrate controller tests to use BDD testing style #936 by @yuzisun
 Genericized component logic. #1018 by @ellistarn 
Use github action for kfserving controller tests #1056 by @yuzisun
Make standalone installation kustomizable #1103 by @jazzsir
 Move KFServing CI to AWS #1170 by @yuzisun
Upgrade k8s and kn go library versions #1144 by @ryandawsonuk
Add e2e test for torchserve #1265 by @jagadeeshi2i
Add e2e test for SKLearn/XGBoost MMS #1306 by @abchoo
Upgrade k8s client library to 1.19 #1305 by @ivan-valkov
Upgrade controller-runtime to 0.7.0 #1341 by @pugangxa

kserve -

Published by yuzisun almost 4 years ago

Final RC release for InferenceService V1Beta1

Merge logger/batcher to model agent

Merge batcher to model agent #1287
Fix model puller flag for MMS #1281
Fix torchserve protocol version and update doc #1271 #1277
Add e2e test for torchserve #1265
V1beta1 canary rollout examples #1267
Promethus based metrics and monitoring docs #1276

kserve -

Published by yuzisun almost 4 years ago

InferenceService V1Beta1

🚢 TorchServe Integration!

Add TorchServe to v1beta1 spec #1161 by @jagadeeshi2i

📝 Documentation

https://github.com/kubeflow/kfserving/tree/master/docs/samples/v1beta1/torchserve

kserve -

Published by yuzisun almost 4 years ago

InferenceService V1Beta1

🚢 KFServing 0.5 promotes the core InferenceService from v1alpha2 to v1beta1!

The minimum required versions are Kubernetes 1.15 and Istio 1.3.1. Conversion webhook is installed to automatically convert v1alpha2 inference service to v1beta1.

🆕 What's new ?

You can now specify container fields on ML Framework spec such as env variable, liveness/readiness probes etc.
You can now specify pod template fields on component spec such as NodeAffinity etc.
gRPC support Tensorflow Serving.
Triton Inference server V2 inference REST/gRPC protocol support
SKLearn/XGBoost V2 inference REST/gRPC protocol support with MLServer
PMMLServer support
Allow specifying timeouts on component spec
Simplified canary rollout, traffic split at knative revisions level instead of services level
Transformer to predictor call is now made async

What's gone ?

Default/Canary level is removed, canaryTrafficPercent is moved to the component level
Parallelism field is renamed to containerConcurrency

What actions are needed to take to upgrade?

Make sure canary traffic is all rolled out before upgrade as v1alpha2 canary spec is deprecated, please use v1beta1 spec for canary rollout feature.
Although KFServing automatically converts the InferenceService to v1beta1, we recommend rewriting all your spec with V1beta1 API as we plan to drop the support for v1alpha2 in later versions.

Contribution list

Make KFServer HTTP requests asynchronous #983 by @salanki 
Add support for generic HTTP/HTTPS URI for Storage Initializer #979 by @tduffy000 
    InferenceService v1beta1 API #991 by @yuzisun 
Validation check for InferenceService Name #1079 by @jazzsir 
Set KFServing default worker to 1 #1106 by @yuzliu 
Add support for MLServer in the SKLearn predictor #1155 by @adriangonz 
Add V2 support to XGBoost predictor #1196 by @adriangonz 
Support PMML server #1141 by @AnyISalIn 
Generate SDK for KFServing v1beta1 #1150 by @jinchihe  
Support Kubernetes 1.18 #1128 by @pugangxa

Multi Model Serving V1Alpha1

🌈 KFServing 0.5 introduces Multi Model Serving with V1Alpha1 TrainedModel CR, this is currently for experiment only and we are looking for your feedbacks!

Multi-Model Puller #989 by @ifilonenko  
Add multi model configmap #992 by @wengyao04 
Trained model v1alpha1 api #1009 by @yuzliu
TrainedModel controller #1013 by @yuzliu 
Harden model puller logic and add tests #1055 by @yuzisun
Puller streamlining/simplification #1057 by @njhill
 Integrate MMS inferenceservice controller, configmap controller, model agent #1132 by @yuzliu
Add load/unload endpoint for SKLearn/XGBoost KFServer #1082 by @wengyao04 
Sync from model config on agent startup #1204 by @yuzisun

Explanation

 Add support for AIX360 explanations #1094 by @drewbutlerbb4 
Alibi 0.5.5 #1168 by @cliveseldon

Documentation

Docs/custom domain #1036 by adamkgray 
Update ingress gateway access instruction #1008 by @yuzisun 
Document working k8s version #1062 by @riklopfer
Add triton torchscript example with prediction v2 protocol #1131 by @yuzisun 
Add torchserve custom server with pv storage example #1182 by @jagadeeshi2i
Add torchserve custom server example #1156 by @jagadeeshi2i
 Add torchserve custom server bert sample #1185 by @jagadeeshi2i   
Bump up minimal Kube and Istio requirements #1166 by @animeshsingh

Developer Experience

Migrate controller tests to use BDD testing style #936 by @yuzisun
 Genericized component logic. #1018 by @ellistarn 
Use github action for kfserving controller tests #1056 by @yuzisun
Make standalone installation kustomizable #1103 by @jazzsir
 Move KFServing CI to AWS #1170 by @yuzisun
Upgrade k8s and kn go library versions #1144 by @ryandawsonuk

kserve - KFServing 0.4.1 release

Published by animeshsingh almost 4 years ago

KFServing patch release on top of v0.4 to enable deployment on OpenShift. Fixes include

Fixed issues on openshift (#1122)

change to use port 9443
add rbac for finalizer

Add inferenceservice finalizer rbac rules (#1134)
Fixes KFServing SDK 0.4 import error while running the custom built image (#1117)

kserve - KFServing 0.4 release

Published by yuzisun about 4 years ago

Action Required

KFServing has added object selector on pod mutator webhook configuration which requires minimally Kubernetes 1.15 to take effect.
The generated KFServing InferenceService openAPI schema validation now includes markers like x-kubernetes-list-map-keys and x-kubernetes-map-type which requires minimally Kubernetes 1.16, if you are on kubernetes 1.15 or lower version please install KFServing with --validate=false flag.
Tensorrt inference server has been renamed to Triton inference server, if you are using tensorrt predictor on inference service yaml please rename to triton.
KFserving has removed the default percentage based queue proxy resource limit due to #844, please set queue proxy requests/limits in the knative config-deployment.yaml config map which is introduced in knative 0.16 or add the queue proxy resource limit annotation if you are on lower version and your cluster has resource quota turned on, we highly recommend upgrading linux kernel if you are hitting the same cpu throttling issue.
The default S3 credential name has been updated to follow the convention from awsAccessKeyID and awsSecretAccessKey to AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY, if you have secrets configured with the old way please update accordingly.
KFServing has stopped maintaining the model server image versions in the configmap, user now can set the corresponding model server version on runtimeVersion field if you need the version different from the default.

New features

Add batcher module as sidecar #847 @zhangrongguo
Add Default LivenessProbe to Tensorflow Predictor #925 @salanki
Remove framework image version list from configmap #917 @yuzisun
Record Events when InferenceService goes in and out of readiness state #876 @ifilonenko
Triton inference server rename and integrations #747 @deadeyegoodwin
Alibi explainer upgrade to 0.4.0 #803 @cliveseldon
Make default request logger url more flexible #837 @ryandawsonuk
Allow customized url paths on data plane #907 @iamlovingit
Add object selector for KFServing pod mutator webhook configuration #893 @yuzisun
Update logger to CloudEvents V1 protocol #886 @cliveseldon
Set ContainerConcurrency to Parallelism #806 @salanki

Bug Fixes

Disable retries in Istio VirtualService #807 @salanki
Remove default queue proxy resource limit and Add KFServing benchmarking #894 @yuzisun
Enhance SDK watch API to avoid traceback #889 @jinchihe
Update KNative annotation when modifying minReplicas to 0 #963 @salanki
Allow configurable region name when creating minio client #823 @harshavardhana
Return 503 from healthhandler when model is not ready #818 @kolasanichaitanya
Updated S3 credential variable names to commonly used en var names #704 @karlschriek
Fix duplicated volume issue when attaching GCS secret #766 @kangwoo

Documentations

Add BERT example for triton inference server integration #750 @yuzisun
Add KFServing Debugging guide #829 @yuzisun
Add new KFServing sample for GCP IAP #853 @owennewo
Add KFServing on Kubeflow with Istio-Dex Example #821 #822 @sachua
Add Outlier Detection and Drift Detection Examples #764 @cliveseldon
Update pipeline sample to point to mnist e2e one #926 @animeshsingh
Add custom gRPC sample #921 @Iamlovingit
Add custom inference example using BentoML #800 @yubozhao
Update KFServing roadmap for Q3/Q4 #861 @yuzisun

Developer Experience

Migrate KFServing to Go Module #796 @yuzisun
Add tabular explainer e2e test #865 @janeman @yuzisun
Add logger and improve batcher e2e tests #938 @yuzisun

kserve - v0.3 "Stability"

Published by ellistarn over 4 years ago

Features

Pytorch model server with GPU inference #540
Support internal mesh routing to inference service e.g routing from Kafka event source #583
Add storage URI for transformer #643
Add parallelism field to allow setting autoscaling target concurrency and number of tornado workers #637
SKLearn model server to support pickled model #560
Add extra information for Logger #699
Default min replica to 1 instead of 0 #655
Upgrade knative API from v1alpha1 to v1 for KFServing #585
Upgrade KFServing Kubernetes dependency 1.15 and knative dependency to 1.11 #630
Upgrade openapi-gen #600
Expose containerPort to let knative listen on logger port, support logger for custom spec #592
Self-signed certs generation script #650

Bug Fixes

Fix default queue proxy container resource limit which was too low #608
Allow configuring max buffer size for tornado server #665
Relax data plane "instances" key validation #705
Return application/json in response header #615
Fix top level virtual service for HTTPS #726

Developer Experience, Tools & Testing, Examples

Enable local development for model servers, explainer and storage initializer #591
Add wait inference service SDK api #610
Adding custom examples #678 #698
Add canary rollout examples #691
Add e2e tests for canary rollout #658