serving

Kubernetes-based, scale-to-zero, request-driven compute

APACHE-2.0 License

Stars
5.3K
Committers
296

Bot releases are visible (Hide)

serving - Knative Serving release v0.14.1

Published by knative-prow-releaser-robot over 4 years ago

Meta

Monitoring Bundle is deprecated

We have made the decision to deprecate the bundled monitoring tools that have remained unchanged since 2018 due to a lack of community interest. We will stop releasing them in a coming release and will instead focus on documenting how to integrate with existing monitoring systems using OpenTelemetry.

V1 is now our storage version

We have included a new migration Job to migrate existing resources. See the serving-storage-version-migration.yaml release artifact.

Several new net-* repos!

  • Our Istio integration has moved out of Serving and into knative/net-istio.
  • Kourier has moved to knative/net-kourier.
  • We have a new knative/net-http01 project for implementing auto-TLS.

We have NOT bumped our minimum Kubernetes dependency (still 1.15)

We were unable to bump our minimum Kubernetes dependency to 1.16 this release as planned due to its lack of availability in GKE (on which we have a hard dependency for CI/CD). The principle behind our choice of minimum upstream version remains the same, and users should expect future releases to attempt to “catch up”.

Autoscaling

  • Disable metric scraping in situations where the activator is always in path for increased efficiency #7431 (thanks @dsimansk)
  • Added a metric for measuring metric scraping overhead #7232 (thanks @rmoe)
  • The “Metric” resource now surfaces potential errors in its status #7525 (thanks @markusthoemmes)
  • Activator tracks revision public service endpoints to assign downstream pods #7208 (thanks @vagababov)
  • Documented the internal autoscaling systems #7126 (thanks @markusthoemmes)
    Cleanups and improvements (logging, metrics, config map, unit and e2e tests, etcs); many PRs (thanks @julz, @mgencur, @vagababov, @markusthoemmes)

Fixed various bugs

  • Fixed races where a revision briefly scales below minScale only to immediately scale up again #7110, #7214 (thanks @tanzeeb)
  • Fixed a bug where a revision would never become ready if minScale was set > 1 #7514 (thanks @markusthoemmes)
  • Fixed a bug where request counts have been reported off by 1 on scale-from-0 #7109 (thanks @vagababov)
  • Fixed potential panics around timeout handling in the queue-proxy #7138, #7146 (thanks @JRBANCEL)
  • Fixed a rare race condition, where the activator would fail to schedule new , requests even though there is capacity in the system #7360 (thanks @markusthoemmes)

Core API

V1 is now our storage version #7204, #7499 (thanks @dprotaso)

After installing 0.14, a new migration Job must be run to migrate pre-existing resources, and remove v1alpha1 as a stored version from our CRDs.

Support for resolving AWS ECR images #7244 (thanks @mattmoor)

Fixes a long-standing issue where our tag resolutions does not work properly for AWS ECR.

Assorted Cleanups:

  • Leader election config map cleaned up, defaulting is implemented, example verified as default values (thanks @vagababov)

Networking

Introducing knative/net-istio repository (thanks @mattmoor, @nghia, @tshafer):

Istio KIngress reconciler is now separated into its own repository knative/net-istio, enabling more focused testing on presubmits. In the future, Istio integration bugs should be filed to this new repository

Introducing knative/net-http01 repository (thanks @mattmoor):

knative/net-http01 is a simple standalone ACME HTTP01 solver for the Knative Certificate abstraction.

Introducing knative/net-kourier repository (thanks @dortiz, @jmprussi):

A new home for Kourier - a lightweight Envoy-based Knative Ingress reconciler previously hosted at https://github.com/3scale/kourier.

Support Istio canonical service and revision #6832 (thanks @tshafer):

Adding Istio canonical service labels (https://github.com/istio/istio/pull/20943) to Knative objects for better integration with Istio UX.

Use /healthz for probe path for easier whitelisting #5918 (thanks itsmurugappa, shreejad)

We changed our probe path from /_internal/knative/activator/probe to /healthz and made that consistent across all probe receivers in Knative Serving.

Best effort Istio probing #6962 (thanks JRBANCEL)

Any scenario where probing would fail forever with the current implementation is now treated as a successful probing, to allow failing-open in cases where users use a 3-legged-oauth setup that would cause probing to fail indefinitely.

Generated VirtualService contains wrong gateways field knative/net-istio#44 (thanks @yanniszark)

Previously, we sometimes referred to unused Gateways in a VirtualService. That caused issues with Istio validation logic if those unused Gateways were non-existent. Unused Gateways are no longer referred from VirtualServices.

Assorted cleanups:

  • Remove usages of deprecated field VirtualService.WebsocketUpgrade knative/net-istio#53 (thanks @nak3)
  • Networking ConfigMap cleaned up, example verified as defaults and Go templates are cached, rather than parsed on every invocation #7403, #7408, #7395 (thanks @vagababov)
serving - Knative Serving release v0.15.0

Published by knative-prow-releaser-robot over 4 years ago

Meta

go mod migration

Knative is now completely migrated to Golang modules.

Serving release artifact deprecations

serving.yaml and serving-cert-manager.yaml will be shipped for the last time in this release. They have been broken out into separate artifacts. Please refer to the current installation docs for guidance on how to install Knative Serving and its optional components.

Minimum supported Kubernetes version bumped to 1.16

As per the Kubernetes minimum version principle - our current minimum supported Kubernetes version is now 1.16.

Autoscaling

Activator Subsetting (thanks @vagababov)

We compute a subset of Activator pods for each revision in a consistent manner, rather than assigning all. This noticeably improves load balancing for smaller revisions with small container concurrency values.

  • Improved pod scraping latency by directly scraping pods if available #7804 (thanks @vagababov)
  • Autoscaling Documentation (thanks @markusthoemmes)
  • Last pod retention period #7931 (thanks @vagababov)
  • Unify Activator and QueueProxy stats reporting libraries and report more precise concurrency values from Activator #7775 (thanks @makusthoemmes)
  • Add a global setting which prohibits setting container concurrency to 0 #7932 (thanks @julz)
  • Progress deadline is now a configurable parameter #7649 (thanks @vagababov)
  • Burst capacity is calculated over the panic window now (thanks @vagababov)
  • General code cleanup, test stabilization, etc thanks (@julz, @markusthoemmes, @vagababov, @nak3)

Core API

  • Our Revision shape has slightly changed to support multiple containers in the future #7373 (thanks @savitaashture)
    • Revision.Status.ImageDigest is deprecated and the digest will appear in Revision.Status.ContainerStatus.
  • Enable K8s dry-run as an experimental feature to provide faster feedback when your template won't create a valid Pod #3425 (thanks @whaught)
    • These are currently opt-in via the current annotation (may change)
      • features.knative.dev/podspec-dryrun: enabled
      • features.knative.dev/podspec-dryrun: strict
    • Strict mode will return failures if dry-run is not supported. This happens when webhooks have side-effects.
  • Webhook infrastructure now supports receiving a callback when a deletion occurs pkg/#1219 (thanks @whaught)
  • Some lingering and deprecated v1alpha1 properties have been removed from our go types
    • Revision's concurrencyModel #7893 (thanks @vagabov)
    • Revision's buildName and buildRef #7896 (thanks @vagabov)
  • Reduced some churn reconciling deleted objects when they were tracking dependent resources #7679 (thanks @markusthoemmes)
  • genreconciler now allows developers to override the controller’s name pkg/#1137 (thanks @shashwathi @andrew-su)

Networking

  • Remove /var/log symlink logic from the queue proxy #7882 (thanks @dprotaso)
    • /var/log log capture now supports containers that aren't named user-container.
  • Add support for labels in DomainTemplate #7647 (thanks @duglin)
    • This allows users to create custom URLs via the template and to choose custom domains in the config-domain configMap via labels.
  • net-certmanager repository setup and code migration (thanks @ZhiminXiang)
    • Cert-manager related resources for AutoTLS are generated and released from the net-certmanager repository now.
  • KIngress no longer uses retries #7842 (thanks @tcnghia)
  • Operation name for activator's proxy span and queue-proxy's span are renamed to {activator,queue}_proxy #7934 (thanks @nak3)
  • Ingress conformance test for visibility and path #7666 (thanks @andrew-su)
  • Better timeouts for the ingress prober #7702 (thanks @JRBANCEL)
  • For ingress prober, use default http.Transport and context with timeout for better timeouts #7702 (thanks @JRBANCEL)
  • Use "go mod" within net-istio, net-contour, net-certmanager, net-http01 (thanks @andrew-su, @mattmoor, @tcnghia, @ZhiminXiang)
  • Propagate status from KCert to Route #7163 (thanks @nak3)
serving - Knative Serving release v0.13.3

Published by knative-prow-releaser-robot over 4 years ago

Meta

Minimum Kubernetes version remains 1.15

This is NOT a change from 0.12, however, with the adoption of Conversion webhooks this is no longer something that may be overridden without consequence.
The target minimum version for our next (0.14) release will be Kubernetes 1.16.

Deprecation of the Alpha and Beta Serving APIs

The v1 APIs are now available in every supported version of Knative, and our controllers are now consuming v1 themselves.
We will continue to ship the deprecated APIs for 9 months (6 releases), so these will be removed in the 0.19 release.

We now rely on CRD Conversion webhooks

We take advantage of this long-awaited Beta+ feature in 1.15+ to manage converting between v1alpha1, v1beta1, and v1 types.

Autoscaling

  • Probe and forward traffic to non-ready pods (#6695 thanks @markusthoemmes)
  • gRPC e2e autoscaling test (#6778, thanks @tanzeeb, @shashwathi)
  • We no longer restrict min target at 1, permitting correct target utilization with CC=1 (#6951 thanks @vagababov)
  • Ignore young pods from metric computation (#6649, #6626, thanks @vagababov)
  • Metric Reporting Refactoring
    • Activator #6804, #6843 (thanks @markusthoemmes)
    • Autoscaler (#6774,#6707, #6712), QP #6852 (thanks @vagababov)
  • Cleanups and improvements (logging, metrics, unit and e2e tests, etcs); many PRs (thanks @jpeach, @MIBc, @taragu, @markusthoemmes, @vagababov)

Core API

  • We’ve adopted generated reconcilers to help minimize the boilerplate in our controllers (thanks @n3wscott, @mattmoor, @shashwathi #6993 #6973 #6969 #6952)
  • We’re removed the serving stats reporter that was reporting some nonsensical metrics (thanks @mattmoor #6939)
  • Webhook certificates now rotate (thanks @mpetason knative/pkg#1101)
  • The validating admission webhook will now apply the correct defaults (thanks @itsmurugappan #6938)
  • We’ve started our journey of actually deprecating the v1alpha1 APIs for the resources Service, Configuration, Revision and Route.
    • The controllers for these resources use the v1 APIs (thanks @dprotaso, @mattmoor #6933 #6949 #6950 #6957 #6958 #6959 #6960)
    • Thanks for the conversion webhook framework @dprotaso (knative/pkg#993)
    • Note: v1alpha1 will remain the storage version until we provide guidance on how to migrate the storage version to v1 - see #6726
    • Our current plan is remove v1alpha1 and v1beta1 APIs in 0.19
    • HPA auto scaling using Revision metrics (concurrency & requests per second) now use v1 APIs (thanks @dprotaso #6957).
      • Consuming revision metrics for the resource version v1alpha1 is deprecated and will be removed in the next release (0.14)

Networking

  • Deprecate the istio.sidecar.includeOutboundIPRanges in config-network #6597 (thanks @nak3)
  • Avoid unconditionally reconciling the Gateways on deletion #6934 (thanks @ZhiminXiang)
  • Remove "internal" in class name of Certificates #6887 (thanks @ZhiminXiang)
  • Wrong revision is picked up for traffic target marked as "latest" #6876 (thanks @taragu)
  • Fix Ready -> NotReady- > Ready flip flops in Ingress Prober #6648 (thanks @JRBANCEL)
  • Clean up orphaned VirtualService when migrating from Istio KIngress to other KIngress #6570 (thanks @nak3)
  • Avoid specifying IngressTLS before Certificate reports Ready. #6870 (thanks @mattmoor)
  • queue-proxy to returns 504 on connect timeouts #6859 (thanks @vagababov)
  • KIngress to disallow ServiceNamespace that differs from its own #6868 (thanks @MIBc)
  • Fix name collision when having two Route with name ${route} and ${route}-mesh #6362 (thanks @sreddy)
  • Route reconciler to separate cluster local Ingress rules and external domain rules to avoid ClusterLocal special-casing in KIngress implementation #6727 (thanks @tcnghia, @andrew-su)
  • Correctly set the network prober User-Agent #6644 (thanks @jpeach)
  • gRPC, AutoTLS, and KIngress testing (thanks @ZhiminXiang, @tanzeeb, @sreddy, @rmoe, andrew-su@)
serving - Knative Serving release v0.11.2

Published by knative-prow-releaser-robot over 4 years ago

Meta

Load-balancing improvements with low containerConcurrency

At low containerConcurrency’s we now perform significantly better due to improvements in the application-specific load-balancing performed by the Activator component.

Kourier networking support

We have a new option for handling the ingress capabilities used by knative/serving. Kourier is the first Knative-native ingress implementation, which reconciles the Knative networking CRDs to program Envoy directly.

Autoscaling

Locally perfect loadbalancing and endpoint subsetting improvements (thanks @vagababov)

These are further improvements to the loadbalancing enhancements over the last releases. Given a stable activator count, loadbalancing of a revision with the activator on the path is now locally ideal. The graph.

Reduced the needed Kubernetes Services per Revision from 3 to 2 #5900 (thanks @markusthoemmes)

The third service used to be used for metric scraping exclusively. This is now done via the private service as well. Metric services are no longer created and actively removed in existing deployments.

Allow applications with a livenessProbe to properly scale down #5986 (thanks @nak3)

The queue-proxy wrongly counted requests sent via livenessProbes as actual requests, causing the revision to never shut down. These requests are now properly ignored.

Target annotation values can now exceed configured defaults #5975 (thanks @markusthoemmes)

This fixes a bug in the logic to determine the actual target of the autoscaler which capped the user-defined target value to the configured default value.

Report desired/actual scale in PodAutoscalers for the HPA as well (thanks @vagababov)

The values for desired and actual scale are now plumbed through from the HPA into the PodAutoscaler’s status.

Assorted code readability, optimizations and clean ups (thanks @vagababov, @markusthoemmes, @mgencur)

Core API

Improved error messages for image tag resolving #5920 (thanks @markusthoemmes)

Previous error messages did not indicate that the image pull failure occurred during digest resolution, and did not provide further details as to why the digest resolution failed. This change aides users in debugging problems in container registry permissions.

Enabled imagePullSecrets in PodSpec #5917 (thanks @taragu)

Users may now specify imagePullSecrets directly without attaching them to their Kubernetes ServiceAccount.

Add permissions for caching.internal.knative.dev to edit and view cluster roles #5992 (thanks @nak3)

Knative provides aggregated ClusterRoles that users can use to manage their Knative resources. These roles previously did not include the caching resource. This change adds the caching resource to both the edit and view roles.

Split apart defaulting and validation webhooks #5947 (thanks @mattmoor)

This fixes a problem where our validation wasn’t necessarily applied to the final object because it runs at the same time as defaulting, which might be before additional mutating webhooks. By separating things out we ensure that the validation occurs on the final object to be committed to etcd.

Configuration and Service now labeled with duck.knative.dev/podspecable #6121 (thanks @mattmoor)

This enables tools that reflect over the Kubernetes type system to reason about the podspec portion of these Knative resources.

Bug Fixes:

  • Fix bug where latestRevision routes can point to wrong revision #5319 (thanks @taragu)
  • Fix issue where config-defaults were not getting applied #5892 (thanks @taragu)
  • Fix validation issue for lastModifier when using multiple service accounts #6072 (thanks @savitaashture)
  • Fix problem with Configuration reporting Ready early #6096 (thanks @taragu)
  • Validation added for name and generateName fields in RevisionTemplate #5110 (thanks @savitaashture)

Test Improvements:

  • File access checks in conformance tests #5102 (thanks @shashwathi)
  • Improve route visibility test #5831 (thanks @andrew-su)
  • Added ability to run e2e tests with https #5157 (thanks @taragu)
  • More reliable WaitForService function #5956 (thanks @mgencur)
  • Added new upgrade test to catch broken defaulting behavior #6165 (thanks @dgerd)
  • Fix digest validation in conformance to account for image templating #6106 (thanks @markusthoemmes)

Networking

Compatibility with Istio 1.4 #6058 (thanks @nak3)

Istio 1.4 introduced a breaking restriction to the length of regular expressions allowed in VirtualServices. We switched to using prefixes to be compatible with Istio 1.4.

Integration with istio/client-go #5969 pkg#208 pkg#831 (thanks @skaslev)

Knative now uses istio/client-go instead of its own version of Istio API client. This addressed a long pain-point of maintaining a manually-translated API client to a changing API.

3scale/kourier integration #5983 (thanks @bbrowning @davidor @jmprusi)

Kourier is a light-weight Ingress for Knative (a deployment of Kourier consists of an Envoy proxy and a control plane for it). In v0.11 we add Kourier as an option to run Knative e2e integration tests.

Better LoadBalancerReady condition when VirtualService failed to be reconciled #6048 (thanks @nak3)

Previously when VirtualService failed to be reconciled the LoadBalancerReady Condition isn’t updated. We fix this to surface reason and message from the failing VirtualService.

Post-ClusterIngress migration cleanups (thanks @markusthoemmes)

Clean up port names of Knative components to follow Istio convention #5070 (thanks @iamejboy)

Bug fix #5734: Do not permit cluster local kservices on the cluster ingress #6174 (thanks @vagababov)

Fix a bug where cluster-local ksvcs are erroneously exposed to the public ingress.

Monitoring

Add default request metrics backend in observability config #6022 (thanks @drpmma)

This change makes the default backend Prometheus and makes it consistent with the default example value in config-observability.yaml

Fix missing required selector for node-exporter #5934 (thanks @lionelvillard)

serving - Knative Serving release v0.14.0

Published by knative-prow-releaser-robot over 4 years ago

Meta

Monitoring Bundle is deprecated

We have made the decision to deprecate the bundled monitoring tools that have remained unchanged since 2018 due to a lack of community interest. We will stop releasing them in a coming release and will instead focus on documenting how to integrate with existing monitoring systems using OpenTelemetry.

V1 is now our storage version

We have included a new migration Job to migrate existing resources. See the serving-storage-version-migration.yaml release artifact.

Several new net-* repos!

  • Our Istio integration has moved out of Serving and into knative/net-istio.
  • Kourier has moved to knative/net-kourier.
  • We have a new knative/net-http01 project for implementing auto-TLS.

We have NOT bumped our minimum Kubernetes dependency (still 1.15)

We were unable to bump our minimum Kubernetes dependency to 1.16 this release as planned due to its lack of availability in GKE (on which we have a hard dependency for CI/CD). The principle behind our choice of minimum upstream version remains the same, and users should expect future releases to attempt to “catch up”.

Autoscaling

  • Disable metric scraping in situations where the activator is always in path for increased efficiency #7431 (thanks @dsimansk)
  • Added a metric for measuring metric scraping overhead #7232 (thanks @rmoe)
  • The “Metric” resource now surfaces potential errors in its status #7525 (thanks @markusthoemmes)
  • Activator tracks revision public service endpoints to assign downstream pods #7208 (thanks @vagababov)
  • Documented the internal autoscaling systems #7126 (thanks @markusthoemmes)
    Cleanups and improvements (logging, metrics, config map, unit and e2e tests, etcs); many PRs (thanks @julz, @mgencur, @vagababov, @markusthoemmes)

Fixed various bugs

  • Fixed races where a revision briefly scales below minScale only to immediately scale up again #7110, #7214 (thanks @tanzeeb)
  • Fixed a bug where a revision would never become ready if minScale was set > 1 #7514 (thanks @markusthoemmes)
  • Fixed a bug where request counts have been reported off by 1 on scale-from-0 #7109 (thanks @vagababov)
  • Fixed potential panics around timeout handling in the queue-proxy #7138, #7146 (thanks @JRBANCEL)
  • Fixed a rare race condition, where the activator would fail to schedule new , requests even though there is capacity in the system #7360 (thanks @markusthoemmes)

Core API

V1 is now our storage version #7204, #7499 (thanks @dprotaso)

After installing 0.14, a new migration Job must be run to migrate pre-existing resources, and remove v1alpha1 as a stored version from our CRDs.

Support for resolving AWS ECR images #7244 (thanks @mattmoor)

Fixes a long-standing issue where our tag resolutions does not work properly for AWS ECR.

Assorted Cleanups:

  • Leader election config map cleaned up, defaulting is implemented, example verified as default values (thanks @vagababov)

Networking

Introducing knative/net-istio repository (thanks @mattmoor, @nghia, @tshafer):

Istio KIngress reconciler is now separated into its own repository knative/net-istio, enabling more focused testing on presubmits. In the future, Istio integration bugs should be filed to this new repository

Introducing knative/net-http01 repository (thanks @mattmoor):

knative/net-http01 is a simple standalone ACME HTTP01 solver for the Knative Certificate abstraction.

Introducing knative/net-kourier repository (thanks @dortiz, @jmprussi):

A new home for Kourier - a lightweight Envoy-based Knative Ingress reconciler previously hosted at https://github.com/3scale/kourier.

Support Istio canonical service and revision #6832 (thanks @tshafer):

Adding Istio canonical service labels (https://github.com/istio/istio/pull/20943) to Knative objects for better integration with Istio UX.

Use /healthz for probe path for easier whitelisting #5918 (thanks itsmurugappa, shreejad)

We changed our probe path from /_internal/knative/activator/probe to /healthz and made that consistent across all probe receivers in Knative Serving.

Best effort Istio probing #6962 (thanks JRBANCEL)

Any scenario where probing would fail forever with the current implementation is now treated as a successful probing, to allow failing-open in cases where users use a 3-legged-oauth setup that would cause probing to fail indefinitely.

Generated VirtualService contains wrong gateways field knative/net-istio#44 (thanks @yanniszark)

Previously, we sometimes referred to unused Gateways in a VirtualService. That caused issues with Istio validation logic if those unused Gateways were non-existent. Unused Gateways are no longer referred from VirtualServices.

Assorted cleanups:

  • Remove usages of deprecated field VirtualService.WebsocketUpgrade knative/net-istio#53 (thanks @nak3)
  • Networking ConfigMap cleaned up, example verified as defaults and Go templates are cached, rather than parsed on every invocation #7403, #7408, #7395 (thanks @vagababov)
serving - Knative Serving release v0.13.2

Published by knative-prow-releaser-robot over 4 years ago

Meta

Minimum Kubernetes version remains 1.15

This is NOT a change from 0.12, however, with the adoption of Conversion webhooks this is no longer something that may be overridden without consequence.
The target minimum version for our next (0.14) release will be Kubernetes 1.16.

Deprecation of the Alpha and Beta Serving APIs

The v1 APIs are now available in every supported version of Knative, and our controllers are now consuming v1 themselves.
We will continue to ship the deprecated APIs for 9 months (6 releases), so these will be removed in the 0.19 release.

We now rely on CRD Conversion webhooks

We take advantage of this long-awaited Beta+ feature in 1.15+ to manage converting between v1alpha1, v1beta1, and v1 types.

Autoscaling

  • Probe and forward traffic to non-ready pods (#6695 thanks @markusthoemmes)
  • gRPC e2e autoscaling test (#6778, thanks @tanzeeb, @shashwathi)
  • We no longer restrict min target at 1, permitting correct target utilization with CC=1 (#6951 thanks @vagababov)
  • Ignore young pods from metric computation (#6649, #6626, thanks @vagababov)
  • Metric Reporting Refactoring
    • Activator #6804, #6843 (thanks @markusthoemmes)
    • Autoscaler (#6774,#6707, #6712), QP #6852 (thanks @vagababov)
  • Cleanups and improvements (logging, metrics, unit and e2e tests, etcs); many PRs (thanks @jpeach, @MIBc, @taragu, @markusthoemmes, @vagababov)

Core API

  • We’ve adopted generated reconcilers to help minimize the boilerplate in our controllers (thanks @n3wscott, @mattmoor, @shashwathi #6993 #6973 #6969 #6952)
  • We’re removed the serving stats reporter that was reporting some nonsensical metrics (thanks @mattmoor #6939)
  • Webhook certificates now rotate (thanks @mpetason knative/pkg#1101)
  • The validating admission webhook will now apply the correct defaults (thanks @itsmurugappan #6938)
  • We’ve started our journey of actually deprecating the v1alpha1 APIs for the resources Service, Configuration, Revision and Route.
    • The controllers for these resources use the v1 APIs (thanks @dprotaso, @mattmoor #6933 #6949 #6950 #6957 #6958 #6959 #6960)
    • Thanks for the conversion webhook framework @dprotaso (knative/pkg#993)
    • Note: v1alpha1 will remain the storage version until we provide guidance on how to migrate the storage version to v1 - see #6726
    • Our current plan is remove v1alpha1 and v1beta1 APIs in 0.19
    • HPA auto scaling using Revision metrics (concurrency & requests per second) now use v1 APIs (thanks @dprotaso #6957).
      • Consuming revision metrics for the resource version v1alpha1 is deprecated and will be removed in the next release (0.14)

Networking

  • Deprecate the istio.sidecar.includeOutboundIPRanges in config-network #6597 (thanks @nak3)
  • Avoid unconditionally reconciling the Gateways on deletion #6934 (thanks @ZhiminXiang)
  • Remove "internal" in class name of Certificates #6887 (thanks @ZhiminXiang)
  • Wrong revision is picked up for traffic target marked as "latest" #6876 (thanks @taragu)
  • Fix Ready -> NotReady- > Ready flip flops in Ingress Prober #6648 (thanks @JRBANCEL)
  • Clean up orphaned VirtualService when migrating from Istio KIngress to other KIngress #6570 (thanks @nak3)
  • Avoid specifying IngressTLS before Certificate reports Ready. #6870 (thanks @mattmoor)
  • queue-proxy to returns 504 on connect timeouts #6859 (thanks @vagababov)
  • KIngress to disallow ServiceNamespace that differs from its own #6868 (thanks @MIBc)
  • Fix name collision when having two Route with name ${route} and ${route}-mesh #6362 (thanks @sreddy)
  • Route reconciler to separate cluster local Ingress rules and external domain rules to avoid ClusterLocal special-casing in KIngress implementation #6727 (thanks @tcnghia, @andrew-su)
  • Correctly set the network prober User-Agent #6644 (thanks @jpeach)
  • gRPC, AutoTLS, and KIngress testing (thanks @ZhiminXiang, @tanzeeb, @sreddy, @rmoe, andrew-su@)
serving - Knative Serving release v0.13.1

Published by knative-prow-releaser-robot over 4 years ago

Meta

Minimum Kubernetes version remains 1.15

This is NOT a change from 0.12, however, with the adoption of Conversion webhooks this is no longer something that may be overridden without consequence.
The target minimum version for our next (0.14) release will be Kubernetes 1.16.

Deprecation of the Alpha and Beta Serving APIs

The v1 APIs are now available in every supported version of Knative, and our controllers are now consuming v1 themselves.
We will continue to ship the deprecated APIs for 9 months (6 releases), so these will be removed in the 0.19 release.

We now rely on CRD Conversion webhooks

We take advantage of this long-awaited Beta+ feature in 1.15+ to manage converting between v1alpha1, v1beta1, and v1 types.

Autoscaling

  • Probe and forward traffic to non-ready pods (#6695 thanks @markusthoemmes)
  • gRPC e2e autoscaling test (#6778, thanks @tanzeeb, @shashwathi)
  • We no longer restrict min target at 1, permitting correct target utilization with CC=1 (#6951 thanks @vagababov)
  • Ignore young pods from metric computation (#6649, #6626, thanks @vagababov)
  • Metric Reporting Refactoring
    • Activator #6804, #6843 (thanks @markusthoemmes)
    • Autoscaler (#6774,#6707, #6712), QP #6852 (thanks @vagababov)
  • Cleanups and improvements (logging, metrics, unit and e2e tests, etcs); many PRs (thanks @jpeach, @MIBc, @taragu, @markusthoemmes, @vagababov)

Core API

  • We’ve adopted generated reconcilers to help minimize the boilerplate in our controllers (thanks @n3wscott, @mattmoor, @shashwathi #6993 #6973 #6969 #6952)
  • We’re removed the serving stats reporter that was reporting some nonsensical metrics (thanks @mattmoor #6939)
  • Webhook certificates now rotate (thanks @mpetason knative/pkg#1101)
  • The validating admission webhook will now apply the correct defaults (thanks @itsmurugappan #6938)
  • We’ve started our journey of actually deprecating the v1alpha1 APIs for the resources Service, Configuration, Revision and Route.
    • The controllers for these resources use the v1 APIs (thanks @dprotaso, @mattmoor #6933 #6949 #6950 #6957 #6958 #6959 #6960)
    • Thanks for the conversion webhook framework @dprotaso (knative/pkg#993)
    • Note: v1alpha1 will remain the storage version until we provide guidance on how to migrate the storage version to v1 - see #6726
    • Our current plan is remove v1alpha1 and v1beta1 APIs in 0.19
    • HPA auto scaling using Revision metrics (concurrency & requests per second) now use v1 APIs (thanks @dprotaso #6957).
      • Consuming revision metrics for the resource version v1alpha1 is deprecated and will be removed in the next release (0.14)

Networking

  • Deprecate the istio.sidecar.includeOutboundIPRanges in config-network #6597 (thanks @nak3)
  • Avoid unconditionally reconciling the Gateways on deletion #6934 (thanks @ZhiminXiang)
  • Remove "internal" in class name of Certificates #6887 (thanks @ZhiminXiang)
  • Wrong revision is picked up for traffic target marked as "latest" #6876 (thanks @taragu)
  • Fix Ready -> NotReady- > Ready flip flops in Ingress Prober #6648 (thanks @JRBANCEL)
  • Clean up orphaned VirtualService when migrating from Istio KIngress to other KIngress #6570 (thanks @nak3)
  • Avoid specifying IngressTLS before Certificate reports Ready. #6870 (thanks @mattmoor)
  • queue-proxy to returns 504 on connect timeouts #6859 (thanks @vagababov)
  • KIngress to disallow ServiceNamespace that differs from its own #6868 (thanks @MIBc)
  • Fix name collision when having two Route with name ${route} and ${route}-mesh #6362 (thanks @sreddy)
  • Route reconciler to separate cluster local Ingress rules and external domain rules to avoid ClusterLocal special-casing in KIngress implementation #6727 (thanks @tcnghia, @andrew-su)
  • Correctly set the network prober User-Agent #6644 (thanks @jpeach)
  • gRPC, AutoTLS, and KIngress testing (thanks @ZhiminXiang, @tanzeeb, @sreddy, @rmoe, andrew-su@)
serving - Knative Serving release v0.13.0

Published by knative-prow-releaser-robot over 4 years ago

Meta

Minimum Kubernetes version remains 1.15

This is NOT a change from 0.12, however, with the adoption of Conversion webhooks this is no longer something that may be overridden without consequence.
The target minimum version for our next (0.14) release will be Kubernetes 1.16.

Deprecation of the Alpha and Beta Serving APIs

The v1 APIs are now available in every supported version of Knative, and our controllers are now consuming v1 themselves.
We will continue to ship the deprecated APIs for 9 months (6 releases), so these will be removed in the 0.19 release.

We now rely on CRD Conversion webhooks

We take advantage of this long-awaited Beta+ feature in 1.15+ to manage converting between v1alpha1, v1beta1, and v1 types.

Autoscaling

  • Probe and forward traffic to non-ready pods (#6695 thanks @markusthoemmes)
  • gRPC e2e autoscaling test (#6778, thanks @tanzeeb, @shashwathi)
  • We no longer restrict min target at 1, permitting correct target utilization with CC=1 (#6951 thanks @vagababov)
  • Ignore young pods from metric computation (#6649, #6626, thanks @vagababov)
  • Metric Reporting Refactoring
    • Activator #6804, #6843 (thanks @markusthoemmes)
    • Autoscaler (#6774,#6707, #6712), QP #6852 (thanks @vagababov)
  • Cleanups and improvements (logging, metrics, unit and e2e tests, etcs); many PRs (thanks @jpeach, @MIBc, @taragu, @markusthoemmes, @vagababov)

Core API

  • We’ve adopted generated reconcilers to help minimize the boilerplate in our controllers (thanks @n3wscott, @mattmoor, @shashwathi #6993 #6973 #6969 #6952)
  • We’re removed the serving stats reporter that was reporting some nonsensical metrics (thanks @mattmoor #6939)
  • Webhook certificates now rotate (thanks @mpetason knative/pkg#1101)
  • The validating admission webhook will now apply the correct defaults (thanks @itsmurugappan #6938)
  • We’ve started our journey of actually deprecating the v1alpha1 APIs for the resources Service, Configuration, Revision and Route.
    • The controllers for these resources use the v1 APIs (thanks @dprotaso, @mattmoor #6933 #6949 #6950 #6957 #6958 #6959 #6960)
    • Thanks for the conversion webhook framework @dprotaso (knative/pkg#993)
    • Note: v1alpha1 will remain the storage version until we provide guidance on how to migrate the storage version to v1 - see #6726
    • Our current plan is remove v1alpha1 and v1beta1 APIs in 0.19
    • HPA auto scaling using Revision metrics (concurrency & requests per second) now use v1 APIs (thanks @dprotaso #6957).
      • Consuming revision metrics for the resource version v1alpha1 is deprecated and will be removed in the next release (0.14)

Networking

  • Deprecate the istio.sidecar.includeOutboundIPRanges in config-network #6597 (thanks @nak3)
  • Avoid unconditionally reconciling the Gateways on deletion #6934 (thanks @ZhiminXiang)
  • Remove "internal" in class name of Certificates #6887 (thanks @ZhiminXiang)
  • Wrong revision is picked up for traffic target marked as "latest" #6876 (thanks @taragu)
  • Fix Ready -> NotReady- > Ready flip flops in Ingress Prober #6648 (thanks @JRBANCEL)
  • Clean up orphaned VirtualService when migrating from Istio KIngress to other KIngress #6570 (thanks @nak3)
  • Avoid specifying IngressTLS before Certificate reports Ready. #6870 (thanks @mattmoor)
  • queue-proxy to returns 504 on connect timeouts #6859 (thanks @vagababov)
  • KIngress to disallow ServiceNamespace that differs from its own #6868 (thanks @MIBc)
  • Fix name collision when having two Route with name ${route} and ${route}-mesh #6362 (thanks @sreddy)
  • Route reconciler to separate cluster local Ingress rules and external domain rules to avoid ClusterLocal special-casing in KIngress implementation #6727 (thanks @tcnghia, @andrew-su)
  • Correctly set the network prober User-Agent #6644 (thanks @jpeach)
  • gRPC, AutoTLS, and KIngress testing (thanks @ZhiminXiang, @tanzeeb, @sreddy, @rmoe, andrew-su@)
serving - Knative Serving release v0.12.1

Published by knative-prow-releaser-robot over 4 years ago

Meta

Kubernetes minimum version increased to 1.15

This release of Knative adopts the 1.16.4 Kubernetes client which supports Kubernetes versions 0.15 through 0.17.

Change to Revision GC defaults

Revisions are now retained for 48 hours and the latest 20 are kept before being considered for garbage collection. This is up from 24 hours and a single revision previously. The prior behavior can be restored by updating the “config-gc” configmap in knative-serving.

Auto-TLS now supports HTTP01 challenges

The Certificate interface now supports the use of HTTP01 challenges, which can be significantly faster than DNS01 challenges for provisioning certificates and don’t require permissions to rewrite DNS records.

Support for Contour-based networking

Support has been added to use projectcontour.io as the networking layer for knative/serving.

Autoscaling

Constant-time metrics computations #5981 (thanks @vagababov)

A multi step change replacing map based O(N) metric average computation, with running window O(1) and constant memory allocation scheme based on a circular buffer.

Activator performance improvements (thanks @markusthoemmes)

Various improvements to the allocation behavior of the activator’s hot-path to avoid as many allocations per request as possible. Most notably introduced a BufferPool to pool buffers of the HTTP reverse-proxy.

Runs the activator with optimized garbage-collector settings (thanks @chizhg)

This was buggy in earlier releases as the deployment manifest of the activator was not correctly put together.
Various system clean ups, test stability (thanks @skaslev, @markusthoemmes, @vagababov, @nak3)

Core API

Change default Revision garbage collection to 20 Revisions and 48 hours #6252 (thanks @vagababov)

Previous defaults were 24 hours before consideration and 1 minimum Revision to keep.

Greatly reduced the amount of update conflicts #6164 (thanks @markusthoemmes)

Status updates are now retried locally during reconciliation to avoid constant event and reconciliation-retry churn.

Allow the usage of execProbes on the container #5712 (thanks @nak3)

Previously when execProbes were specified they were dropped resulting in a 'no probe specified error'. With this change the exec probe will be executed on the user container and the default TCP probe will be executed on the queue-proxy container.

Test Improvements:

  • Remove client SetDefaults from e2e tests #6202 (thanks @taragu)
  • Change runtime test image user from 2020 to 65532 #6244 (thanks @mattmoor)
  • Fix race in TestUpdateConfigurationMetadata #6257 (thanks @markusthoemmes)
  • Validate last pinned timestamp on Routes #5830 (thanks @taragu)

Networking

AutoTLS now supports ACME HTTP01 Challenge #4100 (thanks @rmoe)

We added support of ACME HTTP01 challenge which is less restrictive than the DNS while offering faster certificate provisioning.

Ingress conformance suite (thanks @mattmoor @tcnghia @vagababov)

A new suite of directed testing has been added for our Ingress abstraction, which can be used to validate that an implementation properly implements the semantics expected for KIngress without running the full e2e test suite. This suite uncovered a number of underdocumented semantics and several bugs in all of the networking integrations.

Shorter timeout for Ingress readiness Prober #6407 (thanks @MIBc)

A default timeout setting is added to help Ingress prober avoid hanging.

Ingress Prober not to share Pod State between different runs #6422 (thanks @JRBANCEL)

We prevent potential race condition when probing multiple Ingress by not sharing any state between different probe run of different Ingresses.

Switch to 'Set' semantic for header manipulations #6303 (thanks @tcnghia)

Previously our Istio reconciler used header appending to manipulate headers when routing to Revisions. This causes issue of the header has already been set. We switch to using ‘Set’ instead of ‘Append’ to ensure our header manipulation always work.

Fix bug where VirtualService not owned by Ingress may be cleaned up #6342 (thanks @tcnghia)

Previously, it is possible for an Ingress to mistakenly delete a VirtualService that it doesn’t own. We fix to make sure owner reference are confirmed before triggering deletion.

Fix "Duplicate entry of domain" error when using local-gateway.mesh #6488 (thanks @ZhiminXiang)

Fix a bug in our Istio-based reconciler to avoid generating the short hostnames for ‘mesh’ VirtualService, which may intermittently cause Envoy to get ‘Duplicate entry of domain’ errors.

Observability

Metrics for counting certificates created pkg#976 (thanks @ZhiminXiang)

Added metric knative.dev/internal/serving/controller/cert_creation_count.

Experimental support for metrics export to OpenCensus Agent pkg#953 and pkg#978 (thanks @evankanderson @anniefu)

Added experimental support for exporting metrics to OpenCensus. Note that the format of the exported metrics is likely to change.

serving - Knative Serving release v0.12.0

Published by knative-prow-releaser-robot over 4 years ago

Meta

Kubernetes minimum version increased to 1.15

This release of Knative adopts the 1.16.4 Kubernetes client which supports Kubernetes versions 0.15 through 0.17.

Change to Revision GC defaults

Revisions are now retained for 48 hours and the latest 20 are kept before being considered for garbage collection. This is up from 24 hours and a single revision previously. The prior behavior can be restored by updating the “config-gc” configmap in knative-serving.

Auto-TLS now supports HTTP01 challenges

The Certificate interface now supports the use of HTTP01 challenges, which can be significantly faster than DNS01 challenges for provisioning certificates and don’t require permissions to rewrite DNS records.

Support for Contour-based networking

Support has been added to use projectcontour.io as the networking layer for knative/serving.

Autoscaling

Constant-time metrics computations #5981 (thanks @vagababov)

A multi step change replacing map based O(N) metric average computation, with running window O(1) and constant memory allocation scheme based on a circular buffer.

Activator performance improvements (thanks @markusthoemmes)

Various improvements to the allocation behavior of the activator’s hot-path to avoid as many allocations per request as possible. Most notably introduced a BufferPool to pool buffers of the HTTP reverse-proxy.

Runs the activator with optimized garbage-collector settings (thanks @chizhg)

This was buggy in earlier releases as the deployment manifest of the activator was not correctly put together.
Various system clean ups, test stability (thanks @skaslev, @markusthoemmes, @vagababov, @nak3)

Core API

Change default Revision garbage collection to 20 Revisions and 48 hours #6252 (thanks @vagababov)

Previous defaults were 24 hours before consideration and 1 minimum Revision to keep.

Greatly reduced the amount of update conflicts #6164 (thanks @markusthoemmes)

Status updates are now retried locally during reconciliation to avoid constant event and reconciliation-retry churn.

Allow the usage of execProbes on the container #5712 (thanks @nak3)

Previously when execProbes were specified they were dropped resulting in a 'no probe specified error'. With this change the exec probe will be executed on the user container and the default TCP probe will be executed on the queue-proxy container.

Test Improvements:

  • Remove client SetDefaults from e2e tests #6202 (thanks @taragu)
  • Change runtime test image user from 2020 to 65532 #6244 (thanks @mattmoor)
  • Fix race in TestUpdateConfigurationMetadata #6257 (thanks @markusthoemmes)
  • Validate last pinned timestamp on Routes #5830 (thanks @taragu)

Networking

AutoTLS now supports ACME HTTP01 Challenge #4100 (thanks @rmoe)

We added support of ACME HTTP01 challenge which is less restrictive than the DNS while offering faster certificate provisioning.

Ingress conformance suite (thanks @mattmoor @tcnghia @vagababov)

A new suite of directed testing has been added for our Ingress abstraction, which can be used to validate that an implementation properly implements the semantics expected for KIngress without running the full e2e test suite. This suite uncovered a number of underdocumented semantics and several bugs in all of the networking integrations.

Shorter timeout for Ingress readiness Prober #6407 (thanks @MIBc)

A default timeout setting is added to help Ingress prober avoid hanging.

Ingress Prober not to share Pod State between different runs #6422 (thanks @JRBANCEL)

We prevent potential race condition when probing multiple Ingress by not sharing any state between different probe run of different Ingresses.

Switch to 'Set' semantic for header manipulations #6303 (thanks @tcnghia)

Previously our Istio reconciler used header appending to manipulate headers when routing to Revisions. This causes issue of the header has already been set. We switch to using ‘Set’ instead of ‘Append’ to ensure our header manipulation always work.

Fix bug where VirtualService not owned by Ingress may be cleaned up #6342 (thanks @tcnghia)

Previously, it is possible for an Ingress to mistakenly delete a VirtualService that it doesn’t own. We fix to make sure owner reference are confirmed before triggering deletion.

Fix "Duplicate entry of domain" error when using local-gateway.mesh #6488 (thanks @ZhiminXiang)

Fix a bug in our Istio-based reconciler to avoid generating the short hostnames for ‘mesh’ VirtualService, which may intermittently cause Envoy to get ‘Duplicate entry of domain’ errors.

Observability

Metrics for counting certificates created pkg#976 (thanks @ZhiminXiang)

Added metric knative.dev/internal/serving/controller/cert_creation_count.

Experimental support for metrics export to OpenCensus Agent pkg#953 and pkg#978 (thanks @evankanderson @anniefu)

Added experimental support for exporting metrics to OpenCensus. Note that the format of the exported metrics is likely to change.

serving - Knative Serving release v0.11.1

Published by knative-prow-releaser-robot almost 5 years ago

Meta

Load-balancing improvements with low containerConcurrency

At low containerConcurrency’s we now perform significantly better due to improvements in the application-specific load-balancing performed by the Activator component.

Kourier networking support

We have a new option for handling the ingress capabilities used by knative/serving. Kourier is the first Knative-native ingress implementation, which reconciles the Knative networking CRDs to program Envoy directly.

Autoscaling

Locally perfect loadbalancing and endpoint subsetting improvements (thanks @vagababov)

These are further improvements to the loadbalancing enhancements over the last releases. Given a stable activator count, loadbalancing of a revision with the activator on the path is now locally ideal. The graph.

Reduced the needed Kubernetes Services per Revision from 3 to 2 #5900 (thanks @markusthoemmes)

The third service used to be used for metric scraping exclusively. This is now done via the private service as well. Metric services are no longer created and actively removed in existing deployments.

Allow applications with a livenessProbe to properly scale down #5986 (thanks @nak3)

The queue-proxy wrongly counted requests sent via livenessProbes as actual requests, causing the revision to never shut down. These requests are now properly ignored.

Target annotation values can now exceed configured defaults #5975 (thanks @markusthoemmes)

This fixes a bug in the logic to determine the actual target of the autoscaler which capped the user-defined target value to the configured default value.

Report desired/actual scale in PodAutoscalers for the HPA as well (thanks @vagababov)

The values for desired and actual scale are now plumbed through from the HPA into the PodAutoscaler’s status.

Assorted code readability, optimizations and clean ups (thanks @vagababov, @markusthoemmes, @mgencur)

Core API

Improved error messages for image tag resolving #5920 (thanks @markusthoemmes)

Previous error messages did not indicate that the image pull failure occurred during digest resolution, and did not provide further details as to why the digest resolution failed. This change aides users in debugging problems in container registry permissions.

Enabled imagePullSecrets in PodSpec #5917 (thanks @taragu)

Users may now specify imagePullSecrets directly without attaching them to their Kubernetes ServiceAccount.

Add permissions for caching.internal.knative.dev to edit and view cluster roles #5992 (thanks @nak3)

Knative provides aggregated ClusterRoles that users can use to manage their Knative resources. These roles previously did not include the caching resource. This change adds the caching resource to both the edit and view roles.

Split apart defaulting and validation webhooks #5947 (thanks @mattmoor)

This fixes a problem where our validation wasn’t necessarily applied to the final object because it runs at the same time as defaulting, which might be before additional mutating webhooks. By separating things out we ensure that the validation occurs on the final object to be committed to etcd.

Configuration and Service now labeled with duck.knative.dev/podspecable #6121 (thanks @mattmoor)

This enables tools that reflect over the Kubernetes type system to reason about the podspec portion of these Knative resources.

Bug Fixes:

  • Fix bug where latestRevision routes can point to wrong revision #5319 (thanks @taragu)
  • Fix issue where config-defaults were not getting applied #5892 (thanks @taragu)
  • Fix validation issue for lastModifier when using multiple service accounts #6072 (thanks @savitaashture)
  • Fix problem with Configuration reporting Ready early #6096 (thanks @taragu)
  • Validation added for name and generateName fields in RevisionTemplate #5110 (thanks @savitaashture)

Test Improvements:

  • File access checks in conformance tests #5102 (thanks @shashwathi)
  • Improve route visibility test #5831 (thanks @andrew-su)
  • Added ability to run e2e tests with https #5157 (thanks @taragu)
  • More reliable WaitForService function #5956 (thanks @mgencur)
  • Added new upgrade test to catch broken defaulting behavior #6165 (thanks @dgerd)
  • Fix digest validation in conformance to account for image templating #6106 (thanks @markusthoemmes)

Networking

Compatibility with Istio 1.4 #6058 (thanks @nak3)

Istio 1.4 introduced a breaking restriction to the length of regular expressions allowed in VirtualServices. We switched to using prefixes to be compatible with Istio 1.4.

Integration with istio/client-go #5969 pkg#208 pkg#831 (thanks @skaslev)

Knative now uses istio/client-go instead of its own version of Istio API client. This addressed a long pain-point of maintaining a manually-translated API client to a changing API.

3scale/kourier integration #5983 (thanks @bbrowning @davidor @jmprusi)

Kourier is a light-weight Ingress for Knative (a deployment of Kourier consists of an Envoy proxy and a control plane for it). In v0.11 we add Kourier as an option to run Knative e2e integration tests.

Better LoadBalancerReady condition when VirtualService failed to be reconciled #6048 (thanks @nak3)

Previously when VirtualService failed to be reconciled the LoadBalancerReady Condition isn’t updated. We fix this to surface reason and message from the failing VirtualService.

Post-ClusterIngress migration cleanups (thanks @markusthoemmes)

Clean up port names of Knative components to follow Istio convention #5070 (thanks @iamejboy)

Bug fix #5734: Do not permit cluster local kservices on the cluster ingress #6174 (thanks @vagababov)

Fix a bug where cluster-local ksvcs are erroneously exposed to the public ingress.

Monitoring

Add default request metrics backend in observability config #6022 (thanks @drpmma)

This change makes the default backend Prometheus and makes it consistent with the default example value in config-observability.yaml

Fix missing required selector for node-exporter #5934 (thanks @lionelvillard)

serving - Knative Serving release v0.11.0

Published by knative-prow-releaser-robot almost 5 years ago

Meta

Load-balancing improvements with low containerConcurrency

At low containerConcurrency’s we now perform significantly better due to improvements in the application-specific load-balancing performed by the Activator component.

Kourier networking support

We have a new option for handling the ingress capabilities used by knative/serving. Kourier is the first Knative-native ingress implementation, which reconciles the Knative networking CRDs to program Envoy directly.

Autoscaling

Locally perfect loadbalancing and endpoint subsetting improvements (thanks @vagababov)

These are further improvements to the loadbalancing enhancements over the last releases. Given a stable activator count, loadbalancing of a revision with the activator on the path is now locally ideal. The graph.

Reduced the needed Kubernetes Services per Revision from 3 to 2 #5900 (thanks @markusthoemmes)

The third service used to be used for metric scraping exclusively. This is now done via the private service as well. Metric services are no longer created and actively removed in existing deployments.

Allow applications with a livenessProbe to properly scale down #5986 (thanks @nak3)

The queue-proxy wrongly counted requests sent via livenessProbes as actual requests, causing the revision to never shut down. These requests are now properly ignored.

Target annotation values can now exceed configured defaults #5975 (thanks @markusthoemmes)

This fixes a bug in the logic to determine the actual target of the autoscaler which capped the user-defined target value to the configured default value.

Report desired/actual scale in PodAutoscalers for the HPA as well (thanks @vagababov)

The values for desired and actual scale are now plumbed through from the HPA into the PodAutoscaler’s status.

Assorted code readability, optimizations and clean ups (thanks @vagababov, @markusthoemmes, @mgencur)

Core API

Improved error messages for image tag resolving #5920 (thanks @markusthoemmes)

Previous error messages did not indicate that the image pull failure occurred during digest resolution, and did not provide further details as to why the digest resolution failed. This change aides users in debugging problems in container registry permissions.

Enabled imagePullSecrets in PodSpec #5917 (thanks @taragu)

Users may now specify imagePullSecrets directly without attaching them to their Kubernetes ServiceAccount.

Add permissions for caching.internal.knative.dev to edit and view cluster roles #5992 (thanks @nak3)

Knative provides aggregated ClusterRoles that users can use to manage their Knative resources. These roles previously did not include the caching resource. This change adds the caching resource to both the edit and view roles.

Split apart defaulting and validation webhooks #5947 (thanks @mattmoor)

This fixes a problem where our validation wasn’t necessarily applied to the final object because it runs at the same time as defaulting, which might be before additional mutating webhooks. By separating things out we ensure that the validation occurs on the final object to be committed to etcd.

Configuration and Service now labeled with duck.knative.dev/podspecable #6121 (thanks @mattmoor)

This enables tools that reflect over the Kubernetes type system to reason about the podspec portion of these Knative resources.

Bug Fixes:

  • Fix bug where latestRevision routes can point to wrong revision #5319 (thanks @taragu)
  • Fix issue where config-defaults were not getting applied #5892 (thanks @taragu)
  • Fix validation issue for lastModifier when using multiple service accounts #6072 (thanks @savitaashture)
  • Fix problem with Configuration reporting Ready early #6096 (thanks @taragu)
  • Validation added for name and generateName fields in RevisionTemplate #5110 (thanks @savitaashture)

Test Improvements:

  • File access checks in conformance tests #5102 (thanks @shashwathi)
  • Improve route visibility test #5831 (thanks @andrew-su)
  • Added ability to run e2e tests with https #5157 (thanks @taragu)
  • More reliable WaitForService function #5956 (thanks @mgencur)
  • Added new upgrade test to catch broken defaulting behavior #6165 (thanks @dgerd)
  • Fix digest validation in conformance to account for image templating #6106 (thanks @markusthoemmes)

Networking

Compatibility with Istio 1.4 #6058 (thanks @nak3)

Istio 1.4 introduced a breaking restriction to the length of regular expressions allowed in VirtualServices. We switched to using prefixes to be compatible with Istio 1.4.

Integration with istio/client-go #5969 pkg#208 pkg#831 (thanks @skaslev)

Knative now uses istio/client-go instead of its own version of Istio API client. This addressed a long pain-point of maintaining a manually-translated API client to a changing API.

3scale/kourier integration #5983 (thanks @bbrowning @davidor @jmprusi)

Kourier is a light-weight Ingress for Knative (a deployment of Kourier consists of an Envoy proxy and a control plane for it). In v0.11 we add Kourier as an option to run Knative e2e integration tests.

Better LoadBalancerReady condition when VirtualService failed to be reconciled #6048 (thanks @nak3)

Previously when VirtualService failed to be reconciled the LoadBalancerReady Condition isn’t updated. We fix this to surface reason and message from the failing VirtualService.

Post-ClusterIngress migration cleanups (thanks @markusthoemmes)

Clean up port names of Knative components to follow Istio convention #5070 (thanks @iamejboy)

Bug fix #5734: Do not permit cluster local kservices on the cluster ingress #6174 (thanks @vagababov)

Fix a bug where cluster-local ksvcs are erroneously exposed to the public ingress.

Monitoring

Add default request metrics backend in observability config #6022 (thanks @drpmma)

This change makes the default backend Prometheus and makes it consistent with the default example value in config-observability.yaml

Fix missing required selector for node-exporter #5934 (thanks @lionelvillard)

serving - Knative Serving release v0.10.1

Published by knative-prow-releaser-robot almost 5 years ago

Meta

Eliminated errors across dataplane-probe scenarios!

With some of the work to improve the activator’s load balancing, the errors we would see with containerConcurrency: 1 due to queuing have been eliminated. With the CC aware load balancing in the activator we actually see better latency going through the activator than talking directly to the user container (everywhere else the extra hop adds latency).

Moving minimum Kubernetes version to 1.14

As part of moving to a single install path we are moving the minimum supported Kubernetes version to 1.14. This allows us to take advantage of multiple CRD endpoints and begin to experiment with future Kubernetes CRD features. Installation will fail if a lower version is detected.

We’re using Go1.13 for Serving

Serving has switched to Go 1.13, which has much better performance around synchronizations, has error wrapping and other features, like Duration.Milliseconds.

Autoscaling

We no longer use GenerateName for service names (@vagababov)

We were using GenerateName for creation of metrics and private K8s Service names. Which caused us various problems in testing and reliability. We no longer have this problem.

Activator refactoring and load balancing (@vagababov, @markusthoemmes):

After great changes by @greghaynes that permit probing and routing requests to individual pods -- we were able to significantly simplify the code, remove redundancies and ineffective pieces.

Semi-ideal load balancing (@vagababov):

Based on previous work we are able now to route requests to the individual pods always (when pods are individually addressable) permitting us to achieve significant improvements in the CC=1 case. See the graph. Work in this direction is not complete though (CC=10 case is still in progress)

Significant improvements to the integration tests that led to a big reduction in flakes (@nak3 @JRBANCEL):

Assorted improvements to the code and the tests have brought the number of flakes to practically white noise. See graph (check test/e2e.TestAutoscale.* tests)

Assorted code readability, optimizations and clean ups (@markusthoemmes, @nak3, @JRBANCEL, @skaslev)

Core API

Move to K8s 1.15.3 client libs #5570 (thanks @mattmoor)

The client version allows Knative to take advantage of newer features within Kubernetes going forward. This client version was chosen as it is compatible with 1.14 (our minimum version), but can also be used for longer as it is compatible with 1.16 Kubernetes releases as well.

Move back to single install path will all API versions (v1alpha1, v1beta1, v1) #5594 (thanks @mattmoor)

As we move to a 1.14 minimum Kubernetes version we are migrating back to a single install path that contains all 3 API endpoints. See Release Principles Draft for how this will work for future releases.

Skip copying last-applied-configuration annotation to Route #5468 (thanks @nak3)

The last-applied-configuration annotation is applied when using kubectl to manipulate the Service object. Previously this annotation would be copied down to Route which would cause confusion when describing the Route.

ConfigMaps are now validated synchronously #5404 (thanks @nimakaviani)

Malformed configmaps were a potential source of latent error and were often difficult to debug. This release adds synchronous validation of configmaps which ensures that bad values aren't able to sneak through.

New ClusterRoles for editing/viewing Knative resources through User-facing roles #5683 (thanks @nak3)

The ClusterRoles knative-serving-namespaced-edit and knative-serving-namespaced-view have been added as User-facing Roles. Read more about User-facing Roles in the Kubernetes documentation.

Add annotation /creator and /lastModifier to Configuration and Route #5240 (thanks @savitaashture)

These annotations were added to Services in previously releases where the information was propagated down. This release adds the annotations for Routes and Configurations created directly by a user.

Service and Route CRDs are now labeled with duck.knative.dev/addressable=true #5874 (thanks @n3wscott)

Bugs:

  • Fix v1 Route validation for multiple traffic targets with empty tags #5583 (thanks @skaslev)
  • Fix error message for missing probe field. #5713 (thanks @nak3)
  • Fail Service creation on invalid delaySeconds value #5733 (thanks @savitaashture)
  • Better error message for invalid combinations #5566 (thanks @nak3)
  • Delete finalizer on Route on deletion #5715 (thanks @nak3)
  • Validate name and generateName for RevisionTemplate #5110 (thanks @savitaashture)

Tests:

  • Use Status.URL instead of Status.URL.Host for conformance tests #5503 (thanks @bancel)
  • Enable downgrade testing #5596 (thanks @mattmoor)
  • New e2e test for rollback scenario #5702 (thanks @taragu)
  • Remove logURL assertion from conformance tests #5448 (thanks @markusthommes)

Networking

Activator graceful shutdown #5542 (thanks @nak3)

Fix a bug due to readiness probes showing successes when an activator is shutting down, causing requests to still be routed to terminating activators. We do this by letting SIGTERM triggers readinessProbe failure to allow terminating activators to properly drain.

Avoid creating wildcard certs when AutoTLS is disabled #5636 (thanks @nak3)

In v0.9.0, we attempted to create wildcard certs even though cert-manager is not setup, causing certificate creation errors when creating a new namespace. We fixed this to avoid requiring wildcard certs when AutoTLS is disabled.

Finalize ClusterIngress migration #5689 (thanks @wtam2018)

We removed the remaining references to ClusterIngress, and that completes the ClusterIngress migration feature track. Going forward, knative/serving no longer has cluster-scoped CRDs.

Improved readiness prober to support HTTPS redirect, HTTPS, HTTP2 & custom ports #5223 (thanks @JRBANCEL)

Previously Route readiness probers rely on a special domain coded into Istio VirtualServices. We now change the probes to probe the actual data path, allowing probes to work with the domains that the Route is expected to serve.

Fix cluster-local visibility when using tags #5734 (thanks @andrew-su)

Fix a visibility resolution bug, introduced in v0.9.0, causing cluster-local tags to not have cluster-local visibility even if explicitly set.

Avoid creating invalid certificate for cluster-local service #5611 (thanks @nak3)

Monitoring

Add container and pod labels to revision and activator metrics #5824 (thanks @yanweiguo)

serving - Knative Serving release v0.10.0

Published by knative-prow-releaser-robot almost 5 years ago

Meta

Eliminated errors across dataplane-probe scenarios!

With some of the work to improve the activator’s load balancing, the errors we would see with containerConcurrency: 1 due to queuing have been eliminated. With the CC aware load balancing in the activator we actually see better latency going through the activator than talking directly to the user container (everywhere else the extra hop adds latency).

Moving minimum Kubernetes version to 1.14

As part of moving to a single install path we are moving the minimum supported Kubernetes version to 1.14. This allows us to take advantage of multiple CRD endpoints and begin to experiment with future Kubernetes CRD features. Installation will fail if a lower version is detected.

We’re using Go1.13 for Serving

Serving has switched to Go 1.13, which has much better performance around synchronizations, has error wrapping and other features, like Duration.Milliseconds.

Autoscaling

We no longer use GenerateName for service names (@vagababov)

We were using GenerateName for creation of metrics and private K8s Service names. Which caused us various problems in testing and reliability. We no longer have this problem.

Activator refactoring and load balancing (@vagababov, @markusthoemmes):

After great changes by @greghaynes that permit probing and routing requests to individual pods -- we were able to significantly simplify the code, remove redundancies and ineffective pieces.

Semi-ideal load balancing (@vagababov):

Based on previous work we are able now to route requests to the individual pods always (when pods are individually addressable) permitting us to achieve significant improvements in the CC=1 case. See the graph. Work in this direction is not complete though (CC=10 case is still in progress)

Significant improvements to the integration tests that led to a big reduction in flakes (@nak3 @JRBANCEL):

Assorted improvements to the code and the tests have brought the number of flakes to practically white noise. See graph (check test/e2e.TestAutoscale.* tests)

Assorted code readability, optimizations and clean ups (@markusthoemmes, @nak3, @JRBANCEL, @skaslev)

Core API

Move to K8s 1.15.3 client libs #5570 (thanks @mattmoor)

The client version allows Knative to take advantage of newer features within Kubernetes going forward. This client version was chosen as it is compatible with 1.14 (our minimum version), but can also be used for longer as it is compatible with 1.16 Kubernetes releases as well.

Move back to single install path will all API versions (v1alpha1, v1beta1, v1) #5594 (thanks @mattmoor)

As we move to a 1.14 minimum Kubernetes version we are migrating back to a single install path that contains all 3 API endpoints. See Release Principles Draft for how this will work for future releases.

Skip copying last-applied-configuration annotation to Route #5468 (thanks @nak3)

The last-applied-configuration annotation is applied when using kubectl to manipulate the Service object. Previously this annotation would be copied down to Route which would cause confusion when describing the Route.

ConfigMaps are now validated synchronously #5404 (thanks @nimakaviani)

Malformed configmaps were a potential source of latent error and were often difficult to debug. This release adds synchronous validation of configmaps which ensures that bad values aren't able to sneak through.

New ClusterRoles for editing/viewing Knative resources through User-facing roles #5683 (thanks @nak3)

The ClusterRoles knative-serving-namespaced-edit and knative-serving-namespaced-view have been added as User-facing Roles. Read more about User-facing Roles in the Kubernetes documentation.

Add annotation /creator and /lastModifier to Configuration and Route #5240 (thanks @savitaashture)

These annotations were added to Services in previously releases where the information was propagated down. This release adds the annotations for Routes and Configurations created directly by a user.

Service and Route CRDs are now labeled with duck.knative.dev/addressable=true #5874 (thanks @n3wscott)

Bugs:

  • Fix v1 Route validation for multiple traffic targets with empty tags #5583 (thanks @skaslev)
  • Fix error message for missing probe field. #5713 (thanks @nak3)
  • Fail Service creation on invalid delaySeconds value #5733 (thanks @savitaashture)
  • Better error message for invalid combinations #5566 (thanks @nak3)
  • Delete finalizer on Route on deletion #5715 (thanks @nak3)
  • Validate name and generateName for RevisionTemplate #5110 (thanks @savitaashture)

Tests:

  • Use Status.URL instead of Status.URL.Host for conformance tests #5503 (thanks @bancel)
  • Enable downgrade testing #5596 (thanks @mattmoor)
  • New e2e test for rollback scenario #5702 (thanks @taragu)
  • Remove logURL assertion from conformance tests #5448 (thanks @markusthommes)

Networking

Activator graceful shutdown #5542 (thanks @nak3)

Fix a bug due to readiness probes showing successes when an activator is shutting down, causing requests to still be routed to terminating activators. We do this by letting SIGTERM triggers readinessProbe failure to allow terminating activators to properly drain.

Avoid creating wildcard certs when AutoTLS is disabled #5636 (thanks @nak3)

In v0.9.0, we attempted to create wildcard certs even though cert-manager is not setup, causing certificate creation errors when creating a new namespace. We fixed this to avoid requiring wildcard certs when AutoTLS is disabled.

Finalize ClusterIngress migration #5689 (thanks @wtam2018)

We removed the remaining references to ClusterIngress, and that completes the ClusterIngress migration feature track. Going forward, knative/serving no longer has cluster-scoped CRDs.

Improved readiness prober to support HTTPS redirect, HTTPS, HTTP2 & custom ports #5223 (thanks @JRBANCEL)

Previously Route readiness probers rely on a special domain coded into Istio VirtualServices. We now change the probes to probe the actual data path, allowing probes to work with the domains that the Route is expected to serve.

Fix cluster-local visibility when using tags #5734 (thanks @andrew-su)

Fix a visibility resolution bug, introduced in v0.9.0, causing cluster-local tags to not have cluster-local visibility even if explicitly set.

Avoid creating invalid certificate for cluster-local service #5611 (thanks @nak3)

Monitoring

Add container and pod labels to revision and activator metrics #5824 (thanks @yanweiguo)

serving - Knative Serving release v0.9.0

Published by knative-prow-releaser-robot about 5 years ago

Meta

This is “Serving v1” RC2

There is discussion ongoing within the community about how we will message and document that Serving (within constraints) is ready for production workloads, and how we coordinate this with the rest of Knative, which is not yet there.

v1 API

The v1 API shape and endpoint is available starting in this release. Due to potential minimum version constraints this release can be deployed with either just the v1alpha1 endpoint or with all endpoints (v1alpha1, v1beta1, and v1) endpoints enabled. The v1 API shape is usable through all endpoints.

To use the v1beta1 or v1 endpoints, a minimum Kubernetes version of 1.14 is required (1.13.10 also had the fix backported). The minimum required Kubernetes version will become 1.14 in the next release of Knative.

autoscaling.knative.dev/minScale now only applies to routable revisions

We have changed the behavior of minScale to only apply to Revisions that are referenced by a Route. This addresses a long-standing pain point where users used minScale, but Revisions would stick around until garbage collected, which takes at least 10 hours.

Cold Start improvements

We have made some improvement to our cold-start latency, which should result in a small net improvement across the board, but also notably improves:

  • Cold-starts that are sequenced (e.g. front-end calls back-end and both cold-start)
  • Events with responses (e.g. passing events back to the broker with each hop cold starting)
  • The long tail of cold-start latency (this should now be reliably under 10s for small container images)

Autoscaling

Cold Start Improvements #4902 and #3885 (thanks @greghaynes)

The Activator will now send requests directly to the pods when the ClusterIP is not yet ready, providing us with ~200ms latency from the time the pod is ready to the time we send the first request, compared to up to 10s before.
This also fixes a problem where cold start was subject to the 1iptables-min-sync-period of the kubelet (10s on GKE), which created a relatively high floor for cold start times under certain circumstances.

RPS autoscaling #3416 (thanks @yanweiguo and @taragu)

It is possible to drive autoscaling not only by concurrency but also by RPS/QPS/OPS metric, which is a better metric for short and light weight requests (@yanweiguo)
Report RPS metrics (@taragu)

minScale only applies to routable revisions #4183 (thanks @tanzeeb)

Previously Revisions would keep around the minScale instance even when they were no longer routable.
Added Reachability concept to the PodAutoscaler.

Continuous benchmarks are live at https://mako.dev (thanks @mattmoor, @srinivashegde86, @Fredy-Z, @vagababov)

Autoscaler scaledown rate #4993 (thanks @vagababov)

The rate at which the autoscaler scales down revisions can now be limited to a rate configured in config-autoscaler.

Various bug fixes/improvements:

  • AutoScaler did not update metric service #5291 (@vagababov)
  • SKS goes to Serve mode after Autoscaler restart #5327 (@vagababov)
  • Activator scale down problems #5364 (@mattmoor and @yanweiguo)
  • TBC is 200 by default now (thanks @vagababov)
  • PA now exports desired/actual Pods in the Status (thanks @vagababov)
  • Code cleanups, tests stability, etc (@markusthoemmes, @taragu, @savitaashure, etc)

Core API

v1 API #5483, #5259, #5337, #5439, #5559 (thanks @dgerd, @mattmoor)

The v1 API shape and endpoint is available starting in this release. See the "Meta" section for more details.

Validate system annotations #4995 (thanks @shashwathi)

Webhook validation now ensures that serving.knative.dev annotations have appropriate values.

Revisions now have the service.knative.dev/route label #5048 (thanks @mattmoor)

Revisions are now labeled by the referencing Route to enable querying.

Revision GC refactored into its own reconciler #4876 (thanks @taragu)

Revision reconciliation now occurs separately from Configuration reconciliation.

Surface Deployment failures to Revision status #5077 (thanks @jonjohnsonjr)

DeploymentProgressing and DeploymentReplicaFailure information is propagated up to Revision status. An event is no longer emitted when the deployment times out.

Validate VolumeSources and VolumeProjections #5128 (thanks @markusthoemmes)

We now validate the KeyToPath items in the webhook to ensure that both Key and Path are specified. This prevents potential pod deployments problems.

ContainerConcurrecy default is now configurable #5099 (thanks @taragu, @zyqsempai)

ContainerConcurrency is now configured through the config-defaults ConfigMap. Unspecified values will receive the default value, and explicit zero values will receive 'unlimited' concurrency.

Apply Route's labels to the child Ingress #5467 (thanks @nak3)

Labels on the Route will be propagated to the Ingress owned by the Route.

Jitter global resyncs to improve performance at scale #5275 (thanks @mattmoor)

Global resyncs no longer enqueue all objects at once. This prevents latency spikes in reconciliation time and improves the performance of larger clusters.

Improved error messages for readiness probes #5385 (thanks @nak3)

Bug Fixes:

  • Fix Revisions stuck in updating when scaled-to-zero #5106 (thanks @tanzeeb)
  • Fix Service reconcile when using named Revisions #5547 (thanks @dgerd)
  • Skip copying kubectl.kubernetes.io/last-applied-configuration annotation #5202 (thanks @skaslev)
  • Image repository credentials now work for image pulling #5477 (thanks @jonjohnsonjr)
  • Error earlier if using invalid autoscaling annotations #5412 (thanks @savitaashture)
  • Fix potential NPE in Route reconciler #5333 (thanks @mjaow)
  • Fix timeoutSeconds=0 to set default timeout #5224 (thanks @nak3)
  • Consistent update for Ingress ObservedGeneration #5250 (thanks @taragu)

Test Improvements:

  • Fix cgroup test for non-default CPU periods #5322 (thanks @duglin)
  • Improve Revision unit test coverage #5248 (thanks @savitaashture)

Networking

Cold start improvement

The activator sends request directly to Pod #3885 #4902 (thanks @greghaynes)

Disable and remove ClusterIngress resources #5024 (thanks @wtam)

Various bug fixes

  • Prober ignore Gateways that can’t be probed #5129 (thanks @JRBANCEL)
  • Make port name in Gateway unique by adding namespace prefix #5324 (thanks @nak3)
  • Activator to handle graceful shutdown correctly #5364 (thanks @mattmoor)
  • Route cluster-local visibility should take precedence over placeholder Services #5411 (thanks @tcnghia)

Monitoring

  • Upgrade Grafana image to official release 6.3.3 #5288 (thanks @yanweiguo)
  • Remove addonmanager labels from monitoring.yaml #5235 (thanks @yanweiguo)
  • Make reconciler dashboard a generic one #5247 (thanks @sayanh)
  • Report RPS for autoscaler metrics #5238 (thanks @taragu)
  • Remove shadowed logging package. #5132 (thanks @markusthoemmes)
  • Profiling support #5083 (thanks @mgencur)
  • Update log level to run tests on debug level #5071 (thanks @taragu)
  • Report Activator request concurrency to metrics backend #4931 (thanks @yanweiguo)
  • Add the grafana metric for the excess burst capacity #4820 (thanks @vagababov)
  • Export webhook metrics to prometheus #4707 (thanks @anniefu)
serving - Knative Serving release v0.8.1

Published by knative-prow-releaser-robot about 5 years ago

Meta

This release is our first “release candidate” for Serving v1

We are burning down remaining issues here, but barring major issues we will declare 0.9 the “v1” release of knative/serving.

Istio minimum version is now 1.1.x

In order to support #4755 we have to officially remove support for Istio 1.0.x (which is end-of-life).

Route/Service Ready actually means Ready!

Route now only reports Ready if it is accessible from the Istio Ingress. This allows users to start using a Service/Route the moment it reports Ready.

Target Burst Capacity (TBC) support

The activator can now be used to shield user services at smaller scales (not just zero!), where it will buffer requests until adequate capacity is available. This is configurable on cluster and revision level; it is currently off by default.

Migrate to knative.dev/serving import path

We have migrated github.com/knative/serving import paths to use knative.dev/serving.

Autoscaling

Target Burst Capacity (TBC) support #4443, #4516, #4580, #4758 (thanks @vagababov)

The activator can now be used to shield user services at smaller scales (not just zero!), where it will buffer requests until adequate capacity is available. This is configurable on cluster and revision level; it is currently off by default.

Activator HPA and performance improvements #4886, #4772 (thanks @yanweiguo)

With the activator on the dataplane more often (for TBC), several performance and scale problems popped up. We now horizontally scale the activator on CPU, and have made several latency improvements to its request handling.

Faster Scale Down to 0 #4883, #4949, #4938, etc (thanks @vagababov)

We will now elide the scale-to-zero “grace period” when the activator was already in the request path (this is now possible through the use of “target burst capacity”).
The scale-to-zero “grace period” is now computed from the time the activator was confirmed on the data path vs. a fixed duration.

Metrics Resource #4753, #4894, #4895, #4913, #4924 (thanks @markusthoemmes)

Autoscaling metrics are now full-fledged resources in Knative, this enables new autoscalers to plug in from out-of-process.

HPA is a separate controller now #4990 (thanks @markusthoemmes)

This proves that the metrics resource model enables a fully capable autoscaler outside of the main autoscaling controller.

Stability and performance (thanks to many):

  • Improvements to test flakiness
  • Better validation of annotation and config maps is performed
  • Autoscaler will wait for a reasonable population of metrics to be collected before scaling user pods down after it has been restarted.

Core API

Readiness probe cold-start improvements #4148, #4649, #4667, #4668, #4731 (thanks @joshrider, @shashwathi)

The queue-proxy sidecar will now evaluate both user specified readiness probes and the (default) TCP probe. This enables us to much more aggressively probe the user-provided container for readiness (vs. K8s default second granularity).
The default periodSeconds for the readinessProbe is now 0 which enables a system defined sub-second readiness check.
This contains a breaking change for users relying on the default periodSeconds while specifying either timeoutSeconds or failureThreshold. Services using these values should remove them to enable the benefits of faster probing, or they should specify a periodSeconds greater than 0 to restore previous behavior.

Enable specifying protocol without port number #4515 (thanks @tanzeeb)

Container ports can now be specified without a port number. This allows for specifying just a name (i.e. "http1", "h2c") to select the protocol.

Tag-to-digest resolution now works with AWS ECR #4084 (thanks @jonjonshonjr)

Knative has been updated to use the new AWS credential provider to enable pulling images from AWS ECR.

Revisions annotated with serving.knative.dev/creator #4526 (thanks @nak3)

Annotation Validations #4560, #4656, #4669, #4888, #4879, #4763 (thanks @vagababov, @markusthoemmes, @savitaashture , @shashwathi)

System annotations (autoscaling.knative.dev/* and serving.knative.dev/*) are now validated by the webhook for correctness and immutability (where applicable). This improves visibility to errors in annotations, and ensures annotations on Knative objects are accurate and valid.

ServiceAccountName Validation #4733, #4919 (thanks @shashwathi)

Service account names are now validated to be a valid kubernetes identifier to improve the time to error and reduce potential impact of an incorrect identifier.

Fixes

  • Tag resolution for schema 1 images #4432 (thanks @jonjohnsonjr )
  • Don't display user-defined template for cluster-local #4615 (thanks @duglin)
  • Fix error message when multiple containers are specified #4709 (thanks @nak3)
  • Update observedGeneration even when Route fails #4594 (thanks @taragu)

Tests:

  • Improved header test for 'Forwarded' header #4626 (thanks @markusthoemmes)
  • Reduce number of test images #4687, #4677, #4679, #4720, #4721 (thanks @markusthoemmes, @dgerd)
  • Replace test.options with functional options #4762 (thanks @markusthoemmes)

Docs:

  • Remove misuse of RFC2119 keywords #4550 (thanks @duglin)
  • Add links to conformance tests from Runtime Contract #4428 (thanks @dgerd)
  • New API Specification document docs#1642 (thanks @dgerd)

Networking

Honest Route/Service Readiness (#1582, #3312) (thanks @JRBANCEL)

Route now only reports Ready if it is accessible from the Istio Ingress. This allows users to start using a Service or Route the moment it reports Ready.

Remove cluster scoping of ClusterIngress (#4028) (thanks @wtam)

networking.internal.knative.dev/ClusterIngress is now replaced by networking.internal.knative.dev/Ingress, which is a cluster-scoped resource. The ClusterIngress resource will be removed in 0.9.

Enable visibility settings for sub-Route (#3419) (thanks @andrew-su)

Each sub Route (tags) can have their own visibility setting by labelling the corresponding placeholder K8s Service.

Correct split percentage for inactive Revisions (#882, #4755) (thanks @tcnghia)

We no longer just route to the biggest inactive split, when there are more than one inactive traffic splits. To support this fix we now officially remove support for Istio 1.0 (which was announced to be EOL).

Integration with Gloo Ingress (thanks @scottweiss and Solo.io team)

Knative-on-Gloo now has its own continuous build to ensure good integration.
Gloo now officially supports networking.internal.knative.dev/Ingress (see #4028).

Ambassador officially announces Knative support (thanks @richarddli and Ambassador team)

blog post

Fixes

  • Fix activator crash due to trailing dot in resolv.conf (#4407) (thanks @tcnghia)
  • Activator to wait for active requests to drain before terminating (#4654) (thanks @vagababov)
  • Fix cluster-local Service URL (#4204) (thanks @duglin)
  • Remove cert-manager controller from default serving.yaml install (#4120) (thanks @ZhiminXiang)

Monitoring

Automate cold-start timing collection #2495 (thanks @greghaynes)

Record the time spent broken down into components during cold-start including “how much time is spent before we ask our deployment to scale up” and “how much time is spent before our user application begins executing”.

Dash in controller name cause metrics to be dropped #4716 (thanks @JRBANCEL)

Fixed an issue where some controller metrics were not getting into Prometheus due to invalid characters in their component names,

serving - Knative Serving release v0.8.0 (aka "v1rc1")

Published by knative-prow-releaser-robot about 5 years ago

Meta

This release is our first “release candidate” for Serving v1

We are burning down remaining issues here, but barring major issues we will declare 0.9 the “v1” release of knative/serving.

Istio minimum version is now 1.1.x

In order to support #4755 we have to officially remove support for Istio 1.0.x (which is end-of-life).

Route/Service Ready actually means Ready!

Route now only reports Ready if it is accessible from the Istio Ingress. This allows users to start using a Service/Route the moment it reports Ready.

Target Burst Capacity (TBC) support

The activator can now be used to shield user services at smaller scales (not just zero!), where it will buffer requests until adequate capacity is available. This is configurable on cluster and revision level; it is currently off by default.

Migrate to knative.dev/serving import path

We have migrated github.com/knative/serving import paths to use knative.dev/serving.

Autoscaling

Target Burst Capacity (TBC) support #4443, #4516, #4580, #4758 (thanks @vagababov)

The activator can now be used to shield user services at smaller scales (not just zero!), where it will buffer requests until adequate capacity is available. This is configurable on cluster and revision level; it is currently off by default.

Activator HPA and performance improvements #4886, #4772 (thanks @yanweiguo)

With the activator on the dataplane more often (for TBC), several performance and scale problems popped up. We now horizontally scale the activator on CPU, and have made several latency improvements to its request handling.

Faster Scale Down to 0 #4883, #4949, #4938, etc (thanks @vagababov)

We will now elide the scale-to-zero “grace period” when the activator was already in the request path (this is now possible through the use of “target burst capacity”).
The scale-to-zero “grace period” is now computed from the time the activator was confirmed on the data path vs. a fixed duration.

Metrics Resource #4753, #4894, #4895, #4913, #4924 (thanks @markusthoemmes)

Autoscaling metrics are now full-fledged resources in Knative, this enables new autoscalers to plug in from out-of-process.

HPA is a separate controller now #4990 (thanks @markusthoemmes)

This proves that the metrics resource model enables a fully capable autoscaler outside of the main autoscaling controller.

Stability and performance (thanks to many):

  • Improvements to test flakiness
  • Better validation of annotation and config maps is performed
  • Autoscaler will wait for a reasonable population of metrics to be collected before scaling user pods down after it has been restarted.

Core API

Readiness probe cold-start improvements #4148, #4649, #4667, #4668, #4731 (thanks @joshrider, @shashwathi)

The queue-proxy sidecar will now evaluate both user specified readiness probes and the (default) TCP probe. This enables us to much more aggressively probe the user-provided container for readiness (vs. K8s default second granularity).
The default periodSeconds for the readinessProbe is now 0 which enables a system defined sub-second readiness check.
This contains a breaking change for users relying on the default periodSeconds while specifying either timeoutSeconds or failureThreshold. Services using these values should remove them to enable the benefits of faster probing, or they should specify a periodSeconds greater than 0 to restore previous behavior.

Enable specifying protocol without port number #4515 (thanks @tanzeeb)

Container ports can now be specified without a port number. This allows for specifying just a name (i.e. "http1", "h2c") to select the protocol.

Tag-to-digest resolution now works with AWS ECR #4084 (thanks @jonjonshonjr)

Knative has been updated to use the new AWS credential provider to enable pulling images from AWS ECR.

Revisions annotated with serving.knative.dev/creator #4526 (thanks @nak3)

Annotation Validations #4560, #4656, #4669, #4888, #4879, #4763 (thanks @vagababov, @markusthoemmes, @savitaashture , @shashwathi)

System annotations (autoscaling.knative.dev/* and serving.knative.dev/*) are now validated by the webhook for correctness and immutability (where applicable). This improves visibility to errors in annotations, and ensures annotations on Knative objects are accurate and valid.

ServiceAccountName Validation #4733, #4919 (thanks @shashwathi)

Service account names are now validated to be a valid kubernetes identifier to improve the time to error and reduce potential impact of an incorrect identifier.

Fixes

  • Tag resolution for schema 1 images #4432 (thanks @jonjohnsonjr )
  • Don't display user-defined template for cluster-local #4615 (thanks @duglin)
  • Fix error message when multiple containers are specified #4709 (thanks @nak3)
  • Update observedGeneration even when Route fails #4594 (thanks @taragu)

Tests:

  • Improved header test for 'Forwarded' header #4626 (thanks @markusthoemmes)
  • Reduce number of test images #4687, #4677, #4679, #4720, #4721 (thanks @markusthoemmes, @dgerd)
  • Replace test.options with functional options #4762 (thanks @markusthoemmes)

Docs:

  • Remove misuse of RFC2119 keywords #4550 (thanks @duglin)
  • Add links to conformance tests from Runtime Contract #4428 (thanks @dgerd)
  • New API Specification document docs#1642 (thanks @dgerd)

Networking

Honest Route/Service Readiness (#1582, #3312) (thanks @JRBANCEL)

Route now only reports Ready if it is accessible from the Istio Ingress. This allows users to start using a Service or Route the moment it reports Ready.

Remove cluster scoping of ClusterIngress (#4028) (thanks @wtam)

networking.internal.knative.dev/ClusterIngress is now replaced by networking.internal.knative.dev/Ingress, which is a cluster-scoped resource. The ClusterIngress resource will be removed in 0.9.

Enable visibility settings for sub-Route (#3419) (thanks @andrew-su)

Each sub Route (tags) can have their own visibility setting by labelling the corresponding placeholder K8s Service.

Correct split percentage for inactive Revisions (#882, #4755) (thanks @tcnghia)

We no longer just route to the biggest inactive split, when there are more than one inactive traffic splits. To support this fix we now officially remove support for Istio 1.0 (which was announced to be EOL).

Integration with Gloo Ingress (thanks @scottweiss and Solo.io team)

Knative-on-Gloo now has its own continuous build to ensure good integration.
Gloo now officially supports networking.internal.knative.dev/Ingress (see #4028).

Ambassador officially announces Knative support (thanks @richarddli and Ambassador team)

blog post

Fixes

  • Fix activator crash due to trailing dot in resolv.conf (#4407) (thanks @tcnghia)
  • Activator to wait for active requests to drain before terminating (#4654) (thanks @vagababov)
  • Fix cluster-local Service URL (#4204) (thanks @duglin)
  • Remove cert-manager controller from default serving.yaml install (#4120) (thanks @ZhiminXiang)

Monitoring

Automate cold-start timing collection #2495 (thanks @greghaynes)

Record the time spent broken down into components during cold-start including “how much time is spent before we ask our deployment to scale up” and “how much time is spent before our user application begins executing”.

Dash in controller name cause metrics to be dropped #4716 (thanks @JRBANCEL)

Fixed an issue where some controller metrics were not getting into Prometheus due to invalid characters in their component names,

serving - Knative Serving release v0.7.1

Published by knative-prow-releaser-robot over 5 years ago

Meta

serving.knative.dev/v1beta1 (requires K8s 1.14+ due to https://github.com/knative/serving/issues/4533)

  • In 0.6 we expanded our v1alpha1 API to include our v1beta1 fields. In this release, we are contracting the set of fields we store for v1alpha1 to that subset (and disallowing those that don’t fit). With this, we can leverage the “same schema” CRD-conversion supported by Kubernetes 1.11+ to ship v1beta1.

HPA-based scaling on concurrent requests

  • We previously supported using the HPA “class” autoscaler to enable Knative services to be scaled on CPU and Memory. In this release, we are adding support for using the HPA to scale them on the same “concurrent requests” metrics used by our default autoscaler.
  • HPA still does not yet support scaling to zero, and more work is needed to expose these metrics to arbitrary autoscaler plugins, but this is exciting progress!

Non-root containers

  • This release, all of the containers we ship run as a “nonroot” user. This includes the queue-proxy sidecar injected into the user pod. This enables the use of stricter “Pod Security Policies” with knative/serving.

Breaking Changes

  • Previously deprecated status fields are no longer populated.
  • Build and Manual (deprecated in 0.6) are now unsupported
  • The URLs generated for Route tags by default have changed, see the tagTemplate section below for how to avoid this break.

Autoscaling

Support concurrency-based scaling on the HPA (thanks @markusthoemmes).

Metric-scraping and decision-making has been separated out of the Knative internal autoscaler (KPA). The metrics are now also available to the HPA.

Dynamically change autoscaling metrics sample size based on pod population (thanks @yanweiguo).

Depending on how many pods the specific revision has, the autoscaler now scrapes a computed number of pods to gain more confidence in the reported metrics while maintaining scalability.

Fixes:

  • Added readiness probes to the autoscaler #4456 (thanks @vagababov)
  • Adjust activator’s throttling behavior based on activator scale (thanks @shashwathi and @andrew-su).
  • Revisions wait until they have reached “minScale” before they are reported “Ready” (thanks @joshrider).

Core API

Expose v1beta1 API #4199 (thanks @mattmoor)

This release exposes resources under serving.knative.dev/v1beta1.

Non-root containers #3237 (thanks @bradhoekstra and @dprotaso)

This release, all of the containers we ship run as a “nonroot” user. This includes the queue-proxy sidecar injected into the user pod. This enables the use of stricter “Pod Security Policies” with knative/serving.

Allow users to specify their container name #4289 (thanks @mattmoor)

This will default to user-container, which is what we use today, and that default may be changed for config-defaults to a Go template with access to the parent resource’s (e.g. Service, Configuration) ObjectMeta fields.

Projected volume support #4079 (thanks @mattmoor)

Based on community feedback, we have added support for mounting ConfigMaps and Secrets via the projected volume type.

Drop legacy status fields #4197 (thanks @mattmoor)

A variety of legacy fields from our v1alpha1 have been dropped in preparation to serve these same objects over v1beta1.

Build is unsupported #4099 (thanks @mattmoor)

As mentioned in the 0.6 release notes, support for just-in-time builds has been removed, and requests containing a build will now be rejected.

Manual is unsupported #4188 (thanks @mattmoor)

As mentioned in the 0.6 release notes, support for manual mode has been removed, and requests containing it will now be rejected.

V1beta1 clients and conformance testing #4369 (thanks @mattmoor)

We have generated client libraries for v1beta1 and have a v1beta1 version of the API conformance test suite under ./test/conformance/api/v1beta1.

Defaulting based conversion #4080 (thanks @mattmoor)

Objects submitted with the old v1alpha1 schema will be upgraded via our “defaulting” logic in a mutating admission webhook.

New annotations for queue-proxy resource limits #4151 (thanks @raushan2016)

The queue.sidecar.serving.knative.dev/resourcePercentage annotation now allows setting the percetnage of user container resources to be used for the queue-proxy.

Annotation propagation #4363, #4367 (thanks @vagababov)

Annotations now propagate from the Knative Service object to Route and Configuration.

Fixes:

  • Improve our Ready/Generation handling across resources #4185 (thanks @mattmoor)
  • Fix Revision GC #4187, #4245 (thanks @nak3, @greghaynes)
  • Surface pod schedule errors in Revision #4191 (thanks @shashwathi)
  • Allow container.name in RevisionTemplate #4289 (thanks @mattmoor)
  • Fix pulling older schema 1 manifests #4430 (thanks @jonjohnsonjr)

Test:

  • Add multiple namespace test #4108 (thanks @andrew-su)
  • Separate Conformance tests by type #4145 (thanks @tzununbekov)
  • Upgrade test improvements #4211, #4267 (thanks @jonjohnsonjr)
  • Add conformance test case for user set headers #4411 (thanks @dgerd)

Networking

Reconcile annotations from Route to ClusterIngress #4087 (thanks @vagababov)

This allows ClusterIngress class annotation to be specified per-Route instead of cluster wide through a config-network setting.

Introduce tagTemplate configuration #4292 (thanks @mattmoor)

This allows operators to configure the names that are given to the services created for tags in Route.
This also changes the default to transpose the tag and route name, which is a breaking change to the URLs these received in 0.6. To avoid this break, you can set tagTemplate: {{.Name}}-{{.Tag}} in config-network.

Enable use of annotations in domainTemplate #4210 (thanks @raushan2016)

User can now provide custom subdomain via label serving.knative.dev/subDomain.

Allow customizing max allowed request timeout #4172 (thanks @mdemirhan)

This introduces a new config entry max-revision-timeout-seconds in config-defaults to set the max allowed request timeout.

Set Forwarded header on request #4376 (thanks @tanzeeb)

The Forwarded header is constructed and appended to the headers by the queue-proxy if only legacy x-forwarded-* headers are set.

Fixes:

  • Enable short names for cluster-local Service without relying on sidecars #3824 (thanks @tcnghia)
  • Better surfacing of ClusterIngress Status #4288 #4144 (thanks @tanzeeb, @nak3)
  • SKS private service uses random names to avoid length limitation #4250 (thanks @vagababov)

Monitoring

Set memory request for zipkin pods #4353 (thanks @sebgoa)

This lowers the memory necessary to schedule the zipkin pod.

Collect /var/log without fluentd sidecar #4156 (thanks @jrbancel)

This allows /var/log collection without the need to load fluentd sidecar, which is large and significantly increases pod startup time.

Enable queue-proxy metrics scraping by Prometheus. #4111 (thanks @mdemirhan)

The new metrics exposed by queue proxy are now exposed as part of the pod spec and Prometheus can now scrape these metrics.

Fixes:

  • Fix 'Revision CPU and Memory Usage' Grafana dashboard #4106 (thanks @jrbancel)
  • Fix 'Scaling Debugging' Grafana dashboard. #4096 (thanks @jrbancel)
  • Remove embedded jaeger-operator and include as dependency instead #3938 (thanks @objectiser)
  • Fix HTTP request dashboards #4418 (thanks @mdemirhan)
serving - Knative Serving release v0.7.0

Published by knative-prow-releaser-robot over 5 years ago

Meta

serving.knative.dev/v1beta1 (requires K8s 1.14+ due to https://github.com/knative/serving/issues/4533)

  • In 0.6 we expanded our v1alpha1 API to include our v1beta1 fields. In this release, we are contracting the set of fields we store for v1alpha1 to that subset (and disallowing those that don’t fit). With this, we can leverage the “same schema” CRD-conversion supported by Kubernetes 1.11+ to ship v1beta1.

HPA-based scaling on concurrent requests

  • We previously supported using the HPA “class” autoscaler to enable Knative services to be scaled on CPU and Memory. In this release, we are adding support for using the HPA to scale them on the same “concurrent requests” metrics used by our default autoscaler.
  • HPA still does not yet support scaling to zero, and more work is needed to expose these metrics to arbitrary autoscaler plugins, but this is exciting progress!

Non-root containers

  • This release, all of the containers we ship run as a “nonroot” user. This includes the queue-proxy sidecar injected into the user pod. This enables the use of stricter “Pod Security Policies” with knative/serving.

Breaking Changes

  • Previously deprecated status fields are no longer populated.
  • Build and Manual (deprecated in 0.6) are now unsupported
  • The URLs generated for Route tags by default have changed, see the tagTemplate section below for how to avoid this break.

Autoscaling

Support concurrency-based scaling on the HPA (thanks @markusthoemmes).

Metric-scraping and decision-making has been separated out of the Knative internal autoscaler (KPA). The metrics are now also available to the HPA.

Dynamically change autoscaling metrics sample size based on pod population (thanks @yanweiguo).

Depending on how many pods the specific revision has, the autoscaler now scrapes a computed number of pods to gain more confidence in the reported metrics while maintaining scalability.

Fixes:

  • Added readiness probes to the autoscaler #4456 (thanks @vagababov)
  • Adjust activator’s throttling behavior based on activator scale (thanks @shashwathi and @andrew-su).
  • Revisions wait until they have reached “minScale” before they are reported “Ready” (thanks @joshrider).

Core API

Expose v1beta1 API #4199 (thanks @mattmoor)

This release exposes resources under serving.knative.dev/v1beta1.

Non-root containers #3237 (thanks @bradhoekstra and @dprotaso)

This release, all of the containers we ship run as a “nonroot” user. This includes the queue-proxy sidecar injected into the user pod. This enables the use of stricter “Pod Security Policies” with knative/serving.

Allow users to specify their container name #4289 (thanks @mattmoor)

This will default to user-container, which is what we use today, and that default may be changed for config-defaults to a Go template with access to the parent resource’s (e.g. Service, Configuration) ObjectMeta fields.

Projected volume support #4079 (thanks @mattmoor)

Based on community feedback, we have added support for mounting ConfigMaps and Secrets via the projected volume type.

Drop legacy status fields #4197 (thanks @mattmoor)

A variety of legacy fields from our v1alpha1 have been dropped in preparation to serve these same objects over v1beta1.

Build is unsupported #4099 (thanks @mattmoor)

As mentioned in the 0.6 release notes, support for just-in-time builds has been removed, and requests containing a build will now be rejected.

Manual is unsupported #4188 (thanks @mattmoor)

As mentioned in the 0.6 release notes, support for manual mode has been removed, and requests containing it will now be rejected.

V1beta1 clients and conformance testing #4369 (thanks @mattmoor)

We have generated client libraries for v1beta1 and have a v1beta1 version of the API conformance test suite under ./test/conformance/api/v1beta1.

Defaulting based conversion #4080 (thanks @mattmoor)

Objects submitted with the old v1alpha1 schema will be upgraded via our “defaulting” logic in a mutating admission webhook.

New annotations for queue-proxy resource limits #4151 (thanks @raushan2016)

The queue.sidecar.serving.knative.dev/resourcePercentage annotation now allows setting the percetnage of user container resources to be used for the queue-proxy.

Annotation propagation #4363, #4367 (thanks @vagababov)

Annotations now propagate from the Knative Service object to Route and Configuration.

Fixes:

  • Improve our Ready/Generation handling across resources #4185 (thanks @mattmoor)
  • Fix Revision GC #4187, #4245 (thanks @nak3, @greghaynes)
  • Surface pod schedule errors in Revision #4191 (thanks @shashwathi)
  • Allow container.name in RevisionTemplate #4289 (thanks @mattmoor)
  • Fix pulling older schema 1 manifests #4430 (thanks @jonjohnsonjr)

Test:

  • Add multiple namespace test #4108 (thanks @andrew-su)
  • Separate Conformance tests by type #4145 (thanks @tzununbekov)
  • Upgrade test improvements #4211, #4267 (thanks @jonjohnsonjr)
  • Add conformance test case for user set headers #4411 (thanks @dgerd)

Networking

Reconcile annotations from Route to ClusterIngress #4087 (thanks @vagababov)

This allows ClusterIngress class annotation to be specified per-Route instead of cluster wide through a config-network setting.

Introduce tagTemplate configuration #4292 (thanks @mattmoor)

This allows operators to configure the names that are given to the services created for tags in Route.
This also changes the default to transpose the tag and route name, which is a breaking change to the URLs these received in 0.6. To avoid this break, you can set tagTemplate: {{.Name}}-{{.Tag}} in config-network.

Enable use of annotations in domainTemplate #4210 (thanks @raushan2016)

User can now provide custom subdomain via label serving.knative.dev/subDomain.

Allow customizing max allowed request timeout #4172 (thanks @mdemirhan)

This introduces a new config entry max-revision-timeout-seconds in config-defaults to set the max allowed request timeout.

Set Forwarded header on request #4376 (thanks @tanzeeb)

The Forwarded header is constructed and appended to the headers by the queue-proxy if only legacy x-forwarded-* headers are set.

Fixes:

  • Enable short names for cluster-local Service without relying on sidecars #3824 (thanks @tcnghia)
  • Better surfacing of ClusterIngress Status #4288 #4144 (thanks @tanzeeb, @nak3)
  • SKS private service uses random names to avoid length limitation #4250 (thanks @vagababov)

Monitoring

Set memory request for zipkin pods #4353 (thanks @sebgoa)

This lowers the memory necessary to schedule the zipkin pod.

Collect /var/log without fluentd sidecar #4156 (thanks @jrbancel)

This allows /var/log collection without the need to load fluentd sidecar, which is large and significantly increases pod startup time.

Enable queue-proxy metrics scraping by Prometheus. #4111 (thanks @mdemirhan)

The new metrics exposed by queue proxy are now exposed as part of the pod spec and Prometheus can now scrape these metrics.

Fixes:

  • Fix 'Revision CPU and Memory Usage' Grafana dashboard #4106 (thanks @jrbancel)
  • Fix 'Scaling Debugging' Grafana dashboard. #4096 (thanks @jrbancel)
  • Remove embedded jaeger-operator and include as dependency instead #3938 (thanks @objectiser)
  • Fix HTTP request dashboards #4418 (thanks @mdemirhan)
serving - Knative Serving release v0.6.1

Published by knative-prow-releaser-robot over 5 years ago

Meta

New API Shape

We have approved a proposal for the “v1beta1” API shape for knative/serving. These changes will make the Serving resources much more familiar for experienced Kubernetes users, unlock the power of Route to users of Service, and enable GitOps scenarios with features like “bring-your-own-Revision-name”. We will be working towards this over the next few releases.

In this release we have backported the new API surface to the v1alpha1 API as the first part of the transition to v1beta1 (aka “lemonade”). The changes that will become breaking in 0.7+ are:

  • Service and Configuration will no longer support “just-in-time” Builds.
  • Service will no longer support “manual” mode.

You can see the new API surface in use throughout our samples in knative/docs, but we will continue to support the majority of the legacy surface via v1alpha1 until we turn it down.

Overhauled Scale-to-Zero

We have radically changed the mechanism by which we scale to zero. The new architecture creates a better separation of concerns throughout the Serving resource model with fewer moving parts, and enables us to address a number of long-standing issues (some in this release, some to come). See below for more details.

Auto-TLS (alpha, opt-in)

We have added support for auto-TLS integration! The default implementation builds on cert-manager to provision certificates (e.g. via Let’s Encrypt), but similar to how we have made Istio pluggable, you can swap out cert-manager for other certificate provisioning systems. Currently certificates are provisioned per-Route, but stay tuned for wildcard support in a future release. This feature requires Istio 1.1, and must be explicitly enabled.

Moar Controller Decoupling

We have started to split the “pluggable” controllers in Knative into their own controller processes so that folks looking to replace Knative sub-systems can more readily remove the bundled default implementation. For example, to install Knative Serving without the Istio layer run:

kubectl apply -f serving.yaml \
  -l networking.knative.dev/ingress-provider!=istio

Note that we may see some error due to kubectl not understanding the yaml for Istio objects (even if they are filtered out by the label selector). It is safe to ignore the errors no matches for kind "Gateway" in version "networking.istio.io/v1alpha3".

You can also use this to omit the optional Auto-TLS controller based on cert-manager with:

kubectl apply -f serving.yaml \
  -l networking.knative.dev/certificate-provider!=cert-manager

Autoscaling

Move the Knative PodAutoscaler (aka “KPA”) from the /scale sub-resource for scaling to a PodScalable “duck type”. This enables us to leverage informer caching, and the expanded contract will enable the ServerlessService (aka “SKS”) to leverage the PodSpec to do neat optimizations in future releases. (Thanks @mattmoor)

We now ensure that our “activator” component has been successfully wired in before scaling a Revision down to zero (aka “positive hand-off”, #2949). This work was enabled by the Revision-managed activation work below. (Thanks @vagababov)

New annotations autoscaling.knative.dev/window, autoscaling.knative.dev/panicWindowPercentage, and autoscaling.knative.dev/panicThresholdPercentage allow customizing the sensitivity of KPA-class PodAutoscalers (#3103). (Thanks @josephburnett)

Added tracing to activator to get more detailed and persistently measured performance data (#2726). This fixes #1276 and will enable us to troubleshoot performance issues, such as cold start. (Thanks @greghaynes).

Fixed a Scale to Zero issue with Istio 1.1 lean installation (#3987) by reducing the idle timeouts in default transports (#3996) (Thanks @vagababov) which solves the k8's service not being terminated when the endpoint changes.

Resolved an issue which prevented disabling Scale to Zero (#3629) with fix (#3688) (Thanks @yanweiguo) which takes enable-scale-to-zero from configmap into account in KPA reconciler when doing scale. If minScale annotation is not set or set to 0 and enable-scale-to-zero is set to false, keep 1 pod as minimum.

Fix the autoscaler bug that make rash decision when the autoscaler restarts (#3771). This fixes issues #2705 and #2859. (Thanks @hohaichi)

Core API

We have an approved v1beta1 API shape! As above, we have started down the path to v1beta1 over the next several milestones. This milestone landed the v1beta1 API surface as a supported subset of v1alpha1. See above for more details. (Thanks to the v1beta1 task force for many hours of hard work on this).

We changed the way we perform validation to be based on a “fieldmask” of supported fields. We will now create a copy of each Kubernetes object limited to the fields we support, and then compare it against the original object; this ensures we are deliberate with which resource fields we want to leverage as the Kubernetes API evolves. (#3424, #3779) (Thanks @dgerd). This was extended to cleanup our internal API validations (#3789, #3911) (Thanks @mattmoor).

status.domain has been deprecated in favor of status.url. (#3970) (Thanks @mattmoor) which uses the apis.URL for our URL status fields, resolving the issue "Unable to get the service URL" (#1590)

Added the ability to specify default values for the matrix of {cpu, mem} x {request, limit} via our configmap for defaults. This also removes the previous CPU limit default so that we fallback on the configured Kubernetes defaults unless this is specifically specified by the operator. (#3550, #3912) (Thanks @mattmoor)

Dropped the use of the configurationMetadataGeneration label (#4012) (thanks @dprotaso), and wrapped up the last of the changes transitioning us to CRD sub-resources (#643).

Networking

Overhauled the way we scale-to-zero! (Thanks @vagababov) This enables us to have Revisions managing their own activation semantics, implement positive hand-off when scaling to zero, and increase the autoscaling controller’s resync period to be consistent with our other controllers.

Added support for automatically configuring TLS certificates! (Thanks @ZhiminXiang) See above for more details.

We have stopped releasing Istio yamls. It was never our intention for knative/serving to redistribute Istio, and prior releases exposed our “dev”-optimized Istio yamls. Users should consult either the Istio or vendor-specific documentation for how to get a “supported” Istio distribution. (Thanks @mattmoor)

We have started to adopt a flat naming scheme for the named sub-routes within a Service or Route. The old URLs will still work for now, but the new URLs will appear in the status.traffic[*].url fields. (Thanks @andrew-su)

Support the installation of Istio 1.1 (#3515, #3353) (Thanks @tcnghia)

Fixed readiness probes with Istio mTLS enabled (#4017) (Thanks @mattmoor)

Monitoring

Activator now reports request logs (#3781) with check-in (#3927) (Thanks @mdemirhan)

Test and Release

Assorted Fixes

  • label serving.knative.dev/release: devel should have the release name/number instead of devel (#3626) fixed with Export TAG to fix our annotation manipulation. (#3995) (Thanks @mattmoor)

  • Always install istio from HEAD for upgrade tests (#3522) (Thanks @jonjohnsonjr) fixing errors with upgrade / downgrade testing of knative (#3506)

  • Additional runtime conformance test coverage (9 new tests), improvements to existing conformance tests, and v1beta1 coverage. (Thanks @andrew-su, @dgerd, @yt3liu, @mattmoor, @tzununbekov)