antrea

Kubernetes networking based on Open vSwitch

APACHE-2.0 License

Stars
1.6K
Committers
115

Bot releases are hidden (Show)

antrea - Release v0.13.1

Published by antoninbas over 3 years ago

Fixed

  • Clean up stale IP addresses on Antrea host gateway interface. (#1900, @antoninbas)
    • If a Node leaves and later rejoins a cluster, a new Pod CIDR may be allocated to the Node for each supported IP family and the gateway receives a new IP address (first address in the CIDR)
    • If the previous addresses are not removed from the gateway, we observe connectivity issues across Nodes
  • Update libOpenflow to avoid crash in Antrea Agent for certain Traceflow requests. (#1833, @antoninbas)
  • Fix the deletion of stale port forwarding iptables rules installed for NodePortLocal, occurring when the Antrea Agent restarts. (#1887, @monotosh-avi)
  • Fix output formatting for the "antctl trace-packet" command: the result was displayed as a Go struct variable and newline characters were not rendered, making it hard to read. (#1897, @jianjuns)
antrea - Release v0.12.2

Published by antoninbas over 3 years ago

Fixed

  • Ensure that NodePort traffic does not bypass NetworkPolicies. (#1816, @tnqn)
    • NodePort traffic for which ExternalTrafficPolicy is set to Cluster goes through SNAT before NetworkPolicies are enforced; after SNAT the source IP is the IP of the local gateway interface (antrea-gw0)
    • Users will need to define the appropriate NetworkPolicies to allow ingress access to isolated Pods for NodePort traffic
    • This new behavior only applies to Linux Nodes using the OVS system datapath (default)
  • Clean up stale IP addresses on Antrea host gateway interface. (#1900, @antoninbas)
    • If a Node leaves and later rejoins a cluster, a new Pod CIDR may be allocated to the Node for each supported IP family and the gateway receives a new IP address (first address in the CIDR)
    • If the previous addresses are not removed from the gateway, we observe connectivity issues across Nodes
antrea - Release v0.11.3

Published by antoninbas over 3 years ago

Fixed

  • Ensure that NodePort traffic does not bypass NetworkPolicies. (#1816, @tnqn)
    • NodePort traffic for which ExternalTrafficPolicy is set to Cluster goes through SNAT before NetworkPolicies are enforced; after SNAT the source IP is the IP of the local gateway interface (antrea-gw0)
    • Users will need to define the appropriate NetworkPolicies to allow ingress access to isolated Pods for NodePort traffic
    • This new behavior only applies to Linux Nodes using the OVS system datapath (default)
  • Clean up stale IP addresses on Antrea host gateway interface. (#1900, @antoninbas)
    • If a Node leaves and later rejoins a cluster, a new Pod CIDR may be allocated to the Node for each supported IP family and the gateway receives a new IP address (first address in the CIDR)
    • If the previous addresses are not removed from the gateway, we observe connectivity issues across Nodes
antrea - Release v0.13.0

Published by antoninbas over 3 years ago

Includes all the changes from 0.12.1.

Added

  • Add NodePortLocal feature to improve integration with external load-balancers. (#1459 #1743 #1758, @monotosh-avi @shubhamavi @hemantavi) [Alpha - Feature Gate: NodePortLocal]
    • Services can be annotated with "nodeportlocal.antrea.io/enabled" to indicate that NodePortLocal should be enabled for this Service's Pod Endpoints
    • For each container port exposed by such a Pod, the Antrea Agent will allocate a local Node port value and traffic sent to this Node port will be forwarded to the container port using DNAT
    • The mapping from allocated Node ports to container ports is stored in a new Pod annotation, "nodeportlocal.antrea.io", e.g. to be consumed by external load-balancers
  • Introduce the ClusterGroup CRD to logically group different network endpoints and reference them together in Antrea-native policies. (#1782, @abhiraut @Dyanngg)
    • The extra level of indirection enables separation between workload selection and policy definition
    • ClusterGroups can be referenced in Antrea ClusterNetworkPolicies, either in the AppliedTo or as peers in policy rules (#1750 #1734)
    • In addition to the Pod / Namespace selectors and ipBlocks, ClusterGroups can reference a Service by name directly, and all Pod Endpoints for this Service will be included in the ClusterGroup (#1797)
    • ClusterGroups can also select ExternalEntitites, which are used to represent labelled non-Pod endpoints (#1828)
    • The ClusterGroup CRD includes a Status subresource used to indicate whether the Antrea Controller has already computed the membership list for the group (#1778)
    • New APIs are defined in "controlplane.antrea.tanzu.vmware.com/v1beta2": "/clustergroupmembers" retrieves the list of members of a group and "/groupassociations" retrieves the list of groups that a given endpoint (Pod or ExternalEntity) belongs to (#1688)
  • Add support for containerd runtime on Windows Nodes. (#1781 #1832, @ruicao93) [Windows]
  • Add EndpointSlice support to AntreaProxy. (#1703, @hongliangl) [Alpha - Feature Gate: EndpointSlice]
    • EndpointSlice needs to be enabled in the K8s cluster
    • Only the "discovery.k8s.io/v1beta1" EndpointSlice API is supported
  • Add support for arm/v7 and arm64 by providing Antrea Docker images for these architectures. (#1771, @antoninbas)
    • Refer to the documentation for instructions on how to use the image
  • Support IPv6 packets in Traceflow. (#1579, @gran-vmv)
  • Add the following Prometheus metrics to the the AntreaProxy implementation: "antrea_proxy_sync_proxy_rules_duration_seconds", "antrea_proxy_total_endpoints_installed", "antrea_proxy_total_endpoints_updates", "antrea_proxy_total_services_installed", "antrea_proxy_total_services_updates". (#1704, @weiqiangt)
  • Add the following Prometheus metrics to count Status updates for Antrea-native policies: "antrea_controller_acnp_status_updates", "antrea_controller_anp_status_updates". (#1801, @antoninbas)
  • Add support for TLS between the Antrea Agent FlowExporter and the FlowAggregator, using self-signed certificates. (#1649, @zyiou)
  • New Antrea Agent configuration option, "kubeAPIServerOverride", which can be used to explicitly provide an address for the K8s apiserver when the Agent is running as Pod; by default, the Agent uses the ClusterIP for the kubernetes Service. (#1735, @anfernee)
  • Provide ability to configure TLS cipher suites supported by the Antrea apiservers (Agent and Controller). (#1784, @lzhecheng)
  • Add liveness probe to Antrea Controller to ensure it is automatically restarted after a while by kubelet if it stops being responsive. (#1839, @antoninbas)
  • Document workaround to install OVS and Antrea on Windows Nodes for which the CPU does not have the required virtualization capabilities, as may be the case for cloud VMs. (#1744, @ruicao93) [Windows]
  • Improve documentation for "noEncap" and "hybrid" traffic modes, and add information about how to use Kube-router to advertise Pod CIDRs to the fabric with BGP. (#1798, @jianjuns)
  • Add new NetworkPolicy testsuite based on auto-generated test cases. (#1765, @mattfenwick)

Changed

  • Change permissions for the "/var/run/antrea" directory created by the Antrea Agent on each Node to prevent non-root users from accessing it; among other things, it includes the socket file used to send CNI commands to the Agent. (#1770, @jianjuns)
  • Add multi-table support to the "antctl get ovsflows" command, to dump flows from multiple tables at once. (#1708, @weiqiangt)
  • Change the sanity check performed by the Antrea Agent to validate that the Hyper-V dependency is satisfied. (#1741, @ruicao93)
  • Periodically verify that the static iptables rules required by Antrea are present and install missing rules if any. (#1751, @siddhant94)
  • Update Mellanox/sriovnet dependency to version v1.0.2 to support OVS hardware offload to Mellanox devices with Kernel versions 5.8 and above. (#1845, @Mmduh-483)
  • Remove dependency on juju libraries, which are distributed under an LGPL v3 license. (#1796, @antoninbas)

Fixed

  • Ensure that NodePort traffic does not bypass NetworkPolicies. (#1816, @tnqn)
    • NodePort traffic for which ExternalTrafficPolicy is set to Cluster goes through SNAT before NetworkPolicies are enforced; after SNAT the source IP is the IP of the local gateway interface (antrea-gw0)
    • Users will need to define the appropriate NetworkPolicies to allow ingress access to isolated Pods for NodePort traffic
    • This new behavior only applies to Linux Nodes using the OVS system datapath (default)
  • When clearing the flow-restore-wait config for the OVS bridge after re-installing flows, ensure that the operation happened successfully and retry if anything unexpected happen; if flow-restore-wait is not cleared, the bridge will not f
    orward packets correctly. (#1730, @tnqn)
  • Stop mounting the host's kmod binary to the Antrea initContainer as it may depend on shared libraries not available in the container. (#1777, @antoninbas)
  • Fix crashes in the FlowAggregator, along with numerous spurious warnings, by updating the version of the go-ipfix library. (#1817, @zyiou @srikartati)
  • Fix issues with reference logstash configuration and improve reference Kibana dashboards for flow visualization with the FlowExporter feature. (#1727, @zyiou)
antrea - Release v0.11.2

Published by antoninbas over 3 years ago

Fixed

  • Send necessary updates to Antrea Agents when a Pod's IP address is updated, as otherwise NetworkPolicies are not enforced correctly. (#1808, @Dyanngg @tnqn)
  • On Antrea Agent restart, ensure that OpenFlow priorities are assigned correctly for NetworkPolicy rules, and that rules with the same tier and priority are assigned the same OpenFlow priority. (#1841, @Dyanngg)
  • Do not release the OpenFlow priority assigned to a NetworkPolicy rule in case of a transient error when installing the corresponding flows, if other rules are using the same OpenFlow priority. (#1844, @Dyanngg)
  • Do not delete Endpoint flows when an Endpoint is no longer used for a specific Service (or if a Service is deleted) if these flows are still required by another Service. (#1815, @weiqiangt)
  • Fix bugs in IPv6 AntreaProxy implementation, notably for flow "hairpinning" and ServiceAffinity support. (#1713, @lzhecheng)
  • Support non-standardized CIDRs (CIDRs for which some address bits may not have been masked off as per the prefix length) in NetworkPolicies. (#1767, @tnqn)
  • Fix minimum required Linux Kernel version (4.6) in documentation. (#1757, @hongliangl)
  • Fix Agent crash when creating an Antrea-native policy with a "drop" action, while the NetworkPolicyStats feature is enabled. (#1606, @ceclinux)
  • Fix Traceflow when Antrea-native policies are created with a "drop" action. (#1602, @gran-vmv @lzhecheng)
  • Fix Agent crash when enabling NetworkPolicyStats and Traceflow feature together and creating an Antrea-native policy with a "drop" action. (#1615, @tnqn)
  • When the destination is a Service in a Traceflow request, do not overwrite the default TCP SYN flag (needed for the packet to be processed by AntreaProxy correctly) unless the user explicitly provided a non-zero value. (#1602, @gran-vmv @lzhecheng)
  • Improve handling of transient OVS errors when installing flows for policy rules in the Agent, by ensuring that retries are executed correctly. (#1667, @tnqn)
antrea - Release v0.12.1

Published by antoninbas over 3 years ago

Changed

Fixed

  • Send necessary updates to Antrea Agents when a Pod's IP address is updated, as otherwise NetworkPolicies are not enforced correctly. (#1808, @Dyanngg @tnqn)
  • On Antrea Agent restart, ensure that OpenFlow priorities are assigned correctly for NetworkPolicy rules, and that rules with the same tier and priority are assigned the same OpenFlow priority. (#1841, @Dyanngg)
  • Do not release the OpenFlow priority assigned to a NetworkPolicy rule in case of a transient error when installing the corresponding flows, if other rules are using the same OpenFlow priority. (#1844, @Dyanngg)
  • Do not delete Endpoint flows when an Endpoint is no longer used for a specific Service (or if a Service is deleted) if these flows are still required by another Service. (#1815, @weiqiangt)
  • Fix AntreaProxy implementation on Windows for ClusterIP Services with endpoints outside of the cluster's Pod CIDR, by ensuring that SNAT is performed correctly. (#1824, @ruicao93) [Windows]
  • More robust error handling for network adapter operations on Windows; in particular add a retry mechanism if enabling the network adapter fails. (#1736, @ruicao93) [Windows]
  • When the Antrea Agent process is run using the provided PowerShell script, ensure that the Kubeconfigs used by the Agent to connect to the K8s and Antrea Controller apiservers are updated on every restart. (#1847, @ruicao93) [Windows]
  • Fix bugs in IPv6 AntreaProxy implementation, notably for flow "hairpinning" and ServiceAffinity support. (#1713, @lzhecheng)
  • Support non-standardized CIDRs (CIDRs for which some address bits may not have been masked off as per the prefix length) in NetworkPolicies. (#1767, @tnqn)
  • Fix minimum required Linux Kernel version (4.6) in documentation. (#1757, @hongliangl)
antrea - Release v0.12.0

Published by antoninbas almost 4 years ago

Includes all the changes from 0.11.1.

Added

  • Add support for rule-level AppliedTo for Antrea-native policies. (#1396, @Dyanngg)
    • Ability to select different endpoints on which to apply the different rules within the same policy, without having to define multiple policies
    • For a given policy, either the policy-level AppliedTo field must be used, or the rule-level AppliedTo fields
  • Add support for port ranges in the rules of Antrea-native policies. (#1557, @GraysonWu)
  • Introduce the FlowAggregator, an IPFIX mediator implementation to collect, process and export flow records generated by the Antrea Agents. (#1671 #1677, @srikartati @dreamtalen @zyiou)
    • Built using the go-ipfix library
    • Flow records exported by the FlowAggregator are not missing any K8s contextual information (e.g. source / destination Pod names)
    • It is recommended to always deploy the FlowAggregator when using the FlowExporter feature, as opposed to sending records directly from the Agent to a third-party collector
    • Refer to the Flow Exporter documentation for more information
  • Add ability to sort by "effective priority" when listing internal NetworkPolicy resources (computed by the Controller) with antctl: priorities are sorted in the effective order in which they are enforced. (#1530, @Dyanngg)
  • Add support for IPv6 to the FlowExporter implementation in the Agent. (#1677, @lzhecheng @antoninbas @srikartati)
    • Support for IPv6 IPFIX Information Elements in exported flow records
    • Agent can export flows to an IPFIX collector over IPv6
    • However, FlowAggregator is still missing support for IPv6
  • Add support for generating an Antrea manifest which is compatible with K8s 1.15 clusters (by default, Antrea requires K8s >= 1.16). (#1664, @guesslin)
    • This can be done by running the hack/generate-manifest.sh script with the "--k8s-1.15" flag

Changed

  • Update the priority of the default Tiers, to space them out more evenly and to provide more room for user-defined Tiers with higher priority than Emergency. (#1665, @abhiraut)
    • This change will impact users who use custom Tiers - in addition to the default Tiers -, as the relative priorities between tiers may change and impact the order in which Antrea-native policies are enforced
    • Impacted users will need to recreate their custom tiers with updated priority values after upgrading Antrea to restore the enforcement order of their policies
  • Switch to VMware Harbor registry (projects.registry.vmware.com) for all user-facing Docker images, in response to new Docker Hub rate limits. (#1617, @antoninbas @lzhecheng).
    • When applying one of the official Antrea manifests, the Antrea Docker images will be pulled from projects.registry.vmware.com
  • Default to ~/.kube/config as the default location of the Kubeconfig file in the Antrea Octant plugin: this gives a better user experience when running Octant and the plugin as a process (as opposed to running them as a Pod). (#1662, @mengdie-song)
  • Set OVS max revalidator delay to 200 ms (instead of 500ms): this reduces the delay before a learned flow is installed in the OVS datapath and improves the quality of the SessionAffinity implementation in AntreaProxy. (#1584, @antoninbas)
  • Add more load-balancing information for Service traffic (destination Pod name and IP) in the generated Traceflow graph in Octant when applicable. (#1607, @ZhangYW18)
  • Clean up OVS flows in charge of SNAT in Windows Agent implementation. (#1453, @jianjuns) [Windows]
  • Make the OVS flows in charge of L2/L3 forwarding more uniform across different traffic cases. (#1594, @jianjuns)
  • Auto-generate listers and informers for AntreaAgentInfo and AntreaControllerInfo CRDs to facilitate consumption by other projects. (#1612, @liu4480)

Fixed

  • Fix Agent crash when creating an Antrea-native policy with a "drop" action, while the NetworkPolicyStats feature is enabled. (#1606, @ceclinux)
  • Fix Traceflow when Antrea-native policies are created with a "drop" action. (#1602, @gran-vmv @lzhecheng)
  • Fix Agent crash when enabling NetworkPolicyStats and Traceflow feature together and creating an Antrea-native policy with a "drop" action. (#1615, @tnqn)
  • Do not try to remove existing IP addresses from the Antrea OVS bridge on Windows before assigning the correct one, as there may not be any which would cause an error. (#1660, [@ruicao9
    3]) [Windows]
  • When the destination is a Service in a Traceflow request, do not overwrite the default TCP SYN flag (needed for the packet to be processed by AntreaProxy correctly) unless the user explicitly provided a non-zero value. ([#1602](https://
    github.com/vmware-tanzu/antrea/pull/1602), @gran-vmv @lzhecheng)
  • Do not decrement the IP TTL field during L3 forwarding if the packet entered the OVS pipeline from the local gateway. (#1436, @wenyingd @dumlutimuralp)
  • Improve handling of transient OVS errors when installing flows for policy rules in the Agent, by ensuring that retries are executed correctly. (#1667, @tnqn)
antrea - Release v0.11.1

Published by antoninbas almost 4 years ago

Fixed

  • Fix SessionAffinity implementation in AntreaProxy: the timeout value was not honored correctly and flows were not updated correctly when the SessionAffinity type changed. (#1576, @antoninbas)
  • Ensure that AntreaProxy deletes stale flows when a Service's port number changes. (#1576, @antoninbas)
  • Fix networkPolicyOnly traffic mode and support for AKS and EKS by ensuring that the proper criteria are used when determining whether to install IPv4 flows and / or IPv6 flows. (#1585 #1575, @antoninbas @Dyanngg)
  • Ensure backwards-compatibility of "controlplane.antrea.tanzu.vmware.com" for older Agents using the v1beta1 API version to communicate with a new Controller which defaults to v1beta2. (#1586, @tnqn)
    • During upgrade from 0.10.x to 0.11.0, NetworkPolicy enforcement was broken for older Agents (0.10.x) because of an API change
    • Upgrading from 0.10.x to 0.11.1 or from 0.11.0 to 0.11.1 is supported without disruption
  • Mutate empty "tier" field in Antrea-native policies to the default "Application" tier to ensure that the correct tier is reported when dumping policies (e.g. with kubectl). (#1567, @abhiraut)
antrea - Release v0.11.0

Published by antoninbas almost 4 years ago

Includes all the changes from 0.10.1 and 0.10.2.

The AntreaProxy feature is graduated from Alpha to Beta and is therefore enabled by default.

The Traceflow feature is graduated from Alpha to Beta and is therefore enabled by default.

Support for Prometheus metrics is graduated from Alpha to Beta and Antrea metrics are therefore exposed by default.

Added

  • Support for IPv6 and dual-stack clusters. (#1518 #1102, @wenyingd @lzhecheng @mengdie-song @ksamoray) [Alpha]
    • Note that the FlowExporter feature does not support IPv6 and should not be enabled in clusters where IPv6 addresses are used
  • Add "status" field to the Antrea-native policy CRDs to report the realization status of policies (how many Nodes are currently enforcing the policy). (#1442, @tnqn)
    • Each Agent reports its status using an internal API in "controlplane.antrea.tanzu.vmware.com" and everything is aggregated by the Controller which updates the "status" field
  • Support for audit logging for Antrea-native policy rules: logging can now be enabled for individual rules with the "enableLogging" field and logs will be written in human-readable format to "/var/log/antrea/networkpolicy/np.log" on the Node's filesystem. (#1216, @qiyueyao)
  • Add "name" field for individual rules in Antrea-native policy CRDs and auto-generate rule names when they are not provided by the user. (#1330 #1451, @GraysonWu)
  • Add "baseline" tier for Antrea-native policies: policies in that tier are enforced after (i.e. with a lower precedence) than K8s network policies. (#1450, @Dyanngg)
  • Add support for Antrea-native policies to the "antctl get netpol" command. (#1301, @GraysonWu)
  • Add config option to disable SNAT for Pod-to-External traffic in noEncap mode, in case the Pod CIDR is routable in the Node network. (#1394, @jianjuns)
  • Add NetworkPolicy information (Namespace and Name of the NetworkPolicy allowing the connection) to the IPFIX flow records exported by the Agent when FlowExporter is enabled. (#1268, @srikartati)
  • Support for the FlowExporter feature for Windows Nodes. (#1321, @dreamtalen) [Windows]
  • Add support for Pod Traffic Shaping by leveraging the upstream bandwidth plugin, maintained by the CNI project. (#1414, @tnqn)
  • Add "antctl log-level" command to change log verbosity of a specific Antrea Agent or of the Controller at runtime; it invokes the "/loglevel" API. (#1340, @jianjuns)
  • Introduce the "antctl proxy" command, which gives antctl the ability to operate as a reverse proxy for the Antrea API, in order to simplify troubleshooting and profiling Antrea. (#1452, @antoninbas)
  • Support for providing a list of Node names when generating a support bundle with antctl. (#1267, @weiqiangt)
  • Additional documentation:

Changed

  • Upgrade the "controlplane.antrea.tanzu.vmware.com" API to v1beta2; the Antrea Controller still serves version v1beta1 of the API which is now deprecated. (#1467, @Dyanngg @tnqn)
    • Internal NetworkPolicy objects in "controlplane.antrea.tanzu.vmware.com/v1beta2" are cluster-scoped instead of Namespace-scoped and collisions between Antrea-native policies and K8s policies are no longer possible
  • Upgrade the "core.antrea.tanzu.vmware.com" API to v1alpha2 and remove the v1alpha1 version. (#1467, @Dyanngg)
  • Remove deprecated Prometheus metrics "antrea_agent_runtime_info" and "antrea_controller_runtime_info". (#1503, @srikartati)
  • Remove unnecessary writes to "send_redirects" Kernel parameters in the Agent; in theory antrea-agent no longer needs to be run as a "privileged" container, although it is recommended to keep doing so for the FlowExporter feature. (#1364, @tnqn)
  • Do not track Geneve / VXLAN overlay traffic in the host network; this improves data-plane performance when kube-proxy installs a large number of iptables rules. (#1425, @tnqn)
  • Optimize OpenFlow priority assignment in the Agent when converting policies to flows, by assigning all the rule priorities for a given policy in batch. (#1331, @Dyanngg)
  • Upgrade Octant to v0.16.1 and leverage support for "alerts" in the UI to display error messages to users when Traceflow request parameters are invalid or when an error occurs. (#1371, @ZhangYW18)
  • More robust script for preparing Windows Nodes before running the Antrea Agent. (#1480, @ruicao93)
  • Remove dependency on the serviceCIDR configuration parameter in the FlowExporter implementation, when AntreaProxy is enabled. (#1380, @srikartati)
  • Cache mapping from OVS flow ID to original NetworkPolicy in the Agent for a small time interval after the flow has been deleted, to ensure the information remains accessible when generating stats reports or flow records. (#1411, @srikartati)
  • Officially-supported Go version is no longer 1.13 but 1.15. (#1420, @antoninbas).

Fixed

  • Support for Antrea-native policies in Traceflow: without this change all the Traceflow requests would time out and fail. (#1361, @gran-vmv)
  • Use 32-bit unsigned integers for timestamps in flow records instead of 64-bit signed integers, as per the IPFIX RFC. (#1479, @zyiou)
antrea - Release v0.10.2

Published by antoninbas almost 4 years ago

Added

  • Use logrotate to rotate OVS log files written to the Node and avoid filling up the disk partition; log rotation can be configured by changing the "--log_file_max_num" and "--log_file_max_size" command-line arguments for "start_ovs" in the Antrea manifest. (#1329, @jianjuns)

Changed

  • Update Octant plugin installation guide to simplify the steps when deploying Octant as a Pod. (#1473, @mengdie-song)

Fixed

  • Use IP DSCP field instead of Geneve TLV metadata to encode the Traceflow data-plane tag. (#1466, @gran-vmv)
    • This works around an OVS issue which was causing inter-Node Traceflow requests to frequently hang unless no other traffic was present in the cluster network
    • Traceflow can now be used regardless of the traffic mode: this includes other tunneling protocols (e.g. VXLAN) and noEncap mode
  • Update version of libOpenflow to fix a deadlock when an OpenFlow bundle times out, which was causing the Node to run out of Pod IPs; the issue was introduced in v0.10.0. (#1511, @weiqiangt @tnqn)
  • Do not fail Agent initialization if xtables lock cannot be acquired within a short amount of time, as it only creates more xtables lock contention and prevents Pod from being created. (#1497, @tnqn)
  • Bump up portmap CNI plugin version to 0.8.7 to further reduce the xtables lock contention. (#1534, @tnqn)
  • When a new Node is allocated the same Pod CIDR as a recently-deleted Node by the K8s control-plane, do not process the Node creation event in the Antrea Agent until after the deletion event for the old Node has been processed. (#1526, @tnqn)
  • Fix SessionAffinity implementation in AntreaProxy for non-TCP traffic (UDP & SCTP): the match defined in the learn action was incorrect as the transport protocol was hardcoded to TCP. (#1398, @wenyingd)
  • Respect the provided label selector in Antrea aggregated APIs instead of always returning the complete list of objects for each resource type. (#1481, @tnqn)
  • When the destination is a Service in a Traceflow request, automatically set the TCP SYN flag so the packet can be processed by AntreaProxy correctly. (#1386 #1378, @lzhecheng @mengdie-song)
  • Ignore Antrea-native policy resources in the Agent if the AntreaPolicy feature is not enabled, to avoid crashes. (#1336, @jianjuns)
  • When removing Service flows in AntreaProxy, remove Endpoint flows at the very end to avoid "inifinite" packet recirculation in some scenarios. (#1381, @weiqiangt)
  • Set OVS version after the ovs-vswitchd service is started in the Windows installation script to ensure it can always be set successfully. (#1423, @ruicao93 @jayunit100) [Windows]
  • Ensure that the "appliedTo" and "priority" fields are required in the OpenAPI spec for the ClusterNetworkPolicy CRD. (#1359, @abhiraut)
  • Always restart OVS services on Windows in case of failure. (#1495, @ruicao93) [Windows]
  • Validate the Agent configuration on startup and log an error message if any enabled feature is not supported by the OS (in particular on Windows Nodes). (#1468, @jianjuns)
  • Add sanity checks for IPsec and log helpful error messages if some packages or components are missing. (#1430, @antoninbas)
  • Fix reference Kibana dashboard configuration file for FlowExporter feature: some IPFIX IE names did not match the names from the Antrea registry. (#1370, @zyiou)
antrea - Release v0.10.1

Published by antoninbas about 4 years ago

Fixed

  • Fix OpenAPI spec for the ClusterNetworkPolicy CRD: the incorrect spec was causing all CNPs with egress rules to be rejected by kubectl and the K8s apiserver. (#1314, @abhiraut)
    • this only affects users which enable the AntreaPolicy Feature Gate in their cluster and create ClusterNetworkPolicies
antrea - Release v0.10.0

Published by antoninbas about 4 years ago

Includes all the bug fixes from 0.9.1, 0.9.2 and 0.9.3.

Starting with Antrea 0.10.0, K8s version >= 1.16 is required.

Added

  • Add Antrea NetworkPolicy CRD API to define namespaced security policies which support additional features compared to K8s NetworkPolicies. (#1117 #1194, @Dyanngg @abhiraut) [Alpha - Feature Gate: AntreaPolicy]
    • The ClusterNetworkPolicy Feature Gate has been removed, AntreaPolicy is used for both Antrea NetworkPolicies and ClusterNetworkPolicies
    • Refer to the Antrea Policy CRDs documentation for information
  • Add "v1alpha1.stats.antrea.tanzu.vmware.com" API to query traffic statistics about NetworkPolicies (number of sessions / packets / bytes which are allowed or denied). (#1172 #1221 #1140, @tnqn @weiqiangt) [Alpha - Feature Gate: NetworkPolicyStats]
    • The stats are aggregated from each Antrea Agent using an internal API in "controlplane.antrea.tanzu.vmware.com"
  • Add ability for users to define their own policy tiers using a Tier CRD. (#926 #1237 #1260 #1290, @abhiraut @Dyanngg)
    • The 5 static tiers introduced in 0.9.x are mapped to read-only CRDs, in order to provide backwards-compatibility for clusters with existing tiered policies
    • Admission webhooks ensure consistency across Tiers, NetworkPolicies and ClusterNetworkPolicies
    • Refer to the Antrea Policy CRDs documentation for information
  • Support for ExternalEntity: rules in Antrea policies can select labelled non-Pod endpoints (e.g. VMs) which are represented by ExternalEntity CRD resources. (#1084, @Dyanngg @suwang48404)
  • Support for querying the list of NetworkPolicies which are applied to a specific Pod, or which select a specific Pod in an ingress / egress rule. (#1116, @jakesokol1 @antoninbas) [Alpha]
    • New "/endpoint" API endpoint in Antrea Controller - API may change in future releases
    • New "antctl query endpoint" command
  • Add Prometheus metrics for the connection tracking table (max size, total number of connections, total number of connections installed by Antrea) when FlowExporter is enabled. (#1232, @dreamtalen)
  • Configure access to Antrea NetworkPolicy and ClusterNetworkPolicy APIs for default cluster roles (admin / edit / view) using aggregated ClusterRoles. (#1206, @abhiraut)
  • Configure access to Traceflows API for default cluster roles (admin / edit / view) using aggregated ClusterRoles. (#1231, @abhiraut)

Changed

  • Re-introduce legacy "networking.antrea.tanzu.vmware.com" internal API group which was previously removed in 0.9.3, to avoid upgrade issues. (#1243, @tnqn)
    • Users can safely upgrade from any 0.9.x release to 0.10.0 without disruption in NetworkPolicy enforcement, assuming the Antrea Controller is upgraded first.
  • Use the v1 version of "apiextensions.k8s.io" instead of "v1beta1"; v1 was introduced in K8s 1.15. (#1009, @abhiraut)
    • As part of this, the OpenAPI spec used for validation was improved for several of the Antrea CRDs
  • Use the v1 version of "rbac.authorization.k8s.io" instead of v1beta1; v1 was introduced in K8s 1.8. (#1274, @abhiraut)
  • Change type of some Prometheus metrics from "summary" to "histogram", which may impact consumers of these metrics, which where incorrectly tagged as "STABLE" when they were first introduced. (#1202, @dreamtalen)
  • Deprecate "antrea_agent_runtime_info" and "antrea_controller_runtime_info" metrics, which will be removed in 0.11; the same information can now be obtained from the instance label of the target. (#1217, @srikartati)
  • Upgrade OVS version to 2.14.0 to pick up some recent patches. (#1121, @lzhecheng)
  • Collect additional information in support bundle. (#1145, @wenyingd)
    • OVS logs, kubelet logs and host network configuration on Windows Nodes [Windows]
    • Description of the ports associated with the OVS bridge
  • Restrict read permissions for the OVSDB file persisted on each Node. (#1293, @antoninbas)
  • Add more consistent short names for Antrea NetworkPolicies ("anp") and ClusterNetworkPolicies ("acnp"). (#1291, @abhiraut)
  • Add reference to the original user-defined policy object in the internal representation of policies computed by the Antrea Controller and served through the "controlplane.antrea.tanzu.vmware.com" internal API. (#1258, @tnqn)
  • Remove dependency on "github.com/goccy/go-graphviz" in the Traceflow UI implementation: usage of cgo was creating issues when cross-compiling assets and some of the module's dependencies were distributed under copyleft licenses. (#1127, @ZhangYW18)
  • Remove serviceCIDR Agent configuration parameter from Antrea manifests destined to public cloud K8s services (AKS, EKS, GKE) to avoid confusion: AntreaProxy is always enabled for those, which means that the parameter is not needed and will be ignored if provided. (#1177, @jianjuns)
  • Add status message in Traceflow UI for running Traceflow requests. (#1277, @ZhangYW18)
  • Optimize flow priority assignment for Antrea Policies when the Agent restarts. (#1105, @Dyanngg)

Fixed

  • Periodically check timeout of running Traceflow requests to provide a useful status to users and avoid leaking data-plane tags. (#1179, @jianjuns)
antrea - Release v0.9.3

Published by antoninbas about 4 years ago

Changed

  • Rename internal API group from "networking.antrea.tanzu.vmware.com" to "controlplane.antrea.tanzu.vmware.com". (#1147, @jianjuns)
    • This API is served by the Antrea Controller and consumed by Agents (directly) and antctl (through the K8s apiserver using an APIService)
    • Antrea Controller deletes the previous APIService on startup to avoid issues (e.g. with Namespace deletion)
    • During upgrade from a previous version, NetworkPolicy enforcement will be disrupted until the upgrade is complete: NetworkPolicy changes may not take effect and NetworkPolicies may not be applied to new Pods, until all components have been updated

Fixed

  • Fix IPsec support which was broken after updating the base distribution to Ubuntu 20.04 for the Antrea Docker image, as this update introduced a more recent version of strongSwan. (#1184 #1191, @jianjuns)
  • Fix deadlock in the NetworkPolicy implementation in the Antrea Agent: this issue could only be observed when using ClusterNetworkPolicies but was affecting the enforcement of all NetworkPolicies. (#1186, @Dyanngg @yktsubo @tnqn)
  • Fix unbound variable error in "start_ovs" Bash script, which was causing the antrea-ovs container to crash if one OVS daemon stopped for any reason. (#1190, @antoninbas @alex-vmw)
antrea - Release v0.9.2

Published by antoninbas about 4 years ago

Fixed

  • Fix incorrect conversion from unsigned integer to string when indexing the flows responsible for the implementation of a NetworkPolicy rule by their conjunction ID / rule ID; this issue could have caused incorrect NetworkPolicy enforcement when a large number of rules are applied to a Node. (#1161, @weiqiangt)
  • Fix self-signed certificate rotation in the Antrea Controller: after rotation (at half the expiration time), the new certificate was distributed to clients while the Controller apiserver kept using the old certificate. (#1154, @MatthewHinton56)
  • Support setting TCP flags when initiating a Traceflow request from antctl; for Pod-to-Service trace packets, the SYN flag must be set. (#1128, @lzhecheng)
  • Generate correct filename for support bundle archive temporary file: on Windows the name included an asterisk which is invalid. (#1150, @weiqiangt) [Windows]
antrea - Release v0.9.1

Published by antoninbas about 4 years ago

Changed

  • Rotate self-signed certificate generated by the Antrea Controller at half the expiration time, instead of one day before expiration. (#1115, @andrewsykim)
  • Collect heap profile data in Antrea support bundle to help troubleshoot issues related to memory usage. (#1110, @weiqiangt)

Fixed

  • Optimize processing of egress policy rules that do not include any named port by avoiding the creation and distribution of a "global" AddressGroup - which includes all the Pods - when unnecessary. (#1100, @tnqn)
  • Avoid duplicate processing of Traceflow requests in the Antrea Controller and fix data-plane tag allocation. (#1094, @jianjuns)
  • Work around race condition in github.com/containernetworking/plugins when determining the network namespace of the caller which was responsible for errors when configuring Pod networking at scale. (#1131, @tnqn)
  • Fail the CNI ADD request if the OF port value returned by OVS is -1, which indicates an error during interface creation. (#1112, @tnqn)
  • Resubmit traffic for which Antrea Proxy has performed DNAT to the correct table so that ClusterNetworkPolicies can be enforced correctly. (#1119, @weiqiangt @yktsubo)
  • Update Windows OVS package so that the dependency on Microsoft Visual C++ can be resolved during installation. (#1099, @ruicao93) [Windows]
  • Temporarily ignore sanity checks when issuing a Traceflow request from the Octant UI since the current version of Octant does not support reporting the errors to the user; instead the Traceflow CRD is created and its "Status" field can be used to troubleshoot. (#1097, @ZhangYW18)
  • Revert all priority updates to policy flows if flow installation / modification fails on OVS. (#1095, @Dyanngg)
  • Fix the Antrea manifest for EKS (antrea-eks.yml) published for each release. (#1090, @antoninbas)
antrea - Release v0.9.0

Published by antoninbas about 4 years ago

Added

  • Add flow exporter feature. [Alpha - Feature Gate: FlowExporter]
    • Support sending network flow records using the IPFIX protocol from each Agent (#825 #984, @srikartati)
    • Add reference cookbook to visualize exported flows using Elastic Stack (#836, @zyiou)
  • Support OVS hardware offload for Pod networking: Pods can now be assigned an SR-IOV Virtual Function. (#786, @moshe010)
    • Add new CI job to validate the hardware offload functionality (@AbdYsn)
  • Support Node MTU auto-discovery in the Antrea Agent; the user can still override this value in the Agent configuration if desired. (#909, @reachjainrahul)
  • Enable Antrea support for the AKS managed K8s service, using CNI chaining and the "networkPolicyOnly" traffic mode. (#998, @reachjainrahul)
  • Support for NetworkPolicy tiering (ClusterNetworkPolicy only). (#956 #986, @abhiraut @Dyanngg)
    • The ClusterNetworkPolicy Feature Gate must now be enabled for the Agent (in addition to the Controller) to activate the feature
  • Support executing Traceflow requests with antctl. (#932, @lzhecheng)
  • Support automatic rotation for the self-signed certificate generated by Antrea when no certificate is provided by the user. (#1024, @MatthewHinton56)
  • Add new Agent Prometheus metrics for OVS flow operations. (#866, @yktsubo)
  • Provide a DaemonSet to automatically restart Pods on new Nodes in EKS when Antrea becomes ready: this ensures that NetworkPolicies are enforced correctly for all Pods. (#1057, @reachjainrahul)
  • Add scripts to run the Antrea Agent directly without using a Pod to manage the lifecycle of the process. (#1013, @ruicao93) [Windows]

Changed

  • Restrict all traffic modes except for "encap" to use "Antrea Proxy" for Pod-to-Service traffic, as this greatly simplifies the datapath implementation. (#1015, @suwang48404)
  • Improve Antrea Octant plugin. (#913, @ZhangYW18)
    • Merge the two existing plugins (Agent / Controller Info, Traceflow) into a single plugin / binary
    • Enhance Traceflow graph color theme
    • Improve layout of the "Overview" page for the plugin: all CRDs are shown on the same page
    • Update Octant plugin installation guide (#914, @mengdie-song)
  • Use Ubuntu 20.04 (instead of Ubuntu 18.04) as the base distribution for the Antrea Docker image. (#1022, @antoninbas)
  • Enable outer UDP checksum for Geneve and VXLAN tunnels to benefit from Generic Receive Offload (GRO) on the receiver's side. (#1049, @tnqn)
  • Support Services as destinations for Traceflow. (#979, @gran-vmv)
  • Provide additional printer columns in the Traceflow CRD definition, so that more information is included in the "kubectl get" output. (#958, @abhiraut)
  • More comprehensive OpenAPI schema for Traceflow CRD validation. (#918, @abhiraut)
  • Optimize OVS flow updates for NetworkPolicies when the Agent restarts, by using batching. (#844, @Dyanngg)
  • Increase watch timeout for the Antrea apiserver to reduce reconnection frequency; reduce log verbosity when a legitimate reconnection happens. (#1055, @antoninbas)
  • Update OVS pipeline documentation to account for the new tables used for ClusterNetworkPolicy and tiering support. (#921 #1073, @abhiraut)

Fixed

  • Fix implementation of NodePort Service on Windows for traffic for which the destination Pod (Service backend) is on the same Node as the source Pod. (#948, @wenyingd) [Windows]
  • Fix IPsec support, which was broken because of Python3 error in an upstream OVS script. (#1046, @lzhecheng)
  • Support Pod-to-LoadBalancer Service traffic in "Antrea Proxy". (#943, @ruicao93)
  • Support incoming LoadBalancer Service traffic on Windows, by relying on kube-proxy. (#943, @ruicao93) [Windows]
  • Avoid OpenFlow bundle timeout issues when using Traceflow: if PacketIn messages are not consumed fast enough, all inbound messages from OVS are blocked, including bundle reply messages. (#951, @gran-vmv)
  • Move host routes from the uplink interface to the OVS bridge during Agent initialization on Windows. (#959, @ruicao93) [Windows]
  • Optimize handling of very large AddressGroups (introduced by NetworkPolicies which select a large number of Pods in to/from rules) in the Antrea Agent. (#1031, @tnqn)
  • Modify "List" apiserver requests in the Agent to use "resourceVersion=0", which forces requests to be served from the cache (instead of etcd persistent storage) and removes performance issues when many agents are restarted simultaneously. (#1045, @wenyingd)
  • Fix OVS deadlock caused by glibc bug, by upgrading base distribution to Ubuntu 20.04 in Antrea Docker image. (#1022, @antoninbas @alex-vmw)
  • Set the "no-flood" configuration option on the uplink bridge port in Windows, so that ARP broadcast traffic is not sent out to the underlay network. (#922, @wenyingd) [Windows]
  • Avoid inaccurate warnings in the logs about "POD_NAMESPACE" not set. (#925, @antoninbas)
  • Fix format of tracing packets for Traceflow:
    • Set protocol version to the correct value in the IP header (#946, @lzhecheng)
    • Add correct L3/L4 checksum values (#967, @gran-vmv)
    • Set destination MAC address correctly when the provided destination IP address matches a local Pod. (#981, @ZhangYW18)
  • In "hybrid" traffic mode, reject Traceflow requests if the source and destination Nodes are not connected by a tunnel. (#944, @gran-vmv)
  • Log human-readable messages when the ofnet library returns an error. (#1065, @wenyingd)
  • Wait for the Antrea client in the Agent to be ready before starting watches to avoid error log messages. (#1042, @tnqn)
antrea - Release v0.8.2

Published by antoninbas over 4 years ago

Fixed

  • Fix Agent logic in charge of sending Gratuitous ARP messages when networking is configured for a Pod: the previous code was not thread-safe and causing file descriptor leaks for concurrent CNI ADD requests. (#933, @tnqn)
  • Clean up some internal state in the Agent's NetworkPolicy implementation when a rule is updated. (#929, @jianjuns)
antrea - Release v0.8.1

Published by antoninbas over 4 years ago

Do not use this release, use v0.8.2 instead

antrea - Release v0.8.0

Published by antoninbas over 4 years ago

Added

  • Add "Antrea Proxy" implementation to provide Pod-to-Service load-balancing (for ClusterIP Services) directly in the OVS pipeline. (#772, @weiqiangt) [Alpha - Feature Gate: AntreaProxy]
    • This feature is enabled by default for Windows Nodes, as it is required for correct NetworkPolicy implementation for Pod-to-Service traffic
  • Add ClusterNetworkPolicy CRD API, which enables cluster admins to define security policies which apply to the entire cluster (not just one Namespace). (#810 #872 #724, @abhiraut @Dyanngg) [Alpha - Feature Gate: ClusterNetworkPolicy]
  • Add Traceflow CRD API, which supports generating tracing requests for traffic going through the Antrea-managed Pod network. (#660 #731, @gran-vmv @lzhecheng) [Alpha - FeatureGate: Traceflow]
  • Add Traceflow Octant plugin: requests can be generated from the Web dashboard (by filling-out a form) and responses are displayed in graph format. (#841, @ZhangYW18)
  • Wrap klog so that one can specify a maximum number of log files to be kept for each verbosity level (using "--log_file_max_num"), while enforcing the size limit for each file (as specified with "--log_file_max_size"). (#879, @jianjuns @alex-vmw)
  • Support executing Agent API requests which depend on OVS command-line utilities (e.g., ovs-ofctl, ovs-appctl) on Windows Nodes; this enables using the "antctl get ovsflows" and "antctl trace-packet" commands for Windows Nodes. (#794, @wenyingd)
  • Support "antctl supportbundle" command for Windows Nodes. (#820, @weiqiangt)
  • Add "--controller-only" flag to "antctl supportbundle" command to only collect information from the Controller, without the Agents. (#791, @weiqiangt)
  • Add new Agent Prometheus metrics for NetworkPolicies:
    • "antrea_agent_ingress_networkpolicy_rule", "antrea_agent_egress_networkpolicy_rule" (#770, @yktsubo)
    • "antrea_agent_networkpolicy_count" (#834, @yktsubo)
  • Additional documentation:

Changed

  • Change default tunnel type from VXLAN to Geneve. (#858 #903, @jianjuns @antoninbas @abhiraut)
    • this may cause some disruption during upgrade, as inter-Node Pod communications between Nodes running Antrea pre-v0.8 and Nodes running Antrea post-v0.8 will be broken; edit the manifest if you want to stick to VXLAN
  • Move Octant plugin to a new "plugins/" folder and make it its own Go module. (#838, @mengdie-song)
  • Update antrea-cni to support CNI version 0.4.0. (#784, @moshe010)
  • Change gateway and tunnel interface names to antrea-gw0 and antrea-tun0 respectively. (#854, @jianjuns)
  • Make antrea-agent Pod tolerant of "NoExecute" taints to prevent unwanted evictions. (#815, @tnqn)
  • Use "Feature Gates" to control enabling / disabling experimental features instead of introducing separate temporary configuration parameters. (#847, @tnqn)
  • Upgrade K8s API version used by Antrea to 1.18. (#838, @mengdie-song)
  • Create controller-ca ConfigMap in the same Namespace as the Controller Deployment, instead of hard-coding it to "kube-system". (#876, @jianjuns)
  • Log error when "iptables-restore" command fails. (#839, @tnqn)
  • Update OVS version to 2.13.1 on Windows because of some issues, notably with the connection tracking implementation. (#856, @ruicao93)
  • Update behavior of "antctl supportbundle" command so that the Controller logs are not collected when a Node name or a Node filter is provided. (#857, @jianjuns)

Fixed

  • Fix runtime crash in the Agent when processing NetworkPolicy rules for which a Protocol has been provided, but no Port. (#882, @wenyingd @abhiraut)
  • Clean up stale OVS PID files to avoid failure loops in antrea-ovs startup. (#880, @jianjuns)
  • When using CNI chaining in a cloud-managed service, ensure that the initContainer blocks until the "primary CNI"'s conf file is found. (#864, @reachjainrahul)
  • Update version of go-iptables library to avoid deadlock when invoking iptables commands. (#873, @antoninbas)
  • Improve robustness of the liveness probe for the antrea-ovs container. (#861, @tnqn)
antrea - Release v0.7.2

Published by antoninbas over 4 years ago

Fixed

  • Fix handling of StatefulSet Pod rescheduling on same Node: a fast rescheduling can cause unexpected ordering of CNI ADD and DELETE commands, which means Antrea cannot use the Pod Namespace+Name as the unique identifier for a Pod's network configuration. #827
  • Fix IP address leak in IPAM caused by Antrea in-memory cache being out-of-sync with IPAM store. #828
  • Increase timeout to 5 seconds when waiting for ovs-vswitchd to report the allocated of_port number. #830
  • Fix CNI CHECK command implementation: the CNI server was always returning success even in case of failure. #821
  • Update ofnet library version to avoid a goroutine leak. #813
  • Exclude /healthz from authorization to avoid unnecessary calls to K8s API in readiness probes. #816
Package Rankings
Top 1.25% on Proxy.golang.org
Badges
Extracted from project README
Go Report Card CII Best Practices License FOSSA Status FOSSA Status