deepflow

eBPF Observability - Distributed Tracing and Profiling

APACHE-2.0 License

Stars
2.7K
Committers
98

Bot releases are visible (Hide)

deepflow - v6.2.6

Published by dundun9 over 1 year ago

Release v6.2.6

deepflow - v6.1.8.7

Published by dundun9 over 1 year ago

Release v6.1.8.7

deepflow - v6.2.5

Published by dundun9 over 1 year ago

New Features (Alpha)

  • Universal Application Topology
    • Added direction score indicator, the higher the score, the higher the accuracy of the direction of the client and server, and the direction must be correct when the score is 255.
  • Querier API
    • When PromQL queries Prometheus native indicators, it supports the tags automatically injected by DeepFlow AutoTagging

New Features (GA)

  • Universal Application Topology
    • Support zero interpolation code to automatically display the panoramic application topology of process granularity FR-001-Xiaomi
  • Integration
    • Pre-aggregation of OpenTelemetry Span data into service and path metrics
  • Auto Tagging
    • When it is not possible to group by Pod, auto_service, auto_instance (resource_glX) prefer to group by process

optimization

  • Management
    • Supports configuration of hourly granular data storage duration
    • Support unified setting of extra docking routing interface for all managed K8s clusters under the public cloud account
    • Provide two deepflow-agent binary packages: dynamic link and static link, the former depends on the glibc dynamic link library, and the latter has obvious malloc/free lock competition under multi-threading
  • Querier API
    • The Category of custom type Tag (k8s.label/cloud.tag/os.app) is unified as map_item
deepflow - v6.2.4

Published by dundun9 over 1 year ago

New Features (Alpha)

  • Integration
    • Pre-aggregation of OpenTelemetry Span data into service and path metrics
  • Auto Tagging
    • Support batch input of load balancer and its listener information FR-022-Xiaomi
    • When it is not possible to group by Pod, auto_service, auto_instance (resource_glX) prefer to group by process

New Features (GA)

  • Auto Tagging
    • Automatically inherit the metadata marked on the parent process FR-024-Xiaomi
  • SQL API
    • Support SLIMIT parameter to limit the number of Series returned

optimization

  • Auto Tagging
    • Process granular application topology adapts to port multiplexing scenarios ISSUE-#2394
    • Field renaming: use auto_instance instead of resource_gl0, use auto_serivce instead of resource_gl2
  • Management
    • Support configuring the time interval of deepflow-agent list k8s-apiserver
    • Support specifying the Hostname of the environment where the collector is located
deepflow - v6.1.8.6

Published by dundun9 over 1 year ago

Release v6.1.8.6

deepflow - v6.2.3

Published by dundun9 over 1 year ago

New Features (GA)

  • Universal Application Topology
    • Support using the TOA (TCP Option Address) mechanism to calculate the real access relationship before and after NAT FR-002-Xiaomi
  • Auto Tagging
    • A process automatically inherits its parent's metadata (os_app tag) FR-024-Xiaomi
    • Supports synchronization of Baidu Cloud Smart Network (CSN) resource information
  • Grafana
    • Add Grafana backend plug-in module to support standard Grafana alarm policy configuration

optimization

  • Management
    • Remote upgrade of deepflow-agent on the cloud server can be done completely through deepflow-ctl, no need to manually mount hostPath for deepflow-server
  • Auto Tagging
    • Adapt to resource information synchronization of K8s 1.18 and 1.20
  • SQL API
    • When getting the optional value of the enum type Tag field, return the description information corresponding to the value
deepflow - v6.2.2

Published by dundun9 over 1 year ago

New Features (GA)

  • AutoTracing
    • Supports distributed tracing of Golang applications with zero interpolation
  • Auto Tagging
    • Support adding custom metadata for processes, cloud servers, and K8s Namespace FR-001-Xiaomi
    • Support automatic synchronization of K8s cluster information under AWS and Alibaba Cloud accounts
  • Management
    • Support the number of ClickHouse nodes greater than the number of deepflow-server replicas FR-003-Zhongtong
    • Support deepflow-agent running on K8s Node as a normal process (not Pod) FR-004-Tencent
    • Support specifying domain name controller or ingester address for deepflow-agent FR-008-Xiaomi

optimization

  • deepflow-agent
    • The list of regular expressions for scanning processes (os-proc-regex) supports configuring action=drop to express ignore semantics FR-010-Xiaomi
    • Support running in Linux Kernel environment lower than 3.0 FR-012-Xiaomi
    • Use the socket information of the operating system to correct the flow log direction FR-011-Xiaomi
    • When the ctrl_ip or ctrl_mac of the agent's operating environment changes, it supports automatic update of the corresponding information of the agent
  • deepflow-server
    • At the end of the UDP flow timeout, the status field of l4_flow_log is set to OK
deepflow - v6.2.1

Published by dundun9 almost 2 years ago

New Features (Alpha)

-Universal Application Topology

  • Support zero interpolation code to automatically display the panoramic application topology of process granularity FR-001-XIAOMI
    -Auto Tagging
  • Support adding custom metadata for processes, cloud servers, and K8s Namespace FR-001-XIAOMI
  • Support automatic synchronization of K8s cluster information under AWS and Alibaba Cloud accounts
  • Supports synchronization of Baidu Cloud Smart Network (CSN) resource information
  • Querier API
    • Support PromQL
  • Management
    • Support the number of ClickHouse nodes greater than the number of deepflow-server replicas FR-003-ZHONGTONG
    • Support deepflow-agent running on K8s Node as a normal process (not Pod) FR-004-Tencent
    • Support specifying domain name controller or ingester address for deepflow-agent FR-008-XIAOMI

optimization

  • Querier API
    • Support returning the original field name before AS
  • Grafana
    • Optimize the Variable of Enum type to avoid expanding all candidate values in SQL when selecting All
deepflow - v6.1.8.5

Published by dundun9 almost 2 years ago

Release v6.1.8.5

deepflow - v6.1.8.4

Published by dundun9 almost 2 years ago

Release v6.1.8.4

deepflow - v6.2.0

Published by dundun9 almost 2 years ago

Release v6.2.0

deepflow - v6.1.8.3

Published by dundun9 almost 2 years ago

Release v6.1.8.3

deepflow - v6.1.8.2

Published by dundun9 almost 2 years ago

new features

  • AutoMetrics, AutoTracing, AutoLogging
    • Added support for SOFARPC protocol
  • Grafana
    • Support as a Grafana Tempo data source and display Tracing data on the Tempo page
deepflow - v6.1.8.1

Published by dundun9 almost 2 years ago

new features

deepflow - v6.1.8

Published by dundun9 almost 2 years ago

new features

optimization

  • Management
    • Reduce the memory overhead of deepflow-agent when watching K8s apiserver through message compression and clipping
    • Optimize the memory consumption of deepflow-server when calculating service Tag through Pod aggregation
    • Optimize the write pressure of deepflow-server on the flow_tag database by means of memory pre-aggregation
    • Managed MySQL service supports connections using non-root users
deepflow - v6.1.7

Published by dundun9 almost 2 years ago

new features

  • Metrics
    • Mark content_length in OTel as metrics, detailed field mapping reference document
    • Request Log increased session_length metric
  • Tracing
    • Support parsing the sw8 field in the Dubbo protocol, extracting TraceID, SpanID
      -Event
    • Automatically generate cloud server, K8s Pod add, delete, and change events, and add Grafana Dashboard
  • Management

Optimize

  • SQL API
    • When the automatic grouping resource type resource_glX_type is an IP address, directly reuse resource_glX_id to represent the subnet ID, resource_glX to represent the IP address
  • Management
    • Reduced database permission requirements when using Managed RDS and ClickHouse
    • Automatically balance the number of deepflow-agents served by each deepflow-server
    • Use OpenTelemetry to monitor the call chain inside deepflow-server
deepflow - v6.1.6

Published by dundun9 almost 2 years ago

new features

  • AutoLogging
    • Added attribute.http_user_agent and attribute.http_referer fields for HTTP protocol
    • Added attribute.rpc_service field, the value is ServiceName of Dubbo/gRPC
    • Added endpoint field, the value is ServiceName/MethodName of Dubbo/gRPC
    • Support for marking HTTP2 data conforming to the gRPC protocol specification as gRPC (instead of HTTP2) protocol
  • AutoTagging
    • Supports synchronizing the resource information of AWS public cloud, and automatically injects it into the observation data as a tag
    • Supports simultaneous injection of cloud resources and container tags for observation data of public cloud-hosted K8s clusters
  • manage
    • Support configuring whether deepflow-agent enables parsing of various application protocols
    • Support configuring the regular expression of Golang/openssl process name for deepflow-agent to collect data through eBPF uprobe
    • Supports deepflow-agent standalone mode, where Flow Log and Request Log are written to local log files
    • Support i18n, default display is English

Optimize

  • AutoLogging
    • Match SQL keywords to reduce the false positive rate of MySQL and PostgreSQL protocol identification
  • Grafana
    • Display the SQL query statement in the Query Editor of the Panel to help developers understand how to call the API
    • Optimized the display of empty field information in Distributed Tracing flame graph
  • manage
    • Optimize the traffic between deepflow-server and clickhouse, and preferentially write to the clickhouse Pod on the same node
    • Support to compress OTel Span data received by deepflow-agent, the bandwidth consumption when sending to deepflow-server can be reduced by about 7 times
deepflow - v6.1.5

Published by dundun9 almost 2 years ago

new features

  • AutoMetrics, AutoTracing, AutoLogging
    • Support collection of PostgreSQL performance indicators and access logs, and associate them with distributed tracing
    • Support to collect HTTPS performance indicators and access logs using openssl library, and correlate them to distributed tracing
  • Integration
    • deepflow-server supports RemoteRead interface for Prometheus
    • deepflow-agent supports skipping otel-collector to receive OpenTelemetry data directly
    • The query statement of Grafana Variable supports the use of custom variables and built-in variables, [see documentation](https://deepflow.yunshan.net/docs/zh/server-integration/query/sql/#use-tag-self Name filtering), usage scenarios include:
      • filter the value range of the variable pod with the currently selected value of the variable pod_cluster
      • Use the input content of the variable ingress_wildcard to change the value range of the variable ingress
      • Use the current values ​​of the built-in variables $__from and $__to to improve the query speed of the variable value range
    • Add two zero-intrusive observability dashboards in Grafana: K8s Ingresss, SQL Monitoring
  • SQL API
    • string_enum and int_enum types of Tag support using Enum() function to translate Value into Name for query filtering and result return
    • Support SELECT tags/attributes/labels to query all tag.X/attribute.X/label.X fields of each row of data without specifying specific field names
  • manage
    • Support ClickHouse cold data to use disk (as an alternative to object storage)
    • Support using deepflow-ctl agent rebalance to balance deepflow-agent to new and restored deepflow-server

Optimize

-AutoLogging

  • Sort out the application protocol analysis process and lower the threshold of Add support for more application protocols
  • AutoTagging
    • Added deepflow-ctl cloud info command to debug the resource information synchronized from the cloud platform API
  • SmartEncoding
    • Recycle the tag encoding value of deleted resources to improve compression ratio and query speed
  • SQL API
    • Optimize the display_name of the enumeration value of the server_port Tag field, including the corresponding int value to avoid unclear meaning
  • manage
    • deepflow-server is modified to use Deployment Controller to deploy, Please pay attention to update helm chart when upgrading
    • deepflow-agent supports running in unprivileged mode. For specific permission requirements, please refer to [Reference Documentation](https://deepflow.yunshan.net/docs/zh/install/overview/#Running permissions and kernel requirements)
    • Optimize the mapping relationship between Prometheus metrics and DeepFlow Table, each Metrics corresponds to a Table
deepflow - v6.1.4

Published by dundun9 about 2 years ago

1.1. New Features

  • AutoTagging
    • Supports synchronizing the resource information of Tencent's public cloud, and automatically injects it into the observation data as a label
  • AutoTracing
    • Supports associated application spans and network spans in environments where eBPF cannot run, eliminating tracking blind spots
  • SQL API
    • show tags adds fields that return map types, such as labels, attributes, tags
    • show tag values Added limit, offset, like parameters
  • Production environment deployment
    • Supports using managed ClickHouse and MySQL
    • When deepflow-agent has not completed registration, it supports configuration issued by deepflow-server
  • Grafana
    • Added DeepFlow self-monitoring Dashboard

1.2. Optimization

  • SQL API
    • Metrics data in OTel Span can be returned via show metrics API
  • system capability
    • Support deepflow-server master election without relying on sidecar
    • deepflow-server supports backward compatibility with deepflow-agent
    • By default, it is synchronized with the NTP server of the container node where deepflow-server is located
    • -v output of normalized process
deepflow - v6.1.3

Published by dundun9 about 2 years ago

Application

AutoMetrics
New indicators: client waiting delay, number of SYN packets, number of SYN-ACK packets, number of SYN retransmission packets, number of SYN-ACK retransmission packets
AutoTracing
Support to associate eBPF uprobe Span with cBPF Span, OTel Span and display it in the trace flame graph
AutoLogging
Support using eBPF uprobe to collect Golang HTTP2, HTTP2_TLS calls
Support collecting Golang process uprobe data with standard symbol table trimmed (Golang >= 1.13 and < 1.18)
AutoTagging
For K8s nodes that are not associated with a cloud server, the cloud server label is automatically generated
Supports synchronizing resource information of Huawei public cloud
Querier SQL API
Fields after GROUP BY are returned automatically, no need to explicitly declare after SELECT
Grafana
Added thumbnail display for DeepFlow Topo and DeepFlow AppTracing
Optimized the Span Tip in the tracing flame graph, showing the time-consuming ratio of the Span itself

System

The time to wait for the agent to come online at the first deployment was optimized from 7 minutes to 4 minutes
Access to deepflow-server and clickhouse in the same K8s cluster no longer uses NodeIP
deepflow-server uses externalTrafficPolicy=Cluster by default to avoid the unavailability of the externalTrafficPolicy=Local function of kube-proxy in some environments and some CNI compatibility issues. It can be manually changed to Local to optimize cross-cluster traffic
deepflow-server adds ext-metrics-ttl, flow-metrics-ttl, flow-log-ttl configuration parameters to initialize data retention time
deepflow-agent supports writing l4_flow_log and l7_flow_log to local files
deepflow-agent removes dependencies on libbpf