tempo

Grafana Tempo is a high volume, minimal dependency distributed tracing backend.

AGPL-3.0 License

Stars
3.6K
Committers
218

Bot releases are hidden (Show)

tempo - v2.0.0-rc.0

Published by joe-elliott over 1 year ago

Breaking Changes

Config

  • [CHANGE] BREAKING CHANGE Use snake case on Azure Storage config #1879 (@faustodavid)
    Example of using snake case on Azure Storage config:
    # config.yaml
    storage:
      trace:
        azure:
          storage_account_name:
          storage_account_key:
          container_name:
    
  • [CHANGE] Config updates to prepare for Tempo 2.0. #1978 (@joe-elliott)
    query_frontend:
      query_shards:                  // removed. use trace_by_id.query_shards
    querier:
        query_timeout:               // removed. use trace_by_id.query_timeout
    compactor:
        compaction:
            chunk_size_bytes:        // renamed to v2_in_buffer_bytes
            flush_size_bytes:        // renamed to v2_out_buffer_bytes
            iterator_buffer_size:    // renamed to v2_prefetch_traces_count
    ingester:
        use_flatbuffer_search:       // removed. automatically set based on block type
    storage:
      traces:
        wal:
            encoding:                // renamed to v2_encoding
            version:                 // removed and pinned to block.version
        block:
            index_downsample_bytes:  // renamed to v2_index_downsample_bytes
            index_page_size_bytes:   // renamed to v2_index_page_size_bytes
            encoding:                // renamed to v2_encoding
            row_group_size_bytes:    // renamed to parquet_row_group_size_bytes
    
  • [CHANGE] BREAKING CHANGE Remove search_enabled and metrics_generator_enabled. Both default to true. #2004 (@joe-elliott)
  • [CHANGE] BREAKING CHANGE Parquet is the new default block version #1678
    To continue using the v2 backend set:
    storage:
      trace:
        block:
          version: v2
    

Jsonnet

  • [CHANGE] Delete TempoRequestErrors alert from mixin #1810 (@zalegrala)
    • BREAKING CHANGE Any jsonnet users relying on this alert should copy this into their own environment.

Metrics Generator

  • [CHANGE] metrics-generator: handle collisions between user defined and default dimensions #1794 (@stoewer)
    BREAKING CHANGE Custom dimensions colliding with intrinsic dimensions will be prefixed with __.

Changes

  • [CHANGE] Increase default values for server.grpc_server_max_recv_msg_size and server.grpc_server_max_send_msg_size from 4MB to 16MB #1688 (@mapno)
  • [CHANGE] Update Go to 1.19 #1665 (@ie-pham)
  • [CHANGE] Update alpine image version to 3.16. #1784 (@zalegrala)
  • [CHANGE] Config updates to prepare for Tempo 2.0. #1978 (@joe-elliott)
    Defaults updated:
    query_frontend:
      max_oustanding_per_tenant: 2000
      search:
          concurrent_jobs: 1000
          target_bytes_per_job: 104857600
          max_duration: 168h
          query_ingesters_until: 30m
      trace_by_id:
          query_shards: 50
    querier:
        max_concurrent_queries: 20
        search:
            prefer_self: 10
    ingester:
        concurrent_flushes: 4
        max_block_duration: 30m
        max_block_bytes: 524288000
    storage:
        trace:
            pool:
                max_workers: 400
                queue_depth: 20000
            search:
                read_buffer_count: 32
                read_buffer_size_bytes: 1048576
    

Features

  • [FEATURE] TraceQL Phase 1 support. A new query language for traces!
  • [FEATURE] Parquet backend is GA and default. By default Tempo will create Parquet blocks to enhance search performance
    and to provide users with their data in an open format. See breaking changes above for a note on how to continue using older backends.
  • [FEATURE] Add generic forwarder and implement otlpgrpc forwarder #1775 (@Blinkuu)
    New config options and example configuration:
    # config.yaml
    distributor:
      forwarders:
        - name: "otel-forwarder"
          backend: "otlpgrpc"
          otlpgrpc:
            endpoints: ['otelcol:4317']
            tls:
              insecure: true
    
    # overrides.yaml
    overrides:
      "example-tenant-1":
        forwarders: ['otel-forwarder']
      "example-tenant-2":
        forwarders: ['otel-forwarder']
    

Enhancements

  • [ENHANCEMENT] Add /status/usage-stats endpoint to show usage stats data #1782 (@electron0zero)
  • [ENHANCEMENT] Add TLS support to jaeger query plugin. #1999 (@rubenvp8510)
  • [ENHANCEMENT] Collect inspectedBytes from SearchMetrics #1975 (@electron0zero)
  • [ENHANCEMENT] Add zone awareness replication for ingesters. #1936 (@manohar-koukuntla)
# use the following fields in _config field of jsonnet config, to enable zone aware ingester
    multi_zone_ingester_enabled: false,
    multi_zone_ingester_migration_enabled: false,
    multi_zone_ingester_replicas: 0,
    multi_zone_ingester_max_unavailable: 25,
  • [ENHANCEMENT] Add new data-type aware searchtagvalues v2 api #1956 (@mdisibio)
  • [ENHANCEMENT] Filter namespace by cluster in tempo dashboards variables #1771 (@electron0zero)
  • [ENHANCEMENT] Exit early from sharded search requests #1742 (@electron0zero)
  • [ENHANCEMENT] Avoid running tempodb pool jobs with a cancelled context #1852 (@zalegrala)
  • [ENHANCEMENT] Add config flag to allow for compactor disablement for debug purposes #1850 (@zalegrala)
  • [ENHANCEMENT] Identify bloom that could not be retrieved from backend block #1737 (@AlexDHoffer)
  • [ENHANCEMENT] tempo: check configuration returns now a list of warnings #1663 (@frzifus)
  • [ENHANCEMENT] Make DNS address fully qualified to reduce DNS lookups in Kubernetes #1687 (@electron0zero)
  • [ENHANCEMENT] Return 200 instead of 206 when blocks failed is < tolerate_failed_blocks. #1725 (@joe-elliott)
  • [ENHANCEMENT] Add GOMEMLIMIT variable to compactor jsonnet and set the value to equal compactor memory limit. #1758 (@ie-pham)
  • [ENHANCEMENT] Add capability to configure the used S3 Storage Class #1697 (@amitsetty)
  • [ENHANCEMENT] cache: expose username and sentinel_username redis configuration options for ACL-based Redis Auth support #1708 (@jsievenpiper)
  • [ENHANCEMENT] metrics-generator: expose span size as a metric #1662 (@ie-pham)
  • [ENHANCEMENT] Set Max Idle connections to 100 for Azure, should reduce DNS errors in Azure #1632 (@electron0zero)
  • [ENHANCEMENT] Add PodDisruptionBudget to ingesters in jsonnet #1691 (@joe-elliott)
  • [ENHANCEMENT] Add a cli command to convert a block to the current parquet schema. #1707 (@joe-elliott)
  • [ENHANCEMENT] metrics-generator: filter out older spans before metrics are aggregated #1612 (@ie-pham)
  • [ENHANCEMENT] Add hedging to trace by ID lookups created by the frontend. #1735 (@mapno)
    New config options and defaults:
query_frontend:
  trace_by_id:
    hedge_requests_at: 5s
    hedge_requests_up_to: 3
  • [ENHANCEMENT] Vulture now has improved distribution of the random traces it searches. #1763 (@rfratto)
  • [ENHANCEMENT] Add TLS support to the vulture #1874 (@zalegrala)
  • [ENHANCEMENT] metrics-generator: extract status_message field from spans #1786, #1794 (@stoewer)
  • [ENHANCEMENT] metrics-generator: handle collisions between user defined and default dimensions #1794 (@stoewer)
  • [ENHANCEMENT] metrics-generator: make intrinsic dimensions configurable and disable status_message by default #1960 (@stoewer)
  • [ENHANCEMENT] distributor: Log span names when distributor.log_received_spans.include_all_attributes is on #1790 (@suraciii)
  • [ENHANCEMENT] metrics-generator: truncate label names and values exceeding a configurable length #1897 (@kvrhdn)
  • [ENHANCEMENT] Convert last few Jsonnet alerts with per_cluster_label #2000 (@Whyeasy)
  • [ENHANCEMENT] New tenant dashboard #1901 (@mapno)
  • [ENHANCEMENT] Upgrade opentelemetry-proto submodule to v0.18.0 Internal types are updated to use scope instead of instrumentation_library.
    This is a breaking change in trace by ID queries if JSON is requested. #1754 (@mapno)

Bugfixes

  • [BUGFIX] Stop distributors on Otel receiver fatal error#1887 (@rdooley)
  • [BUGFIX] New wal file separator '+' for the NTFS filesystem and backward compatibility with the old separator ':' #1700 (@kilian-kier)
  • [BUGFIX] Honor caching and buffering settings when finding traces by id #1697 (@joe-elliott)
  • [BUGFIX] Correctly propagate errors from the iterator layer up through the queriers #1723 (@joe-elliott)
  • [BUGFIX] Make multitenancy work with HTTP #1781 (@gouthamve)
  • [BUGFIX] Fix parquet search bug fix on http.status_code that may cause incorrect results to be returned #1799 (@mdisibio)
  • [BUGFIX] tempo-mixin: tweak dashboards to support metrics without cluster label present #1913 (@kvrhdn)
  • [BUGFIX] Fix docker-compose examples not running on Apple M1 hardware #1920 (@stoewer)
  • [BUGFIX] Don't persist tenants without blocks in the ingester#1947 (@joe-elliott)
  • [BUGFIX] Return more consistent search results by combining partial traces # (@mapno)
tempo - v1.5.0-rc.2

Published by joe-elliott about 2 years ago

tempo - v1.3.1

Published by mapno over 2 years ago

This patch contains an important fix for users using etcd as kv store in Tempo's consistent hashing ring.

Bug Fixes

  • [BUGFIX] Fixed panic when using etcd as ring's kvstore #1260 (@mapno)
tempo - v1.3.0-rc.0

Published by mapno almost 3 years ago

Breaking changes

This release updates OpenTelemetry libraries version to v0.40.0, and with that, it updates OTLP gRPC's default listening port from the legacy 55680 to the new 4317. There are two main routes to avoid downtime: configuring the receiver to listen in the old port 55680 and/or pushing traces to both ports simultaneously until the rollout is complete.

As part of adding support for full backend search, a search config parameter has had its name change from query_frontend.search.max_result_limit to query_frontend.search.default_result_limit.

  • [CHANGE] BREAKING CHANGE The OTEL GRPC receiver's default port changed from 55680 to 4317. #1142 (@tete17)
  • [CHANGE] BREAKING CHANGE Moved querier.search_max_result_limit and querier.search_default_result_limit to query_frontend.search.max_result_limit and query_frontend.search.default_result_limit #1174.

New Features and Enhancements

  • [FEATURE]: Add support for inline environments. #1184 (@irizzant)
  • [FEATURE] Added support for full backend search. #1174 (@joe-elliott)
  • [ENHANCEMENT] Expose upto parameter on hedged requests for each backend with hedge_requests_up_to. #1085](https://github.com/grafana/tempo/pull/1085) (@joe-elliott)
  • [ENHANCEMENT] Search: drop use of TagCache, extract tags and tag values on-demand #1068 (@kvrhdn)
  • [ENHANCEMENT] Jsonnet: add $._config.namespace to filter by namespace in cortex metrics #1098 (@mapno)
  • [ENHANCEMENT] Add middleware to compress frontend HTTP responses with gzip if requested #1080 (@kvrhdn, @zalegrala)
  • [ENHANCEMENT] Allow query disablement in vulture #1117 (@zalegrala)
  • [ENHANCEMENT] Improve memory efficiency of compaction and block cutting. #1121 #1130 (@joe-elliott)
  • [ENHANCEMENT] Include metrics for configured limit overrides and defaults: tempo_limits_overrides, tempo_limits_defaults #1089 (@zalegrala)
  • [ENHANCEMENT] Add Envoy Proxy panel to Tempo / Writes dashboard #1137 (@kvrhdn)
  • [ENHANCEMENT] Reduce compactionCycle to improve performance in large multitenant environments #1145 (@joe-elliott)
  • [ENHANCEMENT] Added max_time_per_tenant to allow for independently configuring polling and compaction cycle. #1145 (@joe-elliott)
  • [ENHANCEMENT] Add tempodb_compaction_outstanding_blocks metric to measure compaction load #1143 (@mapno)
  • [ENHANCEMENT] Update mixin to use new backend metric #1151 (@zalegrala)
  • [ENHANCEMENT] Make TempoIngesterFlushesFailing alert more actionable #1157 (@dannykopping)
  • [ENHANCEMENT] Switch open-telemetry/opentelemetry-collector to grafana/opentelemetry-collectorl fork, update it to 0.40.0 and add missing dependencies due to the change #1142 (@tete17)
  • [ENHANCEMENT] Allow environment variables for Azure storage credentials #1147 (@zalegrala)
  • [ENHANCEMENT] jsonnet: set rollingUpdate.maxSurge to 3 for distributor, frontend and queriers #1164 (@kvrhdn)
  • [ENHANCEMENT] Reduce search data file sizes by optimizing contents #1165 (@mdisibio)
  • [ENHANCEMENT] Add tempo_ingester_live_traces metric #1170 (@mdisibio)
  • [ENHANCEMENT] Update compactor ring to automatically forget unhealthy entries #1178 (@mdisibio)
  • [ENHANCEMENT] Added the ability to pass ISO8601 date/times for start/end date to tempo-cli query api search #1208 (@joe-elliott)
  • [ENHANCEMENT] Prevent writes to large traces even after flushing to disk #1199 (@mdisibio)

Bug Fixes

  • [BUGFIX] Add process name to vulture traces to work around display issues #1127 (@mdisibio)
  • [BUGFIX] Fixed issue where compaction sometimes dropped spans. #1130 (@joe-elliott)
  • [BUGFIX] Ensure that the admin client jsonnet has correct S3 bucket property. (@hedss)
  • [BUGFIX] Publish tenant index age correctly for tenant index writers. #1146 (@joe-elliott)
  • [BUGFIX] Ingester startup panic slice bounds out of range #1195 (@mdisibio)

Other Changes

  • [CHANGE] Search: Add new per-tenant limit max_bytes_per_tag_values_query to limit the size of tag-values response. #1068 (@annanay25)
  • [CHANGE] Reduce MaxSearchBytesPerTrace ingester.max-search-bytes-per-trace default to 5KB #1129 @annanay25
  • [CHANGE] BREAKING CHANGE The OTEL GRPC receiver's default port changed from 55680 to 4317. #1142 (@tete17)
  • [CHANGE] Remove deprecated method Push from tempopb.Pusher #1173 (@kvrhdn)
  • [CHANGE] Upgrade cristalhq/hedgedhttp from v0.6.0 to v0.7.0 #1159 (@cristaloleg)
  • [CHANGE] Export trace id constant in api package #1176
  • [CHANGE] GRPC 1.33.3 => 1.38.0 broke compatibility with gogoproto.customtype. Enforce the use of gogoproto marshalling/unmarshalling for Tempo, Cortex & Jaeger structs. #1186 (@annanay25)
  • [CHANGE] BREAKING CHANGE Remove deprecated ingester gRPC endpoint and data encoding. The current data encoding was introduced in v1.0. If running earlier versions, first upgrade to v1.0 through v1.2 and allow time for all blocks to be switched to the "v1" data encoding. #1215 (@mdisibio)
Package Rankings
Top 1.01% on Proxy.golang.org
Related Projects