rezolus

Systems performance telemetry

APACHE-2.0 License

Stars
1.6K
Committers
12

Bot releases are hidden (Show)

rezolus - release v2.16.3 Latest Release

Published by brayniac over 2 years ago

Fixed

  • Fixes potential deadlock by updating dashmap dependency.
  • Updates other dependencies to pull-in bugfixes.
  • Improves memcache sampler reconnect logic.
rezolus - release v2.16.2

Published by brayniac over 2 years ago

Fixed

  • Fixes scheduler runqueue latency BPF for newer kernels (>= 5.14) and those
    built with gcc >= 10
  • Fixes issue with release 2.16.1
rezolus - release v2.16.0

Published by brayniac over 2 years ago

Added

  • Adds a new process sampler which can monitor CPU and memory utilization for
    a process. (#282)
rezolus - release v2.15.2

Published by brayniac over 2 years ago

Fixed

  • Fixes issue where release archives do not build successfully due to inclusion
    of vergen in the build script. (#279)
rezolus - release v2.15.1

Published by brayniac almost 3 years ago

Fixed

  • Fixes tcp/connection/accepted and tcp/connection/initiated metrics on
    kernel 5.10. (#266)
  • Fixes tcp/receive/duplicate and tcp/receive/out_of_order metrics. (#267)
rezolus - release v2.15.0

Published by brayniac almost 3 years ago

Changed

  • Allow selective enablement of various BPF metrics. (#254)
  • Support up to BCC 0.23.0 and makes it the new default version. (#256)
  • Removed ssl support in http sampler to remove dependency on openssl. (#257)

Added

  • Adds TCP jitter and connections accepted and initiated using BPF. (#247)
  • Adds Pelikan specific stats to memcache sampler. (#249)
  • Adds TCP packet drops counter using BPF. (#250)
  • Adds TCP tail loss recovery and retransmit timeout using BPF. (#253)
  • Adds TCP duplicate segment and out-of-order segment counters using BPF. (#255)

Fixed

  • Improved handling of BPF initialization errors so that samplers will continue
    to initialize remaining BPF probes if fault tolerant error handling is
    enabled. (#259)
rezolus - release v2.14.0

Published by brayniac about 3 years ago

Added

  • Adds new SRTT metric for TCP sampler using BPF. (#238)
  • Adds new krb5kdc sampler to get telemetry on MIT Kerberos. (#241)
rezolus - release v2.13.0

Published by brayniac over 3 years ago

Fixed

  • Interrupt sampler failed to sample all interrupts if it encountered an
    unexpected keyword. (#225)
  • Interrupt sampler incorrectly initialized per-NUMA node counts for NVMe and
    network interrupts. (#226)
  • Memory sampler failed to report some stats. (#227)
  • CPU c-state sampling now handles older style c-state names. (#229)
  • Prometheus metric exposition now includes type annotations and changes the
    format for percentiles to be encoded as a label value. This fixes collection
    with OpenTelemetry. (#230)

Changed

  • Removed unused interrupt/serial metric from the interrupt sampler. (#228)
rezolus - release v2.12.0

Published by brayniac over 3 years ago

Fixed

  • NTP sampler failed to build with musl toolchain. (#216)

Added

  • New usercall sampler for probing arbitrary userspace functions in shared
    libraries.
rezolus - release v2.11.1

Published by brayniac over 3 years ago

Fixed

  • HTTP and Memcache samplers reporting incorrect percentiles.
rezolus - release v2.11.0

Published by brayniac almost 4 years ago

Added

  • Nvidia GPU sampler which uses the Nvidia Management Library (NVML) to gather
    telemetry for GPU utilization and health.
  • NTP sampler to gather telemetry about NTP synchronization.

Fixed

  • Disk BPF sampling now compatible with newer kernels.
  • Bug introduced in 2.8.0 caused sample rates greater than 1000ms to cause
    errors.
rezolus - release v2.10.0

Published by brayniac almost 4 years ago

Changed

  • Updates tokio to 0.3.1 from 0.2.x
  • Reduces syscall load by reusing filehandles in memory, interupt, and network
    samplers.
rezolus - release v2.9.0

Published by brayniac about 4 years ago

Added

  • Page Cache sampler which uses BPF to instrument Page Cache hit/miss.

Fixed

  • Updated rustcommon dependencies to get some runtime performance benefits.
  • Added proper core -> NUMA node mapping to address issues with per-node metrics
    for interrupt sampler.
  • Reduce the cost of disabled samplers by skipping all initialization of
    samplers which are not enabled in the config.
  • Documentation updates.
rezolus - release v2.8.0

Published by brayniac about 4 years ago

Changed

  • Metrics library has been replaced with a new version which reduces memory
    footprint.
  • Samplers have been optimized to reduce number of system calls and temporary
    allocations.
  • Arbitrary percentiles may now be expressed in the configuration.
  • Percentile exposition format has changed to allow arbitrary percentiles. They
    are now expressed in a decimal format padded to 2 digits before the decimal.
    For example, the 5th percentile is now p05 and the 99.9th percentile is now
    p99.9.
rezolus - release v2.7.1

Published by brayniac about 4 years ago

Fixed

  • Fixed memcache sampler causing tokio worker to panic due to issues registering
    the tcp stream with the tokio runtime.
rezolus - release v2.7.0

Published by brayniac about 4 years ago

Changed

  • Perf event sampling now implemented with BPF. Now requires building with BPF
    support.
  • Renamed worker threads and set limit for total number of runtime threads
    instead of just core threads.

Added

  • CPU sampler now includes CPU frequency.

Fixed

  • BPF probes are now dropped properly on program termination. Previously, on
    some kernel versions, BPF probes might remain after exit.
  • Memcache sampler was not being initialized. It's now re-enabled.
rezolus - release v2.6.0

Published by brayniac about 4 years ago

Added

  • Expanded memory sampler coverage to include telemetry related to NUMA access
    patterns, transparent hugepages, and compaction.

Fixed

  • Disk sampler was not reporting stats for all disks on some multi-disk systems.
rezolus - release v2.5.0

Published by brayniac about 4 years ago

Added

  • Interrupt sampler now has BPF sampling of time distribution of hardirq/softirq
    handlers.

Fixed

  • Replaced remaining uses of chashmap with dashmap which has better performance
    characteristics.
  • Statically linking bcc/bpf has fixes in upstream crates.
rezolus - release v2.4.0

Published by brayniac over 4 years ago

Added

  • HTTP sampler to poll JSON endpoint and provide summary metrics
  • Added support for bcc 0.15.0, making it the new default version
rezolus - release v2.3.0

Published by brayniac over 4 years ago

Added

  • TCP abort metrics added to tcp sampler
  • Increased max for context switch histogram to prevent clipping

Fixed

  • Fixed bug where percentiles could get stuck at the max value if they hit it