Bot releases are hidden (Show)
Release on: 2024-04-30
1.21.9
.logs_config.expected_tags_duration
now works for journald
logs.service
configuration parameter.normalize_only
to support normalizing SQL without obfuscating it. This mode is useful for customers who want to view unobfuscated SQL statements. By default, ObfuscationMode
is set to obfuscate_and_normalize
and every SQL statement is obfuscated and normalized.oracle-dbm
check which is now renamed to oracle
.display_container_name
being tagged as N/A
when container_name
information is available.metric_prefix
in custom_queries
to oracle
.global_custom_queries
bug.oracle.process.pga_maximum_memory
metric for backward compatibility.systemd
metrics when they are not setReleased on: 2024-04-30 Pinned to datadog-agent v7.53.0: CHANGELOG.
Published by kacper-murzyn 7 months ago
Release on: 2024-04-04
Published by kacper-murzyn 7 months ago
Release on: 2024-03-21
ddagentuser
(DDAGENTUSER_NAME
) account. If the account is a service account, such as LocalSystem or a gMSA account, no action is needed. If the account is a regular account, configure a different Datadog Agent service account.expected_tags_duration
amount of time.chdir
event to allow recent container escape detection.oracle.locks.transaction_duration
metric.registry.json
after their TTL expires.logs_file_permissions.log
file, in the form of either the journald directory or a specific file (if specified by the Agent journald configuration).logs_file_permissions.log
, which lists every file and that file's permissions that the Logs Agent can detect.repo_digest
to containerd ContainerImage to remove duplicate images in container images UI.1.21.7
.1.21.8
.ddagenthostname
tag.oracle.tablespace.maxsize
metric.kubelet_core
check to kubelet
and change the metrics prefix from kubernetes_core
to kubernetes
so that it can replace the Python kubelet
check.instant_client
. Replacing it with oracle_client
.stat
a file that doesn't exist.yaml ad.datadoghq.com/redis.checks: | { "redisdb": { "ignore_autodiscovery_tags": true, "instances": [ { "host": "%%host%%", "port": "6379" } ] } }
Moving forward, configurations that attempt to use hybrid setups—combining adv2 for check specification while also employing `adv1 for ignore_autodiscovery_tags—are no longer supported by default. Users should set the configuration parameter cluster_checks.support_hybrid_ignore_ad_tags to true to enable this behavior.
resource_manager
configuration to conf.yaml.example
.cluster_checks.rebalance_period
. The default value is 10 min.Published by kacper-murzyn 8 months ago
Release on: 2024-02-29
win32_event_log
check that occurs when processing an event that has a missing publisher and no EventData
.Published by kacper-murzyn 8 months ago
Release on: 2024-02-19
DD_ORCHESTRATOR_EXPLORER_RUN_ON_NODE_AGENT
to false
. The Process Agent pod check will be deprecated in the following release.check_delay
metric in Agent telemetryKeepTrailingSemicolon
- disable removing trailing semicolon. This option is only valid when ObfuscationMode
is obfuscate_and_normalize
.KeepIdentifierQuotation
- disable removing quotation marks around identifiers. This option is only valid when ObfuscationMode
is obfuscate_and_normalize
.msodbcsql18
linux dependency needed for SQL Server to run in Docker Agent.Updated the ntp check to support the default location of chrony.conf on Ubuntu (/etc/chrony/chrony.conf).
Agents are now built with Go 1.21.5
.
CWS: Reloading the datadog-agent-sysprobe systemd service now reloads the runtime security policies.
CWS: Added ssdeep file hashing algorithm support.
USM will report the actual status code of the HTTP traffic, instead of reporting only the status code family (2xx, 3xx, etc.).
Improved performance of the activity sampling query on RDS and Oracle Cloud databases.
OTLP ingest log timestamps (i.e. '@timestamp') now include milliseconds.
Always report the following telemetry metrics about the retry queue capacity:
datadog.agent.retry_queue_duration.capacity_secs
datadog.agent.retry_queue_duration.bytes_per_sec
datadog.agent.retry_queue_duration.capacity_bytes
Support container metrics for kata containers using containerd.
System Probe can now expose its healthcheck on a dedicated HTTP port. The Kubernetes daemonset uses this by default on port 5558.
freetds
and msodbcsql18
dependencies for py2.postgresql
dependency after upgrading psycopg2
to v2.9 in integrations-core. psycopg2
now comes with pre-built wheel for arm architecture.rancher/mirrored-pause.*
) are now excluded by default for containers and metrics collection.UNKNOWN ERROR
.logs_no_ssl
is set and dd_url
contains an https prefix. logs_no_ssl
will take precedence over the prefix in a future version.Released on: 2024-02-19 Pinned to datadog-agent v7.51.0: CHANGELOG.
Published by kacper-murzyn 9 months ago
Release on: 2024-01-11
Published by kacper-murzyn 10 months ago
Release on: 2024-01-04
1.20.12
.Published by kacper-murzyn 10 months ago
Release on: 2023-12-21
Published by kacper-murzyn 10 months ago
Release on: 2023-12-19
legacy_mode: false
configuration options are backwards compatible except for some regular expressions used in the included_messages
and excluded_messages
options. For example, Go regular expressions do not support lookahead or lookbehind assertions. If you do not use these options, then no configuration changes are necessary. See the Python regular expression docs and the Go regular expression docs for more information on the supported regular expression syntax. Set legacy_mode_v2: true
to revert to the Python implementation of the check. The Python implementation may be removed in a future version of the Agent.The orchestrator check is moving from the Process Agent to the Node Agent. In the next release, this new check will replace the current pod check in the Process Agent. You can start using this new check now by manually setting the environment variable DD_ORCHESTRATOR_EXPLORER_RUN_ON_NODE_AGENT
to true
.
Adds the following CPU manager metrics to the kubelet core check: kubernetes_core.kubelet.cpu_manager.pinning_errors_total, kubernetes_core.kubelet.cpu_manager.pinning_requests_total.
Add a diagnosis for connecting to the agent logs endpoints. This is accessible through the agent diagnose
command.
Add FIPS mode support for Network Device Monitoring products
Added support for collecting Cloud Foundry container names without the Cluster Agent.
The Kubernetes State Metrics Core check now collects kubernetes_state.ingress.tls.
APM: Added a new endpoint tracer_flare/v1/. This endpoint acts as a proxy to forward HTTP POST request from tracers to the serverless_flare endpoint, allowing tracer flares to be triggered via remote config, improving the support experience by automating the collection of logs.
CWS: Ability to send a signal to a process when a rule was triggered. CWS: Add Kubernetes user session context to events, in particular the username, UID and groups of the user that ran the commands remotely.
Enable container image collection by default.
Enable container lifecycle events collection by default. This feature helps stopped containers to be cleaned from Datadog faster.
[netflow] Allow collecting configurable fields for Netflow V9/IPFIX
Add support for Oracle 12.1 and Oracle 11.
Add monitoring of Oracle ASM disk groups.
Add metrics for monitoring Oracle resource manager.
[corechecks/snmp] Load downloaded profiles
DBM: Add configuration option to SQL obfuscator to use go-sqllexer package to run SQL obfuscation and normalization
Support filtering metrics from endpoint and service checks based on namespace when the DD_CONTAINER_EXCLUDE_METRICS environment variable is set.
The Windows Event Log tailer saves its current position in an event log and resumes reading from that location when the Agent restarts. This allows the Agent to collect events created before the Agent starts.
1.20.11
.0.49.0
which adds some more telemetry and contains some small fixes.collect_topology
by default.RemoveSpaceBetweenParentheses
- remove spaces between parentheses. This option is only valid when ObfuscationMode
is obfuscate_and_normalize
.KeepNull` - disable obfuscating null values with ?. This option is only valid when
ObfuscationModeis "obfuscate_only" or
obfuscate_and_normalize``.KeepBoolean
- disable obfuscating boolean values with ?. This option is only valid when ObfuscationMode
is obfuscate_only
or obfuscate_and_normalize
.KeepPositionalParameter
- disable obfuscating positional parameters with ?. This option is only valid when ObfuscationMode
is obfuscate_only
or obfuscate_and_normalize
.min_collection_interval
to collect. min_collection_interval
now controls how frequently the check attempts to reconnect when the event subscription is in an error state.timeout
option for the win32_event_log check is no longer applicable and can be removed. If the option is set, the check logs a deprecation warning and ignores the option.CVE-2023-45283
and CVE-2023-45284
kubelet_tls_verify
is set to false
with a misconfigured root certificate authority.timeout
option. The timeout
option is now deprecated.datadog-cluster-agent status
.kubernetes_state_core
check that caused tag corruption when telemetry
was set to true
.secret_backend_skip_checks
set to true).Published by kacper-murzyn 11 months ago
Release on: 2023-11-16
arch
field into agent context included in CWS events.Published by kacper-murzyn 12 months ago
Release on: 2023-11-02
Add --use-unconnected-udp-socket flag to agent snmp walk command.
Add support for image pull metrics in the containerd check.
Add kubelet stats.summary check (kubernetes_core.kubelet.*) to the Agent's core checks to replace the old kubernetes.kubelet check generated from Python.
APM: [BETA] Adds peer_tags configuration to allow for more tags in APM stats that can add granularity and clarity to a peer.service. To set this config, use DD_APM_PEER_TAGs='["aws.s3.bucket", "db.instance", ...]
or apm_config.peer_tags: ["aws.s3.bucket", "db.instance", ...]
in datadog.yaml. Please note that DD_APM_PEER_SERVICE_AGGREGATION
or apm_config.peer_service_aggregation
must also be set to true
.
Introduces new Windows crash detection check. Upon initial check run, sends a DataDog event if it is determined that the machine has rebooted due to a system crash.
Install the Aerospike integration on ARM platforms for Python 3
CWS: Detect patterns in processes and files paths to improve accuracy of anomaly detections.
Add Dynamic Instrumentation diagnostics proxy endpoint to the trace-agent http server.
At present, diagnostics are forwarded through the debugger endpoint on the trace-agent server to logs. Since Dynamic Instrumentation also allows adding dynamic metrics and dynamic spans, we want to remove the dependency on logs for diagnostics - the new endpoint uploads diagnostic messages on a dedicated track.
Adds a configurable jmxfetch telemetry check that collects additional data on the running jmxfetch JVM in addition to data about the JVMs jmxfetch is monitoring. The check can be configured by enabling the jmx_telemetry_enabled option in the Agent.
[NDM] Collect diagnoses from SNMP devices.
Adding support for Oracle 12.2.
Add support for Oracle 18c.
CWS now computes hashes for all the files involved in the generation of a Security Profile and an Anomaly Detection Event
[Beta] Cluster agent supports APM Single Step Instrumentation for Kubernetes. Can be enabled in Kubernetes cluster by setting `DD_APM_INSTRUMENTATION_ENABLED=true. Single Step Instrumentation can be turned on in specific namespaces using environment variable DD_APM_INSTRUMENTATION_ENABLED_NAMESPACES. Single Step Instrumentation can be turned off in specific namespaces using environment variable DD_APM_INSTRUMENTATION_DISABLED_NAMESPACES.
Moving the Orchestrator Explorer pod check from the process agent to the core agent. In the following release we will be removing the process agent check and defaulting to the core agent check. If you want to migrate ahead of time you can set orchestrator_explorer.run_on_node_agent
= true in your configuration.
Add new GPU metrics in the KSM Core check:
kubernetes_state.node.gpu_capacity
tagged by node
, resource
, unit
and mig_profile
.kubernetes_state.node.gpu_allocatable
tagged by node
, resource
, unit
and mig_profile
.kubernetes_state.container.gpu_limit
tagged by kube_namespace, pod_name, kube_container_name, node
, resource
, unit
and mig_profile
.Tag container entity with image_id
tag.
max_message_size_bytes
can now be configured in logs_config
. This allows the default message content limit of 256,000 bytes to be increased up to 1MB. If a log line is larger than this byte limit, the overflow bytes will be truncated.
APM: Add regex support for filtering tags by apm_config.filter_tags_regex or environment variables DD_APM_FILTER_TAGS_REGEX_REQUIRE and DD_APM_FILTER_TAGS_REGEX_REJECT.
Agents are now built with Go 1.20.10
.
CWS: Support fentry/fexit eBPF probes which provide lower overhead than kprobe/kretprobes (currently disabled by default and supported only on Linux kernel 5.10 and later).
CWS: Improved username resolution in containers and handle their creation and deletion at runtime.
CWS: Apply policy rules on processes already present at startup.
CWS: Reduce memory usage of BTF symbols.
Remote Configuration for Cloud Workload Security detection rules is enabled if Remote Configuration is globally enabled for the Datadog Agent. Remote Configuration for Cloud Workload Security can be disabled while Remote Configuration is globally enabled by setting the runtime_security_config.remote_configuration.enabled value to false. Remote Configuration for Cloud Workload Security cannot be enabled if Remote Configuration is not globally enabled.
Add gce-container-declaration
to default GCE excluded host tags. See exclude_gce_tags
configuration settings for more.
Add metrics for the workloadmeta extractor to process-agent status output.
Add a heartbeat mechanism for SBOM collection to avoid having to send the whole SBOM if it has not changed since the last computation. The default interval for the host SBOM has changed from 24 hours to 1 hour.
Prefix every entry in the log file with details about the database server and port to distinguish log entries originating from different databases.
JMXFetch internal telemetry is now included in the agent status
output when the verbose flag is included in the request.
Sensitive information is now scrubbed from pod annotations.
The image_id tag no longer includes the docker-pullable://
prefix when using Kubernetes with Docker as runtime.
Improve SQL text collection for self-managed installations. The Agent selects text from V$SQL instead of V$SQLSTATS. If it isn't possible to query the text, the Agent tries to identify the context, such as parsing or closing cursor, and put it in the SQL text.
Improve the Oracle check example configuration file.
Collect Oracle execution plans by default.
Add global custom queries to Oracle checks.
Add connection refused handling.
Add the hosting-type tag, which can have one of the following values: self-managed, RDS, or OCI.
Add a hidden parameter to log unobfuscated execution plan information.
Adding real_hostname tag.
Add sql_id and plan_hash_value to obfuscation error message.
Add Oracle pga_over_allocation_count_metric
.
Add information about missing privileges with the link to the grant commands.
Add TCPS configuration to conf.yaml.example.
The container check reports two new metrics:
container.memory.page_faults
container.memory.major_page_faults
to report the page fault counters per container.
prometheus_scrape: Adds support for multiple OpenMetrics V2 features in the prometheus_scrape.checks[].configurations[]
items:
exclude_metrics_by_labels
raw_line_filters
cache_shared_labels
use_process_start_time
hostname_label
hostname_format
telemetry
ignore_connection_errors
request_size
log_requests
persist_connections
allow_redirects
auth_token
For a description of each option, refer to the sample configuration in https://github.com/DataDog/integrations-core/blob/master/openmetrics/datadog_checks/openmetrics/data/conf.yaml.example.
Improved the SBOM check function to now communicate the status of scans and any potential errors directly to DataDog for more streamlined error management and resolution.
Separate init-containers from containers in the KubernetesPod structure of workloadmeta.
Improve marshalling performance in the system-probe
-> process-agent
path. This improves memory footprint when NPM and/or USM are enabled.
Raise the default logs_config.open_files_limit
to 500
on Windows.
DD_APM_OBFUSCATION_MEMCACHED_KEEP_COMMAND=true
or apm_config.obfuscation.memcached.keep_command: true
in datadog.yaml.CVE-2023-39325
golang.org/x/net
to v0.17.0 to fix CVE-2023-44487.docker.cpu.shares
metric emitted by the Docker check now reports the correct number of CPU shares when running on cgroups v2.workloadmeta
that was causing issues when a subscriber attempted to unsubscribe while events were being handled in another goroutine.containerd
as the container runtime.logs_config.use_podman_logs
from workingflare_stripped_keys
not working on YAML list.cdb
and pdb
tags.$action
.check_name
tag to the cluster_checks.configs_info
metric emitted by the Cluster Agent telemetry.Published by kacper-murzyn about 1 year ago
Release on: 2023-10-17
Published by kacper-murzyn about 1 year ago
Release on: 2023-10-10
The EventIDs logged to the Windows Application Event Log by the Agent services have been normalized and now have the same meaning across Agent services. Some EventIDs have changed and the rendered message may be incorrect if you view an Event Log from a host that uses a different version of the Agent than the host that created the Event Log. To ensure you see the correct message, choose "Display information for these languages" when exporting the Event Log from the host. This does not affect Event Logs collected by the Datadog Agent's Windows Event Log integration, which renders the event messages on the originating host. The EventIDs and messages used by the Agent services can be viewed in pkg/util/winutil/messagestrings/messagestrings.mc
.
datadog-connectivity
and metadata-availability
subcommands do not exist anymore and their diagnoses are reported in a more general and structured way.
Diagnostics previously reported via datadog-connectivity
subcommand will be reported now as part of connectivity-datadog-core-endpoints
suite. Correspondingly, diagnostics previously reported via metadata-availability
subcommand will be reported now as part of connectivity-datadog-autodiscovery
suite.
Streamlined settings by renaming workloadmeta.remote_process_collector.enabled and process_config.language_detection.enabled to language_detection.enabled.
The command line arguments to the Datadog Agent Trace Agent trace-agent
have changed from single-dash arguments to double-dash arguments. For example, -config
must now be provided as --config
. Additionally, subcommands have been added, these may be listed with the --help
switch. For backward-compatibility reasons the old CLI arguments will still work in the foreseeable future but may be removed in future versions.
Added the kubernetes_state.pod.tolerations metric to the KSM core check
Grab, base64 decode, and attach trace context from message attributes passed through SNS->SQS->Lambda
Add kubelet healthz check (check_run.kubernetes_core.kubelet.check) to the Agent's core checks to replace the old kubernetes.kubelet.check generated from Python.
Tag the aws.lambda span generated by the datadog-extension with a language tag based on runtime information in dotnet and java cases
Extended the "agent diagnose" CLI command to allow the easy addition of new diagnostics for diverse and dispersed Agent code.
Add support for the otlp_config.metrics.sums.initial_cumulative_monotonic_value
setting.
[BETA] Adds Golang language and version detection through the system probe. This beta feature can be enabled by setting system_probe_config.language_detection.enabled
to true
in your system-probe.yaml
.
Add new kubelet corecheck, which will eventually replace the existing kubelet check.
Add custom queries to Oracle monitoring.
Adding new configuration setting otlp_config.logs.enabled
to enable/disable logs support in the OTLP ingest endpoint.
Add logsagentexporter, which is used in OTLP agent to translate ingested logs and forward them to logs-agent
Flush in-flight requests and pending retries to disk at shutdown when disk-based buffering of metrics is enabled (for example, when forwarder_storage_max_size_in_bytes is set).
Added a new collector in the process agent in workloadmeta. This collector allows for collecting processes when the process_config.process_collection.enabled is false and language_detection.enabled is true. The interval at which this collector collects processes can be adjusted with the setting workloadmeta.local_process_collector.collection_interval.
Tag lambda cold starts and proactive initializations on the root aws.lambda span
APM - This change improves the acceptance and queueing strategy for trace payloads sent to the Trace Agent. These changes create a system of backpressure in the Trace Agent, causing it to reject payloads when it cannot keep up with the rate of traffic, rather than buffering and causing OOM issues.
This change has been shown to increase overall throughput in the Trace Agent while decreasing peak resource usage. Existing configurations for CPU and memory work at least as well, and often better, with these changes compared to previous Agent versions. This means users do not have to adjust their configuration to take advantage of these changes, and they do not experience performance degredation as a result of upgrading.
Process Language Detection Enabled
in the output of the Agent Status command under the Process Agent
section.agent diagnose
command to be executed in context of running Agent process.1.20.7
. This version of Golang fixes CVE-2023-29409
.container.memory.usage.peak
metric to the container check. It shows the maximum memory usage recorded since the container started.agent diagnose
CLI command by removing all
, datadog-connectivity
, and metadata-availability
subcommands. These separate subcommands became one of the diagnose suites. The all
subcommand became unnecessary.1.20.8
.collector.worker_utilization
to the telemetry. This metric represents the amount of time that a runner worker has been running checks.trace-agent
have changed from single-dash arguments to double-dash arguments. For example, -config
must now be provided as --config
. For backward-compatibility reasons the old CLI arguments will still work in the foreseeable future but may be removed in future versions.APM: In order to improve the default customer experience regarding sensitive data, the Agent now obfuscates database statements within span metadata by default. This includes MongoDB queries, ElasticSearch request bodies, and raw commands from Redis and MemCached. Previously, this setting was off by default. This update could have performance implications, or obfuscate data that is not sensitive, and can be disabled or configured through the obfuscation options within the apm_config, or with the environment variables prefixed with DD_APM_OBFUSCATION. Please read the [Data Security documentation for full details](https://docs.datadoghq.com/tracing/configure_data_security/#trace-obfuscation).
This update ensures the sql.query tag is always obfuscated by the Datadog Agent even if this tag was already set by a tracer or manually by a user. This is to prevent potentially sensitive data from being sent to Datadog. If you wish to have a raw, unobfuscated query within a span, then manually add a span tag of a different name (for example, sql.rawquery).
Fix CVE-2023-39320
, CVE-2023-39318
, CVE-2023-39319
, and CVE-2023-39321
.
Update OpenSSL from 3.0.9 to 3.0.11. This addresses CVEs CVE-2023-2975, CVE-2023-3446, CVE-2023-3817, CVE-2023-4807.
APM: Fix issue of agent status
returning an error when run shortly after starting the trace agent.
APM: Fix incorrect filenames and line numbers in logs from the trace agent.
OTLP logs ingestion is now disabled by default. To enable it, set otlp_config.logs.enabled to true.
Avoids fetching tags for ECS tasks when they're not consumed.
APM: Concurrency issue at high volumes fixed in obfuscation.
datadog.agent.sbom_generation_duration
to only be observed for successful scans.Fixes a bug that prevents the Agent from writing permissions information about system-probe files when creating a flare.
Fixed a bug that causes the Agent to report the datadog.agent_name.running
metric with missing tags in some environments with cgroups v1.
Fix dogstatsd_mapper_profiles
wrong serialization when displaying the configuration (for example match_type
was shown as matchtype
). This also fixes a bug in which the secret management feature was incompatible with dogstatsd_mapper_profiles
due to the renaming of the match_type
key in the YAML data.
Fix a crash in the Cluster Agent when Remote Configuration is disabled
Corrected a bug in calculating the total size of a container image, now accounting for the configuration file size.
Fix to the process-agent from picking up processes which are kernel threads due integer overflow when parsing /proc/<pid>/stat
.
Fixes a rare bug in the Kubernetes State check that causes the Agent to incorrectly tag the kubernetes_state.job.complete
service check.
On Windows, the host metadata correctly reflects the Windows 11 version.
Fix a datadog.yaml
configuration file parsing issue. When the datadog.yaml
configuration file contained a complex configuration under prometheus.checks[*].configurations[*].metrics
, a parsing error could lead to an OpenMetrics check not being properly scheduled. Instead, the Agent logged the following error:
2023-07-26 14:09:23 UTC | CORE | WARN | (pkg/autodiscovery/common/utils/prometheus.go:77 in buildInstances) | Error processing prometheus configuration: json: unsupported type: map[interface {}]interface {}
Fixes the KSM check to support HPA v2beta2 again. This stopped working in Agent v7.44.0.
Counts sent through the no-aggregation pipeline are now sent as rate with a forced interval 10
to mimick the normal DogStatsD pipelines.
Bug fix for the wrong query signature.
Populate OTLP resource attributes in Datadog logs
Changes mapping for jvm.loaded_classes from process.runtime.jvm.classes.loaded to process.runtime.jvm.classes.current_loaded
The minimum and maximum estimation for OTLP Histogram to Datadog distribution mapping now ensures the average is within [min, max].
This estimation is only used when the minimum and maximum are not available in the OTLP payload or this is a cumulative payload.
Fixes a panic in the OTLP ingest metrics pipeline when sending OpenTelemetry runtime metrics
Set correct tag value "otel_source:datadog_agent" for OTLP logs ingestion
Removed specific environment variable filter on the Windows platform to fetch ECS task tags.
diagnose datadog-connectivity subcommand now loads and resolves secrets before checking connectivity.
The Agent now starts even if it cannot write events to the Application event log
Fix Windows Service detection by replacing svc.IsAnInteractiveSession()
(deprecated) with svc.IsWindowsService()
HorizontalPodAutoscaler
collection in the orchestrator check.cluster_checks.advanced_dispatching_enabled
is set to true).cluster_checks.advanced_dispatching_enabled
is set to true).kubernetes_state.job.complete
service check.Published by kacper-murzyn about 1 year ago
Release on: 2023-09-21
DD_APM_REPLACE_TAGS
environment variable and apm_config.replace_tags
setting now properly look for tags with numeric values.Published by kacper-murzyn about 1 year ago
Release on: 2023-08-31
Add ability to send an Agent flare from the Datadog Application for Datadog support team troubleshooting. This feature requires enabling Remote Configuration.
Added workloadmeta remote process collector to collect process metadata from the Process-Agent and store it in the core agent.
Added new parameter workloadmeta.remote_process_collector.enabled
to enable the workloadmeta remote process collector.
Added a new tag collector
to datadog.agent.workloadmeta_remote_client_errors
.
APM: Added support for obfuscating all Redis command arguments. For any Redis command, all arguments will be replaced by a single "?". Configurable using config variable apm_config.obfuscation.redis.remove_all_args
and environment variable DD_APM_OBFUSCATION_REDIS_REMOVE_ALL_ARGS
. Both accept a boolean value with default value false
.
Added an experimental setting process_config.language_detection.enabled. This enables detecting languages for processes. This feature is WIP.
Added an experimental gRPC server to process-agent in order to expose process entities with their detected language. This feature is WIP and controlled through the process_config.language_detection.enabled setting.
The Agent now sends its configuration to Datadog by default to be displayed in the Agent Configuration section of the host detail panel. See https://docs.datadoghq.com/infrastructure/list/#agent-configuration for more information. The Agent configuration is scrubbed of any sensitive information and only contains configuration you’ve set using the configuration file or environment variables. To disable this feature set inventories_configuration_enabled to false.
The Windows installer can now send a report to Datadog in case of installation failure.
The Windows installer can now send APM telemetry.
Add support for Oracle Autonomous Database (Oracle Cloud Infrastructure).
Add shared memory (a.k.a. system global area - SGA) metric for Oracle databases: oracle.shared_memory.size
With this release, remote_config.enabled
is set to true
by default in the Agent configuration file. This causes the Agent to request configuration updates from the Datadog site.
To receive configurations from Datadog, you still need to enable Remote Configuration at the organization level and enable Remote Configuration capability on your API Key from the Datadog application. If you don't want the Agent to request configurations from Datadog, set remote_config.enabled
to false
in the Agent configuration file.
DD_SERVICE_MAPPING can be used to rename Serverless inferred spans' service names.
Adds a new agent command stream-event-platform
to stream the event platform payloads being generated by the agent. This will help diagnose issues with payload generation, and should ease validation of payload changes.
Add two new initContainer metrics to the Kubernetes State Core check: kubernetes_state.initcontainer.waiting and kubernetes_state.initcontainer.restarts.
Add the following sysmetrics to improve DBA/SRE/SE perspective:
avg_synchronous_single_block_read_latency,
active_background_on_cpu, active_background,
branch_node_splits, consistent_read_changes,
consistent_read_gets, active_sessions_on_cpu, os_load,
database_cpu_time_ratio, db_block_changes, db_block_gets,
dbwr_checkpoints, enqueue_deadlocks, execute_without_parse,
gc_current_block_received, gc_average_cr_get_time,
gc_average_current_get_time, hard_parses,
host_cpu_utilization, leaf_nodes_splits, logical_reads,
network_traffic_volume, pga_cache_hit, parse_failures,
physical_read_bytes, physical_read_io_requests,
physical_read_total_io_requests, physical_reads_direct_lobs,
physical_read_total_bytes, physical_reads_direct,
physical_write_bytes, physical_write_io_requests,
physical_write_total_bytes, physical_write_total_io_requests,
physical_writes_direct_lobs, physical_writes_direct,
process_limit, redo_allocation_hit_ratio, redo_generated,
redo_writes, row_cache_hit_ratio, soft_parse_ratio,
total_parse_count, user_commits
Pause containers from the new Kubernetes community registry (registry.k8s.io/pause) are now excluded by default for containers and metrics collection.
[corechecks/snmp] Add forced type rate
as an alternative to counter
.
[corechecks/snmp] Add symbol level metric_type
for table metrics.
Adds support for including the span.kind tag in APM stats aggregations.
Allow ad_identifiers
to be used in file based logs integration configs in order to collect logs from disk.
Agents are now built with Go 1.20.5
Agents are now built with Go 1.20.6
. This version of Golang fixes CVE-2023-29406.
Improve error handling in External Metrics query logic by running queries with errors individually with retry and backoff, and batching only queries without errors.
CPU metadata is now collected without running the sysctl binary on Darwin.
Memory metadata is now collected without running the sysctl binary on Darwin.
Always send the swap size value in metadata as an integer in kilobytes.
Platform metadata is now collected without running the uname binary on Linux and Darwin.
Add new metrics for resource aggregation to the Kubernetes State Core check:
The kube node name is now reported a host tag kube_node
[pkg/netflow] Collect flow_process_nf_errors_count metric from goflow2.
APM: Bind apm_config.obfuscation.*
parameters to new obfuscation environment variables. In particular, bind:
apm_config.obfuscation.elasticsearch.enabled
to DD_APM_OBFUSCATION_ELASTICSEARCH_ENABLED
: It accepts a boolean value with default value false.
apm_config.obfuscation.elasticsearch.keep_values
to DD_APM_OBFUSCATION_ELASTICSEARCH_KEEP_VALUES
It accepts a list of strings of the form ["id1", "id2"]
.
apm_config.obfuscation.elasticsearch.obfuscate_sql_values
to DD_APM_OBFUSCATION_ELASTICSEARCH_OBFUSCATE_SQL_VALUES
It accepts a list of strings of the form ["key1", "key2"]
.
apm_config.obfuscation.http.remove_paths_with_digits
to DD_APM_OBFUSCATION_HTTP_REMOVE_PATHS_WITH_DIGITS
, It accepts a boolean value with default value false.
apm_config.obfuscation.http.remove_query_string
to DD_APM_OBFUSCATION_HTTP_REMOVE_QUERY_STRING
, It accepts a boolean value with default value false.
apm_config.obfuscation.memcached.enabled
to DD_APM_OBFUSCATION_MEMCACHED_ENABLED
: It accepts a boolean value with default value false.
apm_config.obfuscation.mongodb.enabled
to DD_APM_OBFUSCATION_MONGODB_ENABLED
: It accepts a boolean value with default value false.
apm_config.obfuscation.mongodb.keep_values
to DD_APM_OBFUSCATION_MONGODB_KEEP_VALUES
It accepts a list of strings of the form ["id1", "id2"]
.
apm_config.obfuscation.mongodb.obfuscate_sql_values
to DD_APM_OBFUSCATION_MONGODB_OBFUSCATE_SQL_VALUES
It accepts a list of strings of the form ["key1", "key2"]
.
apm_config.obfuscation.redis.enabled
to DD_APM_OBFUSCATION_REDIS_ENABLED
: It accepts a boolean value with default value false.
apm_config.obfuscation.remove_stack_traces
to DD_APM_OBFUSCATION_REMOVE_STACK_TRACES
: It accepts a boolean value with default value false.
apm_config.obfuscation.sql_exec_plan.enabled
to DD_APM_OBFUSCATION_SQL_EXEC_PLAN_ENABLED
: It accepts a boolean value with default value false.
apm_config.obfuscation.sql_exec_plan.keep_values
to DD_APM_OBFUSCATION_SQL_EXEC_PLAN_KEEP_VALUES
It accepts a list of strings of the form ["id1", "id2"]
.
apm_config.obfuscation.sql_exec_plan.obfuscate_sql_values
to DD_APM_OBFUSCATION_SQL_EXEC_PLAN_OBFUSCATE_SQL_VALUES
It accepts a list of strings of the form ["key1", "key2"]
.
apm_config.obfuscation.sql_exec_plan_normalize.enabled
to DD_APM_OBFUSCATION_SQL_EXEC_PLAN_NORMALIZE_ENABLED
: It accepts a boolean value with default value false.
apm_config.obfuscation.sql_exec_plan_normalize.keep_values
to DD_APM_OBFUSCATION_SQL_EXEC_PLAN_NORMALIZE_KEEP_VALUES
It accepts a list of strings of the form ["id1", "id2"]
.
apm_config.obfuscation.sql_exec_plan_normalize.obfuscate_sql_values
to DD_APM_OBFUSCATION_SQL_EXEC_PLAN_NORMALIZE_OBFUSCATE_SQL_VALUES
It accepts a list of strings of the form ["key1", "key2"]
.
The Windows installer is now built using WixSharp.
Refactored the Windows installer custom actions in .Net.
Remove Oracle from the Heroku build.
[pkg/snmp/traps] Collect telemetry metrics for SNMP Traps.
[pkg/networkdevice] Add Meraki fields to NDM Metadata payload.
[corechecks/snmp] Add metric_type
to metric root and deprecate forced_type
.
[corechecks/snmp] Add tags
to interface_configs
to tag interface metrics
[corechecks/snmp] Add user_profiles
directory support.
%%var%%
template variable pattern./etc/services
does not follow the format port/protocol
: https://gitlab.com/cznic/libc/-/issues/25
security-agent.yaml
file in the flare.datadog.agent.check_status
is now disabled bydefault. To re-enable, set integration_check_status_enabled
to true
.leader_election_default_resource
to leases
, available since Kubernetes version 1.14. If this parameter is empty, leader election automatically detects if leases are available and uses them. Set leader_election_default_resource
to configmap
on clusters running Kubernetes versions previous to 1.14.Published by kacper-murzyn over 1 year ago
Release on: 2023-07-10
Refactor the SBOM collection parameters from:
conf.d/container_lifecycle.d/conf.yaml existence (A) # to schedule the container lifecycle long running check
conf.d/container_image.d/conf.yaml existence (B) # to schedule the container image metadata long running check
conf.d/sbom.d/conf.yaml existence (C) # to schedule the SBOM long running check
Inside datadog.yaml:
container_lifecycle:
enabled: (D) # Used to control the start of the container_lifecycle forwarder but has been decommissioned by #16084 (7.45.0-rc)
dd_url: # \
additional_endpoints: # |
use_compression: # |
compression_level: # > generic parameters for the generic EVP pipeline
… # |
use_v2_api: # /
container_image:
enabled: (E) # Used to control the start of the container_image forwarder but has been decommissioned by #16084 (7.45.0-rc)
dd_url: # \
additional_endpoints: # |
use_compression: # |
compression_level: # > generic parameters for the generic EVP pipeline
… # |
use_v2_api: # /
sbom:
enabled: (F) # control host SBOM collection and do **not** control container-related SBOM since #16084 (7.45.0-rc)
dd_url: # \
additional_endpoints: # |
use_compression: # |
compression_level: # > generic parameters for the generic EVP pipeline
… # |
use_v2_api: # /
analyzers: (G) # trivy analyzers user for host SBOM collection
cache_directory: (H)
clear_cache_on_exit: (I)
use_custom_cache: (J)
custom_cache_max_disk_size: (K)
custom_cache_max_cache_entries: (L)
cache_clean_interval: (M)
container_image_collection:
metadata:
enabled: (N) # Controls the collection of the container image metadata in workload meta
sbom:
enabled: (O)
use_mount: (P)
scan_interval: (Q)
scan_timeout: (R)
analyzers: (S) # trivy analyzers user for containers SBOM collection
check_disk_usage: (T)
min_available_disk: (U)
to:
conf.d/{container_lifecycle,container_image,sbom}.d/conf.yaml no longer needs to be created. A default version is always shipped with the Agent Docker image with an underscore-prefixed ad_identifier that will be synthesized by the agent at runtime based on config {container_lifecycle,container_image,sbom}.enabled parameters.
Inside datadog.yaml:
container_lifecycle:
enabled: (A) # Replaces the need for creating a conf.d/container_lifecycle.d/conf.yaml file
dd_url: # \
additional_endpoints: # |
use_compression: # |
compression_level: # > unchanged generic parameters for the generic EVP pipeline
… # |
use_v2_api: # /
container_image:
enabled: (B) # Replaces the need for creating a conf.d/container_image.d/conf.yaml file
dd_url: # \
additional_endpoints: # |
use_compression: # |
compression_level: # > unchanged generic parameters for the generic EVP pipeline
… # |
use_v2_api: # /
sbom:
enabled: (C) # Replaces the need for creating a conf.d/sbom.d/conf.yaml file
dd_url: # \
additional_endpoints: # |
use_compression: # |
compression_level: # > unchanged generic parameters for the generic EVP pipeline
… # |
use_v2_api: # /
cache_directory: (H)
clear_cache_on_exit: (I)
cache: # Factorize all settings related to the custom cache
enabled: (J)
max_disk_size: (K)
max_cache_entries: (L)
clean_interval: (M)
host: # for host SBOM parameters that were directly below `sbom` before.
enabled: (F) # sbom.host.enabled replaces sbom.enabled
analyzers: (G) # sbom.host.analyzers replaces sbom.analyzers
container_image: # sbom.container_image replaces container_image_collection.sbom
enabled: (O)
use_mount: (P)
scan_interval: (Q)
scan_timeout: (R)
analyzers: (S) # trivy analyzers user for containers SBOM collection
check_disk_usage: (T)
min_available_disk: (U)
This change adds support for ingesting information such as database settings and schemas as database "metadata"
Add the capability for the security-agent compliance module to export detailed Kubernetes node configurations.
Add unsafe-disable-verification flag to skip TUF/in-toto verification when downloading and installing wheels with the integrations install command
Add container.memory.working_set metric on Linux (computed as Usage - InactiveFile) and Windows (mapped to Private Working Set)
Enabling dogstatsd_metrics_stats_enable
will now enable dogstatsd_logging_enabled
. When enabled, dogstatsd_logging_enabled
generates dogstatsd log files at:
Windows
:c:\programdata\datadog\logs\dogstatsd_info\dogstatsd-stats.log
Linux
:/var/log/datadog/dogstatsd_info/dogstatsd-stats.log
MacOS
:/opt/datadog-agent/logs/dogstatsd_info/dogstatsd-stats.log
These log files are also automatically attached to the flare.
You can adjust the dogstatsd-stats logging configuration by using:
SizeInBytes
(default: dogstatsd_log_file_max_size:"10Mb"
)Int
(default: dogstatsd_log_file_max_rolls:3
)The network_config.enable_http_monitoring configuration has changed to service_monitoring_config.enable_http_monitoring.
Add Oracle execution plans
Oracle query metrics
Add support for Oracle RDS multi-tenant
agent status -v
now shows verbose diagnostic information. Added tailer-specific stats to the verbose status page with improved auto multi-line detection information.health
command from the Agent and Cluster Agent now have a configurable timeout (60 second by default).1.19.10
flush_timestamp
to payload.peer.service
.span.kind
.0.47.9
which has fixes to improve efficiency when fetching beans, fixes for process attachment in some JDK versions, and fixes a thread leak.auto_multi_line_detection
, auto_multi_line_sample_size
, and auto_multi_line_match_threshold
were not working when set though a pod annotation or container label.device_ip
to exporter_ip
.hostNetwork: true
, the leader election mechanism was using a node name instead of the pod name. This was breaking the “follower to leader” forwarding mechanism. This change introduce the DD_POD_NAME
environment variable as a more reliable way to set the cluster-agent pod name. It is supposed to be filled by the Kubernetes downward API.Published by kacper-murzyn over 1 year ago
Release on: 2023-06-27
Published by kacper-murzyn over 1 year ago
Release on: 2023-06-05
peer.service
to trace stats exported by the Agent.span.kind
value.Cluster Agent: User config, cluster Agent deployment and node Agent daemonset manifests are now added to the flare archive, when the Cluster Agent is deployed with Helm (version 3.23.0+).
Datadog Agent running as a systemd service can optionally read environment variables from a text file /etc/datadog-agent/environment containing newline-separated variable assignments. See https://www.freedesktop.org/software/systemd/man/systemd.exec.html#Environment
Add ability to filter kubernetes containers based on autodiscovery annotation. Containers in a pod can now be omitted by setting ad.datadoghq.com/<container_name>.exclude as an annotation on the pod. Logs can now be ommitted by setting ad.datadoghq.com/<container_name>.logs_exclude as an annotation on the pod.
Added support for custom resource definitions metrics: crd.count and crd.condition.
sbom.cached_keys
: Number of cache keys stored in memorysbom.cache_disk_size
: Total size, in bytes, of the database as reported by BoltDB.sbom.cached_objects_size
: Total size, in bytes, of cached SBOM objects on disk. Limited by sbom.custom_cache_max_disk_size.sbom.cache_hits_total
: Total number of cache hits.sbom.cache_misses_total
: Total number of cache misses.sbom.cache_evicts_total
: Total number of cache evicts.Added DD_ENV to the SBOMPayload in the SBOM check.
Added kubernetes_state.hpa.status_target_metric and kubernetes_state.deployment.replicas_ready metrics part of the kubernetes_state_core check.
Add support for emitting resources on metrics from tags in the format dd.internal.resource:type,name.
APM: Dynamic instrumentation logs and snapshots can now be shipped to multiple Datadog logs intakes.
Adds support for OpenTelemetry span links to the Trace Agent OTLP endpoint when converting OTLP spans (span links are added as metadata to the converted span).
Agents are now built with Go 1.19.9
.
Make Podman DB path configurable for rootless environment. Now we can set $HOME/.local/share/containers/storage/libpod/bolt_state.db
.
Add ownership information for containers to the container-lifecycle check.
Add Pod exit timestamp to container-lifecycle check.
The Agent now uses the ec2_metadata_timeout value when fetching EC2 instance tags with AWS SDK. The Agent fetches instance tags when collect_ec2_tags is set to true.
Upgraded JMXFetch to 0.47.8
which has improvements aimed to help large metric collections drop fewer payloads.
Kubernetes State Metrics Core: Adds collection of Kubernetes APIServices metrics
Add support for URLs with the http|https scheme in the dd_url or logs_dd_url parameters when configuring endpoints. Also automatically detects SSL needs, based on the scheme when it is present.
[pkg/netflow] Add NetFlow Exporter to NDM Metadata.
SUSE RPMs are now built with RPM 4.14.3 and have SHA256 digest headers.
observability_pipelines_worker
can now be used in place of the vector
config options.
Add an option and an annotation to skip kube_service
tags on Kubernetes pods.
When the selector of a service matches a pod and that pod is ready, its metrics are decorated with a kube_service
tag.
When the readiness of a pod flips, so does the kube_service
tag. This could create visual artifacts (spikes when the tag flips) on dashboards where the queries are missing .fill(null)
.
If many services target a pod, the total number of tags attached to its metrics might exceed a limit that causes the whole metric to be discarded.
In order to mitigate these two issues, it’s now possible to set the kubernetes_ad_tags_disabled
parameter to kube_config
to globally remove the kube_service
tags on all pods:: kubernetes_ad_tags_disabled
It’s also possible to add a tags.datadoghq.com/disable: kube_service
annotation on only the pods for which we want to remove the kube_service
tag.
Note that kube_service
is the only tag that can be removed via this parameter and this annotation.
Support OTel semconv 1.17.0 in OTLP ingest endpoint.
When otlp_config.metrics.histograms.send_aggregation_metrics
is set to true
, the OTLP ingest pipeline will now send min and max metrics for delta OTLP Histograms and OTLP Exponential Histograms when available, in addition to count and sum metrics.
The deprecated option otlp_config.metrics.histograms.send_count_sum_metrics
now also sends min and max metrics when available.
OTLP: Use minimum and maximum values from cumulative OTLP Histograms. Values are used only when we can assume they are from the last time window or otherwise to clamp estimates.
The OTLP ingest endpoint now supports the same settings and protocol as the OpenTelemetry Collector OTLP receiver v0.75.0.
Secrets with ENC[] notation are now supported for proxy setting from environment variables. For more information you can refer to our [Secrets Management](https://docs.datadoghq.com/agent/guide/secrets-management/) and [Agent Proxy Configuration](https://docs.datadoghq.com/agent/proxy/) documentations.
[corechecks/snmp] Adds ability to send constant metrics in SNMP profiles.
[corechecks/snmp] Adds ability to map metric tag value to string in SNMP profiles.
[corechecks/snmp] Add support to format bytes into ip_address
ADDLOCAL=NPM
and REMOVE=NPM
, no longer controls the install state of NPM components. The NPM components are now always installed, but will only run when enabled in the agent configuration. The Windows Installer NPM feature option still exists for backwards compatability purposes, but has no effect.otlp_config.metrics.histograms.send_count_sum_metrics
in favor of otlp_config.metrics.histograms.send_aggregation_metrics
.pkg/forwarder
device
tag on the system.disk
group of metrics.device.ip
to exporter.ip
prometheus.io/scrape: true
, the Agent used to schedule one openmetrics
check per container in the pod unless a datadog.prometheusScrape.additionalConfigs[].autodiscovery.kubernetes_container_names
list was defined, which restricted the potential container targets. The Agent is now able to leverage the prometheus.io/port
annotation to schedule an openmetrics
check only on the container of the pod that declares that port in its spec.Published by kacper-murzyn over 1 year ago
Release on: 2023-05-16
1.19.8
.security-agent.yaml
file in the flare.Published by kacper-murzyn over 1 year ago
Release on: 2023-04-27
Added kubernetes_state.hpa.status_target_metric and kubernetes_state.deployment.replicas_ready metrics part of the kubernetes_state_core check.
The status page now includes a Status render errors
section to highlight errors that occurred while rendering it.
APM:
127.0.0.1
. The port is configurable through apm_config.debug.port
and DD_APM_DEBUG_PORT
, set it to 0 to disable the server.APM: apm_config.features is now configurable from the Agent configuration file. It was previously only configurable via DD_APM_FEATURES.
Agents are now built with Go 1.19.7
.
The OTLP ingest endpoint now supports the same settings and protocol as the OpenTelemetry Collector OTLP receiver v0.71.0.
Collect Kubernetes Pod conditions.
Added the "availability-zone" tag to the Fargate integration. This matches the tag emitted by other AWS infrastructure integrations.
Allow to report all gathered data in case of partial failure of container metrics retrieval.
Upgraded JMXFetch to 0.47.8
which has improvements aimed to help large metric collections drop fewer payloads.
JMXFetch upgraded to 0.47.5 which now supports pulling metrics from javax.management.openmbean.TabularDataSupport. Also contains a fix for pulling metrics from javax.management.openmbean.TabularDataSupport when no tags are specified.
Updated chunking util and use cases to use generics. No behavior change.
[corechecks/snmp] Add interface_configs
to override interface speed.
No longer increments TCP retransmit count when the retransmit fails.
The OTLP ingestion endpoint now supports the same settings and protocols as the OpenTelemetry Collector OTLP receiver v0.70.0.
Changes the retry mechanism of starting workloadmeta collectors so that instead of retrying every 30 seconds, it retries following an exponential backoff with initial interval of 1s and max of 30s. In general, this should help start sooner the collectors that failed on the first try.
Added the "pull_duration" metric in the workloadmeta telemetry. It measures the time that it takes to pull from the collectors.
enable_sketch_stream_payload_serialization
is now deprecated.agent status
would show incorrect system-probe status for 15 seconds as the system-probe started up.;
with &
in the URL to open GUI to follow golang.org/issue/25192.logs_config.cca_in_ad
has been removed.