The open-source observability platform everyone needs!
GPL-3.0 License
Bot releases are visible (Hide)
Published by netdatabot about 5 years ago
Release v1.17.1 contains 2 bug fixes, 6 improvements, and 2 documentation updates.
The main reason for the patch release is an essential fix to the repeating alarm notifications we introduced in v1.17.0. If you enabled repeating notifications, Netdata would not then send CLEAR notifications for the selected alarms.
The release also includes a significant improvement to Netdata's auto-detection capabilities, especially after a system restart. Netdata now remembers which python.d
plugin jobs were successfully collecting data the last time it was running, and retries to run those jobs for 5 minutes before giving up. As a result, you no longer have to worry if your system starts Netdata before the monitored services have had a chance to start properly. We will complete the same improvement for go.d
plugins in v1.18.0.
We also made some improvements to our binary packages and added a neat sample custom dashboard that can show charts from multiple Netdata agents.
Our thanks go to:
Dash.html
, the custom dashboard that can show charts from multiple hosts.dash.html
#6603 (tnyeanderson)configure.ac
from linking against dbengine and https libraries when dbengine or https are disabled #6658 (mfundul)Published by netdatabot about 5 years ago
Release v1.17.0 contains 38 bug fixes, 33 improvements, and 20 documentation updates.
You can now change the data collection frequency at will, without losing previously collected values. A major improvement to the new database engine allows you not only to store metrics at variable granularity, but also to autoscale the time axis of the charts, depending on the data collection frequencies used during the presented time.
You can also now monitor VM performance from one or more vCenter servers with a new VSphere collector. In addition, the proc
plugin now also collects ZRAM device performance metrics and the apps
plugin monitors process uptime for the defined process groups.
Continuing our efforts to integrate with as many existing solutions as possible, you can now directly archive metrics from Netdata to MongoDB via a new backend.
Netdata badges now support international (UTF8) characters! We also made our URL parser smarter, not only for international character support, but also for other strange API queries.
We also added .DEB
packages to our binary distribution repositories at Packagecloud, a new collector for Linux zram device metrics, and support for plain text email notifications.
This release includes several fixes and improvements to the TLS encryption feature we introduced in v1.16.0. First, encryption slave-to-master streaming connections wasn't working as intended. And second, our community helped us discover cases where HTTP requests were not correctly redirected to HTTPS with TLS enabled. This release mitigates those issues and improves TLS support overall.
Finally, we improved the way Netdata displays charts with no metrics. By default, Netdata displays charts for disks, memory, and networks only when the associated metrics are not zero. Users could enable these charts permanently using the corresponding configuration options, but they would need to change more than 200 options. With this new improvement, users can enable all charts with zero values using a single, global configuration parameter.
Our thanks go to:
nfacct
plugin.gitignore
fixesspigotmc
collectorchart
and data
API calls (#6054) #6615 (alpes214)proc.plugin
#6276 #6424 (RaZeR-RBI)/etc/passwd
#6472 (vlvkobal)password
to pass
#6518 (ilyam8)arcstat.py
and arc_summary.py
in dashboard_info.js #6461 (TheLovinator1)/tmp
#6491 (cakrit)edit-config
, the configuration editor, not being able to run in MacOS. We no longer deliver edit-config as part of the distribution tarball, so that it can get generated with proper configuration during installation .#6507 (paulkatsoulakis).environment
file getting overwritten, by moving tarball checksum information into lib dir of netdata #6555 (paulkatsoulakis)undefined reference to LZ4_compress_default
#6589 (mfundul)kickstart.sh
as a non-privileged user #6642 (paulkatsoulakis)Thanks to the community for their help!
Published by netdatabot over 5 years ago
Release v1.16.0 contains 40 bug fixes, 31 improvements and 20 documentation updates
Binary distributions. To improve the security, speed and reliability of new netdata installations, we are delivering our own, industry standard installation method, with binary package distributions. The RPM binaries for the most common OSs are already available on packagecloud and we’ll have the DEB ones available very soon. All distributions are considered in Beta and, as always, we depend on our amazing community for feedback on improvements.
Netdata now supports SSL encryption! You can secure the communication to the web server, the streaming connections from slaves to the master and the connection to an openTSDB backend.
This version also brings two long-awaited features to netdata’s health monitoring:
As always, we’ve introduced new collectors, 5 of them this time.
perf
plugin collects system-wide CPU performance statistics from Performance Monitoring Units (PMU) using the perf_event_open()
system call. You can read a wonderful article on why this is useful here.Finally, the DB Engine introduced in v1.15.0 now uses much less memory and is more robust than before.
As you’ll see in the detailed list below, once again we’ve had great help from our contributors.
We can't stress enough the immense help we get just from users creating an issue in GitHub, helping us identify the root cause and validate the change in their infrastructure. Unfortunately, we are not able to list all of them here, but their contribution is invaluable.
perf_event_open()
system call. (perf plugin) #6225 (vlvkobal)userstats
and deadlocks
charts to the python mysql collector #6118 #6115 (kam1kaze)memory mode = dbengine
, by adding empty page detection #6173 (mfundul)stream.conf
option health enabled by default = auto
#6281 (cakrit)cloud base url
parameter to the notifications mechanism, so that modifications to the configuration are respected when creating the link to the alarm #6383 (ladakis).gitattributes
file to improve git diff
for C files #6381 (ac000)CRITICAL: main[main] SIGPIPE received.
error #6373 (vlvkobal)ram_available
alarm #6261 (octomike)/dev
and /run
in the disk space and inode usage charts #6399 (vlvkobal)PERF_COUNT_HW_REF_CPU_CYCLES' undeclared here
in old Linux kernels (perf plugin) #6382 (vlvkobal)Failed to parse
error in adaptec_raid #6338 (ilyam8)cluster_health_nodes
and cluster_stats_nodes
charts in the elasticsearch collector #6311 (Wing924)End
key #6294 (thiagoftsm)libexecdir
directory #6272 (paulkatsoulakis)Error: 'module' object has no attribute 'Retry'
messages from python collectors, by enforcing minimum version check for the UrlService
library #6263 (ilyam8)freeipmi
#6260 (vlvkobal)Assertion
old_state & PG_CACHE_DESCR_ALLOCATED' failed` of the new dbengine. Eliminated a page cache descriptor race condition #6202 (mfundul)\r\n
as per the RFC #6187 (toofar)PLUGINSD : cannot open plugins directory
#6080 #6089 (Steve8291)Published by netdatabot over 5 years ago
Release v1.15.0 contains 11 bug fixes and 30 improvements.
We are very happy and proud to be able to include two major improvements in this release: The aggregated node view and the new database engine.
The No. 1 request from our community has been a better way to view and manage their Netdata installations, via an aggregated view. The node menu with the simple list of hosts on the agent UI just didn't do it for people with hundreds, or thousands of instances. This release introduces the node view, which uses the power of Netdata Cloud to deliver powerful views of a Netdata-based monitoring infrastructure.
You can read more about Netdata Cloud and the future of netdata here.
Historically, Netdata has required a lot of memory for long-term metrics storage. To mitigate this we've been building a new DB engine for several months and will continue improving until it can become the default memory mode
for new Netdata installations. The version included in release v1.15.0 already permits longer-term storage of compressed data and we'll continue reducing the required memory in following releases.
We have added support for the AWS Kinesis backend and new collectors for OpenVPN, the Tengine web server, ScaleIO (VxFlex OS), ioping-like latency metrics and Energi Core node instances.
We now have a new, "text-only" chart type, cpu limits for v2 cgroups, docker swarm metrics and improved documentation.
We continued improving the Kubernetes helmchart with liveness probes for slaves, persistence options, a fix for a Cannot allocate memory
issue and easy configuration for the kubelet, kube-proxy and coredns collectors.
Finally, we built a process to quickly replace any problematic nightly builds and added more automated CI tests to prevent such builds from being published in the first place.
Our heartfelt gratitude for this release goes to the following people:
Cannot allocate memory
issue #18 (kam1kaze)apiVersion
to fix linting errors and correct the location of the env
field #22, #23 (karuppiah7890)node
applications group did not include all node processes. #5962 (jonfairbanks)timeout
command #5938 (paulkatsoulakis)lsns
command, used to match network interfaces to containers #1 (kam1kaze)Published by netdatabot over 5 years ago
Release 1.14 contains 14 bug fixes and 24 improvements.
The release introduces major additions to Kubernetes monitoring, with tens of new charts for Kubelet, kube-proxy and coredns metrics, as well as significant improvements to the netdata helm chart.
Two new collectors were added, to monitor Docker hub and Docker engine metrics.
Finally, v1.14 adds support for version 2 cgroups, OpenLDAP over TLS, NVIDIA SMI free and per process memory and configurable syslog facilities.
Our contributors kicked the ball out of the park this time. Our thanks go to the following people:
@ekartsonakis for the excellent addition of TLS support to the OpenLDAP collector
@Wing924 whose cat apparently leaves him enough time to help us with springboot2 and a lot more!
@huww98 for his contribution to the NVIDIA SMI plugin.
@varyumin for his help on the Kubernetes helm chart.
@skrzyp1 for the very significant addition of cgroup v2 support
@hsegnitz for his contribution to the web server log plugin.
@archisgore for the quick fixes to the Polyverse-enabled docker image.
@tctovsli for his Rocket Chat notifications improvements.
@JoeWrightss and @vinyasmusic for not letting us get away with spelling mistakes.
@andvgal for the addition to the MongoDB collector.
@piiiggg for the apache proxy documentation fix
@Ferroin for general awesomeness.
netdata-v1.14.0-rc0-39a9sf9g
we would get a netdata-39a9sf9g
. #5860 (paulkatsoulakis)SIGTERM
when netdata exits, resulting in zombie processes. Added a heartbeat so that the process can exit on SIGPIPE
. #5797 (ilyam8)sha256sum
used by the installers is not available on all FreeBSD installations. Modified the installers to properly support FreeBSD. #5760 (paulkatsoulakis)netdata.conf
. #5792 (thiagoftsm)Published by netdatabot over 5 years ago
Release 1.13 contains 14 bug fixes and 8 improvements.
netdata has taken the first step into the world of Kubernetes, with a beta version of a Helm chart for deployment to a k8s cluster and proper naming of the cgroup containers. We have big plans for Kubernetes, so stay tuned!
A major refactoring of the python.d plugin has resulted in a dramatic decrease of the required memory, making netdata even more resource efficient.
We also added charts for IPC shared memory segments and total memory used.
Published by netdatabot over 5 years ago
Patch release 1.12.2 contains 7 bug fixes and 4 improvements.
The main motivation behind a new patch release is the introduction of a stable release channel.
A "stable" installation and update channel was always on our roadmap, but it became a necessity when we realized that our users in China could not use the nightly releases published on Google Cloud. The "stable" channel is based on our official GitHub releases and uses assets hosted on GitHub.
We are also introducing a new Oracle DB collector module, implemented in Python.
Published by netdatabot over 5 years ago
Patch release 1.12.1 contains 22 bug fixes and 8 improvements.
/opt
usage #5218
Published by netdatabot over 5 years ago
Release 1.12 is made out of 211 pull requests and 22 bug fixes.
The key improvements are:
netdata.cloud
, the free netdata service for all netdata usersnetdata.cloud
is a free service for all netdata users. Currently it replaces the old netdata registry, while providing single sign on with GitHub and Google accounts.
Using netdata.cloud
we plan to provide the following features:
and many more.
Read more about netdata.cloud
here.
netdata can now bind its API functions to different ports.
The following API functions can be isolated:
dashboard
for access the dashboardbadges
for generating badgesstreaming
for receiving streamed metrics from remote netdata serversmanagement
for receiving management commandsregistry
for accessing the netdata registrynetdata.conf
for downloading the current configurationTo bind API functions to different ports, append =function|function|...
to the port definition, like this:
[web]
bind to = *:19999=dashboard|netdata.conf *:20000=streaming
The above will bind netdata:
*
) at port 19999
for dashboard access and access to netdata.conf
*
) at port 20000
for receiving streamed data from remote netdata serversFor more information about binding API functions to different ports, check this.
Netdata now has a management API. We plan to provide a full set of configuration commands using this API.
In this release, the management API supports disabling or silencing alarms during maintenance periods.
For more information about the management API, check this.
Anonymous usage information is collected by default and sent to Google Analytics. The statistics calculated from this information will be used for:
Quality assurance, to help us understand if netdata behaves as expected and help us identify repeating issues for certain distributions or environment.
Usage statistics, to help us focus on the parts of netdata that are used the most, or help us identify the extend our development decisions influence the community.
Information is sent to Netdata via two different channels:
anonymous-statistics.sh
is executed by the Netdata daemon, when Netdata starts, stops cleanly, or fails.Both methods are controlled via the same opt-out mechanism.
For more information, check this.
This release introduces a new Go plugin orchestrator. This plugin has its own github repo. It is open-source, using the same license and we welcome contributions. The orchestrator can also be used to build custom data collection plugins written in Go. We have used the orchestrator to write many new Go plugins in our go.d plugin github repo. For more information, check this.
New data collectors:
High performance versions of older data collectors:
Other improved data collectors:
N/A
values.my-netdata
menu when signed in to netdata.cloud
DT_UNKNOWN
files as regular files.Published by netdatabot almost 6 years ago
This is a patch - bug fix release of netdata.
Our work to move all the documentation inside the repo is still in progress. Everything has been moved, but still we need to refactor a lot of the pages to be more meaningful.
The README file on netdata home has been rewritten. Check it here.
Overflown incremental values (counters) do not show a zero point at the charts. Netdata detects the width (8bit, 16bit, 32bit, 64bit) of each counter and properly calculates the delta when the counter overflows.
The internal database format has been extended to support values above 64bit.
openldap
, to collect performance statistics from OpenLDAP servers.tor
, to collect traffic statistics from Tor.nvidia_smi
to monitor NVIDIA GPUs.:
) in them were incorrectly parsed and resulted in faulty data collection values.smartd_log
has been refactored, has better python v2 compatibility, and now supports SCSI smart attributescpufreq
has been re-written in C - since this module if common, we decided to convert to an internal plugin to lower the pressure on the python ones. There are a few more that will be transitioned to C in the next release.sensors
got some compatibility fixes and improved handling for lm-sensors
errors.alerta.io
notifications got a few improvements
BUG FIX: conntrack_max
alarm has been restored (was not working due to an invalid variable name referenced)
my-netdata
menu)It has been refactored a bit to reveal the URLs known for each node and now it supports deleting individual URLs.
openrc
service definition got a few improvementsPublished by netdatabot almost 6 years ago
Hi all,
It has been 8 months since the last release of Netdata. We delayed releases a bit, but as you can see on these release notes, we were working hard to provide the best Netdata ever.
Thanks to synacktiv.com and red4sec.com, we fixed a number of vulnerabilities in the code base (check below), so release 1.11 of Netdata is the most secure Netdata so far. All users are advised to update to this version asap.
Netdata now has its own organization on GitHub. So, we moved from firehol/netdata
to netdata/netdata
! We also provide new docker images as netdata/netdata
(the old ones are deprecated and are not updated any more).
Netdata community grows faster than ever. Currently netdata grows by +2k unique users and +1k unique installations per day, every day!
Contributions sky rocket too. To make it even easier for newcomers to get involved, we modularized all the code, now organized into a hierarchy of directories. We also moved most of the documentation, from the wiki into the repo. This is quite unique. Netdata is one of the first projects that organizes code and docs under the same hierarchy. Browse the repo; you will be surprised! Examples: data collection plugins, database, backends, web server, ARL, including benchmarks, etc.
Many thanks to all the contributors that help building, enhancing and improving a project useful and helpful to hundreds of thousands of admins, devops and developers around the world!
You rock!
@ktsaou
There was an accidental breaking change in the master repo of netdata.
All users that use automatic updates, are advised to run:
sudo sh -c 'cd /usr/src/netdata.git && git fetch --all && git reset --hard origin/master && ./netdata-updater.sh -f'
After that, netdata-updater
will be able to update your netdata.
/usr/lib/netdata
We prepare netdata for binary packages. This required stock config files to be overwritten unconditionally when new netdata binary packages are installed. So, all config files we ship with netdata are now installed under /usr/lib/netdata/conf.d
.
To edit config files, we have supplied the script /etc/netdata/edit-config
that automatically moves the config file you need to edit to /etc/netdata
and opens an editor for you.
The query engine of netdata has been re-written to support query plugins. We have already added the following algorithms that are available for alarm, charts and badges:
stddev
, for calculating the standard deviation on any time-frame.ses
or ema
or ewma
, for calculating the exponential weighted moving average, or single/simple exponential smoothing on any time-frame.des
, for calculating the double exponential smoothing on any time-frame.cv
or rsd
, for calculating the coefficient of variation for any time-frame.CVE-2018-18836
Fixed JSON Header Injection (an attacker could send \n
encoded in the request to inject a JSON fragment into the response).CVE-2018-18837
Fixed HTTP Header Injection (an attacker could send \n
encoded in the request to inject an HTTP header into the response).CVE-2018-18838
Fixed LOG Injection (an attacker could send \n
encoded in the request to inject a log line at access.log
).CVE-2018-18839
Not fixed Full Path Disclosure, since these are intended (netdata reports the absolute filename of web files, alarm config files and alarm handlers).apps.plugin
or cgroup-network
error handling.\n
in them).netdata/netdata
. These images are based on Alpine Linux for optimal footprint. We provide images for i386
, amd64
, aarch64
and armhf
.netdata.service
now allows configuring process scheduling priorities exclusively on netdata.service
(no need to change netdata.conf
too).netdata.service
is now installed in /usr/lib/systemd/system
./usr/lib/netdata/conf.d
and a new script has been added to allow easily copying and editing config files: /etc/netdata/edit-config
.rethinkdbs
for monitoring RethinkDB performanceproxysql
for monitoring ProxySQL performancelitespeed
for monitoring LiteSpeed web server performance.uwsgi
for monitoring uWSGI performanceunbound
for monitoring the performance of Unbound DNS servers.powerdns
for monitoring the performance of PowerDNS servers.dockerd
for monitoring the health of dockerdpuppet
for monitoring Puppet Server and Puppet DB.logind
for monitoring the number of active users.adaptec_raid
and megacli
for monitoring the relevant raid controllerspigotmc
for monitoring minecraft server statisticsboinc
for monitoring Berkeley Open Infrastructure Network Computing clients.w1sensor
for monitoring multiple 1-Wire temperature sensors.monit
for collecting process, host, filesystem, etc checks from monit.linux_power_supplies
for monitoring Linux Power Supplies attributesnode.d.plugin
does not use the js
command any more.python.d.plugin
now uses monotonic
clocks. There was a discrepancy in clocks used in netdata that resulted in a shift in time of python module after some time (it was missing 1 sec per day).MySQLService
for quickly adding plugins using mysql queries.URLService
now supports self-signed certificates and supports custom client certificates.python.d.plugin
modules that require sudo
to collect metrics, are now disabled by default, to avoid security alarms on installations that do not need them.apps.plugin
now detects changes in process file descriptors, also fixed a couple of memory leaks. Its default configuration has been enriched significantly, especially for IoT.freeipmi.plugin
now supports option ignore-status
to ignore the status reported by given sensors.statsd.plugin
(for collecting custom APM metrics)sets
now report zeros instead of gaps when no data are collectedhistograms
and timers
have been optimized for lowering their CPU consumption to support several thousands of such metrics are collected.histograms
had wrong sampling rate calculations.gauges
now ignore sampling rate when no sign is included in the value.proc.plugin
(Linux, system monitoring)/proc/net/snmp
parsing of IcmpMsg
lines that failed on a few systems.TcpExtTCPReqQFullDrop
and re-organizes metrics in charts to properly monitor the TCP SYN queue and the TCP Accept queue of the kernel.ip.*
.SCTP
./proc/interrupts
and /proc/softirqs
parsing fixes.diskspace.plugin
(Linux, disk space usage monitoring)stat()
excluded mount points any more (it was interfering with kerberos authenticated mount points).freebsd.plugin
(FreeBSD, PFSense, system monitoring)loundry
memory is now monitored.system.net
and system.packets
charts added that report the total bandwidth and packets of all physical network interfaces combined.python.d.plugin
PYTHON modules (applications monitoring)web_log
module now supports virtual hosts, reports http/https metrics, support squid
logsnginx_plus
module now handles non-continuous peer IDs (bug fix)ipfs
module is optimized, the use of its Pin API is now disabled by default and can enabled with a netdata module option (using the IPFS Pin API increases the load on the IPFS server).fail2ban
module now supports IPv6 too.ceph
module now checks permissions and properly reports issueselasticsearch
module got better error handlingnginx_plus
module now uses upstream ip:port
instead of transient id to identify dimensions.redis
, now it supports Pika, collects evited keys, fixes authentication issues reported and improves exception handling.beanstalk
, bug fix for yaml config loading.mysql
, the % of active connections is now monitored, query types are also charted.varnish
, now it supports versions above 5.0.0couchdb
phpfpm
, now supports IPv6 too.apache
, now supports IPv6 too.icecast
mongodb
, added support for connect URIspostgress
elasticsearch
, now it supports versions above 6.3.0, fixed JSON parse errorsmdstat
, now collects mismatch_cnt
openvpn_log
node.d.plugin
NODE.JS modulessnmp
was incorrectly parsing a new OID names as float. Fixed it.charts.d.plugin
BASH modulesnut
now supports naming UPSes.$system.cpu.processors
.TCP
SYN
and TCP
accept queue alarms, replacing the old softnet dropped alarm that was too generic and reported many false positives.bcache
alarms.mdstat
alarms.apcupsd
alarms.mysql
alarms.undefined
instead of never
.UTC
timezone to the list of available time-zones.Published by firehol-automation over 6 years ago
Posted on twitter, facebook, reddit r/linux,
Hi all,
Another great netdata release: netdata v1.10.0 !
This is a birthday release: netdata is now 2 years old !
Many thanks to all the contributors that help building, enhancing and improving a project useful and helpful for thousands of admins, devops and developers around the world! You rock!
- @ktsaou
netdata now has a new web server (called static
) with a fixed number of threads, providing a lot better performance and finer control of the resources allocated to it.
All dashboard elements (javascript) have been updated to their latest versions - this allows a smoother experience when embedding netdata charts on third party web sites and apps.
IMPORTANT: all users using older netdata are advised to update to this version. This version offers improved stability, security and a huge number of bug fixes, compared to any prior version of netdata.
And as always, hundreds more enhancements, improvements and bugfixes.
BTRFS space usage monitoring and related alarms.
netdata is able to detect if any of the space-related components (physical disk allocation, data, metdata and system) of BTRFS is about the become exhausted!
#3150 - thanks to @Ferroin for explaining everything about btrfs...
netdata now monitors bcache metrics - they are automatically added to any disk that is found to be a bcache disk.
New plugin to monitor ceph, the unified, distributed storage system designed for excellent performance, reliability and scalability (#3166 @lets00).
systemd-nspawn
containers.virsh
is now called with -r
to avoid prompting for password #3144
cgroup-network
is now a lot more strict, preventing unauthorized privilege escalation #3269
cgroup-network
now searches for container processes in sub-cgroups too - this improves the mapping of network interfaces to containerscgroup-network
now works even when there are no veth
interfaces in the systemnetdata can now monitor isc-ntpd. @rda0 did a marvelous job decoding NTP Control Message Protocol, collecting ntpd metrics in the most efficient way #3421, #3454 @rda0
btw, netdata also monitors
chrony
but the chrony module of netdata is disabled by default, because certain CentOS versions ship a version of chrony that consumes 100% cpu when queried for statistics.
Added python plugin to monitor the operation of nginx plus servers. The plugin monitors everything about nginx+, except streaming #3312 @l2isbad
netdata now monitors libreswan tunnels - #3204
netdata now has an httpcheck
plugin (module of python.d.plugin), that can query remote http/https servers, track the response timings and check that the response body contains certain text #3448 @ccremer .
netdata now has portcheck
plugin (module of python.d.plugin), that can check any remote TCP port is open #3447 @ccremer
netdata now monitors icecast servers #3511 @l2isbad.
netdata now monitors traefik reverse proxies - #3557.
netdata can now monitor java spring-boot applications @Wing924
netdata now monitors dnsdist name servers - @nobody-nobody #3009
hidden
to add the dimension, but make it hidden on the dashboard - a hidden dimension can participate in various calculations, including alarms).zinit
to allow them get initialized without altering their values (this is useful if you have rare metrics that you need to initialize when netdata starts).Several new charts have been added to monitor (#3400 by @anayrat):
Also, the postgres plugin now also works when postgres is in recovery mode.
netdata prior to this version was detecting the user and group of processes by examining the ownership of /proc/PID/stat
. Unfortunately it seems that the owneship of files in /proc
do not change when the process switches user. So, netdata could not detect the user and group of processes that started as root and then switched to another user.
Now netdata reads /proc/PID/status
:
/proc/PID/statm
(all the information of /proc/PID/statm
is available in /proc/PID/status
)VmSwap
, so a new chart has been added to monitor the swap memory usage per process, user and group.
The new plugin is 20% more expensive in terms of CPU. We tried hard to optimize it, but this is as good as it can get. Read about it at #3434 and #3436
Added charts:
@ktarasz
netdata now uses /proc/uptime
when CLOCK_BOOTTIME
does not report the same uptime. In containers CLOCK_BOOTTIME
reports the uptime of the host, while /proc/uptime
reports the uptime of the container, so now netdata correctly reports the uptime of the container.
various fixes to better monitor rebuild time and rate @l2isbad
to_scan
dimensionAdded several charts for translog / indices segments statistics and JVM buffer pool utilization, which are often helpful when evaluating an elasticsearch node health #3544 @NeonSludge
netdata now supports monitoring multiple APC UPSes.
netdata now also supports monitoring IPv6 leases - @l2isbad
solar_consumption
@ccremerAdded web server response timings histogram #3558 @Wing924 .
/etc/netdata/python.d.conf
is missing @l2isbadcharts.d.plugin BASH modules can now have custom number of retries in case of data collection failures #3524.
static web server
. This web server allows netdata to work around memory fragmentation (since the treads are fixed, the underlying memory allocators reuse the same memory arenas) and cpu utilization (we can control the number of threads that will be used by netdata). This is the default now. #3248
the print button now respects the URL path netdata is hosted.
dygraphs updated to the latest version - this fixes an issue that prevented netdata charts from being interactive under certain conditions
added dygraph theme logscale
#3283
fontawesome updated to version 5
d3 updated to the latest version (this broke c3 charts that require an older version)
added d3pie charts
custom dashboards can now have alarms for specific roles (all, none, one or more).
allow stacked charts to zoom vertically when dimensions are selected
netdata now has a global XSS protection #3363
netdata now uses intersectionObserver when available #3280 - this improves the scrolling performance of the dashboard.
prevent date, time and units from wrapping at the charts legends #3286
various units scaling improvements #3285
added data-common-colors="NAME"
chart option for custom dashboards #3282.
added wiki page for creating custom dashboards on Atlassian's Confluence.
prevented a double click on the charts' toolbox to select the text of the buttons.
fixed the alignment of dashboard icons #3224 @xPaw
added a simple js, called refresh-badges.js, to update badges on a custom web page
netdata badges can now be scaled #3474
gtime
parameter, for group time. This is used to request from netdata to return values in a different rate (i.e. gtime=60
on a X/sec
dimension, will return X/min
).dimensions=
parameter now supports simple patterns #3170 and added option values match-ids
and match-names
to control which matches are executed for dimensions.system.swap
alarms now send notifications with a 30 seconds delay, to work-around a kernel bug that incorrectly reports all swap as instantly used under containers #3380.
added alarm to predict the time a mount point will run out of inodes #3566.
all system alarms are now ported to FreeBSD too #3337 @arch273
added alerta.io notifications @kattunga
added available memory alarm
removed unsupported html tags from hipchat notifications.
pagerduty notifications have been modified to avoid incident duplication #3549.
alarm definitions can now use both chart IDs and chart names (prior to this version only chart IDs were allowed).
curl
options (eg for disabling SSL certificates verification) for alarm-notify.sh
can now be defined in health_alarm_notify.conf
.
netdata can now send notifications to IRC channels #3458 @manosf
IRCCloud web client:
Irssi terminal client:
send hosts matching = *
pattern.EALREADY
or EINPROGRESS
.host tags
(the tags have to be formatted in a json friendly way) #3556.timestamps=yes|no
to /api/v1/allmetrics
to support prometheus Pushgateway #3533
netdata_info
variable with the version of netdatanetdata_host_tags
to netdata_host_tags_info
(the old exists but is deprecated and will be removed eventually)average
metrics, netdata remembers the last access time the prometheus collected metrics, on a per host basis.stream.conf
option multiple connections = accept | deny
to allow or deny multiple connection for the same netdata host. The default remains accept
, but it is likely to be changed to no
on future versions.netdata-updater
was growing the PATH
variable on each of its runs - fixed it.--accept
and --dont-start-it
command line options to kickstart-static64.sh
long double
support (useful in embedded devices that don't support long double numbers) #3354
netdata.spec
to allow building netdata on older and newer rpm based distros. Also added a script to build a netdata rpm
curl
provided with this path.gap when lost iterations
to control the number of iterations that should be lost to show a gap on the charts.idle
process scheduling priority, even when it was configured to do otherwise. Fixed it #3523
snapshots
We can now save and load dashboard snapshots for any timeframe in any resolution. snapshots allow us to save artifacts, evidence, documentation of incidents, or just the raw data for postmortem analysis.
highlighted time-frame
We can now highlight a selected time-frame on all dashboard charts. So, to quickly compare charts press ALT or CONTROL and select an area on one chart. The same area will be highlighted on all charts.
export to PDF
We can now export netdata dashboards to PDF, for any timeframe with any detail.
access lists (IP filtering)
We can now setup IP filtering at netdata.conf
for all functions of netdata (dashboard access, streaming, registry, badges, etc - no more iptables rules for protecting netdata).
TCP overflows and connection drops
netdata can now detect TCP listening sockets overflows and connection drops, for any server running on the host (even the ones netdata is not aware of).
libvirt VMs
netdata now detects libvirt network interfaces and moves them to VM section of the dashboard (it also supports .libvirt-qemu
naming of cgroups).
Units auto-scaling
netdata dashboards can now scale units (KB
-> MB
-> GB
-> TB
, etc), on the fly.
Units conversions
netdata dashboards can now convert units (eg. Celsius to Fahrenheit, seconds to HH:MM:DD, etc), on the fly.
Multiple Timezones
netdata dashboards can now change timezone on the fly (yes, we can now compare charts with server logs).
python.d.plugin rewritten
@l2isbad rewrote the whole of it, to add flexibility and support the latest netdata features! The new plugin supports the old python modules.
better / faster dashboard scrolling
netdata now uses passive event listeners to detect page scrolling. This improved significantly the responsiveness of the dashboard (check your dashboard settings: sync
scrolling is the fastest, async
is closer to the older behavior).
netdata now monitors couchdb, powerdns, beanstalkd and dnsdist !
netdata now detects redis background save failures
netdata can now send flock.com and kavenegar.com alarm notifications
and as always... dozens more improvements, enhancements, new features and bug fixes!
Netdata can now export and import dashboard snapshots.
Snapshots are JSON files containing everything the dashboard needs to be rendered: charts and chart data.
They are exported as JSON files, to your computer. The saved snapshots can be loaded back on any netdata dashboard (even of different host). When importing, not network traffic is generated. The web browser loads the local file and renders an interactive dashboard to examine it.
The current visible timeframe of the dashboard is respected, so first align the dashboard to the timeframe required and the click "Export". The pop-up allows selecting the resolution of the export (its detail).
Press the ALT or CONTROL key and select a time-frame at a chart. An overlay will appear with the selected time-frame and all the charts will highlight the same region.
The highlighted time-frame:
my-netdata
menuAlso, netdata charts can now be zoomed vertically (use the SHIFT key, like in zoom, but select the chart vertically):
netdata dashboards can now be printed to PDF. Just click the 🖨️ icon on the dashboard.
The current visible timeframe of the dashboard is respected, so first align the dashboard to the timeframe required and the click "Print".
netdata can now check the client IPs connecting to it and deny/allow access based on your settings. No more iptables rules to control access to netdata.
All these settings are netdata simple patterns that are checked against the client IP (string matching - not subnet matching). localhost clients (IPv4, IPv6 and unix domain sockets) can be matched with localhost
:
[web].allow connections from
to match the clients' IPs allowed to connect to netdata. This has the same effect with iptables (but implemented at the application level - so clients will get connected, and disconnected immediately if they are not allowed access, without any response from netdata).netdata.conf
: [web].allow dashboard from
to match the clients' IPs that are allowed to access the dashboard (ie fetch static files and query netdata API).netdata.conf
: [web].allow badges from
to match the clients' IPs that are allowed to access badges (the dashboard clients are allowed to access badges too, so this setting allows badges to clients that do not have access to the dashboard).netdata.conf
: [web].allow streaming from
to match the the clients' IPs that are allowed to stream to stream metrics.stream.conf
: [API_KEY].allow from
to match the clients' IPs allowed to push metrics for the given API KEY.stream.conf
: [MACHINE_GUID].allow from
to match the clients' IPs allowed to push metrics for the specific machine.netdata will also check the API keys supplied by slaves and proxies connected.
netdata.conf
: [web].allow netdata.conf from
to limit the clients that can get netdata.conf
- by default netdata allows only private IPs.netdata.conf
: [registry].allow from
to limit the clients allowed to access the registry (only when this netdata acts as a registry).Added a new chart: ipv4.tcplistenissues
with dimensions ListenOverflows
and ListenDrops
.
This chart detects if any listening TCP socket on the host, is overflown, or it drops connections. This is system-wide: any listening TCP socket, of any application.
The chart will not be shown if these kernel counters are zero. It will be enabled automatically if it is found non-zero at any point (it is collected via /proc/net/netstat
every second). If you need to enable it even if it is zero, edit netdata.conf and set:
[plugin:proc:/proc/net/netstat]
TCP listen issues = yes
Two alarms have been added, one for ListenOverflows
and one for ListenDrops
that detect if there is any overflow or drop in the last minute (they run every 10 seconds).
slack alarm for overflows:
slack alarm for drops:
and the alarms configuration:
The alarms will automatically be attached when the chart is active.
The overflows dimension and alarm is supported on FreeBSD too.
/proc/net/sockstat
and /proc/net/sockstat6
These files provide sockets statistics for all protocols.
netdata also adds 3 new alarms:
netdata proxies with more than 100 slaves, had a timing issue that caused them to crash randomly on slave reconnects. Parts of the code have been rewritten to get rid of the timing issue.
netdata slaves and proxies, now have a protection that ensures they will never use 100% CPU, even if the master is misbehaving.
expired orphaned hosts are now removed from the my-netdata
menu of the dashboard.
streaming functions can now be monitored via access.log
streaming now support IP filtering. So the entire streaming functionality, API keys and MACHINE GUIDs can be associated with one or more IPs or IP patterns.
streaming now transfers alarm variables too
@l2isbad did a marvelous job rewriting python.d.plugin
. The new plugin:
supports option autodetection_retry: SECONDS
. When set to non-zero, the plugin will re-check the module every that many seconds. This solves the problem that netdata did not persist on collecting metrics from applications, if the application is not found running when netdata starts. By default is zero for all modules, so you need to enable it for all the applications you need it.
got a rewrite of several functions, like logging, module configuration, chart and dimensions management.
the new URL service disables by default certificates checks, to allow self-signed certificates to work without configuration.
The new plugin is compatible with custom python modules developed for the previous version.
custom regex now supports parsing hostnames and IPs @l2isbad
web_log now parses lines with error 408 (request timeout - these are a special case, since the request has not received by the web server, so the log line is incomplete) @l2isbad
now properly parses resp_length
with value -
@racciari
CouchDB maintainer @wohali, submitted a couchdb plugin for netdata. The plugin monitors:
2 charts have been added to monitor background save health status, bundled with 2 alarms that detect if background save has failed, or background save is slow (warn > 10 mins, crit > 20min). @l2isbad
netdata now monitors PowerDNS, @l2isbad
netdata now monitors beanstalkd, @l2isbad
netdata now monitors dnsdist, @nobody-nobody
disks under Linux are renamed using /dev/disk/by-label
. An option has been added at netdata.conf to also allow renaming based on /dev/disk/by-id
.
chrony
is now disabled by default, because there have been reports that chronyc
enters an infinite loop in CentOS and RHEL.
tomcat
improvements to support flavors of the tomcat server @Wing924
zfs
on FreeBSD now monitors ZFS TRIM statistics
disks monitoring charts on FreeBSD got a lot more FreeBSD related dimensions.
added CPU frequency charts on FreeBSD (Linux already had them).
chart system.io
(the total system Disk I/O) is now calculated by aggregating the reads and writes of all physical disks. The previous system.io
chart (that is based on pgpgin
and pgpgout
from /proc/vmstat
) is now named system.pgpgio
. The key difference is that the new system.io
now sees ZFS I/O, and it also correctly and accurately sums the real disk bandwidth of RAID arrays.
chart system.net
(the total system network bandwidth) is now calculated by aggregating the bandwidth of all physical network interfaces and is common for both IPv4 and IPv6.
tc
(QoS) charts now sort the dimensions on the legends, the same way tc
reports them.
postgres
versions <= 10 the WAL directory was named pg_xlog'
and from 10 upwards has been renamed to pg_wal
@facetoe
mysql
(and mariadb) got new charts for galera replication @spinitron
openvpn_log
improvements @l2isbad
smartd
improvements @l2isbad
varnish
module has been rewritten @l2isbad
mdstat
regex fix @l2isbad
smartd_log
improvements @l2isbad
dns_query_time
improvements @wungad
isc_dhcpd
improvements @wungad
freeipmi.plugin
got a command line option (can be given at netdata.conf) to ignore certain sensor IDs that are faulty.
freeradius
improvements @wungad
node.d.plugin
bugfixes
netdata.conf
, plugins directory = "DIRECTORY1" "DIRECTORY2" ...
, up to 20 directories. By default netdata sets:[global]
plugins directory = "/usr/libexec/netdata/plugins.d" "/etc/netdata/custom-plugins.d"
netdata now supports alarms variables.
Each plugin can now define host global and chart local variables with static values, that can be used in alarms' expressions. So, hosts and charts can now have any number of static values associated with them (eg. an application server may expose its max connections limit), and these static values can be used to trigger alarms (eg. the current connections, is compared to the max connections variable). The whole setup allows alarm templates to use this feature (eg each netdata can maintain different such variables for each server it monitors).
Alarm variables are propagated to upstream netdata servers.
added init file for SLC 6.9 and CloudLinux Server release 6.9
packages installer was incorrectly detecting all python versions as version 2.
a makeself
bug that prevented the static netdata binaries from being installed on busybox
systems, has been fixed.
openrc startup script (gentoo, alpine) had hardcoded the path to netdata. This affected all static-64bit builds when installed on these distros. Fixed.
the static 64bit installer now downloads netdata.conf, much like the git installer does.
openrc / gentoo init improvements @candrews
enabled support for macOS versions 10.5+ (10.11 was working already) @vlvkobal
enabled support for FreeBSD 12 @vlvkobal
fixed a crash on macOS hosts with empty disk names.
added Dockerfile.armv7hf
for running netdata under docker on ARM v7 machines @justin8
hover selection of charts is now faster on all browsers. Perfect on Chrome, Firefox and Opera. Quite usable on Edge.
the dashboard is now fixed when a modal is open, preventing scrolling the page.
the dashboard now uses fontawesome 5.0.1 for icons.
the chart names can now be searched with browser control-F (find in page). netdata lazy loads all charts for it was impossible to search of a chart. Now the charts are searchable. This is important on dashboards with several hundreds of statsd charts, because all these charts appear under the same section.
netdata now detects libvirt VM network interfaces and moves them to the VM section of the dashboard. The same functionality already exists for containers.
Show the context of each chart. The context
is used in alarm templates. (hover on the date of the chart)
Show the resolution of the chart. (hover on the time of the chart)
The dashboard now adds a tooltip at the date of the charts, to show the plugin and its module that collects each chart.
The dashboard should now put a lot less CPU pressure on the browser when the page does not have focus.
The dashboard does dynamic units scaling, on the fly ! It converts:
kilobits/s
to megabits/s
or gigabits/s
)kilobytes/s
to megabytes/s
or gigabytes/s
, similarly for KB/s
)MB
to KB
, GB
or TB
)GB
to MB
or TB
)Chart units dynamically adapt based on the value of the selected dimension too:
Custom dashboards can give data-desired-units="UNITS"
and netdata will automatically convert the presented values to the desired units. UNITS
can be any of the supported one, or auto
for auto-scaling based on the values, or original
to show the original units maintained by the netdata server.
The dashboard now supports units conversions. Currently it converts:
temperatures from Celsius
to Fahrenheit
seconds
to human readable duration DDd:HH:MM:SS
netdata can now convert all dates presented to any timezone. Traditionally netdata presented all charts at the timezone of the viewer. This allowed homogeneous central administration of systems that are installed all over the world. However, this was inefficient when we needed to compare the information presented on the dashboard, with the log files of the servers.
So, now netdata can present the charts on any timezone. The netdata server auto-detects the timezone of the server and new dashboard settings have been added to allow this conversion.
If autodetection of the servers timezone fails, the configuration option [global].timezone
has been added in netdata.conf
to set it. Also, the dashboard itself allows the viewers to configure the timezone (it is saved at browser local storage, so this has to be set just once per viewer).
To support all the above, the dashboard settings got a new tab, with all the required options:
statsd metrics can now be added to statsd synthetic charts using patterns. No need to add a dimension
line for each statsd metric to be added. netdata will also extract the wildcarded part of the metric name and use that one for the dimension name.
dimensions added to statsd synthetic charts, can automatically be renamed using a dictionary. Each synthetic charts application has its own dictionary of name - value pairs, which is used to automatically rename statsd metrics when they are added to synthetic charts.
statsd timers and histograms now report zeros when nothing is collected
fixed a bug in netdata badges that was incorrectly matching zero values with the null
color condition.
added API option display_absolute
to allow badges use the signed value for color evaluation, but present the absolute value.
warning emails sent by netdata, are now a little bit more orange (they were a bit green'sh).
added flock.com notifications @tvarsis
added kavenegar.com support for SMS notifications @vahit
fixed a bug in email notifications that was triggering a corrupted MIME match by anti-spam solutions.
pushbullet notifications now track the devices, so that per device filtering at pushbullet is possible. Also improved the formatting a bit. @user501254
pushover notifications fixes (the priority of warnings was set incorrectly)
alarms can now use variables like this ${variable with spaces or +, -, *, / in it}
. So, alarms can now use dimension names with any character in them.
access.log
has been refactored to support monitoring all netdata operations
inodes monitoring is now by default disabled for mount points based on filesystems that do not have a maximum inode threshold (such as cephfs
).
rabbitmq
has been added to apps_groups.conf
so that apps.plugin
now monitors (cpu, memory, disk I/O, sockets, etc) for rabbitmq instances.
several email and log management apps have been added to email
and logs
targets of apps_groups.conf
, @Flums
ceph
target added to apps_groups.conf
to allow netdata monitor Ceph - the unified, distributed storage system, @k0ste
refactored several internal data collection plugins to eliminate a few hundreds of index lookups per second.
netdata.conf
settings that are loaded from disk, but were the same with the default ones, were generated commented when the server was asked to give its config. Now all loaded settings are generated uncommented.
netdata simple patterns can now extract the the wildcarded part of the string they match (used in statsd synthetic charts)
netdata simple patterns can allow escaping spaces by prefixing them with a backslash.
netdata v1.8.0 released.
This release focuses on metrics streaming improvements and containers monitoring.
As always, this netdata is the fastest and the more stable netdata ever! Update now!
To install or update netdata, click here!
netdata, as a slave, was not handling all the error cases properly, resulting in 100% cpu utilization of a single core, under certain conditions. Especially under FreeBSD and macOS slaves, these conditions were always met, so using FreeBSD or macOS as netdata slaves, was completely broken.
netdata was incorrectly messing cached alarm state data between the alarms of the mirrored hosts, resulting in alarm notifications not dispatched under certain conditions. This was affecting only netdata masters (ie. netdata servers with more than one host databases, with health monitoring enabled). The alarms were generated and were visible at the dashboards, but the notifications were not always sent.
There was a minor issue with charts that were created with name aliases. When these charts were streamed from netdata slaves to netdata masters, they ended up with duplicate chart names (ie instead of type.name
they had type.type.name
).
Container network interfaces are now moved to the container section and they are rendered from the container view point (i.e. sent
= what the container sent) - no more veth*
garbage on the dashboard.
The interfaces also appear as eth0
(or whatever the container sees) and they are inside the container section of the dashboard. netdata maps each veth*
interface to the right container, using plain cgroups
features, so this works for all container managers (docker, lxc, etc).
Eliminated the nested containers shown under certain versions of lxc
.
Also, containers and VMs now have summary gauges on the dashboard
netdata now uses urllib3
(shipped with netdata for both python v2 and v3) for URLService based plugins.
This enables HTTP keep-alive
on all connections, which allows netdata to have permanent connections to third party web applications.
Fixed by @l2isbad
fping
can now run as non-root, in static binary netdata packagesnetdata can now listen on UNIX domain sockets (.sock
files). This allows a local web server and netdata to communicate bypassing the network stack (for netdata set bind to = unix:/path/to/netdata.sock
- this option supports multiple arguments, so netdata can listen to multiple unix sockets and tcp sockets, at the same time).
netdata was assuming that the JSON representation of a chart would at most be 1024 bytes, and it was generating corrupted JSON output when any chart was exceeding that limit. Removed the limitation (ie. now there is no limit).
netdata was crashing while starting, if no usable disks were found.
systemd netdata.service
now allows setting negative netdata OOM score and restarts netdata if it crashes. The new netdata.service
is not automatically installed when updating netdata. Either delete /etc/systemd/system/netdata.service
and then update/re-install netdata, or copy the file by hand.
minor fixes at the installer, by @vincele
chrony
plugin, by @domschlweb_log
bugfixes, enhancements and optimizations (including squid
logs), by @l2isbadweb_log
now enables parsing HTTP/2 logs in custom_log_format
, by @Funzinatorredis
bugfixes, by @l2isbadhaproxy
bugfixes, by @l2isbadelasticsearch
bugfixes and optimizations, by @l2isbadrabbitmq
bugfixes and optimizations, by @l2isbadmdstat
bugfixes, by @JeffHensontomcat
improvements, by @Wing924mysql
improvements, by @alibo and @l2isbaddovecot
improvementspostgres
improvements, by @facetoecpufreq
fixed a bug that prevented accurate
reporting of CPU frequencies. accurate
works with the acpi-cpufreq
driver and calculates the average CPU clock of the CPUs utilizing the accounting per frequency, as reported by the kernel, by @tychocpuidle
performance improvements (faster under load) by @tychofail2ban
bugfixes, by @l2isbadSNMP
plugin new uses latest net-snmp
and the corrupted 64 bit counters encountered under certain node.js version is now fixed.easypiecharts
and gauges
can now render arbitrary ranges and animate clock wise or counter clock wise.
traditionally netdata was using 1024 bits = 1 kilobit. It is fixed: 1000 bits = 1 kilobit.
netdata charts should now work on wordpress pages.
alarm-notify.sh
now supports debug mode, showing the exact commands it runs to send notifications, when export NETDATA_ALARM_NOTIFY_DEBUG=1
alarm-notify.sh
now supports setting the sender email address of the emails it sends.
emails sent by alarm-notify.sh
now include headers to reduce the possibility of them being scored as spam, by @Ferroin
network related alarms got new thresholds and improved badges
netdata now detects if the system has been suspended and pauses all alarms for 60 seconds on resume, to prevent false alarms (no more false alarms on laptops when they resume).
netdata alarms now support filtering based on hostname and O/S (linux, freebsd, macos). This means that netdata masters, can now support alarms for slaves of any O/S (i.e. a Linux netdata master can handle alarms for a FreeBSD slave).
netdata slack notifications now show the host sent the alarm. In the image below, the alarm is about bangalore
, and is sent by netdata-build-server
(at the lower left corner):
Published by philwhineray over 7 years ago
This is release v1.7 of netdata.
netdata is still spreading fast: we are at 320.000 users and 132.000 servers! Almost 100k new users, 52k new installations and 800k docker pulls since the previous release 4 and a half months ago! netdata user base grows at about 1000 new users and 600 new servers per day! Thank you! You are awesome!
The next release (v1.8) will be focused on providing a global health monitoring service, for all netdata users, for free! Read more about it here. We need supporters for this cause. Join us!
netdata is now a (very fast) fully featured statsd server and the only one with automatic visualization: push a statsd metric and hit F5 on the netdata dashboard: your metric visualized. It also supports synthetic charts, defined by you, so that you can correlate and visualize your application the way you like it.
netdata got new installation options - it is now easier than ever to install netdata - we also distribute a statically linked netdata x86_64 binary, including key dependencies (like bash
, curl
, etc) that can run everywhere a Linux kernel runs (CoreOS, CirrOS, etc).
metrics streaming and replication has been improved significantly. All known issues have been solved and key enhancements have been added. headless collectors and proxies can now send metrics to backends when data source = as collected
.
backends have got quite a few enhancements, including host tags, metrics filtering at the netdata side and sending of chart and dimension names instread of IDs; prometheus support has been re-written to utilize more prometheus features and provide more flexibility and integration options. IF YOU UPDATE FROM NETDATA 1.6 PLEASE CHECK YOUR DASHBOARDS, SINCE MANY METRICS HAVE CHANGED NAMES.
netdata now monitors ZFS (on Linux and FreeBSD), ElasticSearch, RabbitMQ, Go applications (via expvar
), ipfw (on FreeBSD 11), samba, squid logs (with web_log
plugin!).
netdata dashboard loading times have been improved significantly (hit F5 a few times on a netdata dashboard - it is now amazingly fast), to support dashboards with thousands of charts.
netdata alarms now support custom hooks, so you can run whatever you like in parallel with netdata alarms.
As usual, this release brings dozens more improvements, enhancements and compatibility fixes.
netdata is now a fully featured statsd server. It can collect statsd formatted metrics, visualize them on its dashboards, stream them to other netdata servers or archive them to backend time-series databases.
netdata statsd is fast. It can collect more than 1.200.000 metrics per second on modern hardware, more than 200Mbps of sustained statsd traffic. netdata statsd is inside netdata. This provides a distributed statsd implementation.
netdata also supports statsd synthetic charts: You can create dedicated sections on the dashboard to render the charts. You can control everything: the main menu, the submenus, the charts, the dimensions on each chart, etc.
Read more about netdata statsd
name:INTEGER|c
or name:INTEGER|C
or name|c
INTEGER
number supplied (positive, or negative).name:FLOAT|g
FLOAT
begins with +
or -
.name:FLOAT|h
The same chart with sum
unselected, to show the detail of the dimensions supported:
This is identical to counter
.
name:INTEGER|m
or name|m
or just name
INTEGER
number supplied (positive, or negative).name:TEXT|s
name:FLOAT|ms
The same chart with the sum
unselected:
There have been significant optimizations to the loading times of the dashboard. The dashboard loads instantly now, even when there are several hundreds of charts in it (hit F5 on the dashboard - it is super fast).
For those who know: we eliminated most browser reflows, by refactoring the way the charts are initialized and splitting initialization in 2 phases. Unfortunately we had to re-shape gauge and easypiecharts, so pay some attention to your custom dashboards after updating.
We now use natural sorting on the dashboard elements (i.e. instead of 1, 10, 2, 3 we get 1, 2, 3, 10).
There have been dozens of performance improvements on the netdata dashboard. Like all the previous releases, this release makes netdata the fastest netdata so far!
average
, sum
or volume
(from the netdata database) are now more accurate.contrib/nc-backend.sh
, a script that can act as a fallback backend for graphite, opentsdb and compatibles.as collected
metrics to backends.expvar
! @kralewitzweb_log
plugin can now monitor squid logs too ! @l2isbadweb_log
plugin can now monitor apache cache logs too (removed old apache_cache
plugin) @l2isbadweb_log
improvements - web_log
is now a lot more powerful! @l2isbadpython.d.plugin
LogService
now supports monitoring web log files matching a pattern @l2isbad/dev/mapper
names. It also has improved docker compatibility.haproxy
improvements @l2isbaddns_query_time
plugin to monitor the response time of nameservers @l2isbadcpufreq
improvements @l2isbadsmartd_log
improvements @pkoenig10bind_rndc
rewritten @l2isbadlighttpd
improvements (part of the apache
plugin)isc_dhcpd
improvements @l2isbadfping
improvementsapps.plugin
improvements (added many more applications to monitor, notably hadoop and friends, improved compatibility)freeipmi
improvementsmdstat
improvements @l2isbadmysql
improvements @aliboredis
improvements @l2isbadpostgres
rds fixes @facetoefail2ban
improvements @l2isbadidlejitter
rewrittenopenvpn
improvements @l2isbadnuma
improvements @Benje06alarm-notify.sh
now supports custom notification methods (you can hook whatever you like to netdata alarms).lighttpd
alarmmongodb
alarm @jnogolram
utilizes KSM (kernel memory deduper).map
improvements for faster operation with huge databases.clang
, even on FreeBSDPublished by philwhineray over 7 years ago
Release announced on twitter, hacker news, reddit r/linux, reddit r/sysadmin, reddit r/linuxadmin, reddit r/freebsd reddit r/devops reddir r/homelab facebook
netdata was first published on March 30th, 2016.
It has been a crazy year since then:
This is the first release that supports real-time streaming of metrics between netdata servers.
netdata can now be:
metrics databases can be configured on all nodes and each node maintaining a database may have a different retention policy and possibly run (even different) alarms on them.
There are 4 settings that control what netdata can be:
[global].memory mode
in netdata.conf
, controls if a netdata will maintain a local database and the type of it. For more information check Running a dedicated central netdata server.
[web].mode
in netdata.conf
, controls if netdata will expose its API, and the type of web server to enable (single or multi-threaded). Check netdata.conf configuration for streaming.
[stream].enabled
in stream.conf
, controls if netdata will stream its metrics to another netdata. Check stream.conf for sending metrics.
[API KEY].enabled
in stream.conf
, controls if netdata will accept metrics from other netdata. Check stream.conf for receiving metrics.
Using the above, we support a lot of different configurations, like these:
target | memorymode | webmode | streamenabled | send tobackend | localalarms | localdashboard |
---|---|---|---|---|---|---|
headless collector | none |
none |
yes |
not possible | not possible | no |
headless proxy | none |
not none
|
yes |
not possible | not possible | no |
proxy with db | not none
|
not none
|
yes |
possible | possible | yes |
central netdata | not none
|
not none
|
no |
possible | possible | yes |
netdata now supports monitoring autoscaled ephemeral nodes, that are started and stopped on demand (their IP is not known).
When the ephemeral nodes start streaming metrics to the central netdata, the central netdata will show register them at my-netdata
menu on the dashboard, like this:
You can see this live at https://build.my-netdata.io (this server may not always be available for demo).
For more information check: monitoring ephemeral nodes.
netdata now cleans up container, guest VM, network interfaces and mounted disk metrics, disabling automatically their alarms too.
For more information check monitoring ephemeral containers.
Vladimir Kobal has ported apps.plugin
to FreeBSD.
netdata can now provide Applications
, Users
and User Groups
under FreeBSD too:
Also, the CPU utilization of netdata under FreeBSD, is now a lot less compared to netdata v1.5.
See it live at our FreeBSD demo server.
Ilya Mashchenko has done a wonderful job creating a unified web log parsing plugin for all kinds of web server logs. With it, netdata provides real-time performance information and health monitoring alarms for web applications and web sites!
Requests by http status:
Requests by http status code family:
Requests by http status code:
Requests bandwidth:
Requests timings:
URL patterns of interest (you configure the patterns):
Requests by http method:
Requests by IP version:
Number of unique clients:
and a lot more, including alarms:
alarm | description | minimumrequests | warning | critical |
---|---|---|---|---|
1m_redirects |
The ratio of HTTP redirects (3xx except 304) over all the requests, during the last minute. Detects if the site or the web API is suffering from too many or circular redirects. (i.e. oops! this should not redirect clients to itself) | 120/min | > 20% | > 30% |
1m_bad_requests |
The ratio of HTTP bad requests (4xx) over all the requests, during the last minute. Detects if the site or the web API is receiving too many bad requests, including 404 , not found. (i.e. oops! a few files were not uploaded) |
120/min | > 30% | > 50% |
1m_internal_errors |
The ratio of HTTP internal server errors (5xx), over all the requests, during the last minute. Detects if the site is facing difficulties to serve requests. (i.e. oops! this release crashes too much) | 120/min | > 2% | > 5% |
5m_requests_ratio |
The percentage of successful web requests of the last 5 minutes, compared with the previous 5 minutes. Detects if the site or the web API is suddenly getting too many or too few requests. (i.e. too many = oops! we are under attack)(i.e. too few = oops! call the network guys) | 120/5min | > double or < half | > 4x or < 1/4x |
web_slow |
The average time to respond to requests, over the last 1 minute, compared to the average of last 10 minutes. Detects if the site or the web API is suddenly a lot slower. (i.e. oops! the database is slow again) | 120/min | > 2x | > 4x |
1m_successful |
The ratio of successful HTTP responses (1xx, 2xx, 304) over all the requests, during the last minute. Detects if the site or the web API is performing within limits. (i.e. oops! help us God!) | 120/min | < 85% | < 75% |
For more information check: the spectacles of a web server log file.
netdata can now archive metrics to JSON
backends (both push, by @lfdominguez, and pull modes).
netdata now has an IPMI plugin (based on freeipmi) for monitoring server hardware.
The plugin creates (up to) 8 charts, based on the information collected from IPMI:
It also supports alarms (including the number of sensors in critical state):
For more information, check monitoring IPMI.
Ilya Mashchenko builds python data collection plugins for netdata at an wonderfull rate! He rocks!
nice
netdata has received a lot more improvements from many more contributors! (it was really a lot of work to dig into git log to collect all the above, so forgive me if I forgot to mention a few contributions and contributors).
Thank you all!
Published by ktsaou over 7 years ago
Release announced on twitter, hacker news, reddit r/linux, reddit r/sysadmin, reddit r/linuxadmin, reddit r/freebsd
Yet another release that makes netdata the fastest netdata ever!
This is probably the release with the largest changeset so far. A lot of work, by a lot of people made this release possible!
Vladimir Kobal has done a magnificent work porting netdata to FreeBSD and MacOS.
Everything works:
Wow! Check it live on FreeBSD, at https://freebsd.my-netdata.io/
netdata supports data archiving to backend databases:
and of course all the compatible ones (KairosDB, InfluxDB, Blueflood, etc)
With this feature netdata can interface with your existing devops infrastructure and allow you to visualize its metrics with other tools, like grafana.
Ilya Mashchenko has created most of the python data collection plugins in this release! He rocks!
Shell scripts can now query netdata easily!
eval "$(curl -s 'http://localhost:19999/api/v1/allmetrics')"
after this command, all the netdata metrics are exposed to shell. Check:
# source the metrics
eval "$(curl -s 'http://localhost:19999/api/v1/allmetrics')"
# let's see if there are variables exposed by netdata for system.cpu
set | grep "^NETDATA_SYSTEM_CPU"
NETDATA_SYSTEM_CPU_GUEST=0
NETDATA_SYSTEM_CPU_GUEST_NICE=0
NETDATA_SYSTEM_CPU_IDLE=95
NETDATA_SYSTEM_CPU_IOWAIT=0
NETDATA_SYSTEM_CPU_IRQ=0
NETDATA_SYSTEM_CPU_NICE=0
NETDATA_SYSTEM_CPU_SOFTIRQ=0
NETDATA_SYSTEM_CPU_STEAL=0
NETDATA_SYSTEM_CPU_SYSTEM=1
NETDATA_SYSTEM_CPU_USER=4
NETDATA_SYSTEM_CPU_VISIBLETOTAL=5
# let's see the total cpu utilization of the system
echo ${NETDATA_SYSTEM_CPU_VISIBLETOTAL}
5
# what about alarms?
set | grep "^NETDATA_ALARM_SYSTEM_SWAP_"
NETDATA_ALARM_SYSTEM_SWAP_RAM_IN_SWAP_STATUS=CRITICAL
NETDATA_ALARM_SYSTEM_SWAP_RAM_IN_SWAP_VALUE=53
NETDATA_ALARM_SYSTEM_SWAP_USED_SWAP_STATUS=CLEAR
NETDATA_ALARM_SYSTEM_SWAP_USED_SWAP_VALUE=51
# let's get the current status of the alarm 'ram in swap'
echo ${NETDATA_ALARM_SYSTEM_SWAP_RAM_IN_SWAP_STATUS}
CRITICAL
# is it fast?
time curl -s 'http://localhost:19999/api/v1/allmetrics' >/dev/null
real 0m0,070s
user 0m0,000s
sys 0m0,007s
# it is...
# 0.07 seconds for curl to be loaded, connect to netdata and fetch the response back...
The _VISIBLETOTAL
variable sums up all the dimensions of each chart.
The format of the variables is:
NETDATA_${chart_id^^}_${dimension_id^^}="${value}"
The value
is rounded to the closest integer, since shell script cannot process decimal numbers.
netdata has received a lot more improvements from many more contributors! (it was really a lot of work to dig into git log to collect all the above, so forgive me if I forgot to mention a few contributions and contributors).
Thank you all!
Published by ktsaou about 8 years ago
Release announced on Hacker News
Release announced on reddit r/linux
Release announced on reddit r/sysadmin
Release announced on twitter
Many new alarms have been added to detect common kernel configuration errors and old alarms have been re-worked to avoid notification floods.
Alarms now support:
notification hysteresis (both static and dynamic)
notification self-cancellation, and
dynamic thresholds based on current alarm status
Also, a new alarms log:
netdata now supports:
For all the above methods, netdata supports role-based notifications, with multiple recipients for each role and severity filtering per recipient!
Also, netdata support HTML5 notifications, while the dashboard is open in a browser window (no need to be the active one).
All notifications (HTML5, emails, slack, pushover, telegram) are now clickable to get to the chart that raised the alarm.
improved IoT support!
netdata builds and runs with musl libc and runs on systems based on busybox.
improved containers support!
netdata runs on alpine linux (a low profile linux distribution used in containers).
Dozens of other improvements and bugfixes
netdata 1.4.0 - download release tarfiles from http://firehol.org/download/netdata/releases/v1.4.0
Published by ktsaou about 8 years ago
IMPORTANT:
Since netdata now uses python plugins, new packages are
required to be installed on a system to allow it work.
For more information, please check the installation page.
Based on the POLL we made on github, health monitoring was the winner. So here it is!
netdata now has a powerful health monitoring system embedded.
netdata can generate badges with live information from the collected metrics.
Thanks to the great work of Paweł Krupa (@paulfantom), most BASH plugins have been ported to python.
The new python.d.plugin supports both python2 and python3 and data collection from multiple sources for all modules.
The following pre-existing modules have been ported to python:
The following new modules have been added:
Thanks to @simonnagl netdata now reports disk space usage.
dashboards now transfer certain settings from server to server when changing servers via the my-netdata menu.
The settings transferred are the dashboard theme, the online help status and current pan and zoom timeframe of the dashboard.
API improvements:
apps.plugin improvements:
netdata now runs with IDLE process priority (lower than nice 19)
netdata now instructs the kernel to kill it first when it starves for memory.
netdata listens for signals:
netdata can now bind to multiple IPs and ports.
netdata now has new systemd service file (it starts as user netdata and does not fork).
Dozens of other improvements and bugfixes
netdata 1.3.0 - download release tarfiles from http://firehol.org/download/netdata/releases/v1.3.0
Published by ktsaou over 8 years ago
IMPORTANT:
This version requires libuuid. The package you need to build netdata is:
- uuid-dev (debian/ubuntu), or
- libuuid-devel (centos/fedora/redhat)
The central registry tracks all your netdata servers and bookmarks them for you at the my-netdata menu on all dashboards.
Every netdata can act as a registry, but there is also a global registry provided for free for all netdata users!
docker, lxc, or anything else. For each container it monitors CPU, RAM, DISK I/O (network interfaces were already monitored).
netdata 1.2.0 - download release tarfiles also from http://firehol.org/download/netdata/releases/v1.2.0