Instrumenting Go applications with Prometheus Demo
This repo contains a demo of code instrumentation using prometheus official Go client.
The application is a single endpoint API that fetches the weather for a given city using the openweathermap API. The code responsible for fetching the weather information is on the weather
package.
The application also uses a redis server to store the number of requests done for each city, this logic is on the redis
package and uses a channel to serialize the calls to the redis backend. This is done just to illustrate different instrumentation patterns.
Use the different branches to navigate between the code as we go from a completely non instrumented code to a code with some instrumentation coverage.
This is the base code that implements the logic described in the introduction section.
In this step we introduced the prometheus client to our vendored dependencies in included it's HTTP handler on our server:
r.Handle("/metrics", promhttp.Handler())
With this single line of code we add several metrics about the Go runtime to our /metrics
endpoint.
We begin this branch by adding instrumentation to our own handler.
Registering the new metric:
var (
handlerDuration = prometheus.NewHistogramVec(prometheus.HistogramOpts{
Name: "myapp_handlers_duration_seconds",
Help: "Handlers request duration in seconds",
}, []string{"path"})
)
func init() {
prometheus.MustRegister(handlerDuration)
}
We then create a wrapper handler that calls the underlaying handler and measures the duration it took:
func instrumentHandler(pattern string, next http.HandlerFunc) http.HandlerFunc {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
now := time.Now()
next(w, r)
handlerDuration.WithLabelValues(pattern).Observe(time.Since(now).Seconds())
})
}
We use pattern
as a label to be able to distingish the duration of different handlers.
The router is updated to use this handler on top of the old one:
r.HandleFunc("/city/{name}/temp", instrumentHandler("/city/{name}/temp", cityTemp)
We can also instrument our server and keep track of the opened connections. First, registering the new metric:
var (
openConnections = prometheus.NewGauge(prometheus.GaugeOpts{
Name: "myapp_open_connections",
Help: "The current number of open connections.",
})
)
func init() {
prometheus.MustRegister(openConnections)
}
And then updating our http.Server
:
s := &http.Server{
Addr: ":8080",
Handler: router(),
ReadTimeout: 10 * time.Second,
WriteTimeout: 30 * time.Second,
ConnState: func(c net.Conn, s http.ConnState) {
switch s {
case http.StateNew:
openConnections.Inc()
case http.StateHijacked | http.StateClosed:
openConnections.Dec()
}
},
}
The ConnState
is called everytime a connection changes state.
In this step we added instrumentation to our weather
package. The instrumentation on this package follows the RED pattern:
This is done by adding the following metrics and initializing them:
var (
requestsDuration = prometheus.NewHistogram(prometheus.HistogramOpts{
Name: "myapp_weather_request_duration_seconds",
Help: "The duration of the requests to the weather service.",
})
requestsCurrent = prometheus.NewGauge(prometheus.GaugeOpts{
Name: "myapp_weather_requests_current",
Help: "The current number of requests to the weather service.",
})
requestsStatus = prometheus.NewCounterVec(prometheus.CounterOpts{
Name: "myapp_weather_requests_total",
Help: "The total number of requests to the weather service by status.",
}, []string{"status"})
clientErrors = prometheus.NewCounter(prometheus.CounterOpts{
Name: "myapp_weather_errors",
Help: "The total number of weather client errors",
})
)
func init() {
prometheus.MustRegister(requestsDuration)
prometheus.MustRegister(requestsCurrent)
prometheus.MustRegister(requestsStatus)
prometheus.MustRegister(clientErrors)
}
And updating the (w *weatherClient) do
func to properly update those metrics, by adding the following snippet to the begin of the func:
now := time.Now()
requestsCurrent.Inc()
defer func() {
requestsDuration.Observe(time.Since(now).Seconds())
requestsCurrent.Dec()
if resp != nil {
requestsStatus.WithLabelValues(strconv.Itoa(resp.StatusCode)).Inc()
}
if err != nil {
clientErrors.Inc()
}
}()
In this step we add instrumentation to the redis
package by implementing the prometheus.Collector
interface to expose some of the metrics already available on the redis
client.
The registration of the collector is done after the client is created, on the init
function.
prometheus.MustRegister(&redisCollector{client: client})
The collector is a simple struct that implements two methods:
var (
requestsDesc = prometheus.NewDesc("myapp_redis_connections_requests_total", "The total number of connections requests to redis pool.", []string{}, nil)
hitsDesc = prometheus.NewDesc("myapp_redis_connections_hits_total", "The total number of times a free connection was found in redis pool.", []string{}, nil)
waitsDesc = prometheus.NewDesc("myapp_redis_connections_waits_total", "The total number of times the redis pool had to wait for a connection.", []string{}, nil)
timeoutsDesc = prometheus.NewDesc("myapp_redis_connections_timeouts_total", "The total number of wait timeouts in redis pool.", []string{}, nil)
connsDesc = prometheus.NewDesc("myapp_redis_connections_current", "The current number of connections in redis pool.", []string{}, nil)
freeConnsDesc = prometheus.NewDesc("myapp_redis_connections_free_current", "The current number of free connections in redis pool.", []string{}, nil)
)
type redisCollector struct {
client *redis.Client
}
func (c *redisCollector) Describe(ch chan<- *prometheus.Desc) {
ch <- requestsDesc
ch <- hitsDesc
ch <- waitsDesc
ch <- timeoutsDesc
ch <- connsDesc
ch <- freeConnsDesc
}
func (c *redisCollector) Collect(ch chan<- prometheus.Metric) {
stats := c.client.PoolStats()
ch <- prometheus.MustNewConstMetric(requestsDesc, prometheus.CounterValue, float64(stats.Requests))
ch <- prometheus.MustNewConstMetric(hitsDesc, prometheus.CounterValue, float64(stats.Hits))
ch <- prometheus.MustNewConstMetric(waitsDesc, prometheus.CounterValue, float64(stats.Waits))
ch <- prometheus.MustNewConstMetric(timeoutsDesc, prometheus.CounterValue, float64(stats.Timeouts))
ch <- prometheus.MustNewConstMetric(connsDesc, prometheus.GaugeValue, float64(stats.TotalConns))
ch <- prometheus.MustNewConstMetric(freeConnsDesc, prometheus.GaugeValue, float64(stats.FreeConns))
}
Every time Prometheus fetches the /metrics
endpoint on our API the Collect
function is called and sends the metrics to the provided channel.
Our handlers interact with the redis
package by calling redis.Increment
which writes to a channel. Another goroutine, in the redis
package keeps listening for this channel and talks to the redis server everytime a request is made thru it.
It's usefull to instrument the size of the channel to understand if we need to increase it's buffer size.
prometheus.MustRegister(prometheus.NewGaugeFunc(prometheus.GaugeOpts{
Name: "myapp_redis_queue_current_length",
Help: "The current number of itens on redis queue.",
}, func() float64 {
return float64(len(keys))
}))
This snippet registers a new metric, using one of the New*Func
family of functions. This function (that returns the length of the keys
channel), will be called everytime Prometheus scrapes our service. This instrument our channel and might be used to discover saturation and tune the buffer size.
Our call to the redis.Increment
will block if the channel hits it's buffer size. We can instrument our channel to discover if we are indeed blocking (Saturation). This is done in this step.
We begin by introducing a new metric:
var (
queueWaitDuration = prometheus.NewHistogram(prometheus.HistogramOpts{
Name: "myapp_redis_queue_wait_duration_seconds",
Help: "The wait duration when trying to write to the redis queue",
})
)
And changing our redis.Increment
implementation to:
func Increment(key string) {
select {
case keys <- key:
default:
now := time.Now()
keys <- key
queueWaitDuration.Observe(time.Since(now).Seconds())
}
}
This means that, if there is no request being served (channel is empty), the first case statement is reached: keys <- key
and we simply add the key to the channel and return. If the channel blocks, the default
case is reached and we record the time we spend blocked.
This metric can be used to measure our saturation and to tune the channel buffer size for our requirements.
In this step we add some instrumentation to our redis backend to measure the number of operations we are applying.
We start by registering a new metric:
var (
redisOps = prometheus.NewCounterVec(prometheus.CounterOpts{
Name: "myapp_redis_worker_operations",
Help: "The total operations performed by redis worker.",
}, []string{"operation", "result"})
)
And then updating our background worker:
func startWorker(client *redis.Client) {
fmt.Printf("Starting background worker...\n")
for {
opLabel := "ping"
var err error
select {
case key := <-keys:
result := client.Incr(key)
opLabel = "incr"
err = result.Err()
case <-time.After(time.Second * 10):
result := client.Ping()
err = result.Err()
}
resultLabel := "ok"
if err != nil {
resultLabel = "error"
}
redisOps.WithLabelValues(opLabel, resultLabel).Inc()
}
}
We use labels to distinguish the operation type and the result.