Prometheus is the new hotness on the monitoring front for it’s ability to integrate and monitor easily into container environments that predominate microservice architectures.
This article shows you how to add some quick and dirty Prometheus metrics to your Go application. This recipe adds in count, total duration and histogram of durations for any RESTful endpoint with go-restful. It requires the Prometheus Golang client.
Install Prometheus Client
To start, install the Prometheus Go client with:
$ go get github.com/prometheus/client_golang/prometheus
If you are using Go vendoring, you’ll need to run godeps save
or the like to copy the library into your repo.
Add Metrics Variables
Then, add a file into your app called metrics.go
(or some such name) to define your new metrics. In this example, I am going to instrument all of my RESTful endpoints as well as hot path of my code which handles authentication. To do so, I am going to «INSERT COMMENTS»
1package main
2
3import (
4 "strconv"
5 "time"
6
7 log "github.com/Sirupsen/logrus"
8
9 "github.com/emicklei/go-restful"
10 "github.com/prometheus/client_golang/prometheus"
11)
12
13var (
14 requestDuration = prometheus.NewSummaryVec(
15 prometheus.SummaryOpts{
16 Name: "request_durations_microseconds",
17 Help: "Request handler latency distributions.",
18 },
19 []string{"verb", "resource", "code"},
20 )
21)
22
23var (
24 authorizeDuration = prometheus.NewSummaryVec(
25 prometheus.SummaryOpts{
26 Name: "authorize_durations_microseconds",
27 Help: "Authorize middleware latency distributions.",
28 },
29 []string{"service"},
30 )
31)
32
33func init() {
34 prometheus.MustRegister(requestDuration)
35 prometheus.MustRegister(authorizeDuration)
36}
37
38// authQuantile creates a single dimension distribution in prometheus
39func simpleQuantile(begin time.Time, serviceName string) {
40 authorizeDuration.WithLabelValues(
41 serviceName,
42 ).Observe(float64(time.Since(begin)))
43}
44
45// Quantile creates a distribution in prometheus
46func Quantile(begin time.Time, req *restful.Request, resp *restful.Response) {
47 requestDuration.WithLabelValues(
48 req.Request.Method,
49 req.SelectedRoutePath(),
50 strconv.Itoa(resp.StatusCode()),
51 ).Observe(float64(time.Since(begin)))
52}
53
54func instrumentHandler(f restful.RouteFunction) restful.RouteFunction {
55 return func(req *restful.Request, resp *restful.Response) {
56 defer Quantile(time.Now(), req, resp)
57 log.Debugf("%s %s start", req.Request.Method, req.SelectedRoutePath())
58 f(req, resp)
59 log.Debugf("%s %s end", req.Request.Method, req.SelectedRoutePath())
60 }
61}
Wrap Your Web Service Handlers
Change your handler function to be wrapped with the instrumentHandler()
function like getAllDeploys
function in the To()
has been here:
ws.Route(ws.GET(DeploysPath).To(instrumentHandler(getAllDeploys)).
Doc("retrieves all deploy objects").
Operation("getAllDeploys").
Returns(200, OPSUCCESS, []Deploy{}).
Returns(404, DEPLOYNOTFOUND, ErrorResponse{}))
Create A Metrics Endpoint
View Your Metrics
Test, compile and run the application. Your will now see new metrics showing up after your endpoints have been visited.
$ curl http://localhost:9000/metrics
# HELP go_gc_duration_seconds A summary of the GC invocation durations.
# TYPE go_gc_duration_seconds summary
go_gc_duration_seconds{quantile="0"} 0
go_gc_duration_seconds{quantile="0.25"} 0
... snip ...
# HELP request_durations_microseconds Request handler latency distributions.
# TYPE request_durations_microseconds summary
request_durations_microseconds{code="200",resource="/cloud",verb="POST",quantile="0.5"} 1.11544818e+08
request_durations_microseconds{code="200",resource="/cloud",verb="POST",quantile="0.9"} 1.28751743e+08
request_durations_microseconds{code="200",resource="/cloud",verb="POST",quantile="0.99"} 1.28751743e+08
request_durations_microseconds_sum{code="200",resource="/cloud",verb="POST"} 4.1551865e+08
request_durations_microseconds_count{code="200",resource="/cloud",verb="POST"} 3
request_durations_microseconds{code="200",resource="/cloud/{key}",verb="DELETE",quantile="0.5"} 1.246744e+06
request_durations_microseconds{code="200",resource="/cloud/{key}",verb="DELETE",quantile="0.9"} 1.246744e+06
request_durations_microseconds{code="200",resource="/cloud/{key}",verb="DELETE",quantile="0.99"} 1.246744e+06
request_durations_microseconds_sum{code="200",resource="/cloud/{key}",verb="DELETE"} 1.246744e+06
request_durations_microseconds_count{code="200",resource="/cloud/{key}",verb="DELETE"} 1
request_durations_microseconds{code="200",resource="/cluster",verb="POST",quantile="0.5"} 1.20731725e+08
request_durations_microseconds{code="200",resource="/cluster",verb="POST",quantile="0.9"} 1.29331459e+08
request_durations_microseconds{code="200",resource="/cluster",verb="POST",quantile="0.99"} 1.29331459e+08
request_durations_microseconds_sum{code="200",resource="/cluster",verb="POST"} 4.34173575e+08
request_durations_microseconds_count{code="200",resource="/cluster",verb="POST"} 3
request_durations_microseconds{code="200",resource="/cluster/{key}",verb="DELETE",quantile="0.5"} 1.422451e+06
request_durations_microseconds{code="200",resource="/cluster/{key}",verb="DELETE",quantile="0.9"} 1.422451e+06
request_durations_microseconds{code="200",resource="/cluster/{key}",verb="DELETE",quantile="0.99"} 1.422451e+06
request_durations_microseconds_sum{code="200",resource="/cluster/{key}",verb="DELETE"} 1.422451e+06
request_durations_microseconds_count{code="200",resource="/cluster/{key}",verb="DELETE"} 1
request_durations_microseconds{code="200",resource="/deploy",verb="POST",quantile="0.5"} 2.717301e+06
request_durations_microseconds{code="200",resource="/deploy",verb="POST",quantile="0.9"} 2.717301e+06
request_durations_microseconds{code="200",resource="/deploy",verb="POST",quantile="0.99"} 2.717301e+06
request_durations_microseconds_sum{code="200",resource="/deploy",verb="POST"} 6.2752259e+07
request_durations_microseconds_count{code="200",resource="/deploy",verb="POST"} 2
request_durations_microseconds{code="200",resource="/deploy/{key}",verb="DELETE",quantile="0.5"} 1.396844e+06
request_durations_microseconds{code="200",resource="/deploy/{key}",verb="DELETE",quantile="0.9"} 1.396844e+06
request_durations_microseconds{code="200",resource="/deploy/{key}",verb="DELETE",quantile="0.99"} 1.396844e+06
request_durations_microseconds_sum{code="200",resource="/deploy/{key}",verb="DELETE"} 1.396844e+06
request_durations_microseconds_count{code="200",resource="/deploy/{key}",verb="DELETE"} 1
request_durations_microseconds{code="200",resource="/deploy/{key}",verb="PATCH",quantile="0.5"} 1.36752e+06
request_durations_microseconds{code="200",resource="/deploy/{key}",verb="PATCH",quantile="0.9"} 1.36752e+06
request_durations_microseconds{code="200",resource="/deploy/{key}",verb="PATCH",quantile="0.99"} 1.36752e+06
request_durations_microseconds_sum{code="200",resource="/deploy/{key}",verb="PATCH"} 1.36752e+06
request_durations_microseconds_count{code="200",resource="/deploy/{key}",verb="PATCH"} 1
request_durations_microseconds{code="200",resource="/deploys",verb="GET",quantile="0.5"} 1.344103e+06
request_durations_microseconds{code="200",resource="/deploys",verb="GET",quantile="0.9"} 1.344103e+06
request_durations_microseconds{code="200",resource="/deploys",verb="GET",quantile="0.99"} 1.344103e+06
request_durations_microseconds_sum{code="200",resource="/deploys",verb="GET"} 1.344103e+06
request_durations_microseconds_count{code="200",resource="/deploys",verb="GET"} 1
request_durations_microseconds{code="200",resource="/version",verb="GET",quantile="0.5"} 1.19220383e+08
request_durations_microseconds{code="200",resource="/version",verb="GET",quantile="0.9"} 1.19220383e+08
request_durations_microseconds{code="200",resource="/version",verb="GET",quantile="0.99"} 1.19220383e+08
request_durations_microseconds_sum{code="200",resource="/version",verb="GET"} 1.19220383e+08
request_durations_microseconds_count{code="200",resource="/version",verb="GET"} 1
Nota bene: Metrics only start showing up after the endpoints has been retrieved.