blob: 2a6c959abebe5044a592597d37524d630b91d928 [file] [log] [blame]
James E. Blaireff5a9d2017-06-20 00:00:37 -07001:title: Monitoring
2
3Monitoring
4==========
Antoine Mussoa8eea7d2013-10-05 16:08:00 +02005
6Statsd reporting
James E. Blaireff5a9d2017-06-20 00:00:37 -07007----------------
Antoine Mussoa8eea7d2013-10-05 16:08:00 +02008
9Zuul comes with support for the statsd protocol, when enabled and configured
Michael Prokop526926a2013-10-24 16:16:57 +020010(see below), the Zuul scheduler will emit raw metrics to a statsd receiver
James E. Blaireff5a9d2017-06-20 00:00:37 -070011which let you in turn generate nice graphics.
Antoine Mussoa8eea7d2013-10-05 16:08:00 +020012
13Configuration
James E. Blaireff5a9d2017-06-20 00:00:37 -070014~~~~~~~~~~~~~
Antoine Mussoa8eea7d2013-10-05 16:08:00 +020015
16Statsd support uses the statsd python module. Note that Zuul will start without
17the statsd python module, so an existing Zuul installation may be missing it.
18
Michael Prokop526926a2013-10-24 16:16:57 +020019The configuration is done via environment variables STATSD_HOST and
Antoine Mussoa8eea7d2013-10-05 16:08:00 +020020STATSD_PORT. They are interpreted by the statsd module directly and there is no
Michael Prokop526926a2013-10-24 16:16:57 +020021such parameter in zuul.conf yet. Your init script will have to initialize both
Paul Belanger174a8272017-03-14 13:20:10 -040022of them before executing Zuul.
Antoine Mussoa8eea7d2013-10-05 16:08:00 +020023
24Your init script most probably loads a configuration file named
25``/etc/default/zuul`` which would contain the environment variables::
26
27 $ cat /etc/default/zuul
28 STATSD_HOST=10.0.0.1
29 STATSD_PORT=8125
30
31Metrics
James E. Blaireff5a9d2017-06-20 00:00:37 -070032~~~~~~~
Antoine Mussoa8eea7d2013-10-05 16:08:00 +020033
34The metrics are emitted by the Zuul scheduler (`zuul/scheduler.py`):
35
Bruno Tavaresf9af4cd2015-10-15 14:55:51 -030036**gerrit.event.<type> (counters)**
James E. Blaireff5a9d2017-06-20 00:00:37 -070037 Gerrit emits different kind of message over its `stream-events`
38 interface. Zuul will report counters for each type of event it
39 receives from Gerrit.
40
41 Some of the events emitted are:
Antoine Mussoa8eea7d2013-10-05 16:08:00 +020042
43 * patchset-created
44 * draft-published
45 * change-abandonned
46 * change-restored
47 * change-merged
48 * merge-failed
49 * comment-added
50 * ref-updated
51 * reviewer-added
52
53 Refer to your Gerrit installation documentation for an exhaustive list of
54 Gerrit event types.
55
56**zuul.pipeline.**
57 Holds metrics specific to jobs. The hierarchy is:
58
59 #. **<pipeline name>** as defined in your `layout.yaml` file (ex: `gate`,
60 `test`, `publish`). It contains:
61
62 #. **all_jobs** counter of jobs triggered by the pipeline.
63 #. **current_changes** A gauge for the number of Gerrit changes being
64 processed by this pipeline.
65 #. **job** subtree detailing per jobs statistics:
66
67 #. **<jobname>** The triggered job name.
68 #. **<build result>** Result as defined in your triggering system. For
69 Jenkins that would be SUCCESS, FAILURE, UNSTABLE, LOST. The
Joshua Heskethe8987162014-03-13 13:05:33 +110070 metrics holds both an increasing counter and a timing
71 reporting the duration of the build. Whenever the result is a
72 SUCCESS or FAILURE, Zuul will additionally report the duration
73 of the build as a timing event.
Antoine Mussoa8eea7d2013-10-05 16:08:00 +020074
75 #. **resident_time** timing representing how long the Change has been
76 known by Zuul (which includes build time and Zuul overhead).
77 #. **total_changes** counter of the number of change proceeding since
78 Zuul started.
Bruno Tavaresf9af4cd2015-10-15 14:55:51 -030079 #. **wait_time** counter and timer of the wait time, with the difference
Paul Belanger174a8272017-03-14 13:20:10 -040080 of the job start time and the execute time, in milliseconds.
Antoine Mussoa8eea7d2013-10-05 16:08:00 +020081
82 Additionally, the `zuul.pipeline.<pipeline name>` hierarchy contains
Bruno Tavaresf9af4cd2015-10-15 14:55:51 -030083 `current_changes` (gauge), `resident_time` (timing) and `total_changes`
84 (counter) metrics for each projects. The slash separator used in Gerrit name
85 being replaced by dots.
Antoine Mussoa8eea7d2013-10-05 16:08:00 +020086
87 As an example, given a job named `myjob` triggered by the `gate` pipeline
88 which took 40 seconds to build, the Zuul scheduler will emit the following
89 statsd events:
90
91 * `zuul.pipeline.gate.job.myjob.SUCCESS` +1
92 * `zuul.pipeline.gate.job.myjob` 40 seconds
93 * `zuul.pipeline.gate.all_jobs` +1