blob: d43fd035c89216e74cb9f6357cbd57eaf97b1cfb [file] [log] [blame]
James E. Blaireff5a9d2017-06-20 00:00:37 -07001:title: Monitoring
2
3Monitoring
4==========
Antoine Mussoa8eea7d2013-10-05 16:08:00 +02005
James E. Blairded241e2017-10-10 13:22:40 -07006.. _statsd:
7
Antoine Mussoa8eea7d2013-10-05 16:08:00 +02008Statsd reporting
James E. Blaireff5a9d2017-06-20 00:00:37 -07009----------------
Antoine Mussoa8eea7d2013-10-05 16:08:00 +020010
11Zuul comes with support for the statsd protocol, when enabled and configured
Michael Prokop526926a2013-10-24 16:16:57 +020012(see below), the Zuul scheduler will emit raw metrics to a statsd receiver
James E. Blaireff5a9d2017-06-20 00:00:37 -070013which let you in turn generate nice graphics.
Antoine Mussoa8eea7d2013-10-05 16:08:00 +020014
15Configuration
James E. Blaireff5a9d2017-06-20 00:00:37 -070016~~~~~~~~~~~~~
Antoine Mussoa8eea7d2013-10-05 16:08:00 +020017
James E. Blairded241e2017-10-10 13:22:40 -070018Statsd support uses the ``statsd`` python module. Note that support
19is optional and Zuul will start without the statsd python module
20present.
Antoine Mussoa8eea7d2013-10-05 16:08:00 +020021
James E. Blairded241e2017-10-10 13:22:40 -070022Configuration is in the :attr:`statsd` section of ``zuul.conf``.
Antoine Mussoa8eea7d2013-10-05 16:08:00 +020023
24Metrics
James E. Blaireff5a9d2017-06-20 00:00:37 -070025~~~~~~~
Antoine Mussoa8eea7d2013-10-05 16:08:00 +020026
David Shrewsbury1c61c712017-08-16 16:02:33 -040027These metrics are emitted by the Zuul :ref:`scheduler`:
Antoine Mussoa8eea7d2013-10-05 16:08:00 +020028
Tobias Henkel60a85472018-01-31 11:16:15 +010029.. stat:: zuul.event.<driver>.<type>
James E. Blair91c9dde2017-08-04 11:10:24 -070030 :type: counter
James E. Blaireff5a9d2017-06-20 00:00:37 -070031
James E. Blair80ac1582017-10-09 07:02:40 -070032 Zuul will report counters for each type of event it receives from
33 each of its configured drivers.
Antoine Mussoa8eea7d2013-10-05 16:08:00 +020034
James E. Blairfaf81982017-10-10 15:42:26 -070035.. stat:: zuul.tenant.<tenant>.pipeline
Antoine Mussoa8eea7d2013-10-05 16:08:00 +020036
James E. Blair91c9dde2017-08-04 11:10:24 -070037 Holds metrics specific to jobs. This hierarchy includes:
Antoine Mussoa8eea7d2013-10-05 16:08:00 +020038
James E. Blair91c9dde2017-08-04 11:10:24 -070039 .. stat:: <pipeline name>
Antoine Mussoa8eea7d2013-10-05 16:08:00 +020040
James E. Blair91c9dde2017-08-04 11:10:24 -070041 A set of metrics for each pipeline named as defined in the Zuul
42 config.
Antoine Mussoa8eea7d2013-10-05 16:08:00 +020043
James E. Blair91c9dde2017-08-04 11:10:24 -070044 .. stat:: all_jobs
45 :type: counter
Antoine Mussoa8eea7d2013-10-05 16:08:00 +020046
James E. Blair91c9dde2017-08-04 11:10:24 -070047 Number of jobs triggered by the pipeline.
Antoine Mussoa8eea7d2013-10-05 16:08:00 +020048
James E. Blair91c9dde2017-08-04 11:10:24 -070049 .. stat:: current_changes
50 :type: gauge
Antoine Mussoa8eea7d2013-10-05 16:08:00 +020051
James E. Blair91c9dde2017-08-04 11:10:24 -070052 The number of items currently being processed by this
53 pipeline.
Antoine Mussoa8eea7d2013-10-05 16:08:00 +020054
James E. Blair80ac1582017-10-09 07:02:40 -070055 .. stat:: project
James E. Blair91c9dde2017-08-04 11:10:24 -070056
James E. Blair80ac1582017-10-09 07:02:40 -070057 This hierarchy holds more specific metrics for each project
58 participating in the pipeline.
James E. Blair91c9dde2017-08-04 11:10:24 -070059
James E. Blair80ac1582017-10-09 07:02:40 -070060 .. stat:: <canonical_hostname>
James E. Blair91c9dde2017-08-04 11:10:24 -070061
James E. Blair80ac1582017-10-09 07:02:40 -070062 The canonical hostname for the triggering project.
63 Embedded ``.`` characters will be translated to ``_``.
James E. Blair91c9dde2017-08-04 11:10:24 -070064
James E. Blair80ac1582017-10-09 07:02:40 -070065 .. stat:: <project>
James E. Blair91c9dde2017-08-04 11:10:24 -070066
James E. Blair80ac1582017-10-09 07:02:40 -070067 The name of the triggering project. Embedded ``/`` or
68 ``.`` characters will be translated to ``_``.
69
70 .. stat:: <branch>
71
72 The name of the triggering branch. Embedded ``/`` or
73 ``.`` characters will be translated to ``_``.
74
75 .. stat:: job
76
77 Subtree detailing per-project job statistics:
78
79 .. stat:: <jobname>
80
81 The triggered job name.
82
83 .. stat:: <result>
84 :type: counter, timer
85
86 A counter for each type of result (e.g., ``SUCCESS`` or
87 ``FAILURE``, ``ERROR``, etc.) for the job. If the
88 result is ``SUCCESS`` or ``FAILURE``, Zuul will
89 additionally report the duration of the build as a
90 timer.
91
92 .. stat:: current_changes
93 :type: gauge
94
95 The number of items of this project currently being
96 processed by this pipeline.
97
98 .. stat:: resident_time
99 :type: timer
100
101 A timer metric reporting how long each item for this
102 project has been in the pipeline.
103
104 .. stat:: total_changes
105 :type: counter
106
107 The number of changes for this project processed by the
108 pipeline since Zuul started.
James E. Blair91c9dde2017-08-04 11:10:24 -0700109
110 .. stat:: resident_time
111 :type: timer
112
113 A timer metric reporting how long each item has been in the
114 pipeline.
115
116 .. stat:: total_changes
117 :type: counter
118
David Shrewsbury1c61c712017-08-16 16:02:33 -0400119 The number of changes processed by the pipeline since Zuul
James E. Blair91c9dde2017-08-04 11:10:24 -0700120 started.
121
122 .. stat:: wait_time
123 :type: timer
124
125 How long each item spent in the pipeline before its first job
126 started.
127
James E. Blairfaf81982017-10-10 15:42:26 -0700128.. stat:: zuul.executor.<executor>
129
130 Holds metrics emitted by individual executors. The ``<executor>``
131 component of the key will be replaced with the hostname of the
132 executor.
133
134 .. stat:: builds
135 :type: counter
136
137 Incremented each time the executor starts a build.
138
James E. Blairdf37ad22018-02-01 13:59:48 -0800139 .. stat:: starting_builds
140 :type: gauge
141
142 The number of builds starting on this executor. These are
143 builds which have not yet begun their first pre-playbook.
144
James E. Blairfaf81982017-10-10 15:42:26 -0700145 .. stat:: running_builds
146 :type: gauge
147
James E. Blairdf37ad22018-02-01 13:59:48 -0800148 The number of builds currently running on this executor. This
149 includes starting builds.
James E. Blairfaf81982017-10-10 15:42:26 -0700150
151 .. stat:: load_average
152 :type: gauge
153
154 The one-minute load average of this executor, multiplied by 100.
155
James E. Blair40ca3792018-01-31 14:22:07 -0800156 .. stat:: pct_available_ram
157 :type: gauge
158
159 The available RAM (including buffers and cache) on this
160 executor, as a percentage multiplied by 100.
161
James E. Blair4f1731b2017-10-10 18:11:42 -0700162.. stat:: zuul.nodepool
163
164 Holds metrics related to Zuul requests from Nodepool.
165
166 .. stat:: requested
167 :type: counter
168
169 Incremented each time a node request is submitted to Nodepool.
170
171 .. stat:: label.<label>
172 :type: counter
173
174 Incremented each time a request for a specific label is
175 submitted to Nodepool.
176
177 .. stat:: size.<size>
178 :type: counter
179
180 Incremented each time a request of a specific size is submitted
181 to Nodepool. For example, a request for 3 nodes would use the
182 key ``zuul.nodepool.requested.size.3``.
183
184 .. stat:: canceled
185 :type: counter, timer
186
187 The counter is incremented each time a node request is canceled
188 by Zuul. The timer records the elapsed time from request to
189 cancelation.
190
191 .. stat:: label.<label>
192 :type: counter, timer
193
194 The same, for a specific label.
195
196 .. stat:: size.<size>
197 :type: counter, timer
198
199 The same, for a specific request size.
200
201 .. stat:: fulfilled
202 :type: counter, timer
203
204 The counter is incremented each time a node request is fulfilled
205 by Nodepool. The timer records the elapsed time from request to
206 fulfillment.
207
208 .. stat:: label.<label>
209 :type: counter, timer
210
211 The same, for a specific label.
212
213 .. stat:: size.<size>
214 :type: counter, timer
215
216 The same, for a specific request size.
217
218 .. stat:: failed
219 :type: counter, timer
220
221 The counter is incremented each time Nodepool fails to fulfill a
222 node request. The timer records the elapsed time from request
223 to failure.
224
225 .. stat:: label.<label>
226 :type: counter, timer
227
228 The same, for a specific label.
229
230 .. stat:: size.<size>
231 :type: counter, timer
232
233 The same, for a specific request size.
234
235 .. stat:: current_requests
236 :type: gauge
237
238 The number of outstanding nodepool requests from Zuul.
239
James E. Blair4dd5f4b2017-10-23 07:44:08 -0700240.. stat:: zuul.mergers
241
242 Holds metrics related to Zuul mergers.
243
244 .. stat:: online
245 :type: gauge
246
247 The number of Zuul merger processes online.
248
249 .. stat:: jobs_running
250 :type: gauge
251
252 The number of merge jobs running.
253
254 .. stat:: jobs_queued
255 :type: gauge
256
257 The number of merge jobs queued.
258
259.. stat:: zuul.executors
260
261 Holds metrics related to Zuul executors.
262
263 .. stat:: online
264 :type: gauge
265
266 The number of Zuul executor processes online.
267
268 .. stat:: accepting
269 :type: gauge
270
271 The number of Zuul executor processes accepting new jobs.
272
273 .. stat:: jobs_running
274 :type: gauge
275
276 The number of executor jobs running.
277
278 .. stat:: jobs_queued
279 :type: gauge
280
281 The number of executor jobs queued.
282
James E. Blair91c9dde2017-08-04 11:10:24 -0700283
James E. Blair80ac1582017-10-09 07:02:40 -0700284As an example, given a job named `myjob` in `mytenant` triggered by a
285change to `myproject` on the `master` branch in the `gate` pipeline
286which took 40 seconds to build, the Zuul scheduler will emit the
287following statsd events:
James E. Blair91c9dde2017-08-04 11:10:24 -0700288
James E. Blair80ac1582017-10-09 07:02:40 -0700289 * ``zuul.tenant.mytenant.pipeline.gate.project.example_com.myproject.master.job.myjob.SUCCESS`` +1
290 * ``zuul.tenant.mytenant.pipeline.gate.project.example_com.myproject.master.job.myjob.SUCCESS`` 40 seconds
291 * ``zuul.tenant.mytenant.pipeline.gate.all_jobs`` +1