Reorganize docs into user/admin guide
Refresh the user and admin guide for v3 changes, and reorganize into
a narrative structure which makes more sense for v3.
Change-Id: I4ac3b18d5ed33b0fea4e2ef0318b19bfc3447ccc
diff --git a/doc/source/admin/monitoring.rst b/doc/source/admin/monitoring.rst
new file mode 100644
index 0000000..2a6c959
--- /dev/null
+++ b/doc/source/admin/monitoring.rst
@@ -0,0 +1,93 @@
+:title: Monitoring
+
+Monitoring
+==========
+
+Statsd reporting
+----------------
+
+Zuul comes with support for the statsd protocol, when enabled and configured
+(see below), the Zuul scheduler will emit raw metrics to a statsd receiver
+which let you in turn generate nice graphics.
+
+Configuration
+~~~~~~~~~~~~~
+
+Statsd support uses the statsd python module. Note that Zuul will start without
+the statsd python module, so an existing Zuul installation may be missing it.
+
+The configuration is done via environment variables STATSD_HOST and
+STATSD_PORT. They are interpreted by the statsd module directly and there is no
+such parameter in zuul.conf yet. Your init script will have to initialize both
+of them before executing Zuul.
+
+Your init script most probably loads a configuration file named
+``/etc/default/zuul`` which would contain the environment variables::
+
+ $ cat /etc/default/zuul
+ STATSD_HOST=10.0.0.1
+ STATSD_PORT=8125
+
+Metrics
+~~~~~~~
+
+The metrics are emitted by the Zuul scheduler (`zuul/scheduler.py`):
+
+**gerrit.event.<type> (counters)**
+ Gerrit emits different kind of message over its `stream-events`
+ interface. Zuul will report counters for each type of event it
+ receives from Gerrit.
+
+ Some of the events emitted are:
+
+ * patchset-created
+ * draft-published
+ * change-abandonned
+ * change-restored
+ * change-merged
+ * merge-failed
+ * comment-added
+ * ref-updated
+ * reviewer-added
+
+ Refer to your Gerrit installation documentation for an exhaustive list of
+ Gerrit event types.
+
+**zuul.pipeline.**
+ Holds metrics specific to jobs. The hierarchy is:
+
+ #. **<pipeline name>** as defined in your `layout.yaml` file (ex: `gate`,
+ `test`, `publish`). It contains:
+
+ #. **all_jobs** counter of jobs triggered by the pipeline.
+ #. **current_changes** A gauge for the number of Gerrit changes being
+ processed by this pipeline.
+ #. **job** subtree detailing per jobs statistics:
+
+ #. **<jobname>** The triggered job name.
+ #. **<build result>** Result as defined in your triggering system. For
+ Jenkins that would be SUCCESS, FAILURE, UNSTABLE, LOST. The
+ metrics holds both an increasing counter and a timing
+ reporting the duration of the build. Whenever the result is a
+ SUCCESS or FAILURE, Zuul will additionally report the duration
+ of the build as a timing event.
+
+ #. **resident_time** timing representing how long the Change has been
+ known by Zuul (which includes build time and Zuul overhead).
+ #. **total_changes** counter of the number of change proceeding since
+ Zuul started.
+ #. **wait_time** counter and timer of the wait time, with the difference
+ of the job start time and the execute time, in milliseconds.
+
+ Additionally, the `zuul.pipeline.<pipeline name>` hierarchy contains
+ `current_changes` (gauge), `resident_time` (timing) and `total_changes`
+ (counter) metrics for each projects. The slash separator used in Gerrit name
+ being replaced by dots.
+
+ As an example, given a job named `myjob` triggered by the `gate` pipeline
+ which took 40 seconds to build, the Zuul scheduler will emit the following
+ statsd events:
+
+ * `zuul.pipeline.gate.job.myjob.SUCCESS` +1
+ * `zuul.pipeline.gate.job.myjob` 40 seconds
+ * `zuul.pipeline.gate.all_jobs` +1