blob: bf245dd27259302e501eaba2f4a36a74096f2471 [file] [log] [blame]
James E. Blaircdd00072012-06-08 19:17:28 -07001:title: Project Gating
2
3Project Gating
4==============
5
6Traditionally, many software development projects merge changes from
7developers into the repository, and then identify regressions
8resulting from those changes (perhaps by running a test suite with a
9continuous integration system such as Jenkins), followed by more
10patches to fix those bugs. When the mainline of development is
11broken, it can be very frustrating for developers and can cause lost
12productivity, particularly so when the number of contributors or
13contributions is large.
14
15The process of gating attempts to prevent changes that introduce
16regressions from being merged. This keeps the mainline of development
17open and working for all developers, and only when a change is
18confirmed to work without disruption is it merged.
19
20Many projects practice an informal method of gating where developers
21with mainline commit access ensure that a test suite runs before
22merging a change. With more developers, more changes, and more
23comprehensive test suites, that process does not scale very well, and
24is not the best use of a developer's time. Zuul can help automate
25this process, with a particular emphasis on ensuring large numbers of
26changes are tested correctly.
27
28Zuul was designed to handle the workflow of the OpenStack project, but
29can be used with any project.
30
31A particular focus of Zuul is ensuring correctly ordered testing of
32changes in parallel. A gating system should always test each change
33applied to the tip of the branch exactly as it is going to be merged.
34A simple way to do that would be to test one change at a time, and
35merge it only if it passes tests. That works very well, but if
36changes take a long time to test, developers may have to wait a long
37time for their changes to make it into the repository. With some
38projects, it may take hours to test changes, and it is easy for
39developers to create changes at a rate faster than they can be tested
40and merged.
41
42Zuul's DependentQueueManager allows for parallel execution of test
43jobs for gating while ensuring changes are tested correctly, exactly
44as if they had been tested one at a time. It does this by performing
45speculative execution of test jobs; it assumes that all jobs will
46succeed and tests them in parallel accordingly. If they do succeed,
47they can all be merged. However, if one fails, then changes that were
48expecting it to succeed are re-tested without the failed change. In
49the best case, as many changes as execution contexts are available may
50be tested in parallel and merged at once. In the worst case, changes
51are tested one at a time (as each subsequent change fails, changes
52behind it start again). In practice, the OpenStack project observes
53something closer to the best case.
54
55For example, if a core developer approves five changes in rapid
56succession::
57
58 A, B, C, D, E
59
60Zuul queues those changes in the order they were approved, and notes
61that each subsequent change depends on the one ahead of it merging::
62
63 A <-- B <-- C <-- D <-- E
64
65Zuul then starts immediately testing all of the changes in parallel.
66But in the case of changes that depend on others, it instructs the
67test system to include the changes ahead of it, with the assumption
68they pass. That means jobs testing change *B* include change *A* as
69well::
70
71 Jobs for A: merge change A, then test
72 Jobs for B: merge changes A and B, then test
73 Jobs for C: merge changes A, B and C, then test
74 Jobs for D: merge changes A, B, C and D, then test
75 Jobs for E: merge changes A, B, C, D and E, then test
76
77If changes *A* and *B* pass tests, and *C*, *D*, and *E* fail::
78
79 A[pass] <-- B[pass] <-- C[fail] <-- D[fail] <-- E[fail]
80
81Zuul will merge change *A* followed by change *B*, leaving this queue::
82
83 C[fail] <-- D[fail] <-- E[fail]
84
85Since *D* was dependent on *C*, it is not clear whether *D*'s failure is the
86result of a defect in *D* or *C*::
87
88 C[fail] <-- D[unknown] <-- E[unknown]
89
90Since *C* failed, it will report the failure and drop *C* from the queue::
91
92 D[unknown] <-- E[unknown]
93
94This queue is the same as if two new changes had just arrived, so Zuul
95starts the process again testing *D* against the tip of the branch, and
96*E* against *D*.