James E. Blair | cdd0007 | 2012-06-08 19:17:28 -0700 | [diff] [blame] | 1 | :title: Project Gating |
| 2 | |
| 3 | Project Gating |
| 4 | ============== |
| 5 | |
| 6 | Traditionally, many software development projects merge changes from |
| 7 | developers into the repository, and then identify regressions |
| 8 | resulting from those changes (perhaps by running a test suite with a |
| 9 | continuous integration system such as Jenkins), followed by more |
| 10 | patches to fix those bugs. When the mainline of development is |
| 11 | broken, it can be very frustrating for developers and can cause lost |
| 12 | productivity, particularly so when the number of contributors or |
| 13 | contributions is large. |
| 14 | |
| 15 | The process of gating attempts to prevent changes that introduce |
| 16 | regressions from being merged. This keeps the mainline of development |
| 17 | open and working for all developers, and only when a change is |
| 18 | confirmed to work without disruption is it merged. |
| 19 | |
| 20 | Many projects practice an informal method of gating where developers |
| 21 | with mainline commit access ensure that a test suite runs before |
| 22 | merging a change. With more developers, more changes, and more |
| 23 | comprehensive test suites, that process does not scale very well, and |
| 24 | is not the best use of a developer's time. Zuul can help automate |
| 25 | this process, with a particular emphasis on ensuring large numbers of |
| 26 | changes are tested correctly. |
| 27 | |
| 28 | Zuul was designed to handle the workflow of the OpenStack project, but |
| 29 | can be used with any project. |
| 30 | |
Antoine Musso | 5586753 | 2014-01-10 18:24:35 +0100 | [diff] [blame] | 31 | Testing in parallel |
| 32 | ------------------- |
| 33 | |
James E. Blair | cdd0007 | 2012-06-08 19:17:28 -0700 | [diff] [blame] | 34 | A particular focus of Zuul is ensuring correctly ordered testing of |
| 35 | changes in parallel. A gating system should always test each change |
| 36 | applied to the tip of the branch exactly as it is going to be merged. |
| 37 | A simple way to do that would be to test one change at a time, and |
| 38 | merge it only if it passes tests. That works very well, but if |
| 39 | changes take a long time to test, developers may have to wait a long |
| 40 | time for their changes to make it into the repository. With some |
| 41 | projects, it may take hours to test changes, and it is easy for |
| 42 | developers to create changes at a rate faster than they can be tested |
| 43 | and merged. |
| 44 | |
Clark Boylan | 00635dc | 2012-09-19 14:03:08 -0700 | [diff] [blame] | 45 | Zuul's DependentPipelineManager allows for parallel execution of test |
James E. Blair | cdd0007 | 2012-06-08 19:17:28 -0700 | [diff] [blame] | 46 | jobs for gating while ensuring changes are tested correctly, exactly |
| 47 | as if they had been tested one at a time. It does this by performing |
| 48 | speculative execution of test jobs; it assumes that all jobs will |
| 49 | succeed and tests them in parallel accordingly. If they do succeed, |
| 50 | they can all be merged. However, if one fails, then changes that were |
| 51 | expecting it to succeed are re-tested without the failed change. In |
| 52 | the best case, as many changes as execution contexts are available may |
| 53 | be tested in parallel and merged at once. In the worst case, changes |
| 54 | are tested one at a time (as each subsequent change fails, changes |
| 55 | behind it start again). In practice, the OpenStack project observes |
| 56 | something closer to the best case. |
| 57 | |
| 58 | For example, if a core developer approves five changes in rapid |
| 59 | succession:: |
| 60 | |
| 61 | A, B, C, D, E |
| 62 | |
| 63 | Zuul queues those changes in the order they were approved, and notes |
Antoine Musso | 3a43e14 | 2013-10-30 23:51:58 +0100 | [diff] [blame] | 64 | that each subsequent change depends on the one ahead of it merging: |
James E. Blair | cdd0007 | 2012-06-08 19:17:28 -0700 | [diff] [blame] | 65 | |
Antoine Musso | 3a43e14 | 2013-10-30 23:51:58 +0100 | [diff] [blame] | 66 | .. blockdiag:: |
| 67 | |
| 68 | blockdiag foo { |
| 69 | node_width = 40; |
| 70 | span_width = 40; |
| 71 | A <- B <- C <- D <- E; |
| 72 | } |
James E. Blair | cdd0007 | 2012-06-08 19:17:28 -0700 | [diff] [blame] | 73 | |
| 74 | Zuul then starts immediately testing all of the changes in parallel. |
| 75 | But in the case of changes that depend on others, it instructs the |
| 76 | test system to include the changes ahead of it, with the assumption |
| 77 | they pass. That means jobs testing change *B* include change *A* as |
| 78 | well:: |
| 79 | |
| 80 | Jobs for A: merge change A, then test |
| 81 | Jobs for B: merge changes A and B, then test |
| 82 | Jobs for C: merge changes A, B and C, then test |
| 83 | Jobs for D: merge changes A, B, C and D, then test |
| 84 | Jobs for E: merge changes A, B, C, D and E, then test |
| 85 | |
Antoine Musso | 3a43e14 | 2013-10-30 23:51:58 +0100 | [diff] [blame] | 86 | Hence jobs triggered to tests A will only test A and ignore B, C, D: |
James E. Blair | cdd0007 | 2012-06-08 19:17:28 -0700 | [diff] [blame] | 87 | |
Antoine Musso | 3a43e14 | 2013-10-30 23:51:58 +0100 | [diff] [blame] | 88 | .. blockdiag:: |
James E. Blair | cdd0007 | 2012-06-08 19:17:28 -0700 | [diff] [blame] | 89 | |
Antoine Musso | 3a43e14 | 2013-10-30 23:51:58 +0100 | [diff] [blame] | 90 | blockdiag foo { |
| 91 | node_width = 40; |
| 92 | span_width = 40; |
| 93 | master -> A -> B -> C -> D -> E; |
| 94 | group jobs_for_A { |
| 95 | label = "Merged changes for A"; |
| 96 | master -> A; |
| 97 | } |
| 98 | group ignored_to_test_A { |
| 99 | label = "Ignored changes"; |
| 100 | color = "lightgray"; |
| 101 | B -> C -> D -> E; |
| 102 | } |
| 103 | } |
James E. Blair | cdd0007 | 2012-06-08 19:17:28 -0700 | [diff] [blame] | 104 | |
Antoine Musso | 3a43e14 | 2013-10-30 23:51:58 +0100 | [diff] [blame] | 105 | The jobs for E would include the whole dependency chain: A, B, C, D, and E. |
| 106 | E will be tested assuming A, B, C, and D passed: |
| 107 | |
| 108 | .. blockdiag:: |
| 109 | |
| 110 | blockdiag foo { |
| 111 | node_width = 40; |
| 112 | span_width = 40; |
| 113 | group jobs_for_E { |
| 114 | label = "Merged changes for E"; |
| 115 | master -> A -> B -> C -> D -> E; |
| 116 | } |
| 117 | } |
| 118 | |
| 119 | If changes *A* and *B* pass tests (green), and *C*, *D*, and *E* fail (red): |
| 120 | |
| 121 | .. blockdiag:: |
| 122 | |
| 123 | blockdiag foo { |
| 124 | node_width = 40; |
| 125 | span_width = 40; |
| 126 | |
| 127 | A [color = lightgreen]; |
| 128 | B [color = lightgreen]; |
| 129 | C [color = pink]; |
| 130 | D [color = pink]; |
| 131 | E [color = pink]; |
| 132 | |
| 133 | master <- A <- B <- C <- D <- E; |
| 134 | } |
| 135 | |
| 136 | Zuul will merge change *A* followed by change *B*, leaving this queue: |
| 137 | |
| 138 | .. blockdiag:: |
| 139 | |
| 140 | blockdiag foo { |
| 141 | node_width = 40; |
| 142 | span_width = 40; |
| 143 | |
| 144 | C [color = pink]; |
| 145 | D [color = pink]; |
| 146 | E [color = pink]; |
| 147 | |
| 148 | C <- D <- E; |
| 149 | } |
James E. Blair | cdd0007 | 2012-06-08 19:17:28 -0700 | [diff] [blame] | 150 | |
| 151 | Since *D* was dependent on *C*, it is not clear whether *D*'s failure is the |
Antoine Musso | 3a43e14 | 2013-10-30 23:51:58 +0100 | [diff] [blame] | 152 | result of a defect in *D* or *C*: |
James E. Blair | cdd0007 | 2012-06-08 19:17:28 -0700 | [diff] [blame] | 153 | |
Antoine Musso | 3a43e14 | 2013-10-30 23:51:58 +0100 | [diff] [blame] | 154 | .. blockdiag:: |
James E. Blair | cdd0007 | 2012-06-08 19:17:28 -0700 | [diff] [blame] | 155 | |
Antoine Musso | 3a43e14 | 2013-10-30 23:51:58 +0100 | [diff] [blame] | 156 | blockdiag foo { |
| 157 | node_width = 40; |
| 158 | span_width = 40; |
James E. Blair | cdd0007 | 2012-06-08 19:17:28 -0700 | [diff] [blame] | 159 | |
Antoine Musso | 3a43e14 | 2013-10-30 23:51:58 +0100 | [diff] [blame] | 160 | C [color = pink]; |
| 161 | D [label = "D\n?"]; |
| 162 | E [label = "E\n?"]; |
| 163 | |
| 164 | C <- D <- E; |
| 165 | } |
| 166 | |
| 167 | Since *C* failed, Zuul will report its failure and drop *C* from the queue, |
| 168 | keeping D and E: |
| 169 | |
| 170 | .. blockdiag:: |
| 171 | |
| 172 | blockdiag foo { |
| 173 | node_width = 40; |
| 174 | span_width = 40; |
| 175 | |
| 176 | D [label = "D\n?"]; |
| 177 | E [label = "E\n?"]; |
| 178 | |
| 179 | D <- E; |
| 180 | } |
James E. Blair | cdd0007 | 2012-06-08 19:17:28 -0700 | [diff] [blame] | 181 | |
| 182 | This queue is the same as if two new changes had just arrived, so Zuul |
| 183 | starts the process again testing *D* against the tip of the branch, and |
Antoine Musso | 3a43e14 | 2013-10-30 23:51:58 +0100 | [diff] [blame] | 184 | *E* against *D*: |
| 185 | |
| 186 | .. blockdiag:: |
| 187 | |
| 188 | blockdiag foo { |
| 189 | node_width = 40; |
| 190 | span_width = 40; |
| 191 | master -> D -> E; |
| 192 | group jobs_for_D { |
| 193 | label = "Merged changes for D"; |
| 194 | master -> D; |
| 195 | } |
| 196 | group ignored_to_test_D { |
| 197 | label = "Skip"; |
| 198 | color = "lightgray"; |
| 199 | E; |
| 200 | } |
| 201 | } |
| 202 | |
| 203 | .. blockdiag:: |
| 204 | |
| 205 | blockdiag foo { |
| 206 | node_width = 40; |
| 207 | span_width = 40; |
| 208 | group jobs_for_E { |
| 209 | label = "Merged changes for E"; |
| 210 | master -> D -> E; |
| 211 | } |
| 212 | } |
| 213 | |
Antoine Musso | 5586753 | 2014-01-10 18:24:35 +0100 | [diff] [blame] | 214 | |
| 215 | Cross projects dependencies |
| 216 | --------------------------- |
| 217 | |
| 218 | When your projects are closely coupled together, you want to make sure |
| 219 | changes entering the gate are going to be tested with the version of |
| 220 | other projects currently enqueued in the gate (since they will |
| 221 | eventually be merged and might introduce breaking features). |
| 222 | |
| 223 | Such dependencies can be defined in Zuul configuration by registering a job |
| 224 | in a DependentPipeline of several projects. Whenever a change enters such a |
| 225 | pipeline, it will create references for the other projects as well. As an |
| 226 | example, given a main project ``acme`` and a plugin ``plugin`` you can |
| 227 | define a job ``acme-tests`` which should be run for both projects: |
| 228 | |
| 229 | .. code-block:: yaml |
| 230 | |
| 231 | pipelines: |
| 232 | - name: gate |
| 233 | manager: DependentPipelineManager |
| 234 | |
| 235 | projects:: |
| 236 | - name: acme |
| 237 | gate: |
| 238 | - acme-tests |
| 239 | - name: plugin |
| 240 | gate: |
| 241 | - acme-tests # Register job again |
| 242 | |
| 243 | Whenever a change enters the ``gate`` pipeline queue, Zuul creates a reference |
| 244 | for it. For each subsequent change, an additional reference is created for the |
| 245 | changes ahead in the queue. As a result, you will always be able to fetch the |
| 246 | future state of your project dependencies for each change in the queue. |
| 247 | |
| 248 | Based on the pipeline and project definitions above, three changes are |
| 249 | inserted in the ``gate`` pipeline with the associated references: |
| 250 | |
| 251 | ======== ======= ====== ========= |
| 252 | Change Project Branch Zuul Ref. |
| 253 | ======== ======= ====== ========= |
| 254 | Change 1 acme master master/Z1 |
| 255 | Change 2 plugin stable stable/Z2 |
| 256 | Change 3 plugin master master/Z3 |
| 257 | ======== ======= ====== ========= |
| 258 | |
| 259 | Since the changes enter a DependentPipelineManager pipeline, Zuul creates |
| 260 | additional references: |
| 261 | |
| 262 | ====== ======= ========= ============================= |
| 263 | Change Project Zuul Ref. Description |
| 264 | ====== ======= ========= ============================= |
| 265 | 1 acme master/Z1 acme master + change 1 |
| 266 | ------ ------- --------- ----------------------------- |
| 267 | 2 acme master/Z2 acme master + change 1 |
| 268 | 2 plugin stable/Z2 plugin stable + change 2 |
| 269 | ------ ------- --------- ----------------------------- |
| 270 | 3 acme master/Z3 acme master + change 1 |
| 271 | 3 plugin stable/Z3 plugin stable + change 2 |
| 272 | 3 plugin master/Z3 plugin master + change 3 |
| 273 | ====== ======= ========= ============================= |
| 274 | |
| 275 | In order to test change 3, you would clone both repositories and simply |
| 276 | fetch the Z3 reference for each combination of project/branch you are |
| 277 | interested in testing. For example, you could fetch ``acme`` with |
| 278 | master/Z3 and ``plugin`` with master/Z3 and thus have ``acme`` with |
| 279 | change 1 applied as the expected state for when Change 3 would merge. |
| 280 | When your job fetches several repositories without changes ahead in the |
| 281 | queue, they may not have a Z reference in which case you can just check |
| 282 | out the branch. |