Allow and document use of the uri module from localhost

The rtfd hook job just does an empty POST to a URI. There's no need to
allocate a node for that, we can just make REST calls from the executor.

Also, there is enough going on here that it needs to be documented. Add
a documentation section to the developer docs about what we're doing
with our ansible plugins. In support of that, add a simple sphinx domain
for ansible to allow us to easily link to upstream ansible documentation for
modules.

Change-Id: I9b0be1018388db7361aec10f30a70437de555615
diff --git a/doc/source/conf.py b/doc/source/conf.py
index 7c0d587..38c1689 100644
--- a/doc/source/conf.py
+++ b/doc/source/conf.py
@@ -29,6 +29,7 @@
     'sphinx.ext.autodoc',
     'sphinxcontrib.blockdiag',
     'sphinxcontrib.programoutput',
+    'zuul.sphinx.ansible',
     'zuul.sphinx.zuul',
 ]
 #extensions = ['sphinx.ext.intersphinx']
diff --git a/doc/source/developer/ansible.rst b/doc/source/developer/ansible.rst
new file mode 100644
index 0000000..e3ebca7
--- /dev/null
+++ b/doc/source/developer/ansible.rst
@@ -0,0 +1,66 @@
+Ansible Integration
+===================
+
+Zuul contains Ansible modules and plugins to control the execution of Ansible
+Job content. These break down into two basic categories.
+
+* Restricted Execution on Executors
+* Build Log Support
+
+Restricted Execution
+--------------------
+
+Zuul runs ``ansible-playbook`` on executors to run job content on nodes. While
+the intent is that content is run on the remote nodes, Ansible is a flexible
+system that allows delegating actions to ``localhost``, and also reading and
+writing files. These actions can be desirable and necessary for actions such
+as fetching log files or build artifacts, but could also be used as a vector
+to attack the executor.
+
+For that reason Zuul implements a set of Ansible action plugins and lookup
+plugins that override and intercept task execution during untrusted playbook
+execution to ensure local actions are not executed or that for operations that
+are desirable to allow locally that they only interact with files in the zuul
+work directory.
+
+.. autoclass:: zuul.ansible.action.normal.ActionModule
+   :members:
+
+Build Log Support
+-----------------
+
+Zuul provides realtime build log streaming to end users so that users can
+watch long-running jobs in progress. As jobs may be written that execute a
+shell script that could run for a long time, additional effort is expended
+to stream stdout and stderr of shell tasks as they happen rather than waiting
+for the command to finish.
+
+Zuul contains a modified version of the :ansible:module:`command`
+that starts a log streaming daemon on the build node.
+
+.. automodule:: zuul.ansible.library.command
+
+All jobs run with the :py:mod:`zuul.ansible.callback.zuul_stream` callback
+plugin enabled, which writes the build log to a file so that the
+:py:class:`zuul.lib.log_streamer.LogStreamer` can provide the data on demand
+over the finger protocol. Finally, :py:class:`zuul.web.LogStreamingHandler`
+exposes that log stream over a websocket connection as part of
+:py:class:`zuul.web.ZuulWeb`.
+
+.. autoclass:: zuul.ansible.callback.zuul_stream.CallbackModule
+   :members:
+
+.. autoclass:: zuul.lib.log_streamer.LogStreamer
+.. autoclass:: zuul.web.LogStreamingHandler
+.. autoclass:: zuul.web.ZuulWeb
+
+In addition to real-time streaming, Zuul also installs another callback module,
+:py:mod:`zuul.ansible.callback.zuul_json.CallbackModule` that collects all
+of the information about a given run into a json file which is written to the
+work dir so that it can be published along with build logs. Since the streaming
+log is by necessity a single text stream, choices have to be made for
+readability about what data is shown and what is not shown. The json log file
+is intended to allow for a richer more interactive set of data to be displayed
+to the user.
+
+.. autoclass:: zuul.ansible.callback.zuul_json.CallbackModule
diff --git a/doc/source/developer/index.rst b/doc/source/developer/index.rst
index 7b16e9c..360dcd5 100644
--- a/doc/source/developer/index.rst
+++ b/doc/source/developer/index.rst
@@ -15,3 +15,4 @@
    triggers
    testing
    docs
+   ansible
diff --git a/tests/fixtures/config/ansible/git/org_plugin-project/playbooks/uri_bad_path.yaml b/tests/fixtures/config/ansible/git/org_plugin-project/playbooks/uri_bad_path.yaml
new file mode 100644
index 0000000..523aab7
--- /dev/null
+++ b/tests/fixtures/config/ansible/git/org_plugin-project/playbooks/uri_bad_path.yaml
@@ -0,0 +1,6 @@
+- hosts: localhost
+  tasks:
+    - uri:
+        method: GET
+        url: https://example.com
+        path: /tmp/example.out
diff --git a/tests/fixtures/config/ansible/git/org_plugin-project/playbooks/uri_bad_scheme.yaml b/tests/fixtures/config/ansible/git/org_plugin-project/playbooks/uri_bad_scheme.yaml
new file mode 100644
index 0000000..5d71793
--- /dev/null
+++ b/tests/fixtures/config/ansible/git/org_plugin-project/playbooks/uri_bad_scheme.yaml
@@ -0,0 +1,5 @@
+- hosts: localhost
+  tasks:
+    - uri:
+        method: GET
+        url: file:///etc/passwd
diff --git a/tests/unit/test_v3.py b/tests/unit/test_v3.py
index 7038471..7c36cc4 100755
--- a/tests/unit/test_v3.py
+++ b/tests/unit/test_v3.py
@@ -791,6 +791,8 @@
             ('credstash', 'FAILURE'),
             ('csvfile_good', 'SUCCESS'),
             ('csvfile_bad', 'FAILURE'),
+            ('uri_bad_path', 'FAILURE'),
+            ('uri_bad_scheme', 'FAILURE'),
         ]
         for job_name, result in plugin_tests:
             count += 1
diff --git a/zuul/ansible/action/normal.py b/zuul/ansible/action/normal.py
index 74e732e..b8a232b 100644
--- a/zuul/ansible/action/normal.py
+++ b/zuul/ansible/action/normal.py
@@ -1,4 +1,4 @@
-# Copyright 2016 Red Hat, Inc.
+# Copyright 2017 Red Hat, Inc.
 #
 # This module is free software: you can redistribute it and/or modify
 # it under the terms of the GNU General Public License as published by
@@ -13,13 +13,27 @@
 # You should have received a copy of the GNU General Public License
 # along with this software.  If not, see <http://www.gnu.org/licenses/>.
 
+from ansible.module_utils.six.moves.urllib.parse import urlparse
+from ansible.errors import AnsibleError
+
 from zuul.ansible import paths
 normal = paths._import_ansible_action_plugin('normal')
 
+ALLOWED_URL_SCHEMES = ('https', 'http', 'ftp')
+
 
 class ActionModule(normal.ActionModule):
+    '''Override the normal action plugin
+
+    :py:class:`ansible.plugins.normal.ActionModule` is run for every
+    module that does not have a more specific matching action plugin.
+
+    Our overridden version of it wraps the execution with checks to block
+    undesired actions on localhost.
+    '''
 
     def run(self, tmp=None, task_vars=None):
+        '''Overridden primary method from the base class.'''
 
         if (self._play_context.connection == 'local'
                 or self._play_context.remote_addr == 'localhost'
@@ -27,16 +41,61 @@
                 or self._task.delegate_to == 'localhost'
                 or (self._task.delegate_to
                     and self._task.delegate_to.startswtih('127.'))):
-            if self._task.action == 'stat':
-                paths._fail_if_unsafe(self._task.args['path'])
-            elif self._task.action == 'file':
-                dest = self._task.args.get(
-                    'path', self._task.args.get(
-                        'dest', self._task.args.get(
-                            'name')))
-                paths._fail_if_unsafe(dest)
-            else:
-                return dict(
-                    failed=True,
-                    msg="Executing local code is prohibited")
+            if not self.dispatch_handler():
+                raise AnsibleError("Executing local code is prohibited")
         return super(ActionModule, self).run(tmp, task_vars)
+
+    def dispatch_handler(self):
+        '''Run per-action handler if one exists.'''
+        handler_name = 'handle_{action}'.format(action=self._task.action)
+        handler = getattr(self, handler_name, None)
+        if handler:
+            handler(self)
+            return True
+        return False
+
+    def handle_stat(self):
+        '''Allow stat module on localhost if it doesn't touch unsafe files.
+
+        The :ansible:module:`stat` can be useful in jobs for manipulating logs
+        and artifacts.
+
+        Block any access of files outside the zuul work dir.
+        '''
+        paths._fail_if_unsafe(self._task.args['path'])
+
+    def handle_file(self):
+        '''Allow file module on localhost if it doesn't touch unsafe files.
+
+        The :ansible:module:`file` can be useful in jobs for manipulating logs
+        and artifacts.
+
+        Block any access of files outside the zuul work dir.
+        '''
+        for arg in ('path', 'dest', 'name'):
+            dest = self._task.args.get(arg)
+            if dest:
+                paths._fail_if_unsafe(dest)
+
+    def handle_uri(self):
+        '''Allow uri module on localhost if it doesn't touch unsafe files.
+
+        The :ansible:module:`uri` can be used from the executor to do
+        things like pinging readthedocs.org that otherwise don't need a node.
+        However, it can also download content to a local file, or be used to
+        read from file:/// urls.
+
+        Block any use of url schemes other than https, http and ftp. Further,
+        block any local file interaction that falls outside of the zuul
+        work dir.
+        '''
+        # uri takes all the file arguments, so just let handle_file validate
+        # them for us.
+        self.handle_file()
+        scheme = urlparse(self._task.args['url']).scheme
+        if scheme not in ALLOWED_URL_SCHEMES:
+            raise AnsibleError(
+                "{scheme} urls are not allowed from localhost."
+                " Only {allowed_schemes} are allowed".format(
+                    scheme=scheme,
+                    allowed_schemes=ALLOWED_URL_SCHEMES))
diff --git a/zuul/sphinx/ansible.py b/zuul/sphinx/ansible.py
new file mode 100644
index 0000000..4a47bc3
--- /dev/null
+++ b/zuul/sphinx/ansible.py
@@ -0,0 +1,53 @@
+# Copyright 2017 Red Hat, Inc.
+#
+# Licensed under the Apache License, Version 2.0 (the "License"); you may
+# not use this file except in compliance with the License. You may obtain
+# a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+# License for the specific language governing permissions and limitations
+# under the License.
+
+from docutils import nodes
+from sphinx.domains import Domain
+
+MODULE_URL = 'http://docs.ansible.com/ansible/latest/{module_name}_module.html'
+
+
+def ansible_module_role(
+        name, rawtext, text, lineno, inliner, options={}, content=[]):
+    """Link to an upstream Ansible module.
+
+    Returns 2 part tuple containing list of nodes to insert into the
+    document and a list of system messages.  Both are allowed to be
+    empty.
+
+    :param name: The role name used in the document.
+    :param rawtext: The entire markup snippet, with role.
+    :param text: The text marked with the role.
+    :param lineno: The line number where rawtext appears in the input.
+    :param inliner: The inliner instance that called us.
+    :param options: Directive options for customization.
+    :param content: The directive content for customization.
+    """
+    node = nodes.reference(
+        rawtext, "Ansible {module_name} module".format(module_name=text),
+        refuri=MODULE_URL.format(module_name=text), **options)
+    return ([node], [])
+
+
+class AnsibleDomain(Domain):
+    name = 'ansible'
+    label = 'Ansible'
+
+    roles = {
+        'module': ansible_module_role,
+    }
+
+
+def setup(app):
+    app.add_domain(AnsibleDomain)