Diff - d7f164293e19922d5846e9b12b301087310eae29^! - github/openstack-infra/zuul

commit	d7f164293e19922d5846e9b12b301087310eae29	[log] [tgz]
author	James E. Blair <jeblair@redhat.com>	Tue Sep 05 16:51:56 2017 -0700
committer	James E. Blair <jeblair@redhat.com>	Tue Sep 05 16:51:56 2017 -0700
tree	1c2b68185e08ba3639df2f802ce2df249d310b27
parent	6fcf115d02461540b1abde92d26a010a6ba978a3 [diff]

Be explicit about byte and encoding in command module This module reads output from a command (via a pipe) one line at a time. The only input we should receive from it is either: * a byte string from the command output terminating with a \n * a python "string" terminating with a \n * that means in python2, a bytestring with a \n * and in python3, a unicode string with a \n For now, we only need to focus on python2 because we explicitly run this code under python2, however, it's wise to be forward-compatible with python3. The error in the previous version of this code is to assume that the value we read from the command was a unicode string which needed to be encoded in order to be written to the log file. That's incorrect; what we receive from the command should already be encoded according to the system locale. This change no longer encodes the lines received from the command (because they are always bytestrings, in python2 or python3, they will follow the code path where no further encoding happens). Of course, if it turns out not to be encoded in utf-8, then zuul_stream is likely going to bomb because it assumes everything it reads is utf-8, but that's a different problem. In practice, we have utf-8 or C locales universally at the moment. Finally, there is a bunch of explicit encoding and bytestring handling added to this method. That is mostly in service of the future codepath under python3; elsewhere in this file we call ".addLine('[Zuul] ...')". Under python2, that's a bytestring so no further work is necessary. In python3, that's a unicode string, so we need to encode it. We should never hit the exception handler, however, if somehow we manage to, it should at least be able to write some data to the log file which approximates what it was given. Change-Id: Iae2f3ee012d914454c335184a8ec7c7ecb924ec7

@@ -159,9 +159,14 @@ # Jenkins format but with microsecond resolution instead of # millisecond. It is kept so log parsing/formatting remains # consistent. - ts = datetime.datetime.now() - outln = '%s | %s' % (ts, ln) - self.logfile.write(outln.encode('utf-8')) + ts = str(datetime.datetime.now()).encode('utf-8') + if not isinstance(ln, bytes): + try: + ln = ln.encode('utf-8') + except Exception: + ln = repr(ln).encode('utf-8') + b'\n' + outln = b'%s | %s' % (ts, ln) + self.logfile.write(outln) def follow(fd, log_uuid):