From: Matthew Fernandez Date: Mon, 6 Dec 2021 04:18:31 +0000 (-0800) Subject: explicitly specify latin-1 encoding when dealing with PS files in tests X-Git-Tag: 3.0.0~127^2~7 X-Git-Url: https://granicus.if.org/sourcecode?a=commitdiff_plain;h=4ea0424037fc2eb6f15981ef38469694799d4b8d;p=graphviz explicitly specify latin-1 encoding when dealing with PS files in tests PostScript files are encoded in a character set called CCSID 1276.¹ This code was working despite not specifying an encoding because the default encoding on each platform coincidentally lines up with that used in their sample files. More recent versions of Pylint warn that encoding should always be specified in `open` calls, and attempting to use either ASCII or UTF-8 encoding fails. The Linux sample files contain invalid ASCII bytes and the Windows sample files contain invalid UTF-8 bytes. Using latin-1, which seems to be the closest available encoding to CCSID 1276, works cross platform. In future, this code should perhaps be adapted to do I/O in binary mode, thus avoiding any encoding concerns. ¹ https://en.wikipedia.org/wiki/PostScript_Standard_Encoding --- diff --git a/rtest/rtest.py b/rtest/rtest.py index c95b89090..ab0db2fdf 100755 --- a/rtest/rtest.py +++ b/rtest/rtest.py @@ -124,8 +124,8 @@ def doDiff(OUTFILE, testname, subtest_index, fmt): return if F in ["ps", "ps2"]: - with open(FILE1, "rt") as src: - with open(TMPFILE1, "wt") as dst: + with open(FILE1, "rt", encoding="latin-1") as src: + with open(TMPFILE1, "wt", encoding="latin-1") as dst: done_setup = False for line in src: if done_setup: @@ -133,8 +133,8 @@ def doDiff(OUTFILE, testname, subtest_index, fmt): else: done_setup = re.match(r"%%End.*Setup", line) is not None - with open(FILE2, "rt") as src: - with open(TMPFILE2, "wt") as dst: + with open(FILE2, "rt", encoding="latin-1") as src: + with open(TMPFILE2, "wt", encoding="latin-1") as dst: done_setup = False for line in src: if done_setup: