Of all the source files in the Graphviz tree, this file alone was encoded in
UTF-16 with a BOM.¹ BOMs have mostly fallen out of favor these days, with people
preferring to let the host operating system or locale determine encoding. Git
will happily translate text files to your local encoding on checkout. With that
in mind, we can convert this file to UTF-8 and stop forcing developers on other
operating systems to pay the price for Windows’ poor past decisions.
¹ https://en.wikipedia.org/wiki/Byte_order_mark
print(f"checking {source}...")
- # FIXME: this file contains invalid UTF-8 characters
- if source == "windows/cmd/fc-fix/fc-fix.cpp":
- print(f"skipping {source} due to malformed content")
- continue
-
with open(source, "rt", encoding="utf-8") as f:
original = f.read()