commit generated ps_font_equiv.h and remove generator for this
The header ps_font_equiv.h was generated from 3 files, fontmap.cfg,
ps_font_equiv.txt, and ps_fontmap.txt. The generator itself was the Perl file,
mksvgfonts.pl.
The generation process is deterministic and not dependent on the end user’s
system. In fact, ps_font_equiv.h is generated during CI and included in the
portable source tarball. Furthermore the main driving source for this,
ps_font_equiv.txt is no clearer or more commented than the generated header.
Finally, a copy of the generated header was actually already committed to the
repository under windows/include/ for the MSBuild work flow that does not want
to call Perl.
For these reasons, having this as a generated file was no advantage. In fact,
this was a net negative in the build process as it is the only thing in the
build that requires Perl. As of this commit, Perl is no longer required to
build Graphviz. Perl is still necessary for building some optional components
like the Graphviz bindings for Perl.
consistently use float arithmetic in build_rotmatrix
By avoiding mixing doubles and floats, we avoid ever down-converting from a
float to a double and losing precision. Squashes some -Wfloat-conversion
warnings.
consistently use float arithmetic in tb_project_to_sphere
By avoiding mixing doubles and floats, we avoid ever down-converting from a
float to a double and losing precision. Squashes a -Wfloat-conversion warning.
rephrase some magic numbers in Smyrna into their originating computation
This removes the use of some double literals with implicit conversion to floats,
squashing some -Wfloat-conversion compiler warnings. It also makes the resulting
code clearer. The compiler constant-folds these at -O1 and above, so there is no
loss of performance. The resulting expressions are slightly different to the
original, but if anything they are a more accurate representation of the intent
here.
By avoiding mixing doubles and floats, we avoid ever down-converting from a
float to a double and losing precision. Squashes some -Wfloat-conversion
warnings.
By avoiding mixing doubles and floats, we avoid ever down-converting from a
float to a double and losing precision. Squashes a -Wfloat-conversion warning.
use more appropriate square root function in vlength
The square root functions sqrt and sqrtf compute on doubles and floats
respectively. Using the float version here avoids a lossy conversion from double
to float when returning from this function.
This function was mixing floats and doubles, leading to mixed precision issues.
Standardizing on float usage throughout leads to more predictable behavior and
removes a number of -Wfloat-conversion warnings.
This marks the class as unnecessary outside of its containing translation unit.
This further emphasizes that is is an implementation detail of the containing
file as well as allowing the compiler to aggressively optimize its layout.
manage Node::r as a reference instead of a pointer
Generally having class members that are references is an anti-pattern. However,
in this case the Node class is only used within generate-constraints.cpp with
all Node objects having limited lifetimes. It is essentially an implementation
detail of generate-constraints.cpp, irrelevant to the outside. Making its
Rectangle member a reference will ease some upcoming changes.
take const references in Rectangle::overlap{X|Y} instead of mutable pointers
These functions do not handle nullptr and do not modify their parameter, so a
const reference is more appropriate. This change will ease some future work
towards removing manual memory management.
Apart from being unused, this function has multiple issues:
1. Comment typo “separater.”
2. Comment discusses a path returning 1 that does not exist.
3. Useless return value.
4. The last store to len in the last loop is dead. This variable is unused
after this point because the containing loop is exited and it is unused in
the remainder of the function.
5. The two mallocs in the last loop over-allocate by one byte. The len
variable tracking the length of strings being constructed here was
incremented *past* the '\0' terminator in the previous line. Thus the
length of the string including '\0' is len, not len + 1.
6. The allocations in the loop are essentially doing an open-coded strdup (or
even more efficiently, strndup).
7. Both the large static buffer swork and the optional larger allocation into
swork1 are unnecessary. Tracking offset and length in s would allow using
the original memory as a source to any strndup/strncpy.
8. The ss allocation reserves one entry too many. The number of tokens is
ntokens, not ntokens + 1, and this array is not zeroed on allocation so the
last entry merely contains garbage.
These functions were intended for callers who want to allocate a Queue on the
heap. However, this is unnecessary and callers all locate their Queue objects on
the stack and use mkQueue and freeQueue instead. This removal cleans up the code
a little and squashes two -Wmissing-prototypes warnings.
This function’s only use is in test code and the header declaring its prototype
is not shipped. It is simpler to omit this (and utility function halffunc) from
the libcommon build entirely. Squashes one -Wmissing-prototypes warning.
compute the extent of some gvputs writes at compile time
This change teaches the compiler how to unfold gvputs of a string literal into
gvwrite, computing the length of the literal at compile time. This is only
applied to the SVG back end in this commit.
On tests/regression_tests/large/long_chain, this drops the number of gvputs
calls from 2442066 to 363011, though obviously introduces the difference as
gvwrite calls. The total executed instructions drops from 8093310656 to 8009099650, a speed up of ~1%.
On tests/regression_tests/large/long_chain this drops the number of gvputs calls
from 2475072 to 2442066, reducing the amount of the trace for which gvputs is
responsible from 6.60% to 6.53%. Total executed instructions are reduced from 8098098396 to 8093310656, a speed up of ~0.05%.
optimize two calls to gvprintf with single characters
The expression `gvprintf(j, "%c", x)` is equivalent to `gvwrite(j, &x, 1)`.
However, it seems modern compilers, even with link-time optimization enabled,
are not clever enough to see this equivalence. By unraveling the gvprintf call
to what it eventually bottoms out to, we can accelerate SVG generation.
On tests/regression_tests/large/long_chain, this drops the number of gvprintf
calls from 297008 to 165008, reducing the amount of the trace for which gvprintf
is responsible from 3.27% to 2.24%. Total executed instructions are reduced from 8160974807 to 8098098396, a speed up of ~1%.
convert some gvprintf calls with no format codes to gvputs
This is equivalent, but gvputs is less expensive to call than gvprintf.
Surprisingly,¹ with link-time optimization a compiler is able to see this
optimization for itself, so this makes no difference to performance in an LTO
build. However, this should be a slight optimization in non-LTO builds.
¹ I say surprisingly because compilers generally do not attempt inter-procedural
optimization across varargs calls. The calling convention and interpretation
of arguments is complex enough that they generally conservatively leave such
calls alone.
This function trims unnecessary trailing zeros from a printed floating-point
number. It was written to be extremely general, however it is only ever used to
trim a number printed with the format string "%.02f". We can take advantage of
this fact to know that, if it can locate a period, there are exactly two digits
following this that need to be checked. This then allows implementing the
remainder of the function not as a loop but as simply a few branches.
Using tests/regression_tests/large/long_chain, which has been used for other
profiling in this area, this drops total executed instructions from 8160952787
to 8143275099, a speed up of ~2%.
The 0 path in this function is, as the comment says, to avoid printing confusing
numbers like -0.00. However, the remainder of the function prints the number to
2 decimal places. So actually any number that *rounds* to -0.00 is going to come
out this way. To avoid this, we can expand the cases where we take an early
exit. This is also a minor performance speed up in these cases, as the 0 path is
faster than the common path.