The definition of `Sfio_t` was structured to give it a public part and a private
part:
struct _sfio_s {
… public members …
_SFIO_PRIVATE; // ← only visible if sfio_t.h is included
};
This works only as long as no user ever allocates a `Sfio_t` themselves. If they
do, they will allocate too few bytes, based on their view of the size of
`Sfio_t` missing its private members.
A side effect of link-time optimization is that the compiler can see through
this trickery and witness both with-private and without-private definitions at
once. Creating objects of this type or pointers to objects of this type is a
violation of C’s strict aliasing rule and the compiler can now see this. This
can cause the compiler to make unsafe optimizations, like concluding any code
involving `Sfio_t*` variables must be unreachable.
The safer way to do a public/private class like this is with two structs, the
private one containing the public one as a member:
struct foo_public {
… public members …
};
struct foo_private {
struct foo_public *public;
… private members …
};
Public API functions then accept `struct foo_public*` parameters and internally
convert them to `struct foo_private*` variables in order to access the
internals:
int foo_do_something(struct foo_public *foo) {
// use something like the Linux kernel’s container_of
struct foo_private *f = container_of(foo, public);
…
}
But instead of this, we can observe that the definition of `Sfio_t` does not
need to have any public members at all, and we can make the entire type private.
We need an exception for libexpr which reaches into sfio internals.
sfio: remove macro implementations of character functions
These were implemented as an optimization to allow inlining. This is no longer
necessary on contemporary compilers with link-time optimization. Removing them
will unblock fixing some strict aliasing problems with the `Sfio_t` type.
These are present in the Autotools build system as a separate library, libsfiof,
linked into libsfio. They are also present in the MS Build build system inlined
into libsfio itself. This change copies the MS Build approach.
This did not cause problems because these sources only exist as fallback
implementations for functions that are also implemented as macros. All current
usages see the macro definitions so never call these functions.
Fixing this will allow removing the macro versions in an upcoming change.
gvmap make_map_internal: operate directly on 'xcombined' instead of 'xtemp'
Nothing appears to alias `xcombined` within this block. So we can save an
intermediate allocation and the cost of data movement by simply writing results
into their final destination.
This function was calling `realloc` and not checking the return value. To fix
this, we rephrase it into using string views and then a final checked
allocation. This removes the long lived static allocation here, now requiring
the caller to free the returned value. This function is not on a hot path, so
removing this optimization is fine.
ccomps process: replace 'sprintf' with 'agxbprint'
This replaces some error prone manual calculation with a dynamic buffer that
handles the calculation of needed allocations. This also potentially saves
memory in cases of short strings where an agxbuf can use inline storage.
ccomps processClusters: replace 'sprintf' with 'agxbprint'
This replaces some error prone manual calculation with a dynamic buffer that
handles the calculation of needed allocations. This also potentially saves
memory in cases of short strings where an agxbuf can use inline storage.
ccomps gwrite: fix crash when failing to open files
This code looks as if it and the similar code in bcomps.c were copy-pasted from
the same common source. Except the bcomps.c version remembered to exit when
failing to open the output file, whereas the ccomps.c code would go on to crash
when using the null `outf`.
It looks like a mistake in the initial revision of Graphviz that this function
never wrote to `bufsz`. The effect of this was that this function would
reallocate on every single call. So empirically it seems there was no need to
avoid allocating new memory on each call, which is what this commit does.
smyrna DrawEllipse: use doubles for local temporaries
Values being stored into these were doubles and later use of these promoted them
to double. So it is not clear why they were not double to begin with. Squashes 2
-Wfloat-conversion warnings.
Most of the calculations with this and uses of it are with doubles. So making it
a double as well avoids losing intermediate precision. Squashes 5
-Wfloat-conversion warnings.
Link-Time Optimization (LTO) is a mechanism that enables the compiler to
optimize across translation unit boundaries. In particular, it enables
cross-file function inlining. This is present and mature in the majority of
contemporary compilers and switching it on has few downsides.
¹ The test case from https://gitlab.com/graphviz/graphviz/-/issues/1652 run as
`neato -Tsvg -o /dev/null 1652.dot`.
² swedish-flat.dot Magnus attached to
https://gitlab.com/graphviz/graphviz/-/issues/1718 run as
`circo -Tsvg -o /dev/null swedish-flag.dot`.
³ The test case from https://gitlab.com/graphviz/graphviz/-/issues/1864 run as
`twopi -Tsvg -o /dev/null 1864.dot`.
⁴ The test case from https://gitlab.com/graphviz/graphviz/-/issues/2064 run as
`dot -Gnslimit=2 -Gnslimit1=2 -Gmaxiter=5000 -Tsvg -o /dev/null 2064.dot`.
⁵ The tests/2095.dot test case from prior to minimization
(3819821ea70fae730dd224936628ed3929b03531). Run as
`dot -Tsvg -o /dev/null 2095.dot`.
pathplan: replace unchecked allocation calls with cgraph wrappers
After the prior UB fixes, the #1999 example bottoms out in this code, failing
the second allocation call while trying to allocate ~938GB. The return values
for neither of these calls were checked, resulting in messy crashes when
scenarios like this occurred. This change swaps them for calls to the cgraph
allocation wrappers that exit gracefully on out-of-memory conditions.
After this change, a ASan+UBSan build of Graphviz can process the #1999 example
without crashing. Graphs with >46341 (⌈√INT_MAX⌉) nodes no longer cause an
integer overflow.
UBSan revealed the graph attached to #1999 was triggering an integer overflow in
this multiplication, later on causing a crash in `twopi`. Any number of nodes
≥⌈√INT_MAX⌉ exceeds INT_MAX during multiplication. This fix still does not
enable the graph to be processed in a reasonable amount of time, and it still
crashes later after several hours due to another integer overflow.