neatogen: push queue allocation into 'bfs_bounded'
Similar to the previous commit, the callee does not care about the contents of
the queue on entry and the caller does not care about the contents of the queue
on exit. Note that in this case we need to add an extra parameter because
`bfs_bounded` did not have the queue size already.
Callers of `bfs` were constructing `Queue` objects and passing them into `bfs`.
But in all of these cases neither the callee nor the caller care about the
contents of the queue. This appears to have been an optimization to hoist
allocation of this object outside loops in callers. This is nowhere close to the
most expensive operation these locations are performing. And replicating
operations like this led to opportunities for errors like that fixed in the
prior commit. It seems a win to undo this for readability.
neatogen embed_graph: remove free of output parameter
The only call into this function passes `NULL` for the `coords` parameter.
Additionally it did not make much sense to free this pointer on behalf of the
caller. This function is not designed to be called in a loop, so if the caller
wants their passed-in parameter freed they should do it themselves prior to
calling.
The lib/cgraph/alloc.h wrappers are similar to the older lib/common/memory.h
wrappers except (1) they are header-only and (2) they live in a directory
(cgraph) that is at the root of the dependency tree. The long term plan is to
replace all use of lib/common/memory.h with lib/cgraph/alloc.h.
The lib/cgraph/alloc.h wrappers are similar to the older lib/common/memory.h
wrappers except (1) they are header-only and (2) they live in a directory
(cgraph) that is at the root of the dependency tree. The long term plan is to
replace all use of lib/common/memory.h with lib/cgraph/alloc.h.
common scanEntity: use a more appropriate type for 'len'
This addresses a problem where the subtraction of two pointers, `endp - t`, can
in theory exceed the size of an `int`. In practice this cannot occur, but using
a more correct type squashes a -Wconversion and -Wsign-conversion warning in
this code.
common undoClusterEdges: use cgraph wrapper for allocation
The lib/cgraph/alloc.h wrappers are similar to the older lib/common/memory.h
wrappers except (1) they are header-only and (2) they live in a directory
(cgraph) that is at the root of the dependency tree. The long term plan is to
replace all use of lib/common/memory.h with lib/cgraph/alloc.h.
common new_queue: use cgraph wrappers for allocation
The lib/cgraph/alloc.h wrappers are similar to the older lib/common/memory.h
wrappers except (1) they are header-only and (2) they live in a directory
(cgraph) that is at the root of the dependency tree. The long term plan is to
replace all use of lib/common/memory.h with lib/cgraph/alloc.h.
A mistake in 632fe0bd1cfc6a4f636db4f85206aff6720bdc6b made this test read from
/dev/null instead of the input file it was supposed to read. Note that this
required some tweak to the skip condition. The Windows platforms on which this
fails seems all over the place and expressing the exact pattern seemed too
complex. For the curious, what we currently see in CI is:
I would not be surprised if these results are not stable. It is likely this
failure presents across all platforms, but is dependent on things like Address
Space Layout Randomization to exhibit.
This change switches both `agxbinit` and zero-initialization (`agxbuf xb = {0}`)
to result in a buffer with inline storage. This means string data is written
inline until it becomes too large to store within the `agxbuf` struct itself,
when it is then relocated to the heap.
This is an optimization. Short dynamic strings can now be written completely
without heap allocation. For example, stringifying a number
(`agxbprint(&xb, "%d", i)`) will fit fully within the inline buffer.
Some performance evaluation comparing the merge-base of this series
(740c4bf1edba931be0698de7bfde629eba2f40c0) to the current commit follows. These
numbers were derived using `/usr/bin/time` and Valgrind on `-O3 -flto -DNDEBUG`
builds of Graphviz. Instruction counts are provided for examples that are not
too long running.
¹ `dot -Tsvg -o /dev/null tests/regression_tests/large/long_chain`.
² “long” in the table is a graph generated by the following script:
print("digraph {")
for i in range(80000):
print(f" N{i} -> N{i + 1}")
print("}")
This represents something like a best case scenario to observe the effect of
this optimization. Lots of small unique strings. Run as
`dot -Tsvg -o /dev/null long.dot`.
³ The test case from https://gitlab.com/graphviz/graphviz/-/issues/456 run as
`dot -Tsvg -o /dev/null 456.dot`. Note that this actually errors out
eventually.
⁴ The test case from https://gitlab.com/graphviz/graphviz/-/issues/456 but with
`concentrate=true` removed. Run as `dot -Tsvg -o /dev/null 456.dot`.
⁵ The test case from https://gitlab.com/graphviz/graphviz/-/issues/1652 run as
`neato -Tsvg -o /dev/null 1652.dot`.
⁶ The test case from https://gitlab.com/graphviz/graphviz/-/issues/1652 run as
`dot -Tsvg -Gnslimit=2.0 -o /dev/null 1652.dot`. Note that this eventually
crashes.
⁷ swedish-flat.dot Magnus attached to
https://gitlab.com/graphviz/graphviz/-/issues/1718 run as
`circo -Tsvg -o /dev/null swedish-flag.dot`.
⁸ The test case from https://gitlab.com/graphviz/graphviz/-/issues/1864 run as
`neato -Tsvg -o /dev/null 1864.dot`.
⁹ The test case from https://gitlab.com/graphviz/graphviz/-/issues/1864 run as
`twopi -Tsvg -o /dev/null 1864.dot`.
¹⁰ The test case from https://gitlab.com/graphviz/graphviz/-/issues/2064 run as
`dot -Gnslimit=2 -Gnslimit1=2 -Gmaxiter=5000 -Tsvg -o /dev/null 2064.dot`.
¹¹The tests/2095.dot test case from prior to minimization
(3819821ea70fae730dd224936628ed3929b03531). Run as
`dot -Tsvg -o /dev/null 2095.dot`.
As indicated by the changes to the layout diagram in this diff, previously we
were essentially wasting a number of bytes (3 on x86, 7 on x86-64) on structure
padding. Through some rearrangement and packing, we can make these bytes usable
for the inline storage area.
The definition of the struct is not quite what you might expect to see. The
`located` field is in the first member of the union but is actually used
regardless of what mode an `agxbuf` is in (heap, stack, inline). This is
necessary to get the desired packing. It does not overlap with other struct
fields, though reading and writing a struct through different union members is
well defined in C99.
This optimization might seem a little niche. But `agxbuf` objects are
extensively used. Once inline `agxbuf` objects become possible (an upcoming
commit), this saves us heap allocations in numerous scenarios.
cgraph agxbuf: support storing short strings inline
This commit implements Small String Optimization (SSO) for `axgbuf`. Strings
up to a certain size (24 bytes unterminated on x86-64) can be stored inline
within an `agxbuf` structure itself with no external backing memory.
There is currently no way to create a buffer in inline mode. Neither
`agxbuf_init` nor zero-initialization gives a path to this. The ability to
create inline strings will be added in an upcoming commit.
When a string is stored inline, its size is maintained in the `located` field
instead of in the `size` field.
CI: suppress -Wmissing-braces in CentOS 7 CMake build
The compiler on CentOS 7 is an older version of GCC, 4.8.5, which spuriously
issues warnings for code like `foo_t x = {0}` where `foo_t` is a struct
containing an aggregate (array, struct, union). This form of initialization is
meant to be valid for all structs in C99. We need to suppress this to avoid CI
warnings that fail the build in an upcoming commit.
cgraph agxbuf: use an enumerated type for buffer location
This does not change the semantics; 0 still means heap-allocated and 1 means
stack-allocated. However this is preparation for introducing a third location a
buffer can be hosted in.
This eases some upcoming changes to the internal structure of the `agxbuf` type.
This alters the semantics of `deparse` (the buffer’s size ends up 0 on
completion of this function), but this is acceptable as all callers go on to
immediately free the buffer.
cgraph: avoid accessing agxbuf internals in the scanner
This eases some upcoming changes to the internal structure of the `agxbuf` type.
The trailing calls to `agxbclear` are removed because they are no longer
necessary; the `agxbuse` call does the same thing.
1349b895eca0bb04855ba46162d0ec7ba3c65010 refactored `printTok` to introduce a
call to `agxbput` that passes in the return value from a previous `agxbuse` on
the same buffer. This pointer still points into the destination buffer and
`agxbput` bottoms out on a call to `memcpy`. So this ends up calling `memcpy`
with overlapping source and destination, something that is undefined behavior in
C.
This change introduces a local alternative for this situation, because this is
the only known place where we need to do such a thing.