Magnus Jacobsson [Mon, 18 Jul 2022 07:47:15 +0000 (09:47 +0200)]
dotgen cluster: fix memory leaks by freeing data from removed rank leaders
The `build_skeleton` function inserts virtual nodes and edges into the
graph to be used by the clustering algorithms. The `expand_cluster`
function later installs real nodes or sub-clusters and deletes the
virtual nodes and edges by calling the `remove_rank_leaders` function
which removes these from the graph. However, it didn't free the data
allocated for them which caused several memory leaks.
Magnus Jacobsson [Sun, 10 Jul 2022 08:47:22 +0000 (10:47 +0200)]
add new test case test_clusters
Upcoming commits in this series will make changes to subgraph layout
and since clusters are also subgraphs, we want to weed out any memory
issues related to clusters separately.
This test is run in CI with ASan leak detection enabled. It currectly
fails because ASan detects memory leaks in the dot and patchwork
layout engines.
Upcoming commits in this series will make changes to gvFreeLayout and
we want to ensure that those changes don't introduce any memory leaks
when run with ASan, since gvFreeLayout handles cleanup of both
subgraphs and the root graph and the test case extracts subgraphs from
the root graph.
This test is run in CI with ASan leak detection enabled. It is more or
less a replica of 'dot.demo/neatopack.c', which is also tested in CI
by 'tests/test_examples.py', but with leak detection disabled. It
currently fails because of a heap-buffer-overflow in the gvFreeLayout.
Towards https://gitlab.com/graphviz/graphviz/-/issues/1800.
Normally gvLayoutJobs does this initialization, but since we call
ccomps before doing the layout and it expects graphs and nodes to have
'Agraphinfo_t' and 'Anodeinfo_t' records attached already, we must do
this initialization in the user program instead, see
https://graphviz.org/pdf/pack.3.pdf.
Fixes first part of
https://gitlab.com/graphviz/graphviz/-/issues/1800.
Upcoming commits in this series will make changes to gvFreeLayout and
we want to ensure that those changes don't introduce any memory leaks
when run with ASan, since gvFreeLayout handles cleanup of both
subgraphs and the root graph.
This test is run in CI with ASan leak detection enabled.
Towards https://gitlab.com/graphviz/graphviz/-/issues/1800.
Upcoming commits in this series will make changes to gvFreeLayout and
we want to ensure that those changes don't introduce any memory leaks
when run with ASan, since gvFreeLayout calls layout-specific cleanup
functions.
This test is run in CI with ASan leak detection enabled. It currently
fails because of a memory leak in the osage layout.
Towards https://gitlab.com/graphviz/graphviz/-/issues/1800.
patchwork nodecmp: rephrase comparator to avoid arithmetic
cccb8b1d22a18031fc92d93133c7fa14ef7e1361 fixed an integer overflow in a
`memcmp`-/`strcmp`-like comparator. The same situation exists in the code
touched in this commit. Rather than wait for an edge case to expose an overflow
here, this change makes the same update, removing arithmetic and the consequent
possibility of overflow.
pack cmpf: rephrase comparator to avoid arithmetic
cccb8b1d22a18031fc92d93133c7fa14ef7e1361 fixed an integer overflow in a
`memcmp`-/`strcmp`-like comparator. The same situation exists in the code
touched in this commit. Rather than wait for an edge case to expose an overflow
here, this change makes the same update, removing arithmetic and the consequent
possibility of overflow.
neatogen cmpIpair: rephrase comparator to avoid arithmetic
cccb8b1d22a18031fc92d93133c7fa14ef7e1361 fixed an integer overflow in a
`memcmp`-/`strcmp`-like comparator. The same situation exists in the code
touched in this commit. Rather than wait for an edge case to expose an overflow
here, this change makes the same update, removing arithmetic and the consequent
possibility of overflow.
neatogen cmpitem: rephrase comparator to avoid arithmetic
cccb8b1d22a18031fc92d93133c7fa14ef7e1361 fixed an integer overflow in a
`memcmp`-/`strcmp`-like comparator. The same situation exists in the code
touched in this commit. Rather than wait for an edge case to expose an overflow
here, this change makes the same update, removing arithmetic and the consequent
possibility of overflow.
fdpgen ijcmpf: rephrase comparator to avoid arithmetic
cccb8b1d22a18031fc92d93133c7fa14ef7e1361 fixed an integer overflow in a
`memcmp`-/`strcmp`-like comparator. The same situation exists in the code
touched in this commit. Rather than wait for an edge case to expose an overflow
here, this change makes the same update, removing arithmetic and the consequent
possibility of overflow.
dotgen edgeposcmpf: rephrase comparator to avoid arithmetic
cccb8b1d22a18031fc92d93133c7fa14ef7e1361 fixed an integer overflow in a
`memcmp`-/`strcmp`-like comparator. The same situation exists in the code
touched in this commit. Rather than wait for an edge case to expose an overflow
here, this change makes the same update, removing arithmetic and the consequent
possibility of overflow.
dotgen nodeposcmpf: rephrase comparator to avoid arithmetic
cccb8b1d22a18031fc92d93133c7fa14ef7e1361 fixed an integer overflow in a
`memcmp`-/`strcmp`-like comparator. The same situation exists in the code
touched in this commit. Rather than wait for an edge case to expose an overflow
here, this change makes the same update, removing arithmetic and the consequent
possibility of overflow.
dotgen ordercmpf: rephrase comparator to avoid arithmetic
cccb8b1d22a18031fc92d93133c7fa14ef7e1361 fixed an integer overflow in a
`memcmp`-/`strcmp`-like comparator. The same situation exists in the code
touched in this commit. Rather than wait for an edge case to expose an overflow
here, this change makes the same update, removing arithmetic and the consequent
possibility of overflow.
common textfont_comparf: rephrase comparator to avoid arithmetic
cccb8b1d22a18031fc92d93133c7fa14ef7e1361 fixed an integer overflow in a
`memcmp`-/`strcmp`-like comparator. The same situation exists in the code
touched in this commit. Rather than wait for an edge case to expose an overflow
here, this change makes the same update, removing arithmetic and the consequent
possibility of overflow.
common increasingrankcmpf: rephrase comparator to avoid arithmetic
cccb8b1d22a18031fc92d93133c7fa14ef7e1361 fixed an integer overflow in a
`memcmp`-/`strcmp`-like comparator. The same situation exists in the code
touched in this commit. Rather than wait for an edge case to expose an overflow
here, this change makes the same update, removing arithmetic and the consequent
possibility of overflow.
common decreasingrankcmpf: rephrase comparator to avoid arithmetic
cccb8b1d22a18031fc92d93133c7fa14ef7e1361 fixed an integer overflow in a
`memcmp`-/`strcmp`-like comparator. The same situation exists in the code
touched in this commit. Rather than wait for an edge case to expose an overflow
here, this change makes the same update, removing arithmetic and the consequent
possibility of overflow.
gvcolor cmpf: rephrase comparator to avoid arithmetic
cccb8b1d22a18031fc92d93133c7fa14ef7e1361 fixed an integer overflow in a
`memcmp`-/`strcmp`-like comparator. The same situation exists in the code
touched in this commit. Rather than wait for an edge case to expose an overflow
here, this change makes the same update, removing arithmetic and the consequent
possibility of overflow.
dijkstra cmpf: rephrase comparator to avoid arithmetic
cccb8b1d22a18031fc92d93133c7fa14ef7e1361 fixed an integer overflow in a
`memcmp`-/`strcmp`-like comparator. The same situation exists in the code
touched in this commit. Rather than wait for an edge case to expose an overflow
here, this change makes the same update, removing arithmetic and the consequent
possibility of overflow.
dotgen edgecmp: rephrase comparator to avoid arithmetic
cccb8b1d22a18031fc92d93133c7fa14ef7e1361 fixed an integer overflow in a
`memcmp`-/`strcmp`-like comparator. The same situation exists in the code
touched in this commit. Rather than wait for an edge case to expose an overflow
here, this change makes the same update, removing arithmetic and the consequent
possibility of overflow.
Note that the first comparison in this function appears to be the wrong way
around. But this (mistake?) is left as-is.
The `del` field was added to `gvdevice_callbacks_t` in ad82f9fa3212cbaa01625f3c27a1e90c1c990fd2 under the name `delete`, but seemingly
has never been used by any in-tree plugin. This commit removes a no-op,
never-called callback for it, squashing some -Wunused-parameter warnings.
This commit does not remove the member because it is part of the public API. We
are assuming the convention is to check these pointers against null before
calling them, like the other GVC structs. But the struct has no comment
explaining it nor how it is meant to be used, so we cannot be sure how/if users
would be relying on this.
The `modify` field was added to `gvdevice_callbacks_t` in 50df9e82edcb61c5650f81ca317a31c9493ed07e, but seemingly has never been used by
any in-tree plugin. This commit removes a no-op, never-called callback for it,
squashing some -Wunused-parameter warnings.
This commit does not remove the member because it is part of the public API. We
are assuming the convention is to check these pointers against null before
calling them, like the other GVC structs. But the struct has no comment
explaining it nor how it is meant to be used, so we cannot be sure how/if users
would be relying on this.
common: manage directory search list as views instead of allocated strings
This list is derived from the `Gvfilepath` global, something that is live for
the entire run of a program. So we can avoid `strdup`-ing components being
extracted out of it and maintain these as read-only references instead. This is
both faster and decreases memory usage.
gvc auto_output_filename: avoid 'strdup' when constructing file name
This block of code is scanning a string of `:`-separated entries and writing
them into `buf` in reverse order `.`-separated. We can rewrite it to avoid
dynamic allocation, thus avoiding certain failure cases and locking overhead.
Unfortunately there seems to be no variant of `strrchr` that takes a length. So
we need to write our own loop for locating the last `:` with a limit.
cgraph tok: add extra check for duplicate separator characters
When a user supplies the same character more than once when constructing a
tokenizer, nothing is functionally incorrect but this is strong indicator of a
bug or misunderstanding by the caller. For example, a bug like this is
documented in #2259.
When using `shape=record`, certain characters within labels have alternate
semantics. The switch in `parse_reclbl` handles these, with most characters
having no special semantics and branching to the default case. The trailing loop
in this case was attempting to accrue UTF-8 continuation bytes. But lets take a
look at the Wikipedia table describing UTF-8:¹
Notice that the continuation bytes are distinguished by upper 0b10 bits. Now
consider that the trailing loop was using a mask with 128 (0b10000000) to
identify such bytes. Such a value masks _out_ bit 6. That is, this loop
condition expression was true for all values with upper 0b10 bits _or_ upper
0b11 bits.
The first consequence of this is that this loop thought multiple consecutive
non-ASCII characters were a single character. It treated the 0b11xxxxxx byte 1
of a new non-ASCII character as if it were another continuation byte of the
preceding non-ASCII character.
The second consequence of this is that an ASCII character followed by non-ASCII
characters would be treated as a single character. That is, 0b0xxxxxxx bytes led
to adjacent following 0b1xxxxxxx bytes being consumed along with them.
These factors combine in the #925 example to confuse the state machine of the
containing loop and result in a malformed label being produced.
This fix adjusts the mask such that it takes _both_ bit 7 and bit 6 and compares
against 0b10xxxxxx.
There are a number of other odd things going on with this code that I did not
attempt to change:
1. ASCII characters are allowed to have following UTF-8 continuation bytes.
This would be considered malformed, but this code treats it as legal.
2. UTF-8 permits a maximum of 3 continuation bytes (see table above) but this
code allows an arbitrary number of continuation bytes. It also does no
validation that the leading byte’s upper bit values and the number of
continuation bytes correspond.
3. There is something called “hard space” mode that this code toggles to keep
track of when a space that would otherwise be omitted needs to be
preserved in the output. Once the hard space flag is toggled on, the code
appears to never toggle it off. This looks like it has the (presumably
unintended) effect of something like `"\\ "` causing all later spaces in
the label to become hard spaces.
This function was using the current system locale to encode and decode data sent
to Graphviz and received from Graphviz when using a textual output format. As a
result, encoding exceptions would occur if either the input or the output
contained non-ASCII characters and the system locale was not a UTF-8 one.
Apparently none of the current test suite hits this scenario. However, an
upcoming commit adds a test case that does.
This change forces the encoding and decoding to be done as UTF-8, which is also
what Graphviz unconditionally uses.
This code was using `strtok` as if it splits based on the single separator
passed to it. But `strtok` actually treats the second parameter as a list of
character separators. In this change, we rephrase this code to do what its
original author appears to have intended.
This slightly changes the semantics of this code. But it seems we do not know
the exact intent of the original, so this is hoped to match the author’s
intention.
Clang seems to consider `{NULL}` different from `{0}`, with the latter being an
intent of zero initialization and the former a possible accidental omission of
other fields.
Vincent Fu [Fri, 15 Jul 2022 00:19:20 +0000 (20:19 -0400)]
dot.demo: replace LDFLAGS with LDLIBS in Makefile
With LDFLAGS I am unable to build the demo programs using the Makefile
but the Makefile works with LDFLAGS changed to LDLIBS. We are using
pkg-config to obtain the appropriate libraries. So LDLIBS is the
appropriate variable to use.
The previous use of `oldof` was a verbose way of allocating a single element, so
we replace it with the central allocation helper, also avoiding crashes if
allocation fails.
xdot sprintXDot: steal agxbuf’s buffer instead of double copying
8064f6e902cc4c3062cffa2d1d307ee9cf1893bb replaced lib/xdot’s inline copy of a
subset of the agxbuf.h API with an include of the header containing the full
API. This gives us access to `agxbdisown`. This function effectively does the
work of `agxbuse;strdup;agxbfree` by taking the existing dynamically allocated
buffer within the `agxbuf` object, rather than making yet another copy of this
data only to discard the original.
edgepaint: remove unnecessary 'strdup' of 'lightness'
Pointers `getopt` returns in `optarg` point into the original `argv` which lives
in immortal storage. There is no need to duplicate such a pointer to prolong its
lifetime.
This commit looks like it is changing the source string, but `arg` and `optarg`
point at the same thing at this point. But `optarg` is not `const` qualified, so
we can do this assignment without a compiler warning.
smyrna: remove unnecessary 'strdup' calls in 'mTestgvpr'
The strings being duplicated are passed through to `gvpr` which does not modify
its arguments. So by rearranging when we release `bf2`, we can remove the need
to dynamically allocate the members of `argv`.