This function was using the current system locale to encode and decode data sent
to Graphviz and received from Graphviz when using a textual output format. As a
result, encoding exceptions would occur if either the input or the output
contained non-ASCII characters and the system locale was not a UTF-8 one.
Apparently none of the current test suite hits this scenario. However, an
upcoming commit adds a test case that does.
This change forces the encoding and decoding to be done as UTF-8, which is also
what Graphviz unconditionally uses.
This code was using `strtok` as if it splits based on the single separator
passed to it. But `strtok` actually treats the second parameter as a list of
character separators. In this change, we rephrase this code to do what its
original author appears to have intended.
This slightly changes the semantics of this code. But it seems we do not know
the exact intent of the original, so this is hoped to match the author’s
intention.
Clang seems to consider `{NULL}` different from `{0}`, with the latter being an
intent of zero initialization and the former a possible accidental omission of
other fields.
Vincent Fu [Fri, 15 Jul 2022 00:19:20 +0000 (20:19 -0400)]
dot.demo: replace LDFLAGS with LDLIBS in Makefile
With LDFLAGS I am unable to build the demo programs using the Makefile
but the Makefile works with LDFLAGS changed to LDLIBS. We are using
pkg-config to obtain the appropriate libraries. So LDLIBS is the
appropriate variable to use.
The previous use of `oldof` was a verbose way of allocating a single element, so
we replace it with the central allocation helper, also avoiding crashes if
allocation fails.
xdot sprintXDot: steal agxbuf’s buffer instead of double copying
8064f6e902cc4c3062cffa2d1d307ee9cf1893bb replaced lib/xdot’s inline copy of a
subset of the agxbuf.h API with an include of the header containing the full
API. This gives us access to `agxbdisown`. This function effectively does the
work of `agxbuse;strdup;agxbfree` by taking the existing dynamically allocated
buffer within the `agxbuf` object, rather than making yet another copy of this
data only to discard the original.
edgepaint: remove unnecessary 'strdup' of 'lightness'
Pointers `getopt` returns in `optarg` point into the original `argv` which lives
in immortal storage. There is no need to duplicate such a pointer to prolong its
lifetime.
This commit looks like it is changing the source string, but `arg` and `optarg`
point at the same thing at this point. But `optarg` is not `const` qualified, so
we can do this assignment without a compiler warning.
smyrna: remove unnecessary 'strdup' calls in 'mTestgvpr'
The strings being duplicated are passed through to `gvpr` which does not modify
its arguments. So by rearranging when we release `bf2`, we can remove the need
to dynamically allocate the members of `argv`.
smyrna load_attributes: use a string view for 'ss'
This code contained multiple memory leaks and unchecked allocations:¹
1. `pch` was `strdup`-ed into `ss` on line 58. But `strdup`-ed again when
being saved to an `attr` field. This lost the memory originating from the
first `strdup`.
2. Cases 0, 3, and 4 of the switch do not save the full contents of `ss` at
all. This means naively removing the `strdup` calls in cases 1, 2, and
default would not have solved the memory leak in (1) because cases 0, 3,
and 4 would still leak memory.
3. None of the `strdup` calls in this function were checked for failure.
This commit attempts to solve all the above. We now take a read-only reference
to the string data on line 58 and only `strdup` it when needed.
¹ It also assumes all lines of the input file are fewer characters than
`BUFSIZ`, a platform-dependent constant. I do not know why this would be
guaranteed. However, this problem seems orthogonal to the above.
This loop contains no `continue` statements, its counter is incremented in a
regular way, and the counter is unused outside the loop. So we can write the
loop more concisely and scope `attrcount` more tightly by using a `for` loop
instead of a `while` loop.
This loop contains no `continue` statements, its counter is incremented in a
regular way, and the counter is unused outside the loop. So we can write the
loop more concisely and scope `ind` more tightly by using a `for` loop instead
of a `while` loop.
Pointers `getopt` returns in `optarg` point into the original `argv` which lives
in immortal storage. There is no need to duplicate such a pointer to prolong its
lifetime.
Sequence IDs are calculated using 64-bit counters in `Agclos_s`. But then the
field used to store sequence IDs, `Agtag_s.seq`, is `sizeof(unsigned) * 8 - 4`
bits wide, 28-bit on x86 and x86-64. As a result, the compiler believes IDs that
exceed 2²⁸ - 1 can occur and overflow `Agtag_s.seq`:
edge.c:213:30: warning: conversion from 'int' to 'unsigned int:28' may change
value [-Wconversion]
213 | AGSEQ(in) = AGSEQ(out) = seq;
| ^~~
...
graph.c: In function 'agopen1':
graph.c:77:20: warning: conversion from 'uint64_t' {aka 'long unsigned int'}
to 'unsigned int:28' may change value [-Wconversion]
77 | AGSEQ(g) = agnextseq(par, AGRAPH);
| ^~~~~~~~~
...
node.c: In function 'newnode':
node.c:76:16: warning: conversion from 'uint64_t' {aka 'long unsigned int'} to
'unsigned int:28' may change value [-Wconversion]
76 | AGSEQ(n) = seq;
| ^~~
...
node.c: In function 'agnodebefore':
node.c:359:22: warning: conversion from 'uint64_t' {aka 'long unsigned int'}
to 'unsigned int:28' may change value [-Wconversion]
359 | AGSEQ(snd) = (g->clos->seq[AGNODE] + 2);
| ^
In practice, ingesting a graph of this size is not achievable, so these
overflows cannot occur.
This change introduces assertions and casts in these cases to explain the
assumptions to the compiler. It squashes the above warnings. In future, perhaps
these fields should all be made to all consistently use the same type.
gv_trim_zeros: identify string extent instead of writing a '\0'
The buffer that this function was truncating is destined for `gvwrite`. So we
can make the whole thing read-only by identifying a string extent instead of
modifying the buffer in place. The compiler may have been able to identify the
intent of this code anyway¹ but if not these changes make it clearer how this
code can be optimized.
This looks like a bit of a strange change, when we now wrap the entire file in
`extern "C"`. However this has two key benefits:
1. `dot_builtins` and `dot_static` that include this source needed an
Autotools hack¹ to force compilation to use the C++ front end (`c++`)
instead of the C front end (`cc`) in order to link against the C++ standard
library. By moving this source into C++ we can remove this hack.
2. When trying to integrate `dot_builtins` into the CMake build system, MSVC
complains (correctly) that the initializers to the array in this file are
not compile-time constants. GCC and Clang apparently allow this by a
non-standard extension. By moving this into C++, we get more relaxed
initialization semantics that allow this on all compilers.
This code is not currently compiled and, in fact, will not compile if you try to
re-enable it. As an example issue, it uses `GD_inleaf`, a macro intended for
accessing `Agraphinfo_t` fields, on a `node_t`. This is sort of a double mistake
as `Agraphinfo_t` also has no `inleaf` field. This problem seems to have been
present in the very first Graphviz revision, 256ef66663ca0c072554ee3f5e7971911031b3c7. Fortunately the mistakes sort of
cancelled each other out because the `GD_*` marcos did no casting and
`Agnodeinfo_t` _does_ have an `inleaf` field. The outcome seems to be what the
author intended, even if the route by which they got there was not intended.
The above is only one of several issues with this code. Resurrecting it has
unknown cost and unknown benefit, so we remove it here to avoid the implication
that it can be easily switched back on.
Contrary to the X11 documentation,¹ it seems button values other than 1-5 can be
returned as button press events. The assertions altered in this commit were
introduced to guarantee the value does not exceed the limits of the type of the
parameter in the user’s callback (`int`). So we can safely relax this to just
the limit itself.
This was validated by doing an exhaustive comparison of all strlen ≤2 inputs to
both the before and after function. Not bulletproof, but it is a strong signal
that the new version is functionally identical.