Magnus Jacobsson [Fri, 14 Aug 2020 10:04:08 +0000 (12:04 +0200)]
Fix publishing of Windows build artifcats
The COLLECTION was missing since we no longer get it from the
portable_source stage because the Windows builds start directly and do
not wait for the portable_source stage. Therefor we generate it in the
Windows builds directly instead.
Magnus Jacobsson [Fri, 14 Aug 2020 09:53:22 +0000 (11:53 +0200)]
Change to all-numeric version entry format and add collection
The version is now specified with a new collection variable which
shall be "stable" or "development" and an all-numeric version. The
"~dev" (and the .<commiter date>) suffix will be added by the script.
The distance between endb and next was being stored in an int variable. While
this should not exceed the bounds of an int during normal usage, it's risky to
store a pointer difference in a narrower type this way.
fix: pass format parameter into gverrorf explicitly
The gverrorf function was prototyped in a way that did not align with a function
pointer it was stored into. On platforms with more forgiving varargs calling
conventions, everything worked out OK. However, on other platforms calling
gverrorf through this function pointer would result in stack corruption. This
change aligns gverrorf with the type of the function pointer in which it is
stored.
change errorv calling convention to explicitly take a format string
This function implicitly assumed its variable arguments contained a format
string as the first argument. This change pushes the assumption into the calling
convention, making it visible to the compiler.
This spin lock incorrectly implied to the reader that this code was thread safe.
IMHO there are two primary issues with this spin lock and one secondary issue:
1. The intention of the spin lock is clearly to provide mutual exclusion
within fmtbuf(). However this function returns a pointer to a static
(thread-global) buffer. Excluding two threads from being in this function
concurrently does not help if both exit the function with pointers into the
same memory.
2. The spin lock uses no atomics or volatile. As a result, accesses to the
lock can be freely rearranged or coalesced, with no regard to thread
interleaving. This can result in one thread never seeing another thread's
update (unlock) to the lock. Even worse, my local compiler simply
recognizes what is happening and omits the lock operations entirely.
3. (secondary) If this lock worked as intended, it would spin writing to the
lock variable. This would cause fairly pessimistic cache line bouncing as
all waiting processors try to keep the line containing the lock in
exclusive state in their local cache. A more performant way to achieve
this is to spin waiting to see the lock value change *without* modifying
the lock yourself. This allows multiple waiting processors to have the line
in shared state, and minimizes cache coherence traffic.
Certain edge attributes are constructed in advance of their being seen in the
input because Graphviz knows it may need default values for them. Later, if seen
in the input, the values of these attributes are updated.
This all works fine unless the order in which these initially-defaulted edge
attributes appear in the input does not match the order in which the default
versions are constructed by Graphviz internally. In this case, the order in
which the attributes are seen in the input is used to construct a dictionary of
them, but the original copies are used to index into attribute values.
In the particular test case added in this commit,
digraph {
{ rank=same; n1; n2 }
n2 -> n1 [ headport=s, arrowhead=normal ]
}
arrowhead was constructed with symbol ID 0 and headport was constructed with
symbol ID 1. But then the later parsing of these attributes resulted in a
dictionary where the headport value was in ID 0 and the arrowhead value was in
ID 1. Indexing into this dictionary with the initially constructed E_arrowhead
resulted in incorrectly returning the value "s". This caused a spurious error
'Arrow type "s" unknown' as well as incorrect graph output.
Fixes #1444. Note that this may just be one of several issues resulting from
using these initially constructed E_* symbols.
When passed a large edge weight, e.g. 1073741824, an integer overflow would
occur when calculating virtual weights. This would go on to cause a segfault as
calculations were increasingly thrown off by negative values.
This change detects when an overflow will occur and exits. Calling exit() from
within a deeply nested library function like this is not good practice, but we
don't have a better alternative right now. The call chain involves gvLayout()
whose interface to the plugins inherently has no way of reporting failure.
Previously Cgraph only assigned odd IDs from the internal counter, because even
IDs were reserved for pointers derived from agstrdup. Now that we no longer use
pointers as IDs there is nothing special about an ID being even or odd.
Instead of using string pointers as IDs, we now assign the IDs for named
entities exactly the same way we assign them for anonymous identities. This
works because we first check the internal map before creating any new ID, so
processing the same name twice will result in the same ID as before.
This fixes #1767. Now clusters within a graph are consistently processed in the
order in which they appear in the input file, rather than an order dependent on
pointers given out by the allocator.
stop relying on pointers-as-IDs to retrieve strings
Now that strings are always stored in the internal map, they can be retrieved
from there instead of relying on the assumption that the ID aliases the name of
an entity. Related to #1767.
check internal map for IDs prior to creating one from a string
This has no effect currently because the case that preceded this check was for
non-local names, that are not inserted into the internal map anyway. However, an
upcoming change will alter this behavior, so we want to make sure that if a name
already has a known ID it is found first. Related to #1767.
fix: take a copy of font name when caching fonts in Pango plugin
The Pango plugin caches the last used font to save an expensive reconstruction
process each time it runs. To determine whether the cached font is eligible for
reuse, the name and size of the requested font are checked against the cache
entry. However the name of the cached font was only stored as a pointer to the
original name. The cached entry could outlive the original font, which could be
freed before a next call into the plugin. As a result, the plugin would perform
a strcmp using a stale freed pointer.
To address this we simply take a copy of the font name's string data instead of
just a pointer to it. There is no need to copy any of the other cached fields as
they are only accessed if the font name check finds the entry to be valid.
fix: suppress Xlib finalization if initialization failed
The initialization function of a device plugin has no way of reporting failure
to its called. So an attempt to use the x11 back end calls xlib_finalize() even
if xlib_initialize() failed. To make this safe, we set a flag if initialization
succeeds and make xlib_finalize() a no-op if the flag is not set. Fixes #1776.
fix: cast overflow with large font sizes in Pango plugin
When using an abnormally large font size, computing the Pango units for the size
would overflow. This resulted in an assertion failure in Pango when seeing a
negative size value. This issue was found by Google Autofuzz project. This
fixes #1314.
replace in-tree str[n]casecmp implementations with libc
Instead of carrying multiple implementations of these functions in the Graphviz
tree, we now call the built-in support on the current platform. Closes #1775.
This change makes the #includes of lib/sfio headers unambiguous within the
lib/sfio C sources. The upcoming plan is to also make the header #includes and
any of lib/sfio's dependents also unambiguous this way. Related to #1785.
This is a step along the way towards less ambiguous #includes. Eventually this
should become a more clearly qualified #include like <sfio/vthread.h>. Related
to #1785.
The reason this is a separate commit from the previoust commit, which
contains the renaming of the file, is to trick git to show the changes
as a one-file diff regardless of the similarity index, rather than one
deleted and one added file. This makes it easier to compare the ksh
code to the Python code.
The reason this is a separate commit from the next commit, which
contains the changed content, is to trick git to show the changes as a
one-file diff regardless of the similarity index, rather than one
deleted and one added file. This makes it easier to compare the ksh
code to the Python code.
compiler annotations to explain calling convention of agerr and friends
This change teaches suitably enlightened compilers (Clang and GCC) more about
how these functions are intended to be called. When told to detect misuse of
format strings (`-Wformat`), they can now detect incorrect calls like that fixed
in 31ef5a3c31214d77137e4b36603a2a97576e851e.
remove unnecessary temporary buffer to construct error message
The motivation for this change is to avoid passing a variable to agerr which
triggers so -Wformat-security errors after some upcoming changes. However, it's
nice to be able to just simplify this code anyway.